Opened 5 months ago

Closed 5 months ago

Last modified 3 months ago

#30012 closed defect (implemented)

When stem receives a signal, log useful information

Reported by: teor Owned by: teor
Priority: Medium Milestone:
Component: Core Tor/Stem Version:
Severity: Normal Keywords: tor-needs
Cc: Actual Points:
Parent ID: #29437 Points:
Reviewer: Sponsor: Sponsor31-can

Description

In #30011, when stem hangs in Tor's CI, we kill it.

But that doesn't give us much information. To fix #29437, we need stem to log a backtrace when it receives a signal. (Or switch into debug mode, or something.)

You can pick any signal you like: maybe SIGUSR2 for debug logs, and SIGABRT for a stacktrace? (If you want stem to be like tor.)

Child Tickets

Change History (10)

comment:1 Changed 5 months ago by teor

Keywords: tor-ci-fail-sometimes added

comment:2 Changed 5 months ago by teor

Keywords: tor-ci-fail-sometimes removed

This ticket isn't urgent any more, because we think we have a fix in #30021.

comment:3 Changed 5 months ago by teor

Parent ID: #29437

comment:4 Changed 5 months ago by atagar

Hi teor. Quick search suggested we might be able to add a signal handler...

https://stackoverflow.com/questions/132058/showing-the-stack-trace-from-a-running-python-application

I won't be able to get to this any time soon, but certainly worthwhile to experiment with at some point.

comment:5 Changed 5 months ago by teor

Keywords: tor-needs added
Parent ID: #29437

We're still seeing stem timeouts after the fixes in #29437, so we need a stacktrace to debug this issue.

comment:6 Changed 5 months ago by teor

Owner: changed from atagar to teor
Status: newassigned

See my pull request:
https://github.com/torproject/stem/pull/11

When I run:

$ make test-stem
...
$ killall -USR1 <run_tests.py pid>

I see:

  control.controller...                               Signal 30 received.
Traceback:
  File "/Users/base/stem/run_tests.py", line 398, in <module>
    main()
  File "/Users/base/stem/run_tests.py", line 239, in main
    run_result = _run_test(args, test_class, output_filters)
  File "/Users/base/stem/run_tests.py", line 379, in _run_test
    run_result = stem.util.test_tools.TimedTestRunner(test_results, verbosity = 2).run(suite)
  File "/Users/base/stem/stem/util/test_tools.py", line 308, in run
    return super(TimedTestRunner, self).run(test)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/unittest/runner.py", line 176, in run
    test(result)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/unittest/case.py", line 663, in __call__
    return self.run(*args, **kwds)
  File "/Users/base/stem/stem/util/test_tools.py", line 258, in run
    result = super(type(self), self).run(result)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/unittest/case.py", line 615, in run
    testMethod()
  File "/Users/base/stem/test/require.py", line 58, in wrapped
    return func(self, *args, **kwargs)
  File "/Users/base/stem/test/require.py", line 58, in wrapped
    return func(self, *args, **kwargs)
  File "/Users/base/stem/test/integ/control/controller.py", line 538, in test_with_invalid_ephemeral_hidden_service_port
    self.assertRaisesWith(stem.ProtocolError, exc_msg, controller.create_ephemeral_hidden_service, ports)
  File "/Users/base/stem/stem/util/test_tools.py", line 288, in assertRaisesWith
    return self.assertRaisesRegexp(exc_type, '^%s$' % re.escape(exc_msg), func, *args, **kwargs)
  File "/Users/base/stem/stem/util/test_tools.py", line 298, in assertRaisesRegexp
    return super(original_type, self).assertRaisesRegexp(exc_type, exc_msg, func, *args, **kwargs)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/unittest/case.py", line 1338, in deprecated_func
    return original_func(*args, **kwargs)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/unittest/case.py", line 1285, in assertRaisesRegex
    return context.handle('assertRaisesRegex', args, kwargs)
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/unittest/case.py", line 178, in handle
    callable_obj(*args, **kwargs)
  File "/Users/base/stem/stem/control.py", line 3099, in create_ephemeral_hidden_service
    response = self.msg(request)
  File "/Users/base/stem/stem/control.py", line 654, in msg
    response = self._reply_queue.get()
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/queue.py", line 170, in get
    self.not_empty.wait()
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 296, in wait
    waiter.acquire()

The control.controller test doesn't hang for me, but the stacktrace looks promising.

comment:7 Changed 5 months ago by teor

Status: assignedneeds_review

comment:8 Changed 5 months ago by teor

I tried testing this in #30117.

We'll need to propagate the signal to the child processes to get a good backtrace. And we might need a backtrace in non-test processes as well.

comment:9 Changed 5 months ago by atagar

Resolution: implemented
Status: needs_reviewclosed

This is a great patch, thanks teor! Moved this a tad but otherwise looks perfect. Merged.

comment:10 Changed 3 months ago by teor

Sponsor: Sponsor31-can

Setting as sponsor 31 can, because we use these jobs to make sure our refactoring works.

Note: See TracTickets for help on using tickets.