Opened 3 months ago

Last modified 5 weeks ago

#30257 new defect

Propagate USR1 and ABRT signals from stem tests through to tor

Reported by: teor Owned by: atagar
Priority: Medium Milestone: Tor: 0.2.9.x-final
Component: Core Tor/Stem Version: Tor: 0.2.4.8-alpha
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: #29437 Points:
Reviewer: Sponsor: Sponsor31-can

Description

In #30234, we got the tor logs, but the USR1 and ABRT signals sent by timelimit to test_stem.py aren't being propagated to tor:

Apr 22 03:32:30.000 [notice] Monitored process 20402 is dead.
Apr 22 03:32:30.000 [notice] Owning controller process has vanished -- exiting now.
Apr 22 03:32:30.000 [notice] Catching signal TERM, exiting cleanly.

https://travis-ci.org/teor2345/tor/jobs/522893523#L4944

We need to work out how to get the signals from this stem test process down to the tor it launches:

================================================================================
Signal SIGABRT received by thread MainThread in process 20402
--------------------------------------------------------------------------------
Event notifier thread stacktrace
  File "/usr/lib/python3.4/threading.py", line 888, in _bootstrap
    self._bootstrap_inner()
  File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.4/threading.py", line 868, in run
    self._target(*self._args, **self._kwargs)
  File "/home/travis/build/teor2345/tor/stem/stem/control.py", line 984, in _event_loop
    self._event_notice.wait(0.05)
  File "/usr/lib/python3.4/threading.py", line 553, in wait
    signaled = self._cond.wait(timeout)
  File "/usr/lib/python3.4/threading.py", line 294, in wait
    gotit = waiter.acquire(True, timeout)
--------------------------------------------------------------------------------
MainThread thread stacktrace
  File "/home/travis/build/teor2345/tor/stem/run_tests.py", line 451, in <module>
    main()
  File "/home/travis/build/teor2345/tor/stem/run_tests.py", line 287, in main
    integ_runner.start(target, args.attribute_targets, args.tor_path)
  File "/home/travis/build/teor2345/tor/stem/test/runner.py", line 262, in start
    self._owner_controller = self.get_tor_controller(True)
  File "/home/travis/build/teor2345/tor/stem/test/runner.py", line 482, in get_tor_controller
    controller.authenticate(password = CONTROL_PASSWORD, chroot_path = self.get_chroot())
  File "/home/travis/build/teor2345/tor/stem/stem/control.py", line 1103, in authenticate
    stem.connection.authenticate(self, *args, **kwargs)
  File "/home/travis/build/teor2345/tor/stem/stem/connection.py", line 530, in authenticate
    protocolinfo_response = get_protocolinfo(controller)
  File "/home/travis/build/teor2345/tor/stem/stem/connection.py", line 1007, in get_protocolinfo
    protocolinfo_response = _msg(controller, 'PROTOCOLINFO 1')
  File "/home/travis/build/teor2345/tor/stem/stem/connection.py", line 1036, in _msg
    return controller.msg(message)
  File "/home/travis/build/teor2345/tor/stem/stem/control.py", line 654, in msg
    response = self._reply_queue.get()
  File "/usr/lib/python3.4/queue.py", line 167, in get
    self.not_empty.wait()
  File "/usr/lib/python3.4/threading.py", line 290, in wait
    waiter.acquire()
  File "/home/travis/build/teor2345/tor/stem/run_tests.py", line 98, in log_traceback
    for thread_name, stacktrace in test.output.thread_stacktraces().items():
  File "/home/travis/build/teor2345/tor/stem/test/output.py", line 110, in thread_stacktraces
    stacktraces[thread.name] = ''.join(traceback.format_stack(frame))
--------------------------------------------------------------------------------
Tor listener thread stacktrace
  File "/usr/lib/python3.4/threading.py", line 888, in _bootstrap
    self._bootstrap_inner()
  File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.4/threading.py", line 868, in run
    self._target(*self._args, **self._kwargs)
  File "/home/travis/build/teor2345/tor/stem/stem/control.py", line 939, in _reader_loop
    control_message = self._socket.recv()
  File "/home/travis/build/teor2345/tor/stem/stem/socket.py", line 474, in recv
    return self._recv(lambda s, sf: recv_message(sf))
  File "/home/travis/build/teor2345/tor/stem/stem/socket.py", line 274, in _recv
    return handler(my_socket, my_socket_file)
  File "/home/travis/build/teor2345/tor/stem/stem/socket.py", line 474, in <lambda>
    return self._recv(lambda s, sf: recv_message(sf))
  File "/home/travis/build/teor2345/tor/stem/stem/socket.py", line 676, in recv_message
    line = control_file.readline()
  File "/usr/lib/python3.4/socket.py", line 374, in readinto
    return self._sock.recv_into(b)
================================================================================

https://travis-ci.org/teor2345/tor/jobs/522893523#L3830

Child Tickets

Change History (3)

comment:1 Changed 3 months ago by atagar

Hi teor. In theory propagating the signal to our tor process should be easy. In the runner I tried adding the following...

  try:
    integ_runner = test.runner.get_runner()
    os.kill(integ_runner.get_pid(), sig)
  except test.runner.RunnerStopped:
    pass  # integ testing tor instance isn't running
  except OSError as exc:
    if exc.errno == errno.ESRCH:
      pass  # already exited, no such process

    raise exc

However, when I send a SIGABRT while invoking the integ tests things don't terminate as I'd expect. This is gonna need some more investigation on my side.

comment:2 Changed 7 weeks ago by teor

Sponsor: Sponsor31-can

Setting as sponsor 31 can, because we use these jobs to make sure our refactoring works.

comment:3 Changed 5 weeks ago by teor

Keywords: tor-ci-fail-sometimes removed

Removing from the CI failure list, because we don't need this change to diagnose failures.

Note: See TracTickets for help on using tickets.