Opened 5 years ago

Closed 3 years ago

Last modified 18 months ago

#8746 closed defect (fixed)

Tor tries to kill nonexistent proxy PID on second SIGINT

Reported by: dcf Owned by: nickm
Priority: Medium Milestone: Tor: 0.2.5.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-client, pt, 024-deferrable, 025-triaged, 2016-bug-retrospective
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


This is what 180-pluggable-transport.txt says about proxies and SIGINT:

Proxies should respond to a single INT signal by closing their listener ports and not accepting any new connections, but keeping all connections open, then terminating when connections are all closed. Proxies should respond to a second INT signal by shutting down cleanly.

I implemented the websocket-server transport to work as specified:
I noticed unexpected behavior when the proxy receives a SIGINT, and doesn't have any open connection, and so exits immediately without waiting for a second SIGINT. The parent tor process tries to kill a nonexistent PID:

^CApr 19 17:58:59.000 [notice] Interrupt: we have stopped accepting new connections, and will shut down in 30 seconds. Interrupt again to exit now.
^CApr 19 17:59:05.000 [notice] SIGINT received a second time; exiting now.
Apr 19 17:59:05.000 [notice] Failed to terminate process with PID '18277' ('No such process').

The PID might have been reclaimed in the meantime, and tor could be killing an unrelated process.

(Originally from

Child Tickets

Change History (21)

comment:1 Changed 5 years ago by dcf

I think that doing a non-blocking waitpid before kill would at least reduce the race window. In the mailing list thread I also suggested requiring proxies to stay running until the second SIGINT, but that's not so good because the proxy could have crashed or stopped running for a different reason. Also the SIGINT behavior is only a "should" in the spec.

comment:2 Changed 5 years ago by arma

Milestone: Tor: 0.2.4.x-final

Putting in 0.2.4.x since it's a bug ("Tor could try to kill a random process"), but it's deferable to 0.2.5.

comment:3 Changed 5 years ago by nickm

Keywords: tor-client pt 024-deferrable added

comment:4 Changed 4 years ago by arma

See also #4952 (which I think should maybe be closed as a duplicate of this one?)

comment:5 Changed 4 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final

comment:6 Changed 4 years ago by andrea

Triage: keep for 0.2.5.x-final; this seems important.

comment:7 Changed 4 years ago by nickm

Also our sigchld handler in main.c needs to turn into a function where we can get notification of dead PIDs back.

Where are the two SIGINTS happening?

comment:8 Changed 4 years ago by nickm

Keywords: 025-triaged added

comment:9 Changed 4 years ago by nickm

I think there's a fix in my branch "bug8746", though I still need to write more tests for it.

On unix, it just does "use waitpid properly". It dumps the "waitpid and discard" loop in main.c, and instead does a proper callback-notification thing.

On windows, it terminates the process by handle, not by pid.

comment:10 Changed 3 years ago by nickm

Owner: set to nickm
Status: newassigned

marking as assigned since I've started coding on these.

comment:11 Changed 3 years ago by nickm

Status: assignedneeds_review

Okay, now it has unit tests. The only non-covered lines are ones that trigger on BUG conditions.

comment:12 Changed 3 years ago by nickm

(And the new branch is in bug8746_v2 )

comment:13 Changed 3 years ago by nickm

(I just looked over this branch and it still seems ok to me)

comment:14 Changed 3 years ago by yawning

Looks good to me.

comment:15 Changed 3 years ago by andrea

Review of bug8746_v2:


  • Shouldn't set_waitpid_callback() tor_free(old_ent) if it's non-NULL?
  • Otherwise, I think this is all okay


  • I think this all looks fine


  • Looks good to me


  • Looks good to me


  • Awwww, no functinos? Don't tell me you're against sfunctions too.


  • Looks good to me


  • Looks good to me


  • This is just the changes file; looks fine to me

comment:16 Changed 3 years ago by asn

Also maybe we want to tor_free(ent); in the LD_BUG of clear_waitpid_callback().

comment:17 Changed 3 years ago by yawning

Assuming the branch in set_waitpid_callback() is taken and the tor_free(old_ent) call suggested by athena is there, it's probably fairly reasonable to assume that there could very well be a different process_handle_t that holds a reference to old_ent.

Some time later when tor_destroy_process_handle(other_handle, 1) is called (I don't know if it ever will, but we're already in "our invariants have been violated" territory), the code path where asn suggests a free will be invoked with ent being the dangling pointer because the hashtable entry for the pid at that point is either missing (the callback fired and the replacement entry got removed from the table) or is the new entry.

Either leave it as is (leaks a trivial amount memory when impossible things happen), add a tor_free() only in clear_waitpid_callback() (may leak a trivial amount of memory, if the previous process handle is never destroyed), or add asserts in both branches because both conditions are invariants.

comment:18 Changed 3 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

I'm going with "leave it as it is". I'd rather risk a small memory leak on a possible bug here than risk a later double-free. (If we reach either of those cases, our representation is messed up somewhere.)

Thanks for the reviews, everybody! Squashing + merging.

comment:19 Changed 18 months ago by nickm

Keywords: 2016-bug-retrospective added

Mark bugs for 2016 bug retrospective based on hand-examination of changelogs for 0.2.5 onwards.

comment:20 Changed 18 months ago by nickm

Mark bugs for 2016 bug retrospective based on hand-examination of changelogs for 0.2.5 onwards.

comment:21 Changed 18 months ago by nickm

Mark bugs for 2016 bug retrospective based on hand-examination of changelogs for 0.2.5 onwards.

Note: See TracTickets for help on using tickets.