Opened 4 years ago

Closed 4 years ago

Last modified 3 years ago

#15435 closed enhancement (implemented)

Tor should not close stdin on child processes.

Reported by: yawning Owned by:
Priority: Medium Milestone: Tor: 0.2.7.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Normal Keywords: tor-pt, needs-spec-patch pt-wants, SponsorS, 027-triaged-1-in
Cc: dcf, rl1987@…, qbi, asn Actual Points:
Parent ID: Points: small-remaining
Reviewer: Sponsor:

Description (last modified by yawning)

The lack of PTs having a reliable way to detect if tor is dead reared it's ugly head again in the form of #15434. There aren't that many good ways to detect if a parent has died, beyond checking if stdin is still open (#10047 says stdout, however stdin is better), but tor explicitly closes the child process's stdin.

Since a variant of this has been working well, I propose that we make it official like thus:

  • Add a new enviornment variable of the form TOR_PT_EXIT_ON_STDIN_CLOSE=1, indicating that Tor will not close the stdin.
  • PTs SHOULD terminate as if they have received a SIGTERM, on stdin being closed if the environment variable is set.
  • All new tor versions will naturally set said env. var, and setup the pt plugin stdin correctly.

This does mean that Bridges will consume one extra file descriptor per ServerTransportPlugin line in the torrc, but I view this as acceptable given the fact that this will bring some much needed sanity to our cleanup process.

Child Tickets

Change History (25)

comment:1 Changed 4 years ago by yawning

See also #5085 for an example of pts having trouble with parent termination detection.

comment:2 Changed 4 years ago by rl1987

Cc: rl1987@… added

comment:3 Changed 4 years ago by qbi

Cc: qbi added

comment:4 Changed 4 years ago by nickm

Keywords: pt-wants added

seems like a fine idea to me

comment:5 Changed 4 years ago by yawning

This isn't totally urgent for me anymore since I added code that should handle this case on common bridge platforms to obfs4proxy, but I still will implement this because said code is a combination of unportable/unreliable/a NOOP.

If people are curious it's obfs4proxy commit: 4e4c9052f436b86729b853715086e9445c76e185

comment:6 Changed 4 years ago by yawning

Status: newneeds_review

Ok, I'll write up the pt-spec.txt changes momentarily, but since the code is done, needs_reviewing.

The TLDR for the pt-spec.txt changes is "If the env. var 'TOR_PT_EXIT_ON_STDIN_CLOSE' is set to '1', implementations MAY terminate immediately if stdin is closed.".

tor: https://github.com/Yawning/tor/compare/feature15435
obfs4proxy (Example usage): https://github.com/Yawning/obfs4/compare/feature15435

NB: While I think the Windows version of the code is correct, it is entirely untested.

comment:7 Changed 4 years ago by yawning

Description: modified (diff)

comment:8 Changed 4 years ago by infinity0

Why SHOULD or MAY? I suggest MUST.

comment:9 in reply to:  8 Changed 4 years ago by yawning

Replying to infinity0:

Why SHOULD or MAY? I suggest MUST.

MUST works. I guess I should add stuff about "Tor MUST NOT write to the stdin of the managed proxy" and "managed proxies MUST NOT use stdin for termination signaling if the environment variable is not set" as well.

comment:10 Changed 4 years ago by dcf

Let's think about how this might be implemented in pyptlib or goptlib. I left signal handling out of goptlib (instead demonstrating it in the dummy transports) because it's a bit complicated and it requires knowledge of how many connections are in progress—whatever that may mean to the transport.

The signal-handling code in dummy-client and dummy-server is the same, and looks like this in pseudocode:

numHandlers = 0
while true {
    event = wait_for_event()
    if event is "connection begin" {
        numHandlers++
    } else if event is "connection end" {
        numHandlers--
    } else if event is SIGINT or SIGTERM {
        break
    }
}

close_all_listeners()

if event is SIGTERM {
    exit()
}

// event was SIGINT, wait until second signal or no more handlers
while numHandlers > 0 {
    event = wait_for_event()
    if event is "connection end" {
        numHandlers--
    } else if event is SIGINT or SIGTERM {
        break
    }
}

It's only about 30 lines, so not terrible to copy-paste, but if we add interaction with stdin (especially if it's switched through TOR_PT_EXIT_ON_STDIN_CLOSE), we should think of a way to librarify it, to avoid mistakes. Basically, we want to treat "stdin eof" the same as SIGTERM everywhere.

The reason the above code is not in goptlib is that it requires knowing how a "connection begin" and "connection end" happen, and how to do close_all_listeners. I didn't think of a nice universal abstraction for those. A "connection" may be ill-defined; for example, meek-server treats an in-progress HTTP request as a "connection" for the purpose of signal handling. I also wasn't sure if close_all_listeners is all any transport will have to do.

For comparison, here's what meek-client-torbrowser (the first program to test the close-on-stdin-eof idea) does:

event = wait_for_event()
if event is "stdin eof" {
    // pretend it was a SIGTERM
    event = SIGTERM
}
// pass the signal to meek-client so it can react
send_signal_to_child(event)
if event == SIGINT {
    wait_for_event()
}

comment:11 in reply to:  10 ; Changed 4 years ago by yawning

Replying to dcf:

It's only about 30 lines, so not terrible to copy-paste, but if we add interaction with stdin (especially if it's switched through TOR_PT_EXIT_ON_STDIN_CLOSE), we should think of a way to librarify it, to avoid mistakes. Basically, we want to treat "stdin eof" the same as SIGTERM everywhere.

obfs4proxy's feature branch has a goroutine that SIGTERMs the current running process if Stdin ever closes, and exits the process if SIGTERMing itself fails (Which it will, always on Windows, because the only signal os.Process.Signal() can deliver is SIGKILL, aka, "it calls ProcessTerminate").

Since I don't have children to clean up, this is fine for me, though I will clean up the branch a tad.

For what it's worth, I don't think tor ever sends SIGINT to pt subprocesses ever either, so it might make sense just to rip that out of the spec.

Anyway. Assuming "tor will sent an env var indicating to PTs that stdin is valid and will never be written to" is ok behavior, shouldn't the various library side implementations of the handlers be discussed in another ticket?

comment:12 in reply to:  11 ; Changed 4 years ago by dcf

Replying to yawning:

Anyway. Assuming "tor will sent an env var indicating to PTs that stdin is valid and will never be written to" is ok behavior, shouldn't the various library side implementations of the handlers be discussed in another ticket?

Yes, assuming it's implementable. You don't want to spec something that turns out to be awkward to implement.

Removing SIGINT from pt-spec sounds good to me, that always seemed more trouble than it was worth. If all you have to worry about is SIGINT and eof, then that's probably simple enough to implement by hand.

comment:13 in reply to:  12 Changed 4 years ago by yawning

Cc: asn added

Replying to dcf:

Yes, assuming it's implementable. You don't want to spec something that turns out to be awkward to implement.

Ok. pyptlib does zero termination handling, so all that changes there is a method that wraps getenv. The obfsproxy side is trivial, since twisted makes monitoring for events on file descriptors easy (twisted.internet.stdio).

If asn doesn't want to do the changes I will (Though #15471 might be good enough given our Lenooks monoculture).

Removing SIGINT from pt-spec sounds good to me, that always seemed more trouble than it was worth. If all you have to worry about is SIGINT and eof, then that's probably simple enough to implement by hand.

Even with SIGINT in the picture, I don't think an event that should be treated as "run whatever cleanup is needed, terminate now" is that hard to implement. The only reason why I do the "sigterm myself" dance in obsf4proxy is because I didn't feel like passing a channel around (I'll probably change this actually).

There's another side to this that still needs writing (and probably a separate ticket) that changes the pt teardown process to be multi-stage (close stdin, give the child sometime to exit, follow up with more destructive methods to kill the child off). As it stands right now, the main benefit to implementing what is specified is that PTs will always clean up properly if tor crashes for any reason.

(Yes, I will change the PT teardown to be multi-stage, provided that this ticket happens.)

comment:14 Changed 4 years ago by yawning

I went and cleaned up/refactored how I do termination in obfs4proxy and updated my feature branch appropriately.

This is what the termination detection will look like with this feature added:
https://github.com/Yawning/obfs4/blob/feature15435/obfs4proxy/termmon.go

And this is the calling code:
https://github.com/Yawning/obfs4/blob/feature15435/obfs4proxy/obfs4proxy.go#L437

It's a lot like how meek-client-torbrowser does this now, though I have extra code to detect abnormal tor termination even without this feature. I guess I can even make termMonitor.wait() just return a bool indicating immediate shutdown or not, but I'm done messing with this for today I think.

comment:15 Changed 4 years ago by nickm

Merged to master. Are there other patches to merge about this?

comment:16 Changed 4 years ago by yawning

A patch to pt-spec.txt that I need to write still, been busy, I'll poke you when it's ready.

comment:17 Changed 4 years ago by nickm

ok. can I ask you to open a new ticket for that, and close this, so we don't get confused?

comment:18 Changed 4 years ago by yawning

Resolution: implemented
Status: needs_reviewclosed

Sure.

comment:19 Changed 4 years ago by yawning

Derp. This should fix the windows jenkins build. Stupid copy/paste error.

https://github.com/Yawning/tor/compare/bug15435_windows

comment:20 Changed 4 years ago by nickm

Merged that!

comment:21 Changed 4 years ago by dcf

So what I'm going to do is make terminateprocess-buffer set TOR_PT_EXIT_ON_STDIN_CLOSE=1, and modify meek-client-torbrowser to obey that environment variable instead of having a special --exit-on-stdin-eof flag. So you can think of terminateprocess-buffer as now being just a compatibility shim that simulates support for TOR_PT_EXIT_ON_STDIN_CLOSE for versions of tor that don't have it. We can get rid of terminateprocess-buffer once a tor with TOR_PT_EXIT_ON_STDIN_CLOSE lands in Tor Browser.

comment:22 in reply to:  21 Changed 4 years ago by dcf

Replying to dcf:

So what I'm going to do is make terminateprocess-buffer set TOR_PT_EXIT_ON_STDIN_CLOSE=1, and modify meek-client-torbrowser to obey that environment variable instead of having a special --exit-on-stdin-eof flag. So you can think of terminateprocess-buffer as now being just a compatibility shim that simulates support for TOR_PT_EXIT_ON_STDIN_CLOSE for versions of tor that don't have it. We can get rid of terminateprocess-buffer once a tor with TOR_PT_EXIT_ON_STDIN_CLOSE lands in Tor Browser.

#15606.

comment:23 Changed 4 years ago by isabela

Keywords: SponsorS added
Points: small-remaning
Version: Tor: unspecifiedTor: 0.2.7

comment:24 Changed 4 years ago by isabela

Keywords: 027-triaged-1-in added

comment:25 Changed 3 years ago by arma

Points: small-remaningsmall-remaining
Severity: Normal
Note: See TracTickets for help on using tickets.