Opened 3 years ago

Last modified 3 years ago

#15125 new defect

meek-client-wrapper does not use signals well

Reported by: infinity0 Owned by: dcf
Priority: Medium Milestone:
Component: Obfuscation/meek Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

When testing meek-client-wrapper, I noticed two things:

  • it does not respond to SIGINT or SIGKILL. also, the signal handling code is different from meek-client. perhaps we should move it to goptlib?
  • it uses sigkill to kill its children, not giving them a chance to clean up. Yes, this is awkward on windows but we can at least do something nicer on posix systems.

Child Tickets

Change History (10)

comment:1 Changed 3 years ago by infinity0

Summary: meek-client-wrapper does not respond to handlers properlymeek-client-wrapper does not use signals well

comment:2 in reply to:  description Changed 3 years ago by dcf

Component: Pluggable transportmeek
Owner: changed from asn to dcf

Replying to infinity0:

  • it does not respond to SIGINT or SIGKILL.

? It does SIGINT and SIGTERM right here:

	signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)

And I thought SIGKILL "cannot be caught, blocked, or ignored"?

also, the signal handling code is different from meek-client. perhaps we should move it to goptlib?

That actually is the reason it's not in goptlib—I have had to do the signal handling code in different ways in different programs. The program might want to close sockets, or log something, or whatever. I didn't find a good way to abstract it (that is significantly more expressive than just handling the signals explicitly).

  • it uses sigkill to kill its children, not giving them a chance to clean up. Yes, this is awkward on windows but we can at least do something nicer on posix systems.

That's a fair point. However keep in mind that the code already passes on any signal it receives; so if we got a SIGTERM, so will the child processes. I guess I might prefer to keep the destructive kill, but only do it if we haven't sent some other signal to a child process (rather than unconditionally as a defer as we do now).

I saw your ae0dc0ff ("give child processes a chance to clean up"). I am strongly disinclined to do anything involving a timer. Do signals really work like that? The parent process needs to still be running in order for the signal to be delivered?

I'm skeptical that closing stdin is sufficient to stop Firefox. Does that work on Windows?

comment:3 Changed 3 years ago by infinity0

Ah, I could have been wrong about not passing on the signals properly. At least, the commit I did that you mentioned, fixes things on my Debian system.

I think the real bug is that, whilst you do pass on SIGTERM, you immediately then return from the main thread, which triggers the defer behaviour which is to send SIGKILL. This does not give the child enough time to clean up - for example if it has its own children, which is the case if meek-browser-helper is using xvfb-run. So a timer between when we send SIGTERM and SIGKILL is necessary. This is a fairly standard pattern that you will see in well-written "graceful shutdown" scripts (TERM, sleep, KILL), it's not that unusual. Somewhere around 10-30 seconds is normal.

(Yes, signals are delivered immediately, so if you want a delay then the parent does need to stick around for this duration of delay, *then* send the second signal.)

Closing stdin is probably not sufficient to stop firefox, no. But meek-browser-helper is supposed to be a shell script anyways, so we might be able to work with that on windows. Something along the lines of start firefox; read x; kill firefox. Needs more testing, of course.

comment:4 Changed 3 years ago by infinity0

It's easier in VBScript:

' meek-browser-helper.vbs
Dim ChildProc
Set ChildProc = CreateObject("WScript.Shell").Exec ".\firefox.exe " & _
    "-no-remote -profile TorBrowser\Data\Browser\profile.meek-http-helper"

Do While Not WScript.StdIn.AtEndOfStream
   WScript.StdIn.Skip(1)
Loop

ChildProc.Terminate()

However I need to get a Windows machine to test this.

comment:5 Changed 3 years ago by infinity0

OK, I managed to get access to a Windows 7 machine, and this does what we want:

' meek-browser-helper.vbs
Dim ChildProc
Set ChildProc = CreateObject("WScript.Shell").Exec(".\firefox.exe " & _
    "-no-remote -profile TorBrowser\Data\Browser\profile.meek-http-helper")

WScript.StdOut.WriteLine(ChildProc.StdOut.ReadLine)

Do While Not WScript.StdIn.AtEndOfStream
   WScript.StdIn.Skip(1)
Loop

ChildProc.Terminate()

If you close stdin, it terminates firefox gracefully as possible (WM_CLOSE; wait; ProcessTerminate) - https://msdn.microsoft.com/en-us/library/yk84ffsf.aspx

Unfortunately it must be run as cscript X.vbs rather than as X.vbs which on most Windows systems would run wscript not cscript, and the former does not have StdIn. There is no way to override this on a per-script basis; the closest thing I found online is to do "if we're not being run by cscript, run ourselves using cscript" but this doesn't work for our purposes because it doesn't provide a way to pass stdout back to the calling process.

It does not seem feasible to do the equivalent behaviour from a .bat file either. :(

comment:6 Changed 3 years ago by infinity0

So I tried another approach - turn meek-browser-helper-windows.bat into a wrapper for meek-browser-helper-windows.js (I rewrote the .vbs in .js which also works, and is probably better since VBS is dead):

meek-browser-helper-windows.bat

@echo off
REM meek-browser-helper program for Tor Browser on Windows
cscript //b TorBrowser\Tor\PluggableTransports\meek-browser-helper-windows.js

meek-browser-helper-windows.js

// meek-browser-helper.js
var ChildProc = new ActiveXObject("WScript.Shell").Exec("./firefox.exe " + 
    "-no-remote -profile TorBrowser/Data/Browser/profile.meek-http-helper")

WScript.StdOut.WriteLine(ChildProc.StdOut.ReadLine())

while (!WScript.StdIn.AtEndOfStream) {
   WScript.StdIn.Skip(1)
}
WScript.StdErr.WriteLine("stdin closed")

ChildProc.Terminate()

This works when run from inside cmd:

C:\Users\gaol\Desktop\Tor Browser\Browser>TorBrowser\Tor\PluggableTransports\meek-browser-helper-windows.bat
meek-http-helper: listen 127.0.0.1:51437
^Z
stdin closed

(ctrl-Z,<enter> is the cmd.exe equivalent of ctrl-D)

However, it does not work when double-clicked or when run from Tor Browser. Log output suggests that when a .bat file is executed directly (i.e. as "argv[0]"), Windows automatically closes its stdin. So our "terminate children on stdin close" behaviour doesn't work for this case... I'll try to think of other ways.

comment:7 Changed 3 years ago by infinity0

Ah! It turns out I was testing with your 4.5a4-12716-1 which doesn't contain my commit that fixes this bug. That's why meek-browser-helper was exiting early - meek-client-wrapper was not giving an open stdin for it to hold on to. (I.e. the last sentence in my previous comment was incorrect.)

Now, with the files in my previous comment, everything cleans up nicely after TBB closes (though meek-client-wrapper is currently waiting the full 16 seconds, because meek-client itself doesn't implement --exit-on-stdin-eof).

I will send in an amendment to the patch in #12716.

comment:8 in reply to:  6 Changed 3 years ago by dcf

Replying to infinity0:

I'll try to think of other ways.

Okay, but don't spend too much time going down obscure paths. I'd rather have something that is 90% correct and simple than 100% correct and complex. Since (so far) we only need this change for Unix with xvfb, my current plan is just to remove the explicit calls to Kill on platforms other than Windows, and rely on the forwarded signals to cause termination. I'm fine with leaving unclean termination on Windows, as long as the xvfb use case works; we can always improve it further later on.

comment:9 Changed 3 years ago by infinity0

A less obscure option would be to replace the .bat and .js by a single go program. It would take me longer to figure out how to integrate that into the TBB build, though.

comment:10 Changed 3 years ago by infinity0

So what are your thoughts on this? I believe that this is needed for #12716 to work well on windows... Maybe not my exact patch, but something like it. Arguably we should reduce the timeout to just a few seconds.

Any particular reason you are "strongly disinclined to do anything involving a timer"? This is how all graceful strong shutdowns work.

Note: See TracTickets for help on using tickets.