Once tor-fw-helper works on Windows, we need to be able to launch it and read its output. This works on Linux/BSD/MacOS X, but Windows does process control very differently.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
My branch bug2046 in git://git.torproject.org/sjm217/tor.git (1da5081) should now be ready for review. I haven't rebased it yet because I was assuming you will just compare it to origin/master, but let me know if this is not the case.
It seems sanely written and defensively coded. I wonder about setting process security attributes and primary thread security attributes to NULL in the child process. Is that simply because the parent already has everything we need and it is inherited?
Overall, I think I'd like to test and compile it but I don't currently have a win32 dev machine setup.
20:19 < nickm> sjmurdoch: Initial request, pending more detailed review: Can we split the C string processing stuff out of functions like log_from_handle, so that we can do a unit test for it?20:21 < nickm> sjmurdoch: Also, elsewhere thoughout the code, we return structs on the heap, not by value. Any reason to make an exception for tor_spawn_background?20:21 < sjmurdoch> nickm: Yes, that sounds like a good idea20:21 < nickm> (If there's other string processing goop in the spawn code, same logic applies)20:21 < sjmurdoch> nickm: Not particularly20:22 < sjmurdoch> nickm: You mean pass in a pointer which the function will modify?20:22 < sjmurdoch> or the function allocates the struct and returns a pointer?20:22 < nickm> Either one would be fine IMO.20:22 < sjmurdoch> nickm: OK20:24 < nickm> If the caller will generally just have it on the stack, passing a pointer is fine.20:24 < nickm> If the caller will generally want to have the struct stick around for a while, having the function allocate and return it is better. You'd probably want to add a process_handle_t function somewhere as well.20:25 < nickm> sjmurdoch: (Also, is it okay if I copy-and-paste all of this onto the 2046 ticket when we're done?)20:25 < nickm> err, a process_handle_free function20:26 < sjmurdoch> Currently the process_handle_t instance for the helper is a static variable in tor_check_port_forwarding20:26 < sjmurdoch> So I could pass in a pointer from tor_check_port_forwarding to tor_spawn_background?20:27 < nickm> fine by me20:27 < sjmurdoch> nickm: And yes, feel free to paste into 204620:28 < nickm> ok20:29 < nickm> so, what happens if some argv entry contains a space?20:29 < sjmurdoch> You're looking at the join code?20:30 < nickm> (Yeah.) That'll happen for sure if we ever need to run a program that lives under C:\Program Files\ .20:30 < sjmurdoch> For Windows, it would have to be put in quotes20:30 < nickm> sjmurdoch: I think we should have a separate, unit-tested function, that formats a command line properly for windows.20:31 < sjmurdoch> OK, I can refactor that out20:31 < nickm> great20:32 < nickm> also it should handle internal quotes; there have been way too many shell-based snafus in the history of programming for us to be blase about anything other than stricly correct command lines.20:32 < sjmurdoch> I don't fully understand how command lines are parsed on Windows. I do know that it is the process which is called that does it, but is it their libc?20:35 < nickm> I'm afraid I don't fully know. But look at Python's subprocess.list2cmdline and at http://msdn.microsoft.com/en-us/library/ms88042120:35 < nickm> I think it's actually done in the equivalent of crt.o20:36 < sjmurdoch> Thanks, I'll take a look20:41 < nickm> For tor_get_exit_code: Let's return a define or an enum rather than a magic tristate.20:42 < sjmurdoch> Sounds sensible20:42 < nickm> nitpick: Should "log_from_handle" be called "log_to_handle" ?20:43 < sjmurdoch> I think log_from_handle because it reads from a handle and logs the output20:43 < nickm> Ah!20:44 < nickm> makes sense20:44 < nickm> In tor_read_all_from_handle, I'd be more comfortable if it did an tor_assert(byte_count + numread <= count);20:47 < nickm> sjmurdoch: In tor_log_from_handle: Do we care about strings with embedded NULs?20:48 < sjmurdoch> nickm: Hmm, possibly20:48 < sjmurdoch> nickm: tor-fw-helper won't generate them but other processes might20:49 < nickm> probably something to note; not necessarily something to fix right now.20:49 < sjmurdoch> OK20:49 < sjmurdoch> Related to this, I've assumed I'm working with byte strings everywhere. Is Windows Unicode support going to bite me?20:50 < sjmurdoch> (another bit I don't understand well).20:50 < nickm> Not sure. I think we're probably okay here.20:51 < nickm> Hm. log_from_handle will do badly if the subprocess has mroe than 255 bytes to say.20:52 < sjmurdoch> I think it will output partial lines in that case20:52 < nickm> yeah.20:52 < nickm> It might do bogus newlines if the \r\n is split20:53 < sjmurdoch> Hmm, yes 20:54 < sjmurdoch> I'm not sure what to do here. Code to handle partial lines will be tricky. On *nix I got away with using fgets but fgets on Windows can't ready from a handle20:55 < nickm> I'd say for now, document it as a known limitation20:55 < nickm> For later, you'd need to leave any partial line in a buffer associated with the handle20:55 < nickm> and make the whole thing more stateful20:56 < sjmurdoch> OK20:57 < sjmurdoch> nickm: There is already a ticket related to this: https://trac.torproject.org/projects/tor/ticket/204520:57 < nickm> sjmurdoch: .... and with that, I'm done with the initial review, I think.
For my benefit, and perhaps others, here's the IRC [comment:6 log above] as a TODO list:
Fix:
Split the C string processing stuff out of functions like log_from_handle, so
that we can do a unit test for it. (If there's other string processing goop in
the spawn code, same logic applies).
Pass pointer to process_handle_t into tor_spawn_background, rather than it
return a process_handle_t.
Have a separate, unit-tested function, that formats a command line properly for
windows. It should handle filenames with a space (e.g. C:\Program Files);
also it should handle internal quotes. Look at Python's subprocess.list2cmdline
and at http://msdn.microsoft.com/en-us/library/ms880421
For tor_get_exit_code: Let's return a define or an enum rather than a magic
tristate.
In tor_read_all_from_handle, I'd be more comfortable if it did an
tor_assert(byte_count + numread <= count);
Document and possibly fix:
In tor_log_from_handle: handle strings with embedded NULs; handle strings > 255 bytes (without splitting lines or outputting bogus newlines). See bug #2045 (moved).
OK, I believe I have addressed all of these issues, and the code is ready for review. FYI, the code you previously reviewed was 1da5081. Changes since then are:
Fix:
Split the C string processing stuff out of functions like log_from_handle, so
that we can do a unit test for it. (If there's other string processing goop in
the spawn code, same logic applies).
Have a separate, unit-tested function, that formats a command line properly for
windows. It should handle filenames with a space (e.g. C:\Program Files);
also it should handle internal quotes. Look at Python's subprocess.list2cmdline
and at http://msdn.microsoft.com/en-us/library/ms880421
I think the need_quotes check in format_commandline_argument might be inadequate: Shouldn't it also check for \ and " ?
Thing we could fix later:
In format_commandline_argument(), it would be neat to skip the business of building a smartlist of char*, and just pre-allocate a guaranteed-to-be-long-enough string -- I think that strlen(arg)*2 + 3 would be long enough, since each character in the input turns into at most 2 characters in the output, and there are at most 3 characters of overhead: 1 for the nul-terminator, 2 for quotes.
Let me know if you'd like me to open tickets for those.
Stuff I tweaked:
I think the tor_join_cmdline() function and format_commandline_argument() functions ought to have "win" somewhere in their names. (Fixed this.)
In documentation for tor_split_lines, we should refer to a NUL character. NULL is a pointer. (Pedantically, NUL is a null character, and null-terminated strings are also NUL-terminated, but never NULL-terminated.) (Fixed this.)
In format_commandline_argument(), it would be neat to skip the business of building a smartlist of char*, and just pre-allocate a guaranteed-to-be-long-enough string -- I think that strlen(arg)*2 + 3 would be long enough, since each character in the input turns into at most 2 characters in the output, and there are at most 3 characters of overhead: 1 for the nul-terminator, 2 for quotes.
Yes, I was considering that. An alternative would be to have a smartlist of ranges within the string, since most of the time output strings will consist of contiguous areas of input string. That would however lead to rather tricky code, so I think your approach sounds good.
Let me know if you'd like me to open tickets for those.
Yes please.
Stuff I tweaked:
I think the tor_join_cmdline() function and format_commandline_argument() functions ought to have "win" somewhere in their names. (Fixed this.)
In documentation for tor_split_lines, we should refer to a NUL character. NULL is a pointer. (Pedantically, NUL is a null character, and null-terminated strings are also NUL-terminated, but never NULL-terminated.) (Fixed this.)
Sorry, there was bug in the test suite which triggered a double-free on some platforms, including the buildbot. This should be fixed in cfa9ee5fe708539965f46a31b5d2abe4950179af of git://git.torproject.org/sjm217/tor.git (branch bug2046).
Trac: Resolution: implemented toN/A Status: closed to reopened
Gah, buildbot is unhappy again because the test made some assumptions about how the EOF flag would be delivered. For some reason it works differently between Ubuntu and MacOS, compared to the Debian machine. I've now fixed the test case in f5df96c94f0c40e4bf55e4b4b843df85e9d90a71 of git://git.torproject.org/sjm217/tor.git (branch bug2046).
Trac: Status: closed to reopened Resolution: fixed toN/A