Opened 14 months ago
Closed 11 months ago
#28179 closed enhancement (implemented)
Handle output from PT processes with the event loop
Reported by: | ahf | Owned by: | ahf |
---|---|---|---|
Priority: | Medium | Milestone: | Tor: 0.4.0.x-final |
Component: | Core Tor/Tor | Version: | Tor: unspecified |
Severity: | Normal | Keywords: | tor-pt, 040-roadmap-subtask |
Cc: | Actual Points: | ||
Parent ID: | #25502 | Points: | 31 |
Reviewer: | nickm | Sponsor: | Sponsor19-must |
Description
Currently the output from stdout/stderr of a PT process is only read during the startup of the process. The reading process uses read() on a non-blocking socket, which currently seems to work, but have proved to be flaky.
We should ensure that PT processes' output can be read all the time.
On Windows we cannot attach the pipes to the main loop because of limitations of the select()
API, so we have to do something slightly worse such as reading from the stdout/stderr handle via a timer as long as the processes are alive.
Child Tickets
Attachments (1)
Change History (32)
comment:1 Changed 14 months ago by
Owner: | set to ahf |
---|---|
Status: | new → assigned |
comment:2 Changed 14 months ago by
Keywords: | tor-pt 036-roadmap-subtask added |
---|
comment:3 Changed 14 months ago by
comment:4 Changed 13 months ago by
Milestone: | Tor: 0.3.6.x-final → Tor: 0.4.0.x-final |
---|
Tor 0.3.6.x has been renamed to 0.4.0.x.
comment:5 Changed 13 months ago by
Keywords: | 040-roadmap-subtask added; 036-roadmap-subtask removed |
---|
0.3.6 is now 0.4.0: changing keywords
comment:6 Changed 13 months ago by
Starting to track this in https://github.com/ahf/tor/tree/bugs/28179
comment:7 Changed 13 months ago by
Status: | assigned → needs_review |
---|
This is already pretty big, it's not entirely done yet, but let's start having some review of the design and code:
Known issues with the current code:
- The Python file is missing its actual test implementations in test_unix/test_win32. Need to talk with Nick about that.
- Missing doc strings for a lot of functions.
- Missing process termination detection.
- Some FIXME's still around.
comment:9 Changed 13 months ago by
Reviewer: | → dgoulet |
---|
comment:10 Changed 13 months ago by
Status: | needs_review → needs_revision |
---|
Did a pass on it. No show stopper for now but what I would need also is some documentation on how that API is used since right now it is very "abstract" to the process.c
and not used outside of it. One thing that could be done there is to do some documentation on how to use the API at the start of process.c
?
comment:11 Changed 13 months ago by
Status: | needs_revision → needs_review |
---|
Fixes should be in: https://github.com/torproject/tor/pull/537
comment:12 Changed 13 months ago by
Status: | needs_review → needs_revision |
---|
comment:14 Changed 13 months ago by
https://github.com/torproject/tor/pull/537/files#diff-08bd8435ecfd8690789652091567ff47R1151
/* The format is 'LOG <transport> <message>'. We accept empty messages. */
It looks like this comment is out of date; I thought the <transport>
part was removed. Also "We accept empty messages" seems to contradict the "Managed proxy sent us a LOG line with missing arguments" check above: I suppose that LOG
(with a space) is acceptable but LOG
is not, not sure if that's what's intended.
https://github.com/torproject/tor/pull/537/files#diff-6fdba7a022e591369dcf7054e5f2700aR7396
/** A pluggable transport called <b>transport_name</b> has emitted a log * message found in <b>message</b>. */ void control_event_transport_log(const char *transport_name, const char *message) { send_control_event(EVENT_TRANSPORT_LOG, "650 TRANSPORT_LOG %s %s\r\n", transport_name, message); }
Where does <transport_name>
come from? Is it the path the executable? In general, tor won't know a specific transport name corresponding to any LOG message. I don't see where control_event_transport_log
is called other than in tests.
comment:15 Changed 13 months ago by
Cross-referencing #9957, an older ticket about reading and logging PTs' stderr.
Changed 13 months ago by
Attachment: | bug28179-client.go added |
---|
Client PT program that demonstrates a deadlock when writing too much to stdout/stderr.
comment:16 Changed 13 months ago by
This program may be useful for testing: attachment:bug28179-client.go
It's a modified version of dummy-client from goptlib. The differences are that after PT initialization, it starts writing to stdout and stderr at 4 KB/s. Also, in its core proxying loop, it tries to write a line to stdout/stderr every time it downloads a chunk of data. Eventually the stdout and stderr buffers fill up, and the proxy loop halts because it cannot write its line. The program copies everything it writes to stdout/stderr to a file called mirror.log, so you can see how much was written before it deadlocks.
- Download and put in a directory called bug28179-client.
go get
andgo build
.- Put in torrc:
DataDirectory datadir UseBridges 1 ClientTransportPlugin dummy exec bug28179-client Bridge dummy 128.31.0.61:443
tor -f torrc SOCKSPort 10000
- In another terminal,
tail -F mirror.log
. You will see a mixture ofhello tor world
andreceived XXX bytes
lines. - For me, the system deadlocks after 8 seconds; apparently the stdout/stderr buffers are 64 KB.
$ ls -l mirror.log -rw-r--r-- 1 david david 65425 Nov 27 14:09 mirror.log
If tor was in the middle of bootstrapping, it will stop here. If tor finished bootstrapping, you can verify that it stopped working withcurl -x socks5h://127.0.0.1:10000/ https://example.com/
.
comment:18 Changed 13 months ago by
Status: | needs_review → merge_ready |
---|
PR in previous comment. Branch is in ahf:bugs/28179_pr
.
@nickm, this is rather large branch. It had a lots of back and forth and testing with me and ahf so it is in merge_ready
for your consideration and upstream merge.
The spec changes of this branch are in #28180.
comment:19 Changed 13 months ago by
Status: | merge_ready → needs_revision |
---|
Hi! I've got some requests in the branch. I don't know if the windows code works or not, but if it's well-tested both manually and with automated tests, I'll believe it.
I had thought that we'd be doing multiple threads here to make the windows code work. I think that might be our only way around a timer. Having timers that bypass the periodic timer system risks undoing a bunch of our work for wakeup reduction, unless we are super careful.
For the spec: I think we need to request more structure from these log messages, or we won't be able to actually do anything automatic based on them. Maybe we could define severity, keyword, suggested-action stuff?
comment:20 Changed 12 months ago by
Status: | needs_revision → needs_review |
---|
Fixed all pending comments on the code. Moving to needs_review.
I've postponed integrating any code that needs to make use of the K/V parser (that is the PT stdout handler) for the LOG and STATUS messages. If we think these should go in first, I'll need some help figuring out what the best way to do this with a merge/rebase without making it difficult for the reviewers to review this again.
comment:21 Changed 12 months ago by
Sponsor: | Sponsor8 → Sponsor8-must |
---|
comment:22 Changed 12 months ago by
Status: | needs_review → merge_ready |
---|
Marked as merge_ready, but made a squashed-and-merged PR as https://github.com/torproject/tor/pull/603 so that CI can take one last peek at this branch before I merge it.
comment:24 Changed 12 months ago by
Status: | merge_ready → accepted |
---|
Let's move this to being a tracker ticket for its childs.
comment:25 Changed 12 months ago by
Status: | accepted → needs_review |
---|
Did an experiment on a VM over the night with running a lot of tests. PR in:
comment:26 Changed 12 months ago by
Reviewer: | dgoulet → nickm |
---|
Changing reviewer to nickm as per discussion on IRC.
comment:28 Changed 12 months ago by
Status: | needs_review → assigned |
---|
comment:29 Changed 11 months ago by
Sponsor: | Sponsor8-must → Sponsor19-must |
---|
Will close this ticket when children #28845 and #28849 are closed.
comment:30 Changed 11 months ago by
Points: | → 31 |
---|
comment:31 Changed 11 months ago by
Resolution: | → implemented |
---|---|
Status: | assigned → closed |
Related to this is #26360. tor should read from transport plugins' stderr, if only to prevent processes from deadlocking when they write to it too much.