Opened 6 years ago

Closed 6 years ago

#9781 closed defect (fixed)

Tor with port forwarding shows "Failed to terminate process" messages

Reported by: mgorbach Owned by:
Priority: Medium Milestone: Tor: 0.2.5.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay tor-fw-helper 025-triaged
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I've got Tor installed on a modern Gentoo linux virtual machine, with the nat-pmp and upnp USE flags enabled in Portage. I've enabled port forwarding in my torrc with: "PortForwarding 1." Port forwarding appears to work as expected, as I can see that the DIRPORT 9030 and the ORPORT 9001 are forwarding via miniupnpd running on my OpenWRT router to the tor relay host. The logs on the router show the port forwarding succeeding via nat-pmp.
However, I am getting log statements every 5 minutes form tor which look like this when seen in ARM: "[NOTICE] Failed to terminate process with PID '9053'".
When I see these statements, I check for the PID in question and don't find it running. I don't have a "PortForwardingHelper" option set in the torrc at all, but I suspect that these PID logging statements are coming from something port-forwarding related. Maybe the helper is quitting unexpectedly?
I can recompile without port forwarding and do it manually, but before I did that I figured I would write up this issue and see if I can help in testing and debugging.

Child Tickets

Attachments (1)

bisect-log (2.9 KB) - added by nickm 6 years ago.

Download all attachments as: .zip

Change History (13)

comment:1 Changed 6 years ago by nickm

Component: - Select a componentTor
Keywords: tor-relay tor-fw-helper added
Milestone: Tor: 0.2.4.x-final

What version of Tor was this using?

comment:2 Changed 6 years ago by mgorbach

0.2.3.25

comment:3 Changed 6 years ago by arma

When I run 0.2.5.1 with "portforwarding 1 bridgerelay 1 orport 4444", I get

Oct 31 15:32:43.405 [warn] tor-fw-helper sent us a string we could not parse (ERR: Failed to spawn background process - code 9/2).

during my bootup.

And then every five minutes I see this:

Oct 31 15:37:43.405 [notice] Failed to terminate process with PID '28891' ('No such process').
Oct 31 15:37:49.405 [warn] tor-fw-helper sent us a string we could not parse (ERR: Failed to spawn background process - code 9/2).

It sounds like, at the least, we should be providing some more useful log messages.

This also doesn't sound like an 0.2.4 ticket.

comment:4 Changed 6 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final

comment:6 Changed 6 years ago by nickm

Keywords: 025-triaged added

comment:7 Changed 6 years ago by nickm

Status: newneeds_review

The "Failed to terminate process with PID '%s'" is almost certainly a duplicate of #8746 : our "destroy a process" logic was really screwy.

I can reproduce that error by starting without a working tor-fw-helper. Does the message go away when you set "portforwardinghelper /path/to/tor-fw-helper "? (Replace the path with a real path.) If so, then this bug isn't caused by a problem in tor-fw-helper.

The remaining problem is that the error message here is completely useless. We're supposed to catch the error message ("ERR: Failed to spawn background process...") in log_from_pipe, but we don't.

And that happens because log_from_pipe() assumes that if the first read() we do gives us EAGAIN, we haven't gotten the error message. (!)

I've got a fix for the bad error message issue in branch "bug9781". But I think that fix isn't right: when I try 0.2.3, the error message is correct, which makes me think we made these files nonblocking while we were doing pluggable transport work in 0.2.4. Bisecting...

comment:8 Changed 6 years ago by nickm

There are only 'skip'ped commits left to test.
The first bad commit could be any of:
da16c425ef8c17a753ae0abc3fdb26a328004062
443260ffd8370c0fddefad42479b957132e8d275
cd05f35d2cdf50d31108428cf5c19549d468dbc0
44fe717524408d247ab10e3794438232d81d66dc
We cannot bisect more!

comment:9 Changed 6 years ago by nickm

Out of those, I most suspect cd05f35d2cdf50.

Here's a fun mismatch it introduced:

// In the child process:
        nbytes = write(STDOUT_FILENO, error_message, error_message_length);

// In the parent:
    stderr_status = log_from_pipe(child_handle->stderr_handle,
                                  LOG_INFO, filename, &retval);

Note that stdout != stderr.

comment:10 Changed 6 years ago by nickm

Status: needs_reviewneeds_revision

Changed 6 years ago by nickm

Attachment: bisect-log added

comment:11 Changed 6 years ago by nickm

21:51 < nickm> athena: on 9781, I could declare my half-baked fix good enough 
               for now, add a new ticket for "do it right, dummy", declare 
               victory, and move on?
21:54 < athena> works for me

comment:12 Changed 6 years ago by nickm

Resolution: fixed
Status: needs_revisionclosed

Created #11922 for the "do it right" fix; merged the fix above into 0.2.5

Note: See TracTickets for help on using tickets.