Opened 5 years ago

Closed 4 years ago

Last modified 4 years ago

#14334 closed defect (fixed)

Bugs when registering guard status in connection_or_connect()

Reported by: asn Owned by: asn
Priority: Medium Milestone: Tor: 0.2.9.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Normal Keywords: tor-guard, 027-triaged-1-in, PostFreeze027, TorCoreTeam201604, tor-guards-revamp
Cc: Actual Points:
Parent ID: Points: small
Reviewer: Sponsor: SponsorU-can

Description

In connection_or_connect() we can see the following snippet:

  switch (connection_connect(TO_CONN(conn), conn->base_.address,
                             &addr, port, &socket_error)) {
    case -1:
      /* If the connection failed immediately, and we're using
       * a proxy, our proxy is down. Don't blame the Tor server. */
      if (conn->base_.proxy_state == PROXY_INFANT)
        entry_guard_register_connect_status(conn->identity_digest,
                                            0, 1, time(NULL));
      connection_or_connect_failed(conn,
                                   errno_to_orconn_end_reason(socket_error),
                                   tor_socket_strerror(socket_error));
      connection_free(TO_CONN(conn));
      return NULL;

I see two problems here:

a) connection_or_connect() can fail in ways that are unrelated to the guard node. For example it can fail with this codepath, in which case the guard should not be marked as down:

  s = tor_open_socket_nonblocking(protocol_family,SOCK_STREAM,IPPROTO_TCP);
  if (! SOCKET_OK(s)) {
    *socket_error = tor_socket_errno(-1);
    log_warn(LD_NET,"Error creating network socket: %s",
             tor_socket_strerror(*socket_error));
    return -1;
  }

b) Also, he comment seems to state the opposite than what the code does. That is, the comment claims that if we are using a proxy and it's down, we shouldn't mark the guard as down. But the code only marks the guard as down, if the proxy state is PROXY_INFANT which is the state of a proxy when we haven't connected to it yet. I think the bug was introduced by me in a79bea40d.

I think a better solution here is to check for proxy_type == PROXY_NONE before marking the guard as down.

Child Tickets

Change History (17)

comment:1 Changed 5 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.7.x-final
Owner: set to asn
Status: newassigned

comment:2 Changed 5 years ago by nickm

Keywords: 027-triaged-1-in added

Marking more tickets as triaged-in for 0.2.7

comment:3 Changed 5 years ago by isabela

Keywords: SponsorU added
Points: small
Version: Tor: 0.2.7

comment:4 Changed 4 years ago by nickm

Keywords: PostFreeze027 added

If we wind up with a nice patch for any of these in the appropriate window, we should sure merge it.

comment:5 Changed 4 years ago by nickm

Milestone: Tor: 0.2.7.x-finalTor: 0.2.8.x-final

Not expected patches for these by Monday. :(

comment:6 Changed 4 years ago by nickm

Keywords: SponsorU removed
Sponsor: SponsorU

Bulk-replace SponsorU keyword with SponsorU field.

comment:7 Changed 4 years ago by asn

Severity: Normal

Hmm, I wonder what the right fix is here.

Could we just remove this block of code entirely?

      /* If the connection failed immediately, and we're using
       * a proxy, our proxy is down. Don't blame the Tor server. */
      if (conn->base_.proxy_state == PROXY_INFANT)
        entry_guard_register_connect_status(conn->identity_digest,
                                            0, 1, time(NULL));

After all, is there any chance that connection_connect() will fail immediately with -1 due to the remote guard node being down? It seems to me that if connection_connect() fails immediately then it's probably some networking issue in the local machine and it shows nothing about the state of the guard node.

comment:8 Changed 4 years ago by isabela

Sponsor: SponsorUSponsorU-can

comment:9 Changed 4 years ago by asn

Milestone: Tor: 0.2.8.x-finalTor: 0.2.9.x-final

This seems like a non-triggerable bug, so I'm going to defer this to 0.2.9 so as to not stall the upcoming 0.2.8 release.

I put this ticket on my TODO list so I should have a patch soon.

comment:10 Changed 4 years ago by asn

Status: assignedneeds_review

Please see my branch bug14334 for a fix here based on comment:7.

I opted for removing that whole block of code, because it doesn't seem useful. connection_connect() can fail for various local reasons, which have nothing to do with the status of our guard.

The only problem with this patch would be if there are any platforms where connect() actually connects to the guard in a blocking manner and then immediately returns -1 if the connection fails. If this is the case, then we need a more complex patch where we would distinguish between connection_connect_sockaddr() failing for local reasons and connect() failing for remote reasons before we mark the guard as down.

comment:11 Changed 4 years ago by nickm

Keywords: TorCoreTeam201604 added

Every postponed needs_review ticket should get a review in April

comment:12 Changed 4 years ago by nickm

Could I have a comment in this case to say why we aren't calling entry_guard_register_connect_status? Otherwise I worry it will get added back in.

Otherwise looks fine.

comment:13 in reply to:  12 Changed 4 years ago by asn

Replying to nickm:

Could I have a comment in this case to say why we aren't calling entry_guard_register_connect_status? Otherwise I worry it will get added back in.

Hm, why do you expect entry_guard_register_connect_status() to be called there? If there is a reason, we need to add it back in. Otherwise, I don't know why anyone would add it back in.

My point with comment:10 was that there is no reason to call entry_guard_register_connect_status() right after connection_connect() since we are dealing with asynchronous networking and by that time we have not even connected to the remote host. The right place to call that function is at asynchronous connection callbacks like circuit_build_failed() (where we do call it).

If that's not persuasive, then I might be wrong. What do you think?

comment:14 Changed 4 years ago by nickm

Hm, why do you expect entry_guard_register_connect_status() to be called there?

Because in most cases where the connection fails, we blame the guard. It isn't right to blame the guard in this case, but we should IMO actually say so.

comment:15 in reply to:  14 Changed 4 years ago by asn

Status: needs_reviewmerge_ready

Replying to nickm:

Hm, why do you expect entry_guard_register_connect_status() to be called there?

Because in most cases where the connection fails, we blame the guard. It isn't right to blame the guard in this case, but we should IMO actually say so.

Sounds good. I pushed a fixup commit that adds a comment. :)

[I change this ticket to "ready for merge". Hope that's a good way to use this feature.]

comment:16 Changed 4 years ago by nickm

Resolution: fixed
Status: merge_readyclosed

lgtm; squashed and merged!

comment:17 Changed 4 years ago by nickm

Keywords: tor-guards-revamp added
Note: See TracTickets for help on using tickets.