Opened 10 years ago

Closed 9 years ago

Last modified 7 years ago

#1138 closed defect (fixed)

If bridge authority is unreachable, client doesn't fallback to bridge

Reported by: arma Owned by: mwenge
Priority: High Milestone: Tor: 0.2.2.x-final
Component: Core Tor/Tor Version: 0.2.1.19
Severity: Keywords: easy
Cc: arma, nickm, phobos, Sebastian Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by nickm)

When a Tor client starts up using a bridge, and UpdateBridgesFromAuthority is
set (which it is when Vidalia configures you to use a bridge), Tor will go to
the authority first and look up the bridge by fingerprint.

If the bridge authority doesn't have the bridge, or Tor doesn't know the
fingerprint, then Tor will fall back to asking the bridge directly.

If the bridge authority is filtered, though, then Tor will never notice that
the bridge authority lookup failed. So it will never fall back. Fail.

Our workaround for now is to stop giving out fingerprints via bridgedb.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Attachments (1)

1138.diff (475 bytes) - added by mwenge 9 years ago.
Review patch from mingw-san (applied)

Download all attachments as: .zip

Change History (26)

comment:1 Changed 9 years ago by nickm

Milestone: Tor: 0.2.2.x-final

comment:2 Changed 9 years ago by nickm

Description: modified (diff)
Keywords: easy added

The current retry code in question [that is, the code that isn't getting called when the bridge authority is filtered] is the part at the start of dir_routerdesc_download_failed() that eventually calls {{{ retry_bridge_descriptor_fetch_directly(); So all we need to do is make sure that this retry code also happens when the request to the bridge authority fails because it was filtered.

Two possible places to notice that it was filtered are at connection_dir_reached_eof() [if the filter closes the connection to the authority], or in main.c as we time out connections [if the filter just drops packets to the bridge authority.]

Alternatively, we could change fetch_bridge_descriptors() so that if a bridge descriptor download has failed a few times, we retry directly. For that, you'd want to track the number of failed downloads from the bridge authority in about the same places you currently mess with bridge->download_status.

comment:3 Changed 9 years ago by mwenge

Owner: set to mwenge
Status: newassigned

comment:4 Changed 9 years ago by mwenge

I think the place to hook this is connection_dir_download_routerdesc_failed() when called from connection_dir_request_failed(). That should cover both timeouts and closures.

comment:6 Changed 9 years ago by mwenge

I have only managed to simulate dropping packets in testing. As far as I can tell though, connections that are closed rather than dropped will also end up in connection_dir_request_failed().

comment:7 in reply to:  5 ; Changed 9 years ago by mingw-san

Replying to mwenge:

https://gitweb.torproject.org/hoganrobert/tor.git/shortlog/refs/heads/bug1138

Need to replace "if (which)" with "if (smartlist_len(which))"?

comment:8 in reply to:  7 ; Changed 9 years ago by mingw-san

Replying to mingw-san:

Replying to mwenge:

https://gitweb.torproject.org/hoganrobert/tor.git/shortlog/refs/heads/bug1138

Need to replace "if (which)" with "if (smartlist_len(which))"?

memory leaks with connection_dir_bridge_routerdesc_failed(), smartlist "which" and elements need to free.

comment:9 in reply to:  8 Changed 9 years ago by mwenge

Replying to mingw-san:

Replying to mingw-san:

Replying to mwenge:

https://gitweb.torproject.org/hoganrobert/tor.git/shortlog/refs/heads/bug1138

Need to replace "if (which)" with "if (smartlist_len(which))"?

memory leaks with connection_dir_bridge_routerdesc_failed(), smartlist "which" and elements need to free.

fixed. thanks!

Changed 9 years ago by mwenge

Attachment: 1138.diff added

Review patch from mingw-san (applied)

comment:10 Changed 9 years ago by arma

Triage: it would be great to fix this for 0.2.2.x, but it's already busted for 0.2.1, so there's no reason to block 0.2.2.

comment:11 Changed 9 years ago by mwenge

Sunday 29 August 2010] [16:22:17] Quit j27Z2 has left this server (Ping timeout: 480 seconds).
[Sunday 29 August 2010] [16:33:10] Join j27Z2 has joined this channel (~x@…).
[Sunday 29 August 2010] [16:35:32] <Sebastian> mwenge: reviewing your #1138 patch
[Sunday 29 August 2010] [16:35:48] <mwenge> thanks
[Sunday 29 August 2010] [16:36:20] <Sebastian> You're lacking documentation for was_ei in connection_dir_retry_bridges()
[Sunday 29 August 2010] [16:36:42] <Sebastian> Also it seems better to not have that parameter at all
[Sunday 29 August 2010] [16:36:57] <Sebastian> and rather still check the assert at the caller
[Sunday 29 August 2010] [16:37:32] <Sebastian> Then you also don't need to document it :)
[Sunday 29 August 2010] [16:38:44] <Sebastian> Calling the first argument "failed" seems weird. Maybe just call it descs
[Sunday 29 August 2010] [16:39:22] Join alkovic has joined this channel (~Undefined@…).
[Sunday 29 August 2010] [16:40:33] <mwenge> ok
[Sunday 29 August 2010] [16:40:52] <Sebastian> (Please don't just update the branch, but rather add a new patch to change these things. Maybe nickm started reviewing it too)
[Sunday 29 August 2010] [16:40:58] <Sebastian> (We can still squash before merge)
[Sunday 29 August 2010] [16:41:07] <mwenge> ah ok
[Sunday 29 August 2010] [16:41:56] <Sebastian> in connection_dir_bridge_routerdesc_failed(), do we want to tor_assert(conn->requested_resource) ?
[Sunday 29 August 2010] [16:42:48] <Sebastian> If we can't do that then we have a bug
[Sunday 29 August 2010] [16:44:13] <Sebastian> What do you think?
[Sunday 29 August 2010] [16:46:52] <Sebastian> next is that you're saying "conn->requested_resource + 3", please add a comment to say "skip over fp/" or just do + strlen("fp/")
[Sunday 29 August 2010] [16:49:32] <Sebastian> and then finally I think we're using names like connection_dir_bridge_routerdesc_failed() to ask questions like "did this fail". Maybe we find a better name, maybe something like connection_dir_bridge_routerdesc_handle_failure
[Sunday 29 August 2010] [16:49:34] <Sebastian> or such
[Sunday 29 August 2010] [16:50:00] <Sebastian> Ok please pick from this list what you agree with :)
[Sunday 29 August 2010] [16:50:45] <mwenge> Sebastian: thanks for all that, will work through them and respond on trac. i think i only need to think about the was_ei one.
[Sunday 29 August 2010] [16:51:13] <Sebastian> Should I add everything to trac?
[Sunday 29 August 2010] [16:51:26] <Sebastian> Happy to do that, though I like discussing feedback before adding it.
[Sunday 29 August 2010] [16:52:03] <mwenge> no, it's ok i'll add it there. i agree with everything.

I ended up disagreeing with the rename-functions suggestion. But the rest of Sebastian's suggestions have been added to my bug1138 branch.

comment:12 Changed 9 years ago by nickm

looks plausible, but...

tor_assert(!conn->_base.purpose == DIR_PURPOSE_FETCH_EXTRAINFO) will DEFINITELY not do what you expect, since ! has higher precedence than ==.

comment:13 in reply to:  12 Changed 9 years ago by mwenge

Replying to nickm:

looks plausible, but...

tor_assert(!conn->_base.purpose == DIR_PURPOSE_FETCH_EXTRAINFO) will DEFINITELY not do what you expect, since ! has higher precedence than ==.

Whoops! Updated in the branch.

comment:14 Changed 9 years ago by arma

Our convention is to say smartlist_t *descs, as opposed to smartlist_t* descs

comment:15 Changed 9 years ago by arma

+  tor_assert(conn->requested_resource);
   /* Requests for bridge descriptors are in the form 'fp/', so ignore
      anything else. */
   if (conn->requested_resource && strcmpstart(conn->requested_resource,"fp/"))

The first part of that if is now redundant.

It seems the defensive programming way to do that would be to say

if (!conn->requested_resource
strcmpstart(...))

I'm not comfortable with the assert in general, because conn->requested_resource only gets assigned once we're actually sending the request. That *may* be fine for begin_dir cells, since they shouldn't fail instantaneously if the request is going over Tor.

Still, notice how connection_dir_request_failed() gets called from directory_initiate_command_rend() in the case where connect() fails. That's at least one codepath.

This area sure is a mess. See also the

        /* XXX we only pass 'conn' above, not 'resource', 'payload',
         * etc. So in many situations it can't retry! -RD */

comment right around there.

I vote for the 'defensive programming way' I mentioned above, as the short-term fix.

comment:16 in reply to:  15 ; Changed 9 years ago by arma

Replying to arma:

It seems the defensive programming way to do that would be to say

if (!conn->requested_resource
strcmpstart(...))

Fucking trac. Here it is again.

  if (!conn->requested_resource || strcmpstart(...))
    return;

comment:17 Changed 9 years ago by arma

For posterity, I tested it with

src/or/tor usebridge 1 bridge "188.223.60.50:443 A16012DBD5AC2A1AAB0E37FE7DB463F2A8D67C14" UpdateBridgesFromAuthority 1 alternatebridgeauthority "tonga orport=443 bridge 85.108.88.19:80 4A0C CD2D DC79 9508 3D73 F5D6 6710 0C8A 5831 F1ED"

rather than your iptables trick

comment:18 Changed 9 years ago by arma

+    if (conn->router_purpose == ROUTER_PURPOSE_BRIDGE)
+      connection_dir_bridge_routerdesc_failed(conn);
     connection_dir_download_routerdesc_failed(conn);

This part makes me sad. It would be best to put the bridge-specific function inside the more general function, not to call both of them back-to-back.

But then, I bet these functions just happen to be superficially named similar things, and are called from different enough places that the names are a coincidence.

Somebody should open a "refactor directory.c" trac entry. Not for 0.2.2.x though.

comment:19 Changed 9 years ago by arma

Ok. Other than those, I give this patch a clean bill of health. Let's hope I'm right. :)

It actually works, which is more than can be said for the current situation. Thanks!

comment:20 in reply to:  16 Changed 9 years ago by mwenge

Replying to arma:

Replying to arma:

It seems the defensive programming way to do that would be to say

if (!conn->requested_resource
strcmpstart(...))

Fucking trac. Here it is again.

  if (!conn->requested_resource || strcmpstart(...))
    return;

Updated my branch. Thanks!

comment:21 Changed 9 years ago by nickm

But the assert is still there, being problematic. If I understand right, arma's suggestion was that the assert go away entirely, since it's not nice to crash. And I'm not seeing a patch to fix the "smartlist_t* foo" thing.

I can clean these up before merging, unless you want to fix 'em first.

comment:22 Changed 9 years ago by nickm

See the patch at the head of the bug1138 branch in my public. It tries to fix the remaining issues from Roger's review I saw above. mwenge, arma: do you like it?

comment:23 Changed 9 years ago by nickm

Resolution: Nonefixed
Status: needs_reviewclosed

arma likes it. Will merge. mwenge: please let me know if I screwed up. Thanks again! :)

comment:24 in reply to:  23 Changed 9 years ago by mwenge

Replying to nickm:

arma likes it. Will merge. mwenge: please let me know if I screwed up. Thanks again! :)

That was just sloppy - sorry!

comment:25 Changed 7 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.