Opened 8 years ago

Closed 8 years ago

#2688 closed defect (fixed)

BridgeDB re-assigns unallocated bridges from/to file buckets without need

Reported by: karsten Owned by: kaner
Priority: High Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

It looks like BridgeDB re-assigns unallocated bridges from/to file buckets without need. That is, a bridge that keeps running from one network status to the next might be removed from a file bucket and replaced with another bridge. This leads to quick enumeration of all bridges in the unallocated pool when using file buckets.

A second bug seems to be that BridgeDB appends bridges to file buckets instead of overwriting these files. The result is that there are duplicate entries in files that external distributors use.

Here's how one can reproduce the problem using sanitized bridge descriptors. Yes, this description is lengthy and ugly, but it works for testing BridgeDB in general, even if one doesn't have the original bridge descriptors handy.

Download and extract the sanitized bridge descriptors from January 2009:

https://metrics.torproject.org/data/bridge-descriptors-2009-01.tar.bz2

Create a single bridge-descriptors file containing all bridge descriptors from that month.

$ cd bridge-descriptors-2009-01/
$ echo "@purpose bridge" > purpose
$ echo "router-signature" > routersignature
$ find server-descriptors/ -type f | xargs -I{} cat purpose {} routersignature > bridge-descriptors

Copy the new bridge-descriptors file to BridgeDB's working directory (here: ~/run/).

$ cd ~/run/
$ cp bridge-descriptors-2009-01/bridge-descriptors .

Also copy the sanitized network status file from 2009-01-10 00:07:04 to BridgeDB's working directory and rename it to networkstatus-bridges.

$ cp bridge-descriptors-2009-01/statuses/10/20090110-000704-4A0CCD2DDC7995083D73F5D667100C8A5831F16D networkstatus-bridges

Configure BridgeDB to write 4 bridges to file bucket twitter, otherwise keep the default configuration. Start BridgeDB (note that it may take 30 seconds to digest the 8.5M bridge-descriptors file) and dump bridges to file buckets.

The result is a new file twitter-2011-03-09.brdgs with this content (may vary for you):

10.134.79.249:443
10.236.199.173:443
10.116.76.140:9001
10.51.76.151:18443

It also writes this unallocated-2011-03-09.brdgs file (again, content may vary):

10.126.198.237:443
10.251.69.61:9003
10.31.186.235:49001
10.81.88.5:9001

Replace the networkstatus-bridges with the one from roughly 30 minutes later:

$ cp bridge-descriptors-2009-01/statuses/10/20090110-030709-4A0CCD2DDC7995083D73F5D667100C8A5831F16D networkstatus-bridges

Give BridgeDB a HUP, wait at least 30 seconds, and tell it to dump bridges to file buckets again.

Here's my new twitter-2011-03-09.brdgs file:

10.134.79.249:443
10.236.199.173:443
10.116.76.140:9001
10.51.76.151:18443
10.51.76.151:18443
10.237.143.0:443
10.241.115.62:443
10.239.76.198:443

And my unallocated-2011-03-09.brdgs file:

10.126.198.237:443
10.251.69.61:9003
10.31.186.235:49001
10.81.88.5:9001
10.126.198.237:443
10.251.69.61:9003
10.31.186.235:49001
10.81.88.5:9001
10.134.79.249:443
10.236.199.173:443
10.116.76.140:9001

There are two bugs:

  • Why was 10.236.199.173:443 (2nd line in twitter-2011-03-09.brdgs) removed from this file bucket and put back in the unallocated ring again (last but one line in unallocated-2011-03-09.brdgs)? I confirmed that this is the same bridge using the new pool assignments patch that is not merged yet. This is the first bug described above.
  • Why are IP:port lines appended to these files? This is the second bug described above.

Child Tickets

Change History (8)

comment:1 Changed 8 years ago by kaner

Status: newaccepted

Bugs confirmed. I have an idea what it could be.

comment:2 Changed 8 years ago by karsten

Did you make any progress in turning your idea into a patch? :)

comment:3 Changed 8 years ago by kaner

Yup, I've got something cooking, but free cycles for BridgeDB are currently pretty low on my side. Pushing the changes won't happen before next week I think.

comment:4 Changed 8 years ago by kaner

Status: acceptedneeds_review

Please review commit 1f4a5df7b531ad3c8e0d106174464f60d84bb0f3 on kaner/bridgedb.git

comment:5 Changed 8 years ago by karsten

Status: needs_reviewassigned
  • The code changes look good, but I'm having a hard time finding what precise change fixed the bug. Does it have to do with the pseudo distributor prefix? Can you explain?
  • Please pick a better commit message than "Fix #2688". Something self-contained would be good. The message should explain what the bug was and what the fix is.
  • Why did you change the list type bucketList to the dictionary type buckets? If this change is not part of the bugfix, please take it out of this commit and make a separate commit for refactoring stuff.
  • What's the purpose of adding incrementBucket()? Why is this a method of BucketManager and not BucketData? Also, the failure mode of not doing anything if the bucket name cannot be found seems unsafe to me; is this intended? Again, if this is a refactoring thing only, please take it out of this commit.
  • See also the changes in branch bug2866 in my public repository. Note that I didn't test these changes yet.

comment:6 in reply to:  5 Changed 8 years ago by karsten

Replying to karsten:

  • See also the changes in branch bug2866 in my public repository. Note that I didn't test these changes yet.

Of course I meant bug2688. Thanks rransom.

comment:7 in reply to:  5 Changed 8 years ago by kaner

Status: assignedneeds_review

Replying to karsten:

  • The code changes look good, but I'm having a hard time finding what precise change fixed the bug. Does it have to do with the pseudo distributor prefix? Can you explain?

Yes, thats correct. The pseudo distributor prefix was missing in the list of buckets.

  • Why did you change the list type bucketList to the dictionary type buckets? If this change is not part of the bugfix, please take it out of this commit and make a separate commit for refactoring stuff.

I thought there'd be a search performance optimization looking up a specific bucket in the list. Turns out it isn't worth it. Especially with the ugliefication that incrementBucket() introduces. Dismissed this change.

  • See also the changes in branch bug2866 in my public repository. Note that I didn't test these changes yet.

Cherry-picked those.

See branch bug2688_rebased for a cleaned up version.

comment:8 Changed 8 years ago by karsten

Resolution: fixed
Status: needs_reviewclosed

Looks good. I added a small change (fileName -> filename) that you probably overlooked when cherry-picking, rebased to origin/master, and pushed.

Feel free to deploy the new origin/master whenever you're ready.

Closing this bug. Thanks!

Note: See TracTickets for help on using tickets.