Opened 6 years ago

Closed 5 years ago

#9462 closed task (fixed)

BridgeDB networkstatus descriptor parsers need refactoring

Reported by: isis Owned by: isis
Priority: Medium Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Keywords: descriptors, tor-bridge, bridgedb, refactoring
Cc: sysrqb, isis@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Some of this stuff I am not sure if it was part of some super old proposal for some descriptor type. Other stuff is just janky. For example, one of the bugs I just found, on L469 in lib/bridgedb/Bridges.py, wouldn't ever get hit because to my knowledge BridgeDB has never handed out fingerprints. But if it were to, with those lines, it would hand out PT bridges that look like this:

bridge obfs3 1.2.3.4:5555 keyid=0123456789abcdef0123456789abcdef01234567

with the part after keyid= being the bridge's fingerprint.

Another thing which I am currently looking into is BridgeDB's parsing of the networkstatus files, it seems BridgeDB expects there to always be an a line with the ORaddress, whereas this is only the case if the bridge has at least one IPv6 address.

BridgeDB also seems to think that a bridge/relay can't have more than 8 ORaddress:ORport pairs. I remember this discussed years ago, though I thought it was 16, and I don't recall if it was ever implemented in little-t tor.

In short, there is quite a bit of jankiness in BridgeDB's descriptor parsing. I am going to continue fixing things and referencing dir-spec.txt when there is a discrepancy, and just calling BridgeDB out when there appears to be no sane reason for something referenced in any of torspec.

Child Tickets

Change History (8)

comment:1 Changed 6 years ago by isis

Cc: isis@… added
Status: newaccepted

The first one was fixed here:

* commit b5ebab54369de950339a9fa2ff3483ebff32e4f8 (HEAD, tpo-isis/fix/9462A-keyid-equals-fingerprint
| gpg: Signature made Mon 12 Aug 2013 21:58:28 UTC
gpg:                using RSA key A3ADB67A2CDB8B35
gpg: Good signature from "Isis! <isis@patternsinthevoid.net>" [ultimate]
gpg:                 aka "Isis <isis@leap.se>" [ultimate]
gpg:                 aka "Isis <isis@torproject.org>" [ultimate]
gpg:                 aka "Isis! <isis@riseup.net>" [ultimate]
gpg: Signature notation: isis@patternsinthevoid.net=0A6A58A14B5946ABDE18E207A3ADB67A2CDB8B35
gpg: Signature expires Tue 12 Aug 2014 21:58:28 UTC
Author:     Isis Lovecruft <isis@torproject.org>
| AuthorDate: 2 minutes ago
| Commit:     Isis Lovecruft <isis@torproject.org>
| CommitDate: 2 minutes ago
| 
|     Fix prepending of "keyid=" string to PT bridge fingerprints.
| ---
|  lib/bridgedb/Bridges.py | 3 +--
|  1 file changed, 1 insertion(+), 2 deletions(-)
| 
| diff --git a/lib/bridgedb/Bridges.py b/lib/bridgedb/Bridges.py
| index 9fc94e8..846bb77 100644
| --- a/lib/bridgedb/Bridges.py
| +++ b/lib/bridgedb/Bridges.py
| @@ -465,8 +465,7 @@ class PluggableTransport:
|              address = "[%s]" % self.address
|          else: address = self.address
|          host = "%s %s:%d" % (self.methodname, address, self.port)
| -        fp = ''
| -        if includeFingerprint: fp = "keyid=%s" % self.bridge.fingerprint
| +        fp = str(self.bridge.fingerprint) if includeFingerprint else ''
|          args = " ".join(["%s=%s"%(k,v) for k,v in self.argdict.items()]).strip()
|          return "%s %s %s" % (host, fp, args)
|  

comment:2 in reply to:  description Changed 6 years ago by sysrqb

Replying to isis:

Some of this stuff I am not sure if it was part of some super old proposal for some descriptor type. Other stuff is just janky. For example, one of the bugs I just found, on L469 in lib/bridgedb/Bridges.py, wouldn't ever get hit because to my knowledge BridgeDB has never handed out fingerprints. But if it were to, with those lines, it would hand out PT bridges that look like this:

bridge obfs3 1.2.3.4:5555 keyid=0123456789abcdef0123456789abcdef01234567

with the part after keyid= being the bridge's fingerprint.

This seems wrong, but there was a time when BridgeDB did hand out a bridge's ip addr:port and its fingerprint. As I understand it, the fingerprint option was disabled about 2 years ago (or so).

Another thing which I am currently looking into is BridgeDB's parsing of the networkstatus files, it seems BridgeDB expects there to always be an a line with the ORaddress, whereas this is only the case if the bridge has at least one IPv6 address.

BridgeDB also seems to think that a bridge/relay can't have more than 8 ORaddress:ORport pairs. I remember this discussed years ago, though I thought it was 16, and I don't recall if it was ever implemented in little-t tor.

We limit it to 8, but really the BA will not process more than 2, iirc.

In short, there is quite a bit of jankiness in BridgeDB's descriptor parsing. I am going to continue fixing things and referencing dir-spec.txt when there is a discrepancy, and just calling BridgeDB out when there appears to be no sane reason for something referenced in any of torspec.

You should also see #9380 :) I already have a branch for this, which you can use.

comment:3 Changed 6 years ago by asn

fwiw the keyid thing is an artifact of proposal 180:
https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/180-pluggable-transport.txt

I don't think the parsing of 'keyid' was ever implemented in tor.git.

comment:4 in reply to:  3 Changed 6 years ago by isis

Replying to asn:

fwiw the keyid thing is an artifact of proposal 180:
https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/180-pluggable-transport.txt

I don't think the parsing of 'keyid' was ever implemented in tor.git.

Thanks, I missed that. I was looking in dir-spec, not pt-spec or any of the proposals.

Hmm. With the keyid=[fingerprint] part of proposal 180-pluggable-transport, if it were to be implemented, then BridgeDB would have to do all of the awkward things, wherein it searches for the string 'keyid=' if the bridge is a PT, and search for the fingerprint positionally otherwise. That will start to get messy.

I think that the [arglist] format (iirc introduced in 180 also, but it's at least in dir-spec), is a good idea, but perhaps only if the descriptors for vanilla bridges and relays report their fingerprints in this manner as well.

I'm still taking it out for now, until we have descriptors that use 'keyid='.

comment:5 Changed 6 years ago by isis

Keywords: tor-bridge bridgedb refactoring added
Summary: BridgeDB does some strange things when parsing bridge descriptorsBridgeDB descriptor parsers need refactoring
Type: defecttask

Added #9380 as a child ticket, and changed the title to be more general.

comment:6 Changed 6 years ago by isis

My branch, fix/9462C-ipaddr-portlist-module, adds unittests for the bridgedb.parse.addr module with 100% coverage, several fixes to the IP address parsers and the PortList class. This should be reviewed and merged. The build, merged on top of my develop branch, passes CI tests for both Python2.6 and Python2.7.

comment:7 Changed 6 years ago by isis

Status: acceptedneeds_review

Another branch for this ticket, fix/9462-refactor-netstatus-parsers_rdevelop, adds unittests with 100% coverage for networkstatus.py and fixes for bridgedb.parse.networkstatus.

It was tested here (rebased on top of my develop branch) and it passes.

And it was also tested again (rebased on top of my develop branch plus the fix/9462C-ipaddr-portlist-module branch mentioned in the last comment) and it also passes. I would recommend that this version (the fix/9462-refactor-netstatus-parsers_r9462C branch) be merged into develop, rather than rebasing and merging the other two.

comment:8 Changed 5 years ago by isis

Resolution: fixed
Status: needs_reviewclosed
Summary: BridgeDB descriptor parsers need refactoringBridgeDB networkstatus descriptor parsers need refactoring

Changing title to only be about parsing networkstatus descriptors, because the other descriptor parsers still need refactoring.

This is merged into 0.1.0.

Note: See TracTickets for help on using tickets.