Opened 8 years ago

Closed 4 years ago

Last modified 4 years ago

#2895 closed defect (fixed)

BridgeDB assumes that cached-descriptors[.new] are in chronological order

Reported by: karsten Owned by: isis
Priority: Low Milestone:
Component: Obfuscation/BridgeDB Version:
Severity: Keywords: bridgedb-parsers, bridgedb-0.3.0
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

When parsing bridge descriptors, BridgeDB assumes that descriptors in the bridge descriptor files are in chronological order and that descriptors in cached-descriptors.new are newer than those in cached-descriptors. If this is not the case, BridgeDB overwrites a bridge's IP address and OR port with those from an older descriptor.

I think that the current cached-descriptors* files that Tor produces always have descriptors in chronological order. But once we change that, e.g., when trying to limit the number of descriptors that Tor memorizes, BridgeDB will behave funny.

We should look at the bridge descriptor that is referenced from the bridge network status by its publication time and ignore all other bridge descriptors from the same bridge.

Child Tickets

Change History (6)

comment:1 Changed 6 years ago by sysrqb

Owner: set to aagbsn
Status: newassigned

Is this still relevant?

comment:2 Changed 4 years ago by isis

Owner: changed from aagbsn to isis
Status: assignedaccepted

comment:3 Changed 4 years ago by isis

Keywords: bridgedb-parsers bridgedb-0.3.0 added
Resolution: fixed
Status: acceptedclosed

If this was referring to the cached-extrainfo and cached-extrainfo.new files (to my knowledge, BridgeDB has never had cached-descriptor* files), then this is a bug (related to #11216), and it would mean that the transport lines of newer descriptors would potentially be overwritten by older, duplicate descriptors.

If that's the bug we're talking about, then here's the fix. :) Otherwise, feel free to reopen and/or add more information.

This was fixed by commits 7869e4c7cd1e43f9354480f6cafba0794fd86433, 65f18cd7c31a97274a8277688f0512b69ac1e3e3, and fe70415269693948bdfc5c8ea3abfab2b1d86c49 in my fix/9380-stem_r10 branch.

Those commits introduce the bridgedb.parse.descriptors.deduplicate() function, which is called in the bridgedb.parse.descriptors.parseExtraInfoFiles() function. The former deduplicates all descriptors for every bridge, selecting only the newest descriptor for a particular bridge. Additionally, if any Bridge has multiple @type bridge-extrainfo descriptors with exactly the same timestamps, then a bridgedb.parse.descriptors.DescriptorWarning will be issued, since perfectly identical descriptors shouldn't be something an unmodified tor is capable of doing (and thus would imply that there is either a drastic regression in tor, or that someone has created a possibly-malicious OR implementation). Unittests and integration tests which verify that these behaviours are functioning as expected have also been added.

comment:4 in reply to:  3 ; Changed 4 years ago by arma

Replying to isis:

to my knowledge, BridgeDB has never had cached-descriptor* files

Hm? That's how bridgedb used to know what bridges exist -- Tonga would export its cached-descriptor* files and bridgedb would import them.

In fact, I'm a bit confused that it doesn't still have them, yet there are extrainfo descriptors. How do you know which extrainfo descriptor matches up to which bridge descriptor? Isn't that what the "extra-info-digest" line in the bridge descriptor is for?

comment:5 in reply to:  4 ; Changed 4 years ago by isis

Replying to arma:

Replying to isis:

to my knowledge, BridgeDB has never had cached-descriptor* files

Hm? That's how bridgedb used to know what bridges exist -- Tonga would export its cached-descriptor* files and bridgedb would import them.

The files currently given to BridgeDB by Tonga are: networkstatus-bridges, bridge-descriptors, cached-extrainfo, and cached-extrainfo.new.

In fact, I'm a bit confused that it doesn't still have them, yet there are extrainfo descriptors. How do you know which extrainfo descriptor matches up to which bridge descriptor? Isn't that what the "extra-info-digest" line in the bridge descriptor is for?

Yes, that is what it is for.

No, BridgeDB (as of #9380) doesn't currently do this, but instead chains the verification of descriptors using the router-signature on the @type bridge-extrainfo document. (Although, I can gladly add code to check the descriptor digest too… that would be part of #9380. And that might possibly require more resources for the parsing and hashing of the @type bridge-extrainfo descriptors during the extrainfo deduplication, stage #6 below, since the deduplication would need to do the hashing for each one and check that the hashes match, and I would still prefer to additionally check the signature on the @type bridge-extrainfo descriptor, so that both would need to validate before updating the Bridge with any of the extrainfo.)

BridgeDB's verification chain for descriptors currently (as of #9380) goes like this:

  1. Parse the @type bridge-networkstatus documents in the networkstatus-bridges file.
  1. Create Bridge class instances for each this we parsed in step #1. Call the Bridge.updateFromNetworkStatus() method with the corresponding networkstatus document for each Bridge. This includes storing the descriptor digest for each Bridge.
  1. Parse the @type bridge-server-descriptors found in the bridge-descriptors file.
  1. Update each Bridge only if the descriptor digest matches the digested value of the @type bridge-server-descriptor that was just parsed.
  1. Store the extra-info-digest from each @type bridge-server-descriptor.
  1. Parse and deduplicate the @type bridge-extrainfo descriptors in cached-extrainfo and cached-extrainfo.new.
  1. Verify the router-signature on the @type bridge-extrainfo descriptor for each bridge, using the signing-key from the Bridge's @type bridge-server-descriptor.
  1. Update the Bridge's PluggableTransport class instances.

comment:6 in reply to:  5 Changed 4 years ago by arma

Replying to isis:

The files currently given to BridgeDB by Tonga are: networkstatus-bridges, bridge-descriptors, cached-extrainfo, and cached-extrainfo.new.

Ah ha. Somewhere in the transfer process I believe it does a "cat cached-descriptors* > bridge-descriptors". So those are indeed the bridge descriptors, just in a different file name, and both files together.

BridgeDB's verification chain for descriptors currently (as of #9380) goes like this:

Looks plausible. Thanks!

Note: See TracTickets for help on using tickets.