BridgeDB assumes that cached-descriptors[.new] are in chronological order

added bridgedb-0.3.0 bridgedb-parsers component::circumvention/bridgedb owner::isis priority::low resolution::fixed status::closed type::defect labels

Is this still relevant?

Trac:
Owner: N/A to aagbsn
Status: new to assigned

Trac:
Owner: aagbsn to isis
Status: assigned to accepted

If this was referring to the cached-extrainfo and cached-extrainfo.new files (to my knowledge, BridgeDB has never had cached-descriptor* files), then this is a bug (related to #11216 (moved)), and it would mean that the transport lines of newer descriptors would potentially be overwritten by older, duplicate descriptors.

If that's the bug we're talking about, then here's the fix. :) Otherwise, feel free to reopen and/or add more information.

This was fixed by commits 7869e4c7cd1e43f9354480f6cafba0794fd86433, 65f18cd7c31a97274a8277688f0512b69ac1e3e3, and fe70415269693948bdfc5c8ea3abfab2b1d86c49 in my fix/9380-stem_r10 branch.

Those commits introduce the bridgedb.parse.descriptors.deduplicate() function, which is called in the bridgedb.parse.descriptors.parseExtraInfoFiles() function. The former deduplicates all descriptors for every bridge, selecting only the newest descriptor for a particular bridge. Additionally, if any Bridge has multiple @type bridge-extrainfo descriptors with exactly the same timestamps, then a bridgedb.parse.descriptors.DescriptorWarning will be issued, since perfectly identical descriptors shouldn't be something an unmodified tor is capable of doing (and thus would imply that there is either a drastic regression in tor, or that someone has created a possibly-malicious OR implementation). Unittests and integration tests which verify that these behaviours are functioning as expected have also been added.

Trac:
Status: accepted to closed
Keywords: N/A deleted, bridgedb-0.3.0, bridgedb-parsers added
Resolution: N/A to fixed

Replying to isis:

to my knowledge, BridgeDB has never had cached-descriptor* files

Hm? That's how bridgedb used to know what bridges exist -- Tonga would export its cached-descriptor* files and bridgedb would import them.

In fact, I'm a bit confused that it doesn't still have them, yet there are extrainfo descriptors. How do you know which extrainfo descriptor matches up to which bridge descriptor? Isn't that what the "extra-info-digest" line in the bridge descriptor is for?

Replying to arma:

Replying to isis:

to my knowledge, BridgeDB has never had cached-descriptor* files

Hm? That's how bridgedb used to know what bridges exist -- Tonga would export its cached-descriptor* files and bridgedb would import them.

The files currently given to BridgeDB by Tonga are: networkstatus-bridges, bridge-descriptors, cached-extrainfo, and cached-extrainfo.new.

In fact, I'm a bit confused that it doesn't still have them, yet there are extrainfo descriptors. How do you know which extrainfo descriptor matches up to which bridge descriptor? Isn't that what the "extra-info-digest" line in the bridge descriptor is for?

Yes, that is what it is for.

No, BridgeDB (as of #9380 (moved)) doesn't currently do this, but instead chains the verification of descriptors using the router-signature on the @type bridge-extrainfo document. (Although, I can gladly add code to check the descriptor digest too… that would be part of #9380 (moved). And that might possibly require more resources for the parsing and hashing of the @type bridge-extrainfo descriptors during the extrainfo deduplication, stage !#6 (closed) below, since the deduplication would need to do the hashing for each one and check that the hashes match, and I would still prefer to additionally check the signature on the @type bridge-extrainfo descriptor, so that both would need to validate before updating the Bridge with any of the extrainfo.)

BridgeDB's verification chain for descriptors currently (as of #9380 (moved)) goes like this:

Parse the @type bridge-networkstatus documents in the networkstatus-bridges file.
Create Bridge class instances for each this we parsed in step !#1. Call the Bridge.updateFromNetworkStatus() method with the corresponding networkstatus document for each Bridge. This includes storing the descriptor digest for each Bridge.
Parse the @type bridge-server-descriptors found in the bridge-descriptors file.
Update each Bridge only if the descriptor digest matches the digested value of the @type bridge-server-descriptor that was just parsed.
Store the extra-info-digest from each @type bridge-server-descriptor.
Parse and deduplicate the @type bridge-extrainfo descriptors in cached-extrainfo and cached-extrainfo.new.
Verify the router-signature on the @type bridge-extrainfo descriptor for each bridge, using the signing-key from the Bridge's @type bridge-server-descriptor.
Update the Bridge's PluggableTransport class instances.

Replying to isis:

The files currently given to BridgeDB by Tonga are: networkstatus-bridges, bridge-descriptors, cached-extrainfo, and cached-extrainfo.new.

Ah ha. Somewhere in the transfer process I believe it does a "cat cached-descriptors* > bridge-descriptors". So those are indeed the bridge descriptors, just in a different file name, and both files together.

BridgeDB's verification chain for descriptors currently (as of #9380 (moved)) goes like this:

Looks plausible. Thanks!

closed

mentioned in issue #9380 (moved)

mentioned in issue #11216 (moved)

moved to tpo/anti-censorship/bridgedb#2895 (closed)

BridgeDB assumes that cached-descriptors[.new] are in chronological order

Child items ...

Activity