Opened 8 years ago

Closed 7 years ago

#5608 closed defect (fixed)

Order of sanitizing bridge descriptor tarballs matters even though it shouldn't

Reported by: karsten Owned by: karsten
Priority: Medium Milestone:
Component: Metrics/CollecTor Version:
Severity: Keywords:
Cc: Actual Points: 3
Parent ID: Points:
Reviewer: Sponsor:

Description

There's a bug in how metrics-db tries to repair references between bridge network statuses, server descriptors, and extra-info descriptors. In theory, the order of processed tarballs shouldn't matter, because we have a mapping file with hashed bridge fingerprint, descriptor publication time, server descriptor digest, and extra-info digest. But apparently this approach is buggy. I just tried sanitizing 1 week of tarballs in forward and in reverse order. The result was that sanitized descriptors differed. I suspect the problem is that bridges can publish more than 1 descriptor in a single second, but I'm not sure yet. I'm also not sure whether this leads to data loss or not. More analysis required. We might have to sanitize all bridge descriptors again.

Child Tickets

Change History (2)

comment:1 Changed 7 years ago by karsten

Solved, I think.

We don't have to calculate descriptor identifiers based on descriptor contents, but we can simply use the SHA1 of the non-scrubbed descriptor identifier as identifier in the scrubbed descriptors. This tor-dev posting contains an example.

I briefly thought about security implications of writing the SHA1 of a descriptor digest into a modified version of that descriptor. But we're modifying enough of that descriptor to prevent people from guessing what the original descriptor was. For example, we always replace the bridge fingerprint with its SHA1.

The patch is here. As one can see, this change reduces complexity of the bridge descriptor sanitizer a lot!

comment:2 Changed 7 years ago by karsten

Actual Points: 3
Resolution: fixed
Status: newclosed

Merged and tested to sanitize bridge descriptors containing nicknames. Closing.

Note: See TracTickets for help on using tickets.