Opened 8 years ago

Closed 4 years ago

#4178 closed enhancement (wontfix)

Specify a TorBEL archive data format

Reported by: karsten Owned by:
Priority: Medium Milestone:
Component: Core Tor/TorDNSEL Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


Sebastian says that TorBEL only gives out current scan results in its CSV file (and others). Once a relay disappears from the consensus, TorBEL removes scan results for the relay, so that its IP address can get unblocked by Wikipedia et al. as soon as possible. The CSV file is going to be updated every five minutes.

Makes sense for TorBEL's main use case. But we want to archive TorBEL's output files and use it as input for VisiTor, ExoneraTor, and similar tools. We'll want to know whether a relay was found to exit via a given IP address at a given time. But we want to avoid archiving every output files that TorBEL publishes, which would be highly redundant.

How about we define an archive data format that extends TorBEL's CSV format in Section 2.1 of data-spec.txt? The change would be that we never remove an entry for a given ExitAddress and Router ID. There could be many such entries, each with a distinct LastTestedTimestamp. The new uniqueness criterion would be (ExitAddress, RouterID, LastTestedTimestamp).

We could define this archive data format in the same spec or only on formats.html. We could implement the new format by making the TorBEL host create a copy of the CSV file whenever it changes and having metrics-db rsync and merge these files into the archive data format.

Once we have this archive data format, we'll have to update VisiTor and ExoneraTor to parse it.

Child Tickets

Change History (1)

comment:1 Changed 4 years ago by arlolra

Resolution: wontfix
Severity: Normal
Status: newclosed

TorBEL is unmaintained.

Note: See TracTickets for help on using tickets.