Do not let appended descriptor files grow too large

added component::metrics/collector owner::karsten priority::medium severity::normal status::new type::enhancement labels

Trac:
Status: assigned to needs_review

Here's another option: rather than append multiple descriptors to a single flat file we could produce a tarball containing the few hundred or thousand descriptor files. Basically,

https://collector.torproject.org/recent/relay-descriptors/server-descriptors/2020-03-10-14-05-00-server-descriptors

containing 596 descriptors concatenated to a 1.4 MiB file would then be replaced by

https://collector.torproject.org/recent/relay-descriptors/server-descriptors/2020-03-10-14-05-00-server-descriptors.tar

containing 596 descriptor files.

Advantage over the approach sketched out above would be that we wouldn't have three output file formats anymore (flat file with 1 descriptor, flat file with >= 1 descriptors, tarball). Disadvantage might be that processing tarballs can be less convenient than processing flat files.

irl and I just talked this over and concluded that producing tarballs is the better design here. It solves the large files issue, and it might even fix data integrity/consistency issues that just haven't surfaced yet. I'm going to write a patch for the tarball idea some time in the next weeks. Thanks!

Trac:
Status: needs_review to new

Do not let appended descriptor files grow too large

Child items ...

Activity