Consider providing descriptor tarballs as .tar.xz rather than .tar.bz2
nickm notes that xz -9
compresses descriptor tarballs a lot better than bzip2
.
Sample 1: file sizes in kB for May consensuses:
22620 consensuses-bzip2.bz2
2532 consensuses-xz.xz
1948 consensuses-xz9.xz
(Will add another sample once yatei is done compressing April votes.)
Switching is as easy as editing the shell script that is run every 3 days on yatei. Recompressing existing tarballs is also just a shell command away.
Are there drawbacks to consider? Maybe:
- Compression will take longer; right now, at the end of a month, yatei spends about 1 hour on running
bzip2
on the various tarballs. That might become 2 or 3 hours withxz
. - People won't find tarballs under the usual URL, because their file extensions will change. (https://metrics.torproject.org/data.html is going to list the correct URLs though.)
- Anything else?