Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#24983 closed defect (fixed)

Inaccessible semi-recent consensus files

Reported by: robgjansen Owned by: karsten
Priority: Medium Milestone:
Component: Metrics/CollecTor Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I noticed there is a gap between the consensus files that are available in the latest snapshot, e.g.:
https://collector.torproject.org/archive/relay-descriptors/consensuses/consensuses-2018-01.tar.xz

and those available in the list of recent consensus files:
https://collector.torproject.org/recent/relay-descriptors/consensuses/

As of right now, many consensus files from January 19th, 2018 are inaccessible (they are too old to be in the recent list and too new to exist in the latest archived snapshot).

I realize that those files will get added to the January 2018 archive snapshot soon, but it seems to me like a problem in general that at any time some consensus files may be inaccessible.

Child Tickets

Change History (7)

comment:1 Changed 3 years ago by robgjansen

Is it a reasonable change to just expand the number of files available in the recent directory of the web server?

comment:2 Changed 3 years ago by karsten

Owner: changed from metrics-team to karsten
Status: newaccepted

Yes, this is a known (to me), but probably yet undocumented issue. Thanks for creating this issue! It bothered me from time to time, but not enough to open a ticket. ;)

So, making more files available in the recent/ directory would be one option. But all tools downloading from that directory would then have to fetch even more data. I'm thinking of newly bootstrapped Onionoo instances for development purposes, for one example.

Another option would be to create tarballs for the archive/ directory more often. Maybe every 2 days instead of every 3 days. That would solve this issue, too, right? If so, and if there are no concerns, I'll change the cronjob to try this out for a week or so.

Accepting this ticket.

comment:3 Changed 3 years ago by robgjansen

This bugged me now because I actually need the files from the 19th :)

Another option would be to create tarballs for the archive/ directory more often. Maybe every 2 days instead of every 3 days. That would solve this issue, too, right?

I think that would work, yes. I can't think of any problems with that approach other than slightly higher CPU usage for a brief period.

Thanks!

comment:4 in reply to:  2 Changed 3 years ago by iwakeh

Replying to karsten:

Yes, this is a known (to me), but probably yet undocumented issue. Thanks for creating this issue! It bothered me from time to time, but not enough to open a ticket. ;)

So, making more files available in the recent/ directory would be one option. But all tools downloading from that directory would then have to fetch even more data. I'm thinking of newly bootstrapped Onionoo instances for development purposes, for one example.

Valid point.

Another option would be to create tarballs for the archive/ directory more often. Maybe every 2 days instead of every 3 days. That would solve this issue, too, right? If so, and if there are no concerns, I'll change the cronjob to try this out for a week or so.

That should be fine.
(We should keep this in mind when we get around to 'java-ize' the archiving process.)

comment:5 Changed 3 years ago by karsten

So, it looks like the files from January 19 are still not available, because the cronjob on January 22 was interrupted by a server reboot. The next regular run would start tomorrow on January 25.

I changed the cronjob from every 3 days to every 2 days. And I triggered a manual run which should finish some time this evening (UTC). The next cronjob will start at 03:25 UTC tonight.

comment:6 Changed 3 years ago by karsten

Resolution: fixed
Status: acceptedclosed

No obvious issues from changing the cronjob from every 3 to every 2 days. Closing. Thanks!

comment:7 Changed 3 years ago by robgjansen

Thanks for taking care of this so quickly, Karsten! :)

Note: See TracTickets for help on using tickets.