Extend descriptorCutOff in CollecTor's RelayDescriptorDownloader by 6 hours
CollecTor's RelayDescriptorDownloader only downloads server and extra-info descriptors that have been published up to 24 hours before the current system time. This makes sense, so that missing descriptors that cannot be obtained are not retried forever.
However, there are cases when a valid consensus or vote references a server descriptor that was published over 24 hours ago:
-
CollecTor may run at any time of the hour, at which point the valid-after time of current consensuses and votes may already be up to 1 hour behind the current system time.
-
The votes that a consensus is based on are generated 10 minutes before the valid-after time, and they may contain server descriptors that have been published in the past 24 hours.
-
Directory authorities may serve an older consensus than the current consensus, say, one that is already 2 hours older than the current one.
All in all, CollecTor should attempt to fetch descriptors that are 27:10 hours old, or let's say 30 hours for simplicity and to account for cases we didn't consider here.
The downside is that missing descriptors will be retried for 6 more hours, but that doesn't seem to be that much of a problem, given that missing descriptors will be retried in batches of up to 96.
Here's a trivial patch:
diff --git a/src/main/java/org/torproject/collector/relaydescs/RelayDescriptorDownloader.java b/src/main/java/org/torproject/collector/relaydescs/RelayDescriptorDownloader.java
index f4e38f4..21b1ee4 100644
--- a/src/main/java/org/torproject/collector/relaydescs/RelayDescriptorDownloader.java
+++ b/src/main/java/org/torproject/collector/relaydescs/RelayDescriptorDownloader.java
@@ -185,7 +185,9 @@ public class RelayDescriptorDownloader {
/**
* Cut-off time for missing server and extra-info descriptors, formatted
* "yyyy-MM-dd HH:mm:ss". This time is initialized as the current system
- * time minus 24 hours.
+ * time minus 30 hours (24 hours for the maximum age of descriptors to be
+ * referenced plus 6 hours for the time between generating votes and
+ * processing a consensus).
*/
private String descriptorCutOff;
@@ -330,7 +332,7 @@ public class RelayDescriptorDownloader {
long now = System.currentTimeMillis();
this.currentValidAfter = format.format((now / (60L * 60L * 1000L))
* (60L * 60L * 1000L));
- this.descriptorCutOff = format.format(now - 24L * 60L * 60L * 1000L);
+ this.descriptorCutOff = format.format(now - 30L * 60L * 60L * 1000L);
this.currentTimestamp = format.format(now);
this.downloadAllDescriptorsCutOff = format.format(now
- 23L * 60L * 60L * 1000L - 30L * 60L * 1000L);