If this happens, 0 (or uninitialised data) is used instead of the zeroth schedule item. (This matters most for bridges, which have schedules that start at 60 rather than 0.)
I wonder if any of the existing Tor code does this?
We could set a flag in download_status_t to 1 in download_status_reset, then assert it's set in the low-level functions that take a download_status_t:
find_dl_schedule_and_len
download_status_increment_failure
(not download_status_reset, it sets the flag)
download_status_get_n_failures
download_status_get_next_attempt_at
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
This bug means that bootstrapping clients an authority and a fallback immediately, and then try another fallback 1-3 seconds later, and another 4-9 seconds after that. The intention was to try an authority after we'd tried the first 2-3 fallbacks.
Please see my branch bug17750_029 for a general fix for this.
It could go all the way back to 0.2.9 if we wanted it to, but we should definitely test it in master first.
This bug could only ever have affected ClientBootstrapConsensusAuthorityDownloadSchedule and TestingBridgeDownloadSchedule, because every other schedule starts with 0 (the default).
And TestingBridgeDownloadSchedule is already initialised correctly.
I opened #22403 (moved) as a follow up for those cases where we directly access a download_status_t's fields rather than using an accessor function.
Trac: Summary: A download_status_t can be used before calling download_status_reset on it to Make bootstrapping clients wait before trying an authority Reviewer: N/AtoN/A Milestone: Tor: unspecified to Tor: 0.3.1.x-final Keywords: review, easy deleted, tor-bootstrap added Actualpoints: N/Ato 0.3 Status: new to needs_review Version: N/Ato Tor: 0.2.8.1-alpha
I'm not so good with this code, but I can confirm that the patch seems to have the behavior that teor describes. (The code used to launch two consensus fetches in parallel, one to an authority and one to a guard or fallback. Now it launches only one, not to an authority, and 1 second later if needed it launches another, also not to an authority. I didn't try triggering stuff after that, but hey, what could go wrong. :)
Agreed it is smart to put this patch into 0.3.1. In fact, we might not need/want to backport it before that -- since it's the sort of thing where it will be a long time until we realize our mistake, if there is one.
Please see my branch bug17750_029, which also adds some regression tests for this bug and #20534 (moved).
I marked this for 0.3.1 backport. If we do backport, we only need to backport the squashed commit "Make clients try fallbacks before authorities".
And we might not want to backport at all, given the risk of bugs in older versions. The only thing this bug causes is more load on the directory authorities.
Squashed as "bug17750_029_squashed" and merged to master. Let's let it cook there for a little while before we decide about a backport; it's a bit large for stable or post-freeze.
Trac: Milestone: Tor: 0.3.2.x-final to Tor: 0.3.1.x-final Keywords: review-group-20 deleted, N/Aadded Status: needs_review to merge_ready