Clients try several fallback directory mirrors, and use the first one that connects. Each attempt happens after a short delay, regardless of the state of the previous attempt, until at least one attempt has connected.
When several fallback directory mirrors have failed, clients start trying directory authorities in a similar fashion.
I was looking through the proposals and spec for specific timing and connection info so that I could make sane initial choices for timeouts for giving on on initial TLS connections to the Tor network for probing, and couldn't find any..
It's also worth noting that the relevant comment in the source code is out of date due to exponential backoff, and the behaviour will also be modified by #17750 (moved):
For a summary, see #22421 (moved), the most relevant lines are:
* Clients with only authorities will try: * - at least 3 authorities over 10 seconds, then exponentially backoff, * with the next attempt 3-21 seconds later, * Clients with authorities and fallbacks will try: * - at least 2 authorities and 4 fallbacks over 21 seconds, then * exponentially backoff, with the next attempts 4-33 seconds later,
Other background info:
The schedules are a list of maximum delays, and the multiplier is 3 (in test networks, 2).
Each attempt occurs after a random delay between (last_delay + 1) and (min(last_delay*3, scledule_max_delay) + 1).
Calculating for authorities and fallbacks separately:
If 90% of fallbacks are up (and not censored), we expect at least 99.99% of clients to try a fallback that is up within the first 16 seconds (trying at most 4 fallbacks). (We try to rebuild the list when 10% of fallbacks go down.)
If 7/8 authorities are up (and not censored), we expect 100% of clients to try an authority that is up within the first 17 seconds (trying at most 2 authorities). (I'm not sure what the stats are on how many authorities are ever down at the same time.)
So, to answer your original question, a reasonable timeout is 17 seconds + (SSL establishment time) + (the time it takes to download a consensus, certificates, and relay descriptors). We aimed for 30 seconds, because that's when Linda's study found that most users give up.
Or if you can get an event when the client connects to a directory server, you can be smarter about keeping on trying, or giving up at around 20-25 seconds.
These tickets were marked as removed, and nobody has said that they can fix them. Let's remember to look at 033-removed-20180320 as we re-evaluate our triage process, to see whether we're triaging out unnecessarily, and to evaluate whether we're deferring anything unnecessarily. But for now, we can't do these: we need to fix the 033-must stuff now.
Trac: Milestone: Tor: 0.3.3.x-final to Tor: unspecified