Opened 2 years ago

Last modified 11 months ago

#22355 new defect

Update dir-spec with client fallback directory mirror attempt and timeout behaviour

Reported by: teor Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-spec, spec, 033-triage-20180320, 033-removed-20180320
Cc: catalyst Actual Points:
Parent ID: Points: 0.5
Reviewer: Sponsor:

Description

Let's add these lines:

Clients try several fallback directory mirrors, and use the first one that connects. Each attempt happens after a short delay, regardless of the state of the previous attempt, until at least one attempt has connected.

When several fallback directory mirrors have failed, clients start trying directory authorities in a similar fashion.

Somewhere near:

https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n3292

I don't think we designed any explicit timeout behaviour, so we are probably using whatever was there before 0.2.8.

Child Tickets

Change History (12)

comment:1 Changed 2 years ago by mikeperry

I think we should specify the number and timing schedule of how we connect to fallback dirs, in as much detail as the second half of this comment: https://trac.torproject.org/projects/tor/ticket/4483#comment:45.

I was looking through the proposals and spec for specific timing and connection info so that I could make sane initial choices for timeouts for giving on on initial TLS connections to the Tor network for probing, and couldn't find any..

Version 0, edited 2 years ago by mikeperry (next)

comment:3 Changed 2 years ago by teor

It's also worth noting that the relevant comment in the source code is out of date due to exponential backoff, and the behaviour will also be modified by #17750:

https://gitweb.torproject.org/tor.git/tree/src/or/config.c#n559

For a summary, see #22421, the most relevant lines are:

   * Clients with only authorities will try:
   *  - at least 3 authorities over 10 seconds, then exponentially backoff,
   *    with the next attempt 3-21 seconds later,
   * Clients with authorities and fallbacks will try:
   *  - at least 2 authorities and 4 fallbacks over 21 seconds, then
   *    exponentially backoff, with the next attempts 4-33 seconds later,

Other background info:

The schedules are a list of maximum delays, and the multiplier is 3 (in test networks, 2).

Each attempt occurs after a random delay between (last_delay + 1) and (min(last_delay*3, scledule_max_delay) + 1).

Calculating for authorities and fallbacks separately:

If 90% of fallbacks are up (and not censored), we expect at least 99.99% of clients to try a fallback that is up within the first 16 seconds (trying at most 4 fallbacks). (We try to rebuild the list when 10% of fallbacks go down.)

If 7/8 authorities are up (and not censored), we expect 100% of clients to try an authority that is up within the first 17 seconds (trying at most 2 authorities). (I'm not sure what the stats are on how many authorities are ever down at the same time.)

So, to answer your original question, a reasonable timeout is 17 seconds + (SSL establishment time) + (the time it takes to download a consensus, certificates, and relay descriptors). We aimed for 30 seconds, because that's when Linda's study found that most users give up.

Or if you can get an event when the client connects to a directory server, you can be smarter about keeping on trying, or giving up at around 20-25 seconds.

comment:4 Changed 23 months ago by arma

Keywords: tor-spec added; torspec removed

switch to tor-spec keyword like most tickets seem to use

comment:5 Changed 21 months ago by nickm

Keywords: spec added

Add 'spec' keyword to items that are just spec fixes. These can land after the feature-freeze.

comment:6 Changed 20 months ago by nickm

Cc: teor added

teor, may I assign this one to you? It's okay to defer to 0.3.3.x, or to do it any time on the 0.3.2.x timeframe.

comment:7 Changed 20 months ago by catalyst

Cc: catalyst added

comment:8 Changed 19 months ago by nickm

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

comment:9 Changed 14 months ago by nickm

Keywords: 033-triage-20180320 added

Marking all tickets reached by current round of 033 triage.

comment:10 Changed 14 months ago by nickm

Keywords: 033-removed-20180320 added

Mark all not-already-included tickets as pending review for removal from 0.3.3 milestone.

comment:11 Changed 14 months ago by nickm

Milestone: Tor: 0.3.3.x-finalTor: unspecified

These tickets were marked as removed, and nobody has said that they can fix them. Let's remember to look at 033-removed-20180320 as we re-evaluate our triage process, to see whether we're triaging out unnecessarily, and to evaluate whether we're deferring anything unnecessarily. But for now, we can't do these: we need to fix the 033-must stuff now.

comment:12 Changed 11 months ago by teor

Cc: teor removed

Remove useless CC

Note: See TracTickets for help on using tickets.