Opened 5 years ago

Last modified 22 months ago

#9968 new enhancement

Time out quicker on microdesc fetch failures while we're bootstrapping

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-client bootstrap download sponsor8-maybe
Cc: catalyst Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

#9946 points out that "if one of our three guards is a bust, it will take 120 seconds for requests to it to time out, and even if the other 2 guards are working great, that typically isn't enough to get to 80% bootstrapped. So there will remain cases where we wait 2 minutes to bootstrap -- and maybe this patch even increases the chance of those since any of the three guards can cause them."

We should be more eager to either rotate to a different guard, or try another request in parallel, if a) we're not bootstrapped yet, b) we have directory questions we want answers to, and c) we've recently gotten good answers from our other guards.

Child Tickets

Change History (11)

comment:1 Changed 5 years ago by nickm

I think one of the changes I suggested in #9969 would work well for this: do not have more than N requests of the same type in flight to the same directory guard at the same time. So suppose we have 3 directory guards, and request our microdescriptors in 18 chunks. Under the current system, 1/3 of our requests will be made to the borken guard, and will have to time out. Under the suggestion I made, if N=2, only 2 requests will be made to the broken guard. As the other requests are answered by the other guards, we'll send more requests to them. So only 1/9 of the total requests would time out, and we could probably bootstrap.

comment:2 Changed 5 years ago by arma

I'm a fan. The main thing to keep in mind is keeping the pipe from the working guards full -- there's a round-trip cost for asking for more microdescriptors, so we should pipeline our requests well for both fast network connections and slow ones.

comment:3 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.???

comment:4 Changed 2 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:5 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:6 Changed 22 months ago by arma

Severity: Normal
Summary: Time out quicker on directory fetches while we're bootstrappingTime out quicker on microdesc fetches while we're bootstrapping

I'm refocusing this ticket to be about microdescriptor fetches, since we did a bunch of work on parallelizing consensus fetches in proposal 210.

comment:7 Changed 22 months ago by arma

One part of me is thinking that when we get done with all the "one guard to rule them all" tickets, we'll be bottlenecking on our one guard for directory fetches, and this ticket will become moot, so if we can just wait it out, we can close with wontfix.

Another part of me is remembering the lesson from #18963, where we decided that wherever we just fetched the consensus from, that's the perfect place to fetch the certificates from, because we know it works and we already have an established tls conn to it. Would it be totally crazy, when you just fetched the consensus and certs from a fallbackdir or the like, to just go ahead and fetch the microdescs from that relay too?

comment:8 Changed 22 months ago by arma

Summary: Time out quicker on microdesc fetches while we're bootstrappingTime out quicker on microdesc fetch failures while we're bootstrapping

comment:9 Changed 22 months ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:10 Changed 22 months ago by nickm

Keywords: bootstrap download sponsor8-maybe added

comment:11 Changed 22 months ago by catalyst

Cc: catalyst added
Note: See TracTickets for help on using tickets.