Opened 6 years ago
Last modified 3 years ago
#9968 new enhancement
Time out quicker on microdesc fetch failures while we're bootstrapping
Reported by: | arma | Owned by: | |
---|---|---|---|
Priority: | Medium | Milestone: | Tor: unspecified |
Component: | Core Tor/Tor | Version: | |
Severity: | Normal | Keywords: | tor-client bootstrap download sponsor8-maybe |
Cc: | catalyst | Actual Points: | |
Parent ID: | Points: | ||
Reviewer: | Sponsor: |
Description
#9946 points out that "if one of our three guards is a bust, it will take 120 seconds for requests to it to time out, and even if the other 2 guards are working great, that typically isn't enough to get to 80% bootstrapped. So there will remain cases where we wait 2 minutes to bootstrap -- and maybe this patch even increases the chance of those since any of the three guards can cause them."
We should be more eager to either rotate to a different guard, or try another request in parallel, if a) we're not bootstrapped yet, b) we have directory questions we want answers to, and c) we've recently gotten good answers from our other guards.
Child Tickets
Change History (11)
comment:1 Changed 6 years ago by
comment:2 Changed 6 years ago by
I'm a fan. The main thing to keep in mind is keeping the pipe from the working guards full -- there's a round-trip cost for asking for more microdescriptors, so we should pipeline our requests well for both fast network connections and slow ones.
comment:3 Changed 6 years ago by
Milestone: | Tor: 0.2.5.x-final → Tor: 0.2.??? |
---|
comment:5 Changed 3 years ago by
Keywords: | tor-03-unspecified-201612 added |
---|---|
Milestone: | Tor: 0.3.??? → Tor: unspecified |
Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.
comment:6 Changed 3 years ago by
Severity: | → Normal |
---|---|
Summary: | Time out quicker on directory fetches while we're bootstrapping → Time out quicker on microdesc fetches while we're bootstrapping |
I'm refocusing this ticket to be about microdescriptor fetches, since we did a bunch of work on parallelizing consensus fetches in proposal 210.
comment:7 Changed 3 years ago by
One part of me is thinking that when we get done with all the "one guard to rule them all" tickets, we'll be bottlenecking on our one guard for directory fetches, and this ticket will become moot, so if we can just wait it out, we can close with wontfix.
Another part of me is remembering the lesson from #18963, where we decided that wherever we just fetched the consensus from, that's the perfect place to fetch the certificates from, because we know it works and we already have an established tls conn to it. Would it be totally crazy, when you just fetched the consensus and certs from a fallbackdir or the like, to just go ahead and fetch the microdescs from that relay too?
comment:8 Changed 3 years ago by
Summary: | Time out quicker on microdesc fetches while we're bootstrapping → Time out quicker on microdesc fetch failures while we're bootstrapping |
---|
comment:9 Changed 3 years ago by
Keywords: | tor-03-unspecified-201612 removed |
---|
Remove an old triaging keyword.
comment:10 Changed 3 years ago by
Keywords: | bootstrap download sponsor8-maybe added |
---|
comment:11 Changed 3 years ago by
Cc: | catalyst added |
---|
I think one of the changes I suggested in #9969 would work well for this: do not have more than N requests of the same type in flight to the same directory guard at the same time. So suppose we have 3 directory guards, and request our microdescriptors in 18 chunks. Under the current system, 1/3 of our requests will be made to the borken guard, and will have to time out. Under the suggestion I made, if N=2, only 2 requests will be made to the broken guard. As the other requests are answered by the other guards, we'll send more requests to them. So only 1/9 of the total requests would time out, and we could probably bootstrap.