Opened 21 months ago

Closed 5 months ago

Last modified 4 months ago

#23605 closed defect (fixed)

expired consensus causes guard selection to stall at BOOTSTRAP PROGRESS=80

Reported by: catalyst Owned by: catalyst
Priority: High Milestone: Tor: 0.4.0.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: bootstrap, clock-skew, tor-guard, usability, ux, s8-errors, 035-roadmap-subtask, 035-triaged-in-20180711, s8-bootstrap
Cc: iry, brade, mcs, intrigeri, torproject@… Actual Points:
Parent ID: #28018 Points:
Reviewer: Sponsor: Sponsor8

Description

Tor can report BOOTSTRAP_STATUS_CONN_OR (PROGRESS=80, "Connecting to the Tor network") when it actually can do no such thing. In some situations (e.g., clock skew) this causes progress to get stuck at 80% indefinitely, resulting in very poor user experience.

Right now update_router_have_minimum_dir_info() reports the BOOTSTRAP_STATUS_CONN_OR event if there's a "reasonably live" consensus and enough descriptors downloaded. A client with a clock skewed several hours into the future can get stalled here indefinitely due to inability to select a guard: if the client's clock is skewed, it will never have a live consensus. (Guard selection seems to require a non-expired consensus, rather than a reasonably live consensus at least during bootstrap.)

We should either relax the guard selection consensus liveness requirement, or avoid reporting BOOTSTRAP_STATUS_CONN_OR when we have no reasonable chance of actually connecting to a guard for building application circuits.

Arguably we shouldn't start downloading descriptors until we have a non-expired consensus either, because that gets represented as a considerable chunk of the progress bar (40%->80%) in a way that could be misleading to a user. Making that change without additional work would cause bootstrap to get stuck at 40% instead of 80%, which might be an improvement. This can already happen if the client's clock is skewed several hours in the past.

Child Tickets

TicketStatusOwnerSummaryComponent
#2878closedDon't bootstrap from an old consensus if we're about to replace itCore Tor/Tor

Change History (22)

comment:1 Changed 21 months ago by iry

Cc: iry added

comment:2 in reply to:  description Changed 21 months ago by arma

Replying to catalyst:

Arguably we shouldn't start downloading descriptors until we have a non-expired consensus either, because that gets represented as a considerable chunk of the progress bar (40%->80%) in a way that could be misleading to a user.

This is a really important thing to do, for the reason you describe but also for the even bigger reason that we're wasting bandwidth on fetching directory stuff that we will then probably not use -- which is an especially big deal on low-bandwidth clients. This is ticket #2878.

comment:3 Changed 21 months ago by catalyst

Sponsor: Sponsor8-can

comment:4 Changed 21 months ago by mcs

Cc: brade mcs added

comment:5 Changed 19 months ago by catalyst

Keywords: s8-errors added

comment:6 Changed 16 months ago by catalyst

Milestone: Tor: 0.3.3.x-finalTor: unspecified

comment:7 Changed 12 months ago by nickm

Keywords: 035-roadmap-subtask added
Milestone: Tor: unspecifiedTor: 0.3.5.x-final

comment:8 Changed 11 months ago by nickm

Keywords: 035-triaged-in-20180711 added

comment:9 Changed 10 months ago by catalyst

Keywords: s8-bootstrap added
Owner: set to catalyst
Sponsor: Sponsor8-canSponsor8
Status: newassigned

comment:10 Changed 10 months ago by intrigeri

Cc: intrigeri added

comment:11 Changed 9 months ago by catalyst

Milestone: Tor: 0.3.5.x-finalTor: 0.3.6.x-final

Move my post-freeze 0.3.5 items to 0.3.6.

comment:12 Changed 8 months ago by catalyst

Summary: BOOTSTRAP PROGRESS=80 is a lieexpired consensus causes guard selection to stall at BOOTSTRAP PROGRESS=80

Edited summary to more accurately reflect the likely cause.

comment:13 Changed 8 months ago by catalyst

Add child ticket #28255 for verifying the consensus expiry behavior.

comment:14 Changed 8 months ago by catalyst

Parent ID: #22266#28018

Make a direct child of #28018.

comment:15 Changed 8 months ago by teor

Owner: changed from catalyst to teor

I am doing most of the subtickets, so I guess this is mine

comment:16 Changed 7 months ago by nickm

Milestone: Tor: 0.3.6.x-finalTor: 0.4.0.x-final

Tor 0.3.6.x has been renamed to 0.4.0.x.

comment:17 Changed 6 months ago by teor

Owner: changed from teor to catalyst

catalyst, does your bootstrap stage rewrite cover this ticket?

comment:18 in reply to:  17 ; Changed 6 months ago by catalyst

Replying to teor:

catalyst, does your bootstrap stage rewrite cover this ticket?

If you mean the part where obtaining "enough" directory info improperly reports that we're starting a connection when we might do no such thing, I'm pretty sure #27167 fixes that. (The behavior is an instance of #27308, which #27167 partially fixes.)

comment:19 in reply to:  18 ; Changed 6 months ago by teor

Replying to catalyst:

Replying to teor:

catalyst, does your bootstrap stage rewrite cover this ticket?

If you mean the part where obtaining "enough" directory info improperly reports that we're starting a connection when we might do no such thing, I'm pretty sure #27167 fixes that. (The behavior is an instance of #27308, which #27167 partially fixes.)

So we can close this ticket when #27167 closes?
(And un-parent any child tickets we still want to do.)

comment:20 in reply to:  19 Changed 6 months ago by catalyst

Replying to teor:

Replying to catalyst:

Replying to teor:

catalyst, does your bootstrap stage rewrite cover this ticket?

If you mean the part where obtaining "enough" directory info improperly reports that we're starting a connection when we might do no such thing, I'm pretty sure #27167 fixes that. (The behavior is an instance of #27308, which #27167 partially fixes.)

So we can close this ticket when #27167 closes?
(And un-parent any child tickets we still want to do.)

That's fine with me. We could also keep this ticket open. I have no strong preferences here.

comment:21 Changed 5 months ago by catalyst

Resolution: fixed
Status: assignedclosed

Closing because the remaining bits are fixed by #27167.

comment:22 Changed 4 months ago by hefee

Cc: torproject@… added
Note: See TracTickets for help on using tickets.