Opened 2 years ago

Closed 10 months ago

#24661 closed defect (fixed)

accept a reasonably live consensus for guard selection

Reported by: catalyst Owned by: teor
Priority: Medium Milestone: Tor: 0.3.5.x-final
Component: Core Tor/Tor Version: Tor: 0.3.0.1-alpha
Severity: Normal Keywords: bootstrap, clock-skew, ux, 035-backport-maybe, 034-backport-maybe-not, 033-backport-maybe-not, 033-triage-20180320, 034-roadmap-proposed, 034-triage-20180328, 034-included-20180328, 034-deferred-20180602, 035-removed-20180711
Cc: asn, intrigeri, catalyst, torproject@… Actual Points:
Parent ID: #28018 Points: 1
Reviewer: catalyst Sponsor:

Description

Clients with clocks skewed far enough in the future to never get a live consensus, but still have a reasonably live one, end up downloading descriptors and then getting stuck on guard selection. This is a rather bad user experience because bootstrap progress appears to get stuck at 80% or 85% even though something rather fundamental (time of day) is wrong.

It's not clear that a reasonably live consensus is dangerous to use for guard selection, so always accept a reasonably live consensus instead of a live one for guard selection.

Ticket #2878 covers the case of deferring descriptor downloads if the consensus isn't live, which would also improve the UX but might not be necessary if we implement the solution in this ticket.

Child Tickets

TicketStatusOwnerSummaryComponent
#28255closedteorverify guard selection consensus expiry constraintsCore Tor/Tor
#28319closedteoraccept a reasonably live consensus for path selectionCore Tor/Tor
#28554closedteorFix memory leaks and missing unmocks in test_entry_guard_outdated_dirserver_exclusionCore Tor/Tor

Change History (28)

comment:1 Changed 2 years ago by asn

Cc: asn added

comment:2 Changed 21 months ago by nickm

Keywords: 033-triage-20180320 added

Marking all tickets reached by current round of 033 triage.

comment:3 Changed 21 months ago by nickm

Keywords: 033-removed-20180320 added

Mark all not-already-included tickets as pending review for removal from 0.3.3 milestone.

comment:4 Changed 21 months ago by dgoulet

Keywords: 034-roadmap-proposed added; 033-removed-20180320 removed

This is sponsor 8 stuff. It should be a subtasks of a master tasks from our roadmap. Maybe 034 or 035, not sure but lets consider it for 034.

comment:5 Changed 21 months ago by nickm

Milestone: Tor: 0.3.3.x-finalTor: 0.3.4.x-final

comment:6 Changed 21 months ago by nickm

Keywords: 034-triage-20180328 added

comment:7 Changed 21 months ago by nickm

Keywords: 034-included-20180328 added

comment:8 Changed 19 months ago by nickm

Keywords: 034-deferred-20180602 added
Milestone: Tor: 0.3.4.x-finalTor: 0.3.5.x-final

Deferring non-must tickets to 0.3.5

comment:9 Changed 17 months ago by nickm

Keywords: 035-removed-20180711 added
Milestone: Tor: 0.3.5.x-finalTor: unspecified

These tickets are being triaged out of 0.3.5. The ones marked "035-roadmap-proposed" may return.

comment:10 Changed 16 months ago by intrigeri

Cc: intrigeri added

comment:11 Changed 14 months ago by catalyst

Parent ID: #23605
Points: 1

Reparent. The actual work to accomplish this is probably fairly small. The hard part is deciding whether it introduces too much risk.

comment:12 Changed 14 months ago by nickm

I think that it is fine to make this change. It's normal for clients to use guards that are chosen based on past consensuses.

It might be good to make sure that we have at least tried to fetch a live consensus before we do this, though. Maybe we should have the test be have_reasonably_live_consensus() || have_received_a_consensus_in_the_last_N_hours()?

Last edited 14 months ago by catalyst (previous) (diff)

comment:13 Changed 14 months ago by teor

Milestone: Tor: unspecifiedTor: 0.3.6.x-final
Owner: changed from catalyst to teor
Version: Tor: 0.3.0.1-alpha

I have a draft branch for this bug.

I still need to fix the entry nodes unit tests.

There also seem to be other parts of tor's codebase that require a recent consensus:

Nov 05 15:29:04.000 [notice] Bootstrapped 25%: Loading networkstatus consensus
Nov 05 15:29:51.000 [notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
Nov 05 15:29:52.000 [notice] Bootstrapped 40%: Loading authority key certs
Nov 05 15:29:55.000 [warn] Our clock is 30 minutes, 6 seconds behind the time published in the consensus network status document (2018-11-05 06:00:00 UTC).  Tor needs an accurate clock to work correctly. Please check your time and date settings!
Nov 05 15:29:55.000 [warn] Received microdesc flavor consensus with skewed time (CONSENSUS): It seems that our clock is behind by 30 minutes, 6 seconds, or that theirs is ahead. Tor requires an accurate clock to work: please check your time, timezone, and date settings.
Nov 05 15:29:55.000 [warn] Problem bootstrapping. Stuck at 40%: Loading authority key certs. (Clock skew -1806 in microdesc flavor consensus from CONSENSUS; CLOCK_SKEW; count 1; recommendation warn; host ? at ?)
Nov 05 15:29:55.000 [notice] I learned some more directory information, but not enough to build a circuit: We have no recent usable consensus.
Nov 05 15:29:55.000 [notice] I learned some more directory information, but not enough to build a circuit: We have no recent usable consensus.
Nov 05 17:27:43.000 [notice] Your system clock just jumped 7056 seconds forward; assuming established circuits no longer work.

comment:14 Changed 13 months ago by nickm

Milestone: Tor: 0.3.6.x-finalTor: 0.4.0.x-final

Tor 0.3.6.x has been renamed to 0.4.0.x.

comment:15 Changed 13 months ago by teor

Cc: catalyst added
Status: assignedneeds_revision

Here's my draft pull request:
https://github.com/torproject/tor/pull/532

The test_outdated_dirserver_exclusion unit tests fail, and I can't work out why.
make test-network-all passes, but I still need to test with clock skew.

The stem integ tests will fail due to #28550.

comment:16 Changed 13 months ago by teor

Keywords: 035-backport-maybe 034-backport-maybe-not 033-backport-maybe-not added
Status: needs_revisionneeds_review

This code had non-trivial merge conflicts when cherry-picked to 0.3.3 or 0.3.4, so I just made an 0.3.5 branch.

See my pull request:
https://github.com/torproject/tor/pull/536

I am still testing it with clock skew.
I'd like to be able to reproduce clock skew tests, maybe I can work out how to do that with chutney.

The stem test will fail due to #28571.

comment:17 Changed 13 months ago by teor

The appveyor tests fail due to #28574.

comment:18 Changed 13 months ago by teor

When I set my clock 12 hours behind, I see this error with maint-0.3.5 and my branch:

$ src/app/tor 
Nov 23 05:07:03.296 [notice] Tor 0.3.5.5-alpha-dev (git-a9820f072bf1bd79) running on Darwin with Libevent 2.1.8-stable, OpenSSL 1.0.2p, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd 1.3.7.
...
Nov 23 05:07:03.000 [notice] Bootstrapped 0%: Starting
Nov 23 05:07:03.000 [notice] Starting with guard context "default"
Nov 23 05:07:04.000 [notice] Bootstrapped 5%: Connecting to directory server
Nov 23 05:07:05.000 [notice] Bootstrapped 10%: Finishing handshake with directory server
Nov 23 05:07:06.000 [notice] Bootstrapped 15%: Establishing an encrypted directory connection
Nov 23 05:07:06.000 [notice] Bootstrapped 20%: Asking for networkstatus consensus
Nov 23 05:07:06.000 [notice] Bootstrapped 25%: Loading networkstatus consensus
Nov 23 05:07:10.000 [notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
Nov 23 05:07:10.000 [notice] Bootstrapped 40%: Loading authority key certs
Nov 23 05:07:12.000 [warn] Our clock is 10 hours, 52 minutes behind the time published in the consensus network status document (2018-11-23 06:00:00 UTC).  Tor needs an accurate clock to work correctly. Please check your time and date settings!
Nov 23 05:07:12.000 [warn] Received microdesc flavor consensus with skewed time (CONSENSUS): It seems that our clock is behind by 10 hours, 52 minutes, or that theirs is ahead. Tor requires an accurate clock to work: please check your time, timezone, and date settings.
Nov 23 05:07:12.000 [warn] Problem bootstrapping. Stuck at 40%: Loading authority key certs. (Clock skew -39169 in microdesc flavor consensus from CONSENSUS; CLOCK_SKEW; count 1; recommendation warn; host ? at ?)

When I set my clock 12 hours ahead, on maint-0.3.5 I see:

$ src/app/tor 
Nov 24 05:09:33.482 [notice] Tor 0.3.5.5-alpha-dev (git-a9820f072bf1bd79) running on Darwin with Libevent 2.1.8-stable, OpenSSL 1.0.2p, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd 1.3.7.
...
Nov 24 05:09:33.000 [notice] Bootstrapped 0%: Starting
Nov 24 05:09:33.000 [notice] Starting with guard context "default"
Nov 24 05:09:34.000 [notice] Bootstrapped 5%: Connecting to directory server
Nov 24 05:09:35.000 [notice] Bootstrapped 10%: Finishing handshake with directory server
Nov 24 05:09:36.000 [notice] Bootstrapped 15%: Establishing an encrypted directory connection
Nov 24 05:09:36.000 [notice] Bootstrapped 20%: Asking for networkstatus consensus
Nov 24 05:09:37.000 [notice] Bootstrapped 25%: Loading networkstatus consensus
Nov 24 05:09:41.000 [notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
Nov 24 05:09:42.000 [notice] Bootstrapped 40%: Loading authority key certs
Nov 24 05:09:44.000 [notice] The current consensus has no exit nodes. Tor can only build internal paths, such as paths to onion services.
Nov 24 05:09:44.000 [notice] Bootstrapped 45%: Asking for relay descriptors for internal paths
Nov 24 05:09:44.000 [notice] I learned some more directory information, but not enough to build a circuit: We need more microdescriptors: we have 0/6279, and can only build 0% of likely paths. (We have 0% of guards bw, 0% of midpoint bw, and 0% of end bw (no exits in consensus, using mid) = 0% of path bw.)
...
Nov 24 05:09:49.000 [notice] Bootstrapped 50%: Loading relay descriptors for internal paths
Nov 24 05:09:51.000 [notice] The current consensus contains exit nodes. Tor can build exit and internal paths.
Nov 24 05:09:53.000 [notice] Bootstrapped 57%: Loading relay descriptors
Nov 24 05:09:57.000 [notice] Bootstrapped 65%: Loading relay descriptors
Nov 24 05:10:36.000 [notice] Bootstrapped 73%: Loading relay descriptors
Nov 24 05:11:08.000 [notice] Bootstrapped 78%: Loading relay descriptors
Nov 24 05:11:08.000 [notice] Bootstrapped 80%: Connecting to the Tor network
Nov 24 05:11:09.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
Nov 24 05:11:09.000 [notice] Our circuit 0 (id: 34) died due to an invalid selected path, purpose General-purpose client. This may be a torrc configuration issue, or a bug.
...
Nov 24 05:11:15.000 [notice] Bootstrapped 85%: Finishing handshake with first hop
Nov 24 05:11:15.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
...

And on this branch I see:

$ src/app/tor
Nov 24 05:13:01.656 [notice] Tor 0.3.5.5-alpha-dev (git-805f75182a87286a) running on Darwin with Libevent 2.1.8-stable, OpenSSL 1.0.2p, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd 1.3.7.
...
Nov 24 05:13:02.000 [notice] Bootstrapped 0%: Starting
Nov 24 05:13:02.000 [notice] Starting with guard context "default"
Nov 24 05:13:03.000 [notice] Bootstrapped 5%: Connecting to directory server
Nov 24 05:13:03.000 [notice] Bootstrapped 10%: Finishing handshake with directory server
Nov 24 05:13:04.000 [notice] Bootstrapped 15%: Establishing an encrypted directory connection
Nov 24 05:13:05.000 [notice] Bootstrapped 20%: Asking for networkstatus consensus
Nov 24 05:13:05.000 [notice] Bootstrapped 25%: Loading networkstatus consensus
Nov 24 05:13:09.000 [notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
Nov 24 05:13:09.000 [notice] Bootstrapped 40%: Loading authority key certs
Nov 24 05:13:11.000 [notice] The current consensus has no exit nodes. Tor can only build internal paths, such as paths to onion services.
Nov 24 05:13:11.000 [notice] Bootstrapped 45%: Asking for relay descriptors for internal paths
Nov 24 05:13:11.000 [notice] I learned some more directory information, but not enough to build a circuit: We need more microdescriptors: we have 0/6251, and can only build 0% of likely paths. (We have 0% of guards bw, 0% of midpoint bw, and 0% of end bw (no exits in consensus, using mid) = 0% of path bw.)
Nov 24 05:13:13.000 [notice] Bootstrapped 50%: Loading relay descriptors for internal paths
Nov 24 05:13:15.000 [notice] The current consensus contains exit nodes. Tor can build exit and internal paths.
Nov 24 05:13:19.000 [notice] Bootstrapped 57%: Loading relay descriptors
Nov 24 05:13:19.000 [notice] Bootstrapped 63%: Loading relay descriptors
Nov 24 05:13:19.000 [notice] Bootstrapped 68%: Loading relay descriptors
Nov 24 05:13:19.000 [notice] Bootstrapped 75%: Loading relay descriptors
Nov 24 05:13:19.000 [notice] Bootstrapped 80%: Connecting to the Tor network
Nov 24 05:13:19.000 [notice] Bootstrapped 90%: Establishing a Tor circuit
Nov 24 05:13:21.000 [notice] Bootstrapped 100%: Done

So this particular bug appears to be fixed.

I opened #28591 for the "future consensus" case.

comment:19 Changed 13 months ago by dgoulet

Reviewer: catalyst

comment:20 Changed 13 months ago by catalyst

Status: needs_reviewmerge_ready

Thanks! Patch looks good to me.

I verified the fix using libfaketime and manual testing.

comment:21 Changed 13 months ago by nickm

Merged PR 536 to 0.4.0; marking for possible backport to 0.3.5.

(Do we still think this is backport material?)

comment:22 Changed 13 months ago by nickm

Milestone: Tor: 0.4.0.x-finalTor: 0.3.5.x-final

comment:23 Changed 13 months ago by teor

It would be nice to have the next LTS work with a skewed clock.

comment:24 Changed 11 months ago by catalyst

Parent ID: #23605#28018

Reparent so we can close #23605.

comment:25 in reply to:  21 Changed 11 months ago by teor

Replying to nickm:

Merged PR 536 to 0.4.0; marking for possible backport to 0.3.5.

(Do we still think this is backport material?)

I think this change is simple enough to backport. And it would be nice to have in LTS.

comment:26 Changed 11 months ago by gaba

Keywords: s8-bootstrap s8-errors removed
Sponsor: Sponsor8-can

comment:27 Changed 10 months ago by hefee

Cc: torproject@… added
Note: See TracTickets for help on using tickets.