Opened 20 months ago

Last modified 6 months ago

#25783 new defect

Circuit creation loop when primary guards are unreachable

Reported by: asn Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.3.0.10
Severity: Normal Keywords: tor-guard 035-removed
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I was offline for a few hours today while Tor was running. At some point I went back online but I noticed that Tor was stuck on a circuit creation loop which it did not exit until it marked one of its primary guards as retriable (which can take lots of time). While in the loop, Tor made one circuit per second.

I spent a good part of today debugging this. I think the issue is that our guard algorithm changes the circuit state of circuits that don't use primary guards to CIRCUIT_STATE_GUARD_WAIT in circuit_build_no_more_hops(). Then in circuit_expire_building() we consider those waiting circuits as not CIRCUIT_STATE_OPEN and expire them quickly with the 2s build timeout. Then we make more, and then expire them, ad infinitum, until a primary guards becomes retriable and breaks the circle.

Here is the loop


Tor thinks it needs a pre-emptive circuit:

Apr 11 14:47:21.000 [info] circuit_build_times_set_timeout(): Set circuit build timeout to 2s (1500.000000ms, 60000.000000ms, Xm: 525, a: 2.177536, r: 0.121588) based on 403 circuit times
Apr 11 14:47:21.000 [info] circuit_predict_and_launch_new(): Have 4 clean circs (3 internal), need another exit circ.
Apr 11 14:47:21.000 [info] origin_circuit_new(): Circuit 139 chose an idle timeout of 2967 based on 2875 seconds of predictive building remaining.

Tor picks guard, picks timeouts and connects to it:

Apr 11 14:47:21.000 [warn] No primary guards available. Selected confirmed guard ENiGMA ($42B4F52C5B11E4D39855F654955425B0D5A0598B) for circuit. Will try other guards before using this circuit.
Apr 11 14:47:22.000 [warn] Recorded success for confirmed guard ENiGMA ($42B4F52C5B11E4D39855F654955425B0D5A0598B)
Apr 11 14:47:22.000 [info] circuit_build_no_more_hops(): circuit built!

Tor marks the circuit as timeout by calling
circuit_build_times_mark_circ_as_measurement_only() in
circuit_expire_building() and starts making a new predictive circuit (loop!):

Apr 11 14:47:23.000 [info] circuit_expire_building(): Deciding to count the timeout for circuit 139
Apr 11 14:47:23.000 [info] circuit_predict_and_launch_new(): Have 4 clean circs (3 internal), need another exit circ.

after a minute finally Tor ditches circuit which has been repurposed as CIRCUIT_PURPOSE_C_MEASURE_TIMEOUT:

Apr 11 14:48:22.000 [info] circuit_expire_building(): Deciding to count the timeout for circuit 139
Apr 11 14:48:22.000 [info] circuit_expire_building(): Abandoning circ 139 5.9.121.207:443:2179853168 (state 0,3:waiting to see how other guards perform, purpose 14, len 3)
Apr 11 14:48:22.000 [info] pathbias_check_close(): Circuit 139 remote-closed without successful use for reason -3. Circuit purpose 14 currently 0,waiting to see how other guards perform. Len 3.

Child Tickets

Change History (6)

comment:1 Changed 20 months ago by cypherpunks

Last edited 18 months ago by cypherpunks (previous) (diff)

comment:2 Changed 19 months ago by asn

Milestone: Tor: 0.3.4.x-finalTor: 0.3.5.x-final

No time to debug and fix this during the 034 cycle. Let's go for 035.

comment:3 Changed 18 months ago by asn

Milestone: Tor: 0.3.5.x-finalTor: unspecified

This is not on the roadmap, and hence it will most probably not happen in 035. Moving out of the milestone.

comment:4 Changed 17 months ago by nickm

Keywords: 035-removed added

comment:5 Changed 6 months ago by gaba

Removing sponsor V as we do not have more time to include this tickets in the sponsor.

comment:6 Changed 6 months ago by gaba

Sponsor: SponsorV-can

Removing sponsor from tickets that we do not have time to fit in the remain of this sponsorship.

Note: See TracTickets for help on using tickets.