Opened 12 days ago

Last modified 30 hours ago

#31652 needs_review defect

hs-v3: Service circuit retry limit should not close a valid circuit

Reported by: dgoulet Owned by: neel
Priority: Medium Milestone: Tor: 0.4.2.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs, tor-circuit, 042-should
Cc: asn, neel Actual Points:
Parent ID: #30200 Points: 0.1
Reviewer: asn Sponsor: Sponsor27-must

Description

Context: Lets say a service has 3 intro circuits opened and stable.

Now, imagine one circuit collapses, like for instance the intro point restarted "tor" after an update. Our code is designed to retry 3 times that is once every second and then give up on the intro point.

What it looks like:

Every second, run_build_circuit_event() is run and launches intro circuit(s) if we are missing any. For each IP, it will increment the circuit_retries counter but does not actually look at it to decide to launch or not.

Before that event, also every 1 second, cleanup_intro_points() checks that every intro point has not expired, fell off the consensus or that circuit_retries is greater than (>) MAX_INTRO_POINT_CIRCUIT_RETRIES = 3.

Putting this together, imagine now that the first 3 attempts failed for whatever reasons and then we launch a 4th one because circuit_retries = 3, it does pass validation of > MAX_INTRO_POINT_CIRCUIT_RETRIES so then a circuit is launched and that very one succeeds.

Because circuit_retries is now 4 then the next second, cleanup_intro_points() removes the IP and closes the valid open established circuit...

I've observed the above a surprising amount of time because booting a tor relay takes some seconds mostly due to the diff-cache parsing.

In a nutshell, we should NOT launch a circuit if we reached the max retries for an intro point. This back and forth of opening a circuit and then deciding that we went over the limit makes no sense in the first place.

Child Tickets

Change History (11)

comment:1 Changed 12 days ago by neel

Cc: neel added
Owner: set to neel
Status: newassigned

comment:2 Changed 12 days ago by neel

Status: assignedneeds_review

comment:3 Changed 8 days ago by dgoulet

Reviewer: asndgoulet

comment:4 Changed 7 days ago by dgoulet

Status: needs_reviewneeds_revision

Comment on PR

comment:5 Changed 7 days ago by neel

Status: needs_revisionneeds_review

comment:6 Changed 7 days ago by nickm

Keywords: 042-should added

comment:7 Changed 5 days ago by dgoulet

Status: needs_reviewneeds_revision

Close but slight problem!

comment:8 Changed 5 days ago by neel

Status: needs_revisionneeds_review

Fixed it!

comment:9 Changed 5 days ago by asn

Status: needs_reviewneeds_revision

Hmm, seems some test failed and CI is sad. (Didn't do any source code review)

comment:10 Changed 4 days ago by neel

Status: needs_revisionneeds_review

comment:11 Changed 30 hours ago by asn

Reviewer: dgouletasn
Note: See TracTickets for help on using tickets.