Opened 10 months ago

Last modified 4 months ago

#21600 assigned defect

Hidden service introduction point retries occur at 1 second intervals

Reported by: teor Owned by: teor
Priority: Medium Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7.2-alpha
Severity: Normal Keywords: tor-hs, single-onion, prop224
Cc: Actual Points:
Parent ID: #21446 Points: 1
Reviewer: Sponsor: SponsorR-can

Description

Tor will try to reconnect to an introduction point up to 3 times.
But rend_consider_services_intro_points() is called every second, which means that it uses up these retries very quickly, particularly if the connection fails quickly, like direct connections sometimes do on single onion services.

It might be more sensible to retry slightly more slowly.

On the other hand, maybe it's good that we fail fast and replace the introduction point.

This behaviour was introduced in commit 1125a4876b4.

Child Tickets

Change History (12)

comment:1 Changed 10 months ago by dgoulet

Status: newneeds_information

Hrm... failing 3 times in 3 seconds means that every circuit creation failed right away? In that case, "tor" might have more problems... But I think the behavior here should be that we open a circuit and then if it fails like in 10 seconds after, we note down the try and retry a second later. That sounds reasonable to me?

comment:2 in reply to:  1 Changed 10 months ago by teor

Replying to dgoulet:

Hrm... failing 3 times in 3 seconds means that every circuit creation failed right away? In that case, "tor" might have more problems... But I think the behavior here should be that we open a circuit and then if it fails like in 10 seconds after, we note down the try and retry a second later. That sounds reasonable to me?

Let's reword it to be something like:

When a circuit fails, we retry 10 seconds after we first detect the failure.

Then it's dynamic based on circuit failure time.

comment:3 Changed 10 months ago by teor

Status: needs_informationassigned

comment:4 Changed 10 months ago by teor

See #21621 for notes on the timing of the retries here: I think we should retry after 30 seconds to match the connection timeout (and avoid penalising slow hidden services).

comment:5 Changed 10 months ago by teor

(Oh, and we should randomise each interval between 0.5 and 1.5 times, to avoid thundering herds.)

comment:6 Changed 10 months ago by teor

Let's try that again: the maximum we'll ever get is the circuit timeout, so let's make it a random value in [CircuitTimeout/3, CircuitTimeout].

Last edited 10 months ago by teor (previous) (diff)

comment:7 Changed 9 months ago by dgoulet

Sponsor: SponsorR-can

comment:8 Changed 8 months ago by teor

Owner: teor deleted

I will not have time to do this before the 0.3.1 code freeze.
It would be good if someone else fixed this bug in 0.3.1, because it affects hidden service reliability.

If you defer to 0.3.2, please reassign to me.

comment:9 Changed 7 months ago by dgoulet

Milestone: Tor: 0.3.1.x-finalTor: 0.3.2.x-final

comment:10 Changed 7 months ago by dgoulet

Owner: set to teor

comment:11 Changed 7 months ago by dgoulet

Keywords: prop224 added

comment:12 Changed 4 months ago by dgoulet

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

Still worth considering!

Note: See TracTickets for help on using tickets.