Opened 3 years ago

Last modified 5 days ago

#21600 assigned defect

Hidden service introduction point retries occur at 1 second intervals

Reported by: teor Owned by: asn
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.7.2-alpha
Severity: Normal Keywords: tor-hs, single-onion, prop224, 034-triage-20180328, 034-removed-20180328, 042-deferred-20190918
Cc: Actual Points:
Parent ID: #21446 Points: 1
Reviewer: Sponsor:

Description

Tor will try to reconnect to an introduction point up to 3 times.
But rend_consider_services_intro_points() is called every second, which means that it uses up these retries very quickly, particularly if the connection fails quickly, like direct connections sometimes do on single onion services.

It might be more sensible to retry slightly more slowly.

On the other hand, maybe it's good that we fail fast and replace the introduction point.

This behaviour was introduced in commit 1125a4876b4.

Child Tickets

Change History (32)

comment:1 Changed 3 years ago by dgoulet

Status: newneeds_information

Hrm... failing 3 times in 3 seconds means that every circuit creation failed right away? In that case, "tor" might have more problems... But I think the behavior here should be that we open a circuit and then if it fails like in 10 seconds after, we note down the try and retry a second later. That sounds reasonable to me?

comment:2 in reply to:  1 Changed 3 years ago by teor

Replying to dgoulet:

Hrm... failing 3 times in 3 seconds means that every circuit creation failed right away? In that case, "tor" might have more problems... But I think the behavior here should be that we open a circuit and then if it fails like in 10 seconds after, we note down the try and retry a second later. That sounds reasonable to me?

Let's reword it to be something like:

When a circuit fails, we retry 10 seconds after we first detect the failure.

Then it's dynamic based on circuit failure time.

comment:3 Changed 3 years ago by teor

Status: needs_informationassigned

comment:4 Changed 3 years ago by teor

See #21621 for notes on the timing of the retries here: I think we should retry after 30 seconds to match the connection timeout (and avoid penalising slow hidden services).

comment:5 Changed 3 years ago by teor

(Oh, and we should randomise each interval between 0.5 and 1.5 times, to avoid thundering herds.)

comment:6 Changed 3 years ago by teor

Let's try that again: the maximum we'll ever get is the circuit timeout, so let's make it a random value in [10, 30].

Version 0, edited 3 years ago by teor (next)

comment:7 Changed 3 years ago by dgoulet

Sponsor: SponsorR-can

comment:8 Changed 2 years ago by teor

Owner: teor deleted

I will not have time to do this before the 0.3.1 code freeze.
It would be good if someone else fixed this bug in 0.3.1, because it affects hidden service reliability.

If you defer to 0.3.2, please reassign to me.

comment:9 Changed 2 years ago by dgoulet

Milestone: Tor: 0.3.1.x-finalTor: 0.3.2.x-final

comment:10 Changed 2 years ago by dgoulet

Owner: set to teor

comment:11 Changed 2 years ago by dgoulet

Keywords: prop224 added

comment:12 Changed 2 years ago by dgoulet

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

Still worth considering!

comment:13 Changed 20 months ago by teor

Milestone: Tor: 0.3.3.x-finalTor: 0.3.4.x-final

Moving most of my assigned tickets to 0.3.4

comment:14 Changed 19 months ago by teor

Owner: teor deleted

I'm not going to get time to do this in 0.3.4

comment:15 Changed 18 months ago by nickm

Keywords: 034-triage-20180328 added

comment:16 Changed 18 months ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:17 Changed 18 months ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: unspecified

These tickets, tagged with 034-removed-*, are no longer in-scope for 0.3.4. We can reconsider any of them, if time permits.

comment:18 Changed 17 months ago by arma

To be clear, this ticket is about the onion service retrying circuits to its already-announced intro points, so it can resume using these intro points, so clients won't be too impacted when e.g. the onion service loses its network connection?

I ask because #25882 seems to be thinking this ticket is about clients who access onion services launching too many requests.

comment:19 Changed 14 months ago by teor

Status: assignednew

Make everything that is assigned to no-one new again.

comment:20 Changed 13 months ago by teor

Keywords: 035-must added
Milestone: Tor: unspecifiedTor: 0.3.5.x-final

Let's look at this again in 0.3.5. In Tor 0.3.4, we made these callbacks happen a few times a second.

comment:21 Changed 12 months ago by nickm

Sponsor: SponsorR-can

comment:22 Changed 12 months ago by nickm

Keywords: 035-must removed

Worth doing, but a long-deferred ticket can't really be a "must" IMO.

comment:23 Changed 12 months ago by nickm

Priority: MediumHigh

comment:24 Changed 12 months ago by nickm

Owner: set to asn
Status: newassigned

comment:25 Changed 11 months ago by asn

Milestone: Tor: 0.3.5.x-finalTor: 0.3.6.x-final

No time for this in 035. Pushing to 036.

comment:26 Changed 11 months ago by asn

Priority: HighMedium

comment:27 Changed 11 months ago by nickm

Milestone: Tor: 0.3.6.x-finalTor: 0.4.0.x-final

Tor 0.3.6.x has been renamed to 0.4.0.x.

comment:28 Changed 7 months ago by asn

Milestone: Tor: 0.4.0.x-finalTor: 0.4.1.x-final

comment:29 Changed 4 months ago by nickm

Keywords: 041-should added

comment:30 Changed 4 months ago by asn

Milestone: Tor: 0.4.1.x-finalTor: 0.4.2.x-final

Defering to 042. We should make a plan here and fit it into 042. It's not really a bugfix for the frozen 041 release.

comment:31 Changed 4 months ago by asn

Keywords: 041-should removed

comment:32 Changed 5 days ago by nickm

Keywords: 042-deferred-20190918 added
Milestone: Tor: 0.4.2.x-finalTor: unspecified

Deferring various tickets from 0.4.2 to Unspecified.

Note: See TracTickets for help on using tickets.