A 304 "Not Modified" should update the time to when we next expect a modification

changed milestone to %Tor: unspecified

added component::core tor/tor milestone::Tor: unspecified points::0.5 priority::medium regression retry severity::normal status::new tor-client tor-directory-protocol type::defect labels

Trac:
Parent: N/A to #20499 (moved)

This also affects #19969 (moved)

Here's my analysis:

On bootstrap:

download_status_increment_attempt increments the schedule for each attempt
each attempt causes the delay to be increased exponentially (rather than using the actual hard-coded Bootstrap schedules)

After downloading the consensus:

download_status_increment_failure doesn't increase the schedule on 503 (server unavailable), even though it probably should, rather than retrying immediately
download_status_increment_failure increases the schedule exponentially on 304 (not modified), or perhaps doesn't increase the schedule at all (see #20499 (moved)), even though it should probably only increase it up to the next time we expect the document to be modified (1 hour)
download_status_schedule_get_delay uses the schedule to increase the backoff, if the schedule isn't increased, the backoff isn't either (rather than using the actual hard-coded Bootstrap schedules)

We took schedules carefully tuned in 0.2.8 to make sure that it could survive 7 relay failures and still bootstrap in 30 seconds with 99.9% reliability, and implemented exponential backoff in 0.2.9 in a way that causes retries 5 times in 10 seconds in some cases, and in other cases retries twice in the first 30 seconds.

I don't think this is easy to fix, so it shouldn't go in 0.2.9.

There are far too many edge cases here - what happens when the client's clock is wrong, or if a relay lies (or is wrong) about the document not being modified?

Trac:
Milestone: Tor: 0.2.9.x-final to Tor: 0.3.0.x-final

Replying to teor:

After downloading the consensus:

download_status_increment_failure doesn't increase the schedule on 503 (server unavailable), even though it probably should, rather than retrying immediately

Part of my code review in #20499 (moved).

download_status_schedule_get_delay uses the schedule to increase the backoff, if the schedule isn't increased, the backoff isn't either (rather than using the actual hard-coded Bootstrap schedules)

Part of my code review in #20499 (moved).

Replying to teor:

We took schedules carefully tuned in 0.2.8 to make sure that it could survive 7 relay failures and still bootstrap in 30 seconds with 99.9% reliability, and implemented exponential backoff in 0.2.9 in a way that causes retries 5 times in 10 seconds in some cases, and in other cases retries twice in the first 30 seconds.

I'm going to suggest some schedule tweaks in #20534 (moved).

I can't see anything else in this bug that we would ever fix. Perhaps someone else can work out how to deal with 304s sensibly.

Trac:
Parent: #20499 (moved) to N/A

Triaged out on December 2016 from 030 to 031.

Trac:
Milestone: Tor: 0.3.0.x-final to Tor: 0.3.1.x-final
Keywords: N/A deleted, triage-out-030-201612 added

A bridge operator has reported an upload/download disparity that may result from an 0.2.9.8 bridge repeatedly trying to download a microdesc consensus even though it gets 304 statuses. (The ns consensus is only downloaded occasionally.)

Deferring all 0.3.1 tickets with status == new, owner == nobody, sponsor == nobody, points > 0.5, and priority < high.

I'd still take patches for most of these -- there's just nobody currently lined up to work on them in this timeframe.

Trac:
Keywords: N/A deleted, triaged-out-20170308 added
Milestone: Tor: 0.3.1.x-final to Tor: unspecified

Trac:
Keywords: triage-out-030-201612 deleted, N/A added

Trac:
Keywords: triaged-out-20170308 deleted, tor-client tor-directory-protocol retry added

changed time estimate to 4h

moved to tpo/core/tor#20535 (closed)

A 304 "Not Modified" should update the time to when we next expect a modification

Child items 0

Activity