Opened 12 months ago

Last modified 6 weeks ago

#28511 new defect

Limit the number of open testing circuits, and the total number of testing circuits

Reported by: teor Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: fast-fix, tor-bwauth, tor-dos, 035-backport, 029-backport, 041-proposed, needs-proposal, 033-unreached-backport
Cc: Actual Points:
Parent ID: #22453 Points:
Reviewer: Sponsor:

Description (last modified by teor)

Tor relays can open many more testing circuits than they need:

When Tor is doing its first ORPort reachability test, it initiates one testing circuit after the first successful circuit, then one testing circuit per second until the ORPort is found reachable. Then it gives up after 20 minutes. (1200 circuits is definitely too many.)

When tor receives any descriptor or consensus, it does another ORPort reachability test, and initiates a testing circuit.

When a testing circuit opens, and there aren't enough testing circuits to test bandwidth, then tor initiates another testing circuit.

When a testing circuit expires, tor doesn't stop opening testing circuits to replace it.

We should place a timeout on bandwidth testing (the same as reachability tests?), a limit on the number of in-progress and open testing circuits (NUM_PARALLEL_TESTING_CIRCS*3/2 ?), and a limit on the total number of testing circuits that tor will build over a certain time (NUM_PARALLEL_TESTING_CIRCS*3 an hour?).

We should also reduce the frequency of the initial ORPort testing circuit callback, so those circuits are spread out over the 20 minute ORPort testing interval.

We should be careful to make these limits apply to relays, but not authorities. Authorities need to test a large number of relays every hour.

Edit: suggest some limits

Child Tickets

Change History (13)

comment:1 Changed 12 months ago by teor

Description: modified (diff)

comment:2 Changed 12 months ago by teor

Description: modified (diff)

comment:3 Changed 12 months ago by teor

Keywords: fast-fix 034-backport 033-backport 029-backport added; 034-backport-maybe 033-backport-maybe 029-backport-maybe-not removed

One fast fix is to test ORPorts and DirPorts every 20 seconds, rather than every 1 (ORPort) or 5 (DirPort) seconds.

If a relay tries 60 internal circuits and 60 exit circuits, and all of them fail, it is almost certainly unreachable.

A lower number of tests would probably lead to some small error rate. (But relays do retry the tests after every new consensus, so rare errors are acceptable.)

comment:4 Changed 12 months ago by teor

Bandwidth testing won't start until 4 circuits are open. But circuits only expire after 10 minutes, so we expect to have 30 circuits open in that time.

comment:5 Changed 10 months ago by nickm

Keywords: 041-proposed added
Milestone: Tor: 0.4.0.x-finalTor: unspecified

comment:6 Changed 9 months ago by teor

Keywords: 033-backport removed

These open, non-merge_ready tickets can not get backported to 0.3.3, because 0.3.3 is now unsupported.

comment:7 Changed 9 months ago by teor

Keywords: 033-backport-unreached added

Hmm, I guess they should still get 033-backport-unreached

comment:8 Changed 8 months ago by neel

Cc: neel added
Owner: set to neel
Status: newassigned

comment:9 Changed 7 months ago by teor

Keywords: needs-proposal added

These tickets need a proposal before we write any code.

The two competing proposals in #22453 are:

  1. Remove the bandwidth self-test
  2. Increase the bandwidth self-test so it measures a reasonable amount of bandwidth

comment:10 Changed 5 months ago by nickm

Keywords: 034-backport removed

Removing 034-backport from all open tickets: 034 has reached EOL.

comment:11 Changed 3 months ago by teor

Keywords: 033-unreached-backport added; 033-backport-unreached removed

Fix 033-unreached-backport spelling.

comment:12 Changed 6 weeks ago by neel

Cc: neel removed
Owner: neel deleted

comment:13 Changed 6 weeks ago by neel

Status: assignednew
Note: See TracTickets for help on using tickets.