Opened 9 years ago

Closed 8 years ago

#1984 closed enhancement (fixed)

Bw Auths should penalize nodes for circ extend failures

Reported by: mikeperry Owned by: mikeperry
Priority: Medium Milestone:
Component: Core Tor/Torflow Version:
Severity: Keywords: MikePerryIterationFires20111106
Cc: aagbsn Actual Points: 3
Parent ID: Points: 3
Reviewer: Sponsor:

Description (last modified by mikeperry)

Right now we have about 50 extremely overloaded guard nodes (the Pandora* set) that are failing TLS connections, dir connections, and just about everything else, due to maxing out their CPU load on crypto.

However, when they do manage to actually rarely complete a circuit, they have huge bandwidth capacity available.

What we should do is assign a measurement of 0 every time we try to use a node as a first hop, but it fails to accept our extend.

We can try to do this to the 2nd hop too, but that is less reliable, since it won't be clear if that extend failed because the 1st hop sucks or if 2nd hop is actually broken... We could ensure that each exit is measured at least twice as an entry, or something, to improve this property (maybe).

We may want to ensure that each exit is measured at least N times as an entry anyways (for N=1 or 2).

Child Tickets

Change History (11)

comment:1 Changed 9 years ago by mikeperry

We should also spend some time thinking if 0 is the right number here. It's not clear if there's any other good choices, though... We have no idea exactly how much load is causing these things to become overloaded to the point of failure, it's just a binary thing..

comment:2 Changed 8 years ago by mikeperry

Description: modified (diff)

comment:3 Changed 8 years ago by mikeperry

Description: modified (diff)

comment:4 Changed 8 years ago by mikeperry

Description: modified (diff)

comment:5 Changed 8 years ago by mikeperry

Type: defectenhancement

comment:6 Changed 8 years ago by mikeperry

The first test here is to export failure rate information for first and second hops and take a look at it..

comment:7 Changed 8 years ago by mikeperry

Cc: aagbsn added

The bw auths do not currently use exits as the first op, so if we only counted failures for the first hop we would never see exit overload of this nature.

comment:8 Changed 8 years ago by mikeperry

Keywords: MikePerryIterationFires2011106 added

Yeah, we're going to have to fix this ASAP or #1976 is going to destroy the Tor network :/.

comment:9 Changed 8 years ago by mikeperry

Keywords: MikePerryIterationFires20111106 added; MikePerryIterationFires2011106 removed

comment:10 Changed 8 years ago by mikeperry

My plan for this is to count circ failures against the node that was being extended to, and stream failures against the exit.

The current plan is to make each of these failures count as a 0 measurement for that node, and not count as a measurement at all for the other node in the path.

comment:11 Changed 8 years ago by mikeperry

Actual Points: 3
Points: 3
Resolution: fixed
Status: newclosed
Summary: Bw Auths should assign 0 bw to first hops that failBw Auths should penalize nodes for circ extend failures

The plumbing was already here for this one, but it was a little tricky to get the dampening right. We will still need to watch it (in #4425).

Note: See TracTickets for help on using tickets.