Opened 9 months ago

Closed 6 months ago

#33009 closed defect (fixed)

sbws bandwidth scans should require a minimum exit bandwidth

Reported by: teor Owned by: gk
Priority: Medium Milestone: sbws: 1.1.x-final
Component: Core Tor/sbws Version: sbws: 1.1.0
Severity: Normal Keywords: sbws-majority-blocker, easy, intro, sbws-roadmap, GeorgKoppen202004
Cc: juga Actual Points: 2.5
Parent ID: #33121 Points: 1
Reviewer: juga Sponsor:

Description

When sbws is constructing a two-hop measurement circuit to run a test, it tries to pick an exit that has at least twice the consensus weight of the current relay-under-test:
https://github.com/torproject/sbws/blob/master/sbws/core/scanner.py#L216

So this means that in this case, sbws would have picked any exit that was not a BadExit, has an acceptable ExitPolicy, and has a consensus weight of at least, well, 2. That's not a lot.

As it turns out, something like 10% of exits have under a 600Kbyte/sec advertised bandwidth. So it seems pretty easy from this weight=1 bootstrap scenario to get paired with an exit that will give poor test results.

Perhaps bwauth path selection should also choose a testing pair from exits/relays with a certain absolute minimum of weight or advertised bandwidth?

Reported by Jimmy on tor-relays:
https://lists.torproject.org/pipermail/tor-relays/2020-January/018027.html

Child Tickets

Change History (14)

comment:1 Changed 9 months ago by teor

Keywords: easy intro added

Note: Torflow's partitions have a similar issue, but it's actually worse: a relay can get stuck in a low-bandwidth partition forever.

So perhaps this isn't strictly a blocker, because the new behaviour is eventually better. But the fix is so easy, we should just do it.

I suggest we only use the top 75% of exits as the second hop, which should remove the long tail of slow exits. (These slow exits are also more likely to fail to connect, and therefore fail the entire bandwidth test. So this change should also improve sbws efficiency.)

comment:2 Changed 9 months ago by gaba

Keywords: sbws-roadmap added

comment:3 Changed 9 months ago by gaba

Parent ID: #33121

The goal is to deploy sbws in all bw authorities. We need to fix critical bugs to do this.

comment:4 Changed 9 months ago by carlbordum

Hi there. I wrote a patch that adds minimum absolute bandwidth for the exit node. What do you suggest I set the value to?

comment:5 Changed 9 months ago by juga

i've not checked which consensus weight corresponds to 600Kbyte/sec advertised bandwidth.
i'm not sure what teor meant with "top 75% of exits", ie. which bandwidth value they think about there.
Looking https://metrics.torproject.org/rs.html#toprelays, showing 100 relays per page, it says "Top Relays by Consensus Weight", the bandwidth column says "Advertised bandwidth", the first relay in the 3rd page (~1/3 ~= 75%) shows 58.1 MiB/s.
We should look which is the actual consensus bandwidth of that 1st relay, if we want to use a hardcoded minimum value.
Instead of that i'd suggest, since bandwidth values changes and we choose exits based on consensus weight, is that:

  1. create a method in relaylist that order them by consensus weight. You could based it RelayList.exits, then order them sorted(self.relays, key=lambda r: r.consensus_bandwidth)
  2. calculate which would be the relay that corresponds to the 75% in that order
  3. use that value as the minimum value

Hope it helps!

Last edited 9 months ago by juga (previous) (diff)

comment:6 in reply to:  5 Changed 9 months ago by teor

Replying to juga:

i've not checked which consensus weight corresponds to 600Kbyte/sec advertised bandwidth.
i'm not sure what teor meant with "top 75% of exits", ie. which bandwidth value they think about there.
Looking https://metrics.torproject.org/rs.html#toprelays, showing 100 relays per page, it says "Top Relays by Consensus Weight", the bandwidth column says "Advertised bandwidth", the first relay in the 3rd page (~1/3 ~= 75%) shows 58.1 MiB/s.
We should look which is the actual consensus bandwidth of that 1st relay, if we want to use a hardcoded minimum value.
Instead of that i'd suggest, since bandwidth values changes and we choose exits based on consensus weight, is that:

  1. create a method in relaylist that order them by consensus weight. You could based it RelayList.exits, then order them sorted(self.relays, key=lambda r: r.consensus_bandwidth)
  2. calculate which would be the relay that corresponds to the 75% in that order
  3. use that value as the minimum value

Yes, I meant a calculated value.

Ideally, we should calculate the value each time we get a new consensus. If we recalculate it for every circuit, it might slow down sbws circuit building.

comment:7 Changed 7 months ago by teor

I have opened ticket #33775, which is about low relay measurements from sbws (but not torflow). It could be related to this issue.

comment:8 Changed 7 months ago by gk

Actual Points: 2.5
Reviewer: juga
Status: newneeds_review

bug_33009_v5 in my public sbws repo has two patches for review (I've tried to create a merge request for torproject/network-health/sbws maint-1.1 but failed so far; additionally the two patches are based on the branch for #30905 which needs revision, so maybe having a merge request does not make so much sense).

That said, while having this up for review I am still thinking about some meaningful integration test but thought a lack of that is not a blocker for initial review at least. In particular, given that our current bw authorities situation might make this bug fix more urgent to get deployed.

comment:9 Changed 7 months ago by gk

Owner: set to gk
Status: needs_reviewassigned

comment:10 Changed 7 months ago by gk

Status: assignedneeds_review

comment:11 Changed 7 months ago by gk

Keywords: GeorgKoppen202004 added

comment:12 Changed 7 months ago by juga

Status: needs_reviewmerge_ready

lgtm, just left comments on which were my thoughts

comment:13 Changed 6 months ago by juga

Merged

comment:14 Changed 6 months ago by juga

Resolution: fixed
Status: merge_readyclosed
Note: See TracTickets for help on using tickets.