Relays should regularly do a larger bandwidth self-test

changed milestone to %Tor: unspecified

added 033-unreached-backport-maybe 034-removed-20180328 034-triage-20180328 034-unreached-backport-maybe 035-backport component::core tor/tor milestone::Tor: unspecified needs-proposal owner::juga priority::medium reviewer-was-teor-20190422 severity::normal status::needs-information tor-bwauth type::defect labels

The competing proposal here would be to raise the 500KB number (the total bandwidth we push in our self-test) by an order of magnitude. That way relays who test well would publish a respectable 5000/10 = 500 in their self-declared consensus weight.

But I think, so long as we keep the bwauths, that self-declared numbers aren't worth the extra code and design hassle.

It's useful as a fallback if we lose 2 bwauths, which almost happened yesterday. And also, the bwauths depend on this test to provide their ratios between the observed and measured bandwidths.

The reason it's set to 500kB is that's the circuit window (so it's not impacted by latency). I think we should use MIN_FAST_BW instead, so every new relay tests if it can be Fast.

Replying to teor:

The reason it's set to 500kB is that's the circuit window (so it's not impacted by latency).

No, the reason it's 500KB is because the threshold for Fast used to be 50KB, and 50KB times 10 seconds = 500KB.

Remember that we're sending the 500KB over 4 circuits, so it's well under the circuit window, per circuit.

/** Number of testing circuits we want open before testing our bandwidth. */
#define NUM_PARALLEL_TESTING_CIRCS 4

Replying to teor:

It's useful as a fallback if we lose 2 bwauths, which almost happened yesterday. And also, the bwauths depend on this test to provide their ratios between the observed and measured bandwidths.

No, I think measuring and self-reporting bandwidth is useful for both the reasons you describe, but doing a little 500KB blitz of traffic at start time is not really helpful for either of those things.

Ok, so should we send AuthDirFastGuarantee*10 (1MB) instead? (Enough to get the Fast flag?)

Or AuthDirGuardGuarantee*10 (20MB)? (Enough to get the guard flag?)

Trac:
Milestone: Tor: 0.3.2.x-final to Tor: 0.3.4.x-final

Trac:
Parent: N/A to #24499 (moved)

Trac:
Keywords: N/A deleted, 034-triage-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

Trac:
Keywords: N/A deleted, 034-removed-20180328 added

These tickets, tagged with 034-removed-*, are no longer in-scope for 0.3.4. We can reconsider any of them, if time permits.

Trac:
Milestone: Tor: 0.3.4.x-final to Tor: unspecified

Replying to teor:

Ok, so should we send AuthDirFastGuarantee*10 (1MB) instead? (Enough to get the Fast flag?)

Or AuthDirGuardGuarantee*10 (20MB)? (Enough to get the guard flag?)

Let's regularly (every 12-24 hours, at random) send a bandwidth self-test of AuthDirGuardGuarantee*10 bytes, but make sure it times out after 30-60 seconds to avoid saturating relays on small links.

Trac:
Summary: We should rip out the bandwidth self-test to Relays should regularly do a larger bandwidth self-test

Trac:
Cc: N/A to juga

Trac:
Parent: #24499 (moved) to 25925

Trac:
Parent: 25925 to #25925 (moved)

Trac:
Cc: juga to N/A
Status: new to assigned
Owner: N/A to juga
Keywords: N/A deleted, tor-bwauth added

Diving into the code i found that the function where the relay bandwidth is compared with options->AuthDirGuardBWGuarantee is in set_routerstatus_from_routerinfo [0].

To obtain the relay bandwidth, that function calls dirserv_get_credible_bandwidth_kb [1] which seems to be using either the measured bandwidth (by bwauths) or the self advertised bandwidth, but not any self measured bandwidth.

If i understand it correctly, the self measured bandwidth (bandwidthcapacity) is obtained by rep_hist_bandwidth_assess [2].

The function that is making use of the circuits to test the bandwidth is router_perform_bandwidth_test [3] but i don't see how it is storing the results or how is related with the previous.

So, should dirserv_get_credible_bandwidth_kb make use of the bandwidth calculated by self tests?, which are the self tests here to be used?

Where in the code should go the times out after 30-60?, and the 12-24 hours, at random?

[0] https://gitweb.torproject.org/tor.git/tree/src/or/dirserv.c#n2058 [1] https://gitweb.torproject.org/tor.git/tree/src/or/dirserv.c#n1780 [2] https://gitweb.torproject.org/tor.git/tree/src/or/rephist.c#n1207 [3] https://gitweb.torproject.org/tor.git/tree/src/or/router.c#n1626

Replying to juga:

Diving into the code i found that the function where the relay bandwidth is compared with options->AuthDirGuardBWGuarantee is in set_routerstatus_from_routerinfo [0].

You should make the default value of AuthDirGuardBWGuarantee into a #define, and use it when you work out how much data to send during a relay self-test.

To obtain the relay bandwidth, that function calls dirserv_get_credible_bandwidth_kb [1] which seems to be using either the measured bandwidth (by bwauths) or the self advertised bandwidth, but not any self measured bandwidth.

You should not change the directory authority code, because this ticket is about relay self-tests.

If i understand it correctly, the self measured bandwidth (bandwidthcapacity) is obtained by rep_hist_bandwidth_assess [2].

The function that is making use of the circuits to test the bandwidth is router_perform_bandwidth_test [3] but i don't see how it is storing the results or how is related with the previous.

So, should dirserv_get_credible_bandwidth_kb make use of the bandwidth calculated by self tests?, which are the self tests here to be used?

When a relay performs a bandwidth self-test, the bandwidth usage from the self-test is recorded in the bandwidth history. Then every 24 hours, the relay puts the highest bandwidth from its history in its descriptor as the observed bandwidth. Then torflow reads the descriptor, and uses the observed bandwidth to calculate the consensus weight.

Where in the code should go the times out after 30-60?

When you start the test, schedule a timer for 30-60 seconds that cancels the test of it is still going.

and the 12-24 hours, at random?

When the relay completes a bandwidth self-test, you should create a timer that changes the bandwidth self-test flag from "completed" to "do a self-test" in 12-24 hours. If the self-test times out, you should schedule the next test in 1-2 hours.

[0] https://gitweb.torproject.org/tor.git/tree/src/or/dirserv.c#n2058 [1] https://gitweb.torproject.org/tor.git/tree/src/or/dirserv.c#n1780 [2] https://gitweb.torproject.org/tor.git/tree/src/or/rephist.c#n1207 [3] https://gitweb.torproject.org/tor.git/tree/src/or/router.c#n1626

I still think there's a strong argument for simplifying our design and code and ripping out the bandwidth self-tests. So long as there are bwauths of any sort running, they will produce load on the relays, which will have the same effect as the self-test. And if there aren't bwauths running, then either the existing client traffic is enough to make relays realize they are fast or it isn't, but the tiny amount of traffic from the self-test isn't going to help that much.

i thought that the descriptor observed bandwidth (used by bwauths) comes only from the self-test, but arma comments in irc:

the descriptor observed bandwidth comes from all of the use of the relay. a little bit of that use is the bandwidth self-test. more of it is bwauth load. and ideally, most of it is from actual clients

so then maybe it can be removed

Replying to arma:

I still think there's a strong argument for simplifying our design and code and ripping out the bandwidth self-tests. So long as there are bwauths of any sort running, they will produce load on the relays, which will have the same effect as the self-test. And if there aren't bwauths running, then either the existing client traffic is enough to make relays realize they are fast or it isn't, but the tiny amount of traffic from the self-test isn't going to help that much.

I still think there's a strong argument for doing a larger bandwidth self-test.

On the public network, torflow's partitions give some relays a ridiculously small amount of bandwidth. Until the relay picks up traffic from elsewhere, it is stuck in the small partition. I have literally had to use EntryNodes to run fake client traffic through relays to get them unstuck. Removing the small test is risky, because it makes the problem worse.

On test networks, there may not be any bandwidth scanners. So the bandwidth self-test should be on in test networks (and it should be larger).

I don't think sbws suffers from similar issues, so once all the torflow scanners are gone, we can disable the bandwidth self-test code. Until then, it should be on and larger.

Replying to arma:

Replying to teor:

The reason it's set to 500kB is that's the circuit window (so it's not impacted by latency).

No, the reason it's 500KB is because the threshold for Fast used to be 50KB, and 50KB times 10 seconds = 500KB.

Remember that we're sending the 500KB over 4 circuits, so it's well under the circuit window, per circuit.
/** Number of testing circuits we want open before testing our bandwidth. */
#define NUM_PARALLEL_TESTING_CIRCS 4
}}}

If i understand correctly the code [0]:

{{{ int num_cells = (int)(get_options()->BandwidthRate * 10 / CELL_MAX_NETWORK_SIZE); int max_cells = num_cells < CIRCWINDOW_START ? num_cells : CIRCWINDOW_START; int cells_per_circuit = max_cells / num_circs;


the 500KB were due to `CIRCWINDOW_START` (1000), the 1st line was not having effect.

Do we want that the first test to send only 500KB (and then AuthDirGuardBWGuarantee*10 in the consecutives ones)?. `max_cells` then it just needs a condition.

I'm not sure why there was `BandwidthRate` there, that's 1GB by default (bigger than AuthDirGuardBWGuarantee), and i think the 1st line was also missing to divide by the number of circuits (not only the cell size).

The 10 in the first line is there because of the 10 seconds?

[0] https://gitweb.torproject.org/tor.git/tree/src/feature/relay/selftest.c?h=release-0.3.5#n278

Where in the code should go the times out after 30-60?

When you start the test, schedule a timer for 30-60 seconds that cancels the test of it is still going.

The cells are queued by [0]:

      if (relay_send_command_from_edge(0, TO_CIRCUIT(circ),
                                       RELAY_COMMAND_DROP,
                                       NULL, 0, circ->cpath->prev)<0) {
        return; /* stop if error */

And there is an event that expire old circuits every 11 seconds [1]

  /* every 11 seconds, so not usually the same second as other such events */
  circuit_expire_old_circuits_serverside(now);
  return 11;

So, i guess that would also clean the circuits created by the bandwidth test?

[0] https://gitweb.torproject.org/tor.git/tree/src/feature/relay/selftest.c?h=release-0.3.5#n294 [1] https://gitweb.torproject.org/tor.git/tree/src/core/mainloop/mainloop.c?h=release-0.3.5#n2340

Replying to juga:

If i understand correctly the code [0]:

  int num_cells = (int)(get_options()->BandwidthRate * 10 /
                        CELL_MAX_NETWORK_SIZE);
  int max_cells = num_cells < CIRCWINDOW_START ?
                    num_cells : CIRCWINDOW_START;
  int cells_per_circuit = max_cells / num_circs;
}}}

the 500KB were due to `CIRCWINDOW_START` (1000), the 1st line was not having effect.

Do we want that the first test to send only 500KB (and then AuthDirGuardBWGuarantee*10 in the consecutives ones)?. `max_cells` then it just needs a condition.

Doing something different the first time makes the code more complicated. We want to get a good observed bandwidth from new relays and slow relays. How does running a small first test help?

I'm not sure why there was BandwidthRate there, that's 1GB by default (bigger than AuthDirGuardBWGuarantee)

If a relay has a small bandwidth rate, this code limits the self-test bandwidth to that rate.

and i think the 1st line was also missing to divide by the number of circuits (not only the cell size).

The division by the number of circuits is done later: {{{ int cells_per_circuit = max_cells / num_circs;


> The 10 in the first line is there because of the 10 seconds?

Yes.

> [0] https://gitweb.torproject.org/tor.git/tree/src/feature/relay/selftest.c?h=release-0.3.5#n278

Replying to [juga](#note_22):
> > > Where in the code should go the `times out after 30-60`?
> > 
> > When you start the test, schedule a timer for 30-60 seconds that cancels the test of it is still going.
> 
> The cells are queued by [0]:
> ```
>       if (relay_send_command_from_edge(0, TO_CIRCUIT(circ),
>                                        RELAY_COMMAND_DROP,
>                                        NULL, 0, circ->cpath->prev)<0) {
>         return; /* stop if error */
> }}}

Yes.

> And there is an event that expire old circuits every 11 seconds [1]
> 
> {{{
>   /* every 11 seconds, so not usually the same second as other such events */
>   circuit_expire_old_circuits_serverside(now);
>   return 11;
> }}}
> 
> So, i guess that would also clean the circuits created by the bandwidth test?

No, this event is only for non-origin circuits:
https://github.com/torproject/tor/blob/b058f64cc002b44e6dd48616ca3163a01c3f3e14/src/core/or/circuituse.c#L1551

The self-test circuits are origin circuits, so you want circuit_expire_old_circs_as_needed(), which calls circuit_expire_old_circuits_clientside():
https://github.com/torproject/tor/blob/b058f64cc002b44e6dd48616ca3163a01c3f3e14/src/core/or/circuituse.c#L1345
https://github.com/torproject/tor/blob/b058f64cc002b44e6dd48616ca3163a01c3f3e14/src/core/or/circuituse.c#L1459

circuit_expire_old_circuits_clientside() expires bandwidth testing circuits 10 minutes after they start testing. That is ok. We don't need to change anything.

> [0] https://gitweb.torproject.org/tor.git/tree/src/feature/relay/selftest.c?h=release-0.3.5#n294
> [1] https://gitweb.torproject.org/tor.git/tree/src/core/mainloop/mainloop.c?h=release-0.3.5#n2340

> Where in the code should go ... the 12-24 hours, at random?

The regular bandwidth test code already exists, but we need to change the "bandwidth is low" check from 51200 to AuthDirGuardBWGuarantee:

{{{
      if (!first_time && me &&
          me->bandwidthcapacity < me->bandwidthrate &&
          me->bandwidthcapacity < 51200) {
        reset_bandwidth_test();
      }

https://gitweb.torproject.org/tor.git/tree/src/core/mainloop/mainloop.c?h=release-0.3.5#n2283

Replying to teor:

Replying to juga:

If i understand correctly the code [0]:

{{{ int num_cells = (int)(get_options()->BandwidthRate * 10 / CELL_MAX_NETWORK_SIZE); int max_cells = num_cells < CIRCWINDOW_START ? num_cells : CIRCWINDOW_START; int cells_per_circuit = max_cells / num_circs; }}}

the 500KB were due to CIRCWINDOW_START (1000), the 1st line was not having effect.

Do we want that the first test to send only 500KB (and then AuthDirGuardBWGuarantee*10 in the consecutives ones)?. max_cells then it just needs a condition.

Doing something different the first time makes the code more complicated. We want to get a good observed bandwidth from new relays and slow relays. How does running a small first test help?

Then there should not be the CIRCWINDOW_START condition because it limits the number of the cells to 1000.

I'm not sure why there was BandwidthRate there, that's 1GB by default (bigger than AuthDirGuardBWGuarantee)

If a relay has a small bandwidth rate, this code limits the self-test bandwidth to that rate.

Then the number of cells should be calculated using min(BandwidthRate, AuthDirGuardBWGuarantee), not only BandwidthRate?. Otherwise we will be sending 10GB if BandwidthRate has not been modified.

and i think the 1st line was also missing to divide by the number of circuits (not only the cell size).

The division by the number of circuits is done later: {{{ int cells_per_circuit = max_cells / num_circs; }}}

The 10 in the first line is there because of the 10 seconds?

Yes.

[0] https://gitweb.torproject.org/tor.git/tree/src/feature/relay/selftest.c?h=release-0.3.5#n278

Replying to juga:

Where in the code should go the times out after 30-60?

When you start the test, schedule a timer for 30-60 seconds that cancels the test of it is still going.

The cells are queued by [0]: {{{ if (relay_send_command_from_edge(0, TO_CIRCUIT(circ), RELAY_COMMAND_DROP, NULL, 0, circ->cpath->prev)<0) { return; /* stop if error */ }}}

Yes.

And there is an event that expire old circuits every 11 seconds [1]

{{{ /* every 11 seconds, so not usually the same second as other such events */ circuit_expire_old_circuits_serverside(now); return 11; }}}

So, i guess that would also clean the circuits created by the bandwidth test?

No, this event is only for non-origin circuits: https://github.com/torproject/tor/blob/b058f64cc002b44e6dd48616ca3163a01c3f3e14/src/core/or/circuituse.c#L1551

The self-test circuits are origin circuits, so you want circuit_expire_old_circs_as_needed(), which calls circuit_expire_old_circuits_clientside(): https://github.com/torproject/tor/blob/b058f64cc002b44e6dd48616ca3163a01c3f3e14/src/core/or/circuituse.c#L1345 https://github.com/torproject/tor/blob/b058f64cc002b44e6dd48616ca3163a01c3f3e14/src/core/or/circuituse.c#L1459

circuit_expire_old_circuits_clientside() expires bandwidth testing circuits 10 minutes after they start testing. That is ok. We don't need to change anything.

[0] https://gitweb.torproject.org/tor.git/tree/src/feature/relay/selftest.c?h=release-0.3.5#n294 [1] https://gitweb.torproject.org/tor.git/tree/src/core/mainloop/mainloop.c?h=release-0.3.5#n2340

Where in the code should go ... the 12-24 hours, at random?

The regular bandwidth test code already exists, but we need to change the "bandwidth is low" check from 51200 to AuthDirGuardBWGuarantee:

{{{ if (!first_time && me && me->bandwidthcapacity < me->bandwidthrate && me->bandwidthcapacity < 51200) { reset_bandwidth_test(); } }}}

https://gitweb.torproject.org/tor.git/tree/src/core/mainloop/mainloop.c?h=release-0.3.5#n2283

I agree. I think you're good to start work. Let me know if you have more questions.

https://github.com/torproject/tor/pull/520

Trac:
Status: assigned to needs_review

Trac:
Milestone: Tor: unspecified to Tor: 0.4.0.x-final

Replying to teor:

Replying to arma:

I still think there's a strong argument for simplifying our design and code and ripping out the bandwidth self-tests. So long as there are bwauths of any sort running, they will produce load on the relays, which will have the same effect as the self-test. And if there aren't bwauths running, then either the existing client traffic is enough to make relays realize they are fast or it isn't, but the tiny amount of traffic from the self-test isn't going to help that much.

I still think there's a strong argument for doing a larger bandwidth self-test.

On the public network, torflow's partitions give some relays a ridiculously small amount of bandwidth. Until the relay picks up traffic from elsewhere, it is stuck in the small partition. I have literally had to use EntryNodes to run fake client traffic through relays to get them unstuck. Removing the small test is risky, because it makes the problem worse.

On test networks, there may not be any bandwidth scanners. So the bandwidth self-test should be on in test networks (and it should be larger).

I don't think sbws suffers from similar issues, so once all the torflow scanners are gone, we can disable the bandwidth self-test code. Until then, it should be on and larger.

I am not sure if sbws suffers from similar issues, but it does need accurate self-reported observed relay bandwidths.

And bridges need self-measured bandwidths, because they don't have a bandwidth scanner:

This is especially important for bridges, since they might go long periods without much use.

https://github.com/torproject/tor/blob/12175987fc744f5ca6559a821867631457911451/src/core/mainloop/mainloop.c#L2271

We should consider how far we backport this fix. There's a strong case for backporting it to 0.3.5, because we'll be supporting 0.3.5 until 2022. I'm not so sure about the other releases, because we'll only be supporting them for about another year.

Trac:
Keywords: N/A deleted, 029-backport-maybe-not, 035-backport, 033-backport-maybe, 034-backport-maybe added
Type: enhancement to defect

I did a review on this ticket - it looks good, but we could make a few improvements.

The code we're changing is old and has a few bugs. I found more during the review, and opened some child tickets.

Trac:
Status: needs_review to needs_revision
Reviewer: N/A to teor

Replying to teor:

I don't think sbws suffers from similar issues, so once all the torflow scanners are gone, we can disable the bandwidth self-test code. Until then, it should be on and larger.

I don't think we need to wait for all of the torflows to be gone. Isn't it enough to have one sbws scanner? Since that scanner would generate appropriate load on relays? And if we're worried that one isn't a stable situation, we should work on getting a second one, rather than messing with this design?

I am not sure if sbws suffers from similar issues, but it does need accurate self-reported observed relay bandwidths.

Nobody is proposing to rip out the "relays passively observe how much traffic they carry and publish their capacity estimate in their descriptor" design here. So, sounds good, sbws will still get the same thing we have now in terms of accurate self-reported observed relay bandwidths.

And bridges need self-measured bandwidths, because they don't have a bandwidth scanner:

I think we need to figure out a use for bridge bandwidth values before we worry about how to make them a bit more accurate. For bridges that don't see much use anyway, having a bigger number or not isn't going to be a big change, right?

To be more concrete, I think approximately nothing uses the bridge descriptor bandwidth values -- I have a dim memory in the distant past of choosing by weight but capping the weight at like 100KBytes so we don't send too much traffic toward a tiny number of bridges. But now in select_and_add_guard_item_for_sample() I see that we just call smartlist_choose() (i.e. uniformly) when sampling a new bridge.

This is especially important for bridges, since they might go long periods without much use.

https://github.com/torproject/tor/blob/12175987fc744f5ca6559a821867631457911451/src/core/mainloop/mainloop.c#L2271

I wrote that comment (commit 43dce232), back when we were going to do a lot more with the bridge design than we ended up doing -- and I now think the comment is wrong. :)

Replying to arma:

Replying to teor:

I don't think sbws suffers from similar issues, so once all the torflow scanners are gone, we can disable the bandwidth self-test code. Until then, it should be on and larger.

I don't think we need to wait for all of the torflows to be gone. Isn't it enough to have one sbws scanner? Since that scanner would generate appropriate load on relays? And if we're worried that one isn't a stable situation, we should work on getting a second one, rather than messing with this design?

Currently sbws prioritizes measuring relays for which it doesn't have results, it probably should then prioritize the relays with less uptime and not being measured according to the consensus. I can try to figure out how long it takes currently to sbws to measure those relays. If any of this sounds like a good idea, then i'll create tickets for sbws.

Replying to juga:

Currently sbws prioritizes measuring relays for which it doesn't have results, it probably should then prioritize the relays with less uptime and not being measured according to the consensus. I can try to figure out how long it takes currently to sbws to measure those relays. If any of this sounds like a good idea, then i'll create tickets for sbws.

Yes, sbws should prioritise relays:

with no sbws measurements, and
with no measured bandwidth in the consensus

I am not so sure about uptime.

Replying to arma:

Replying to teor:

I don't think sbws suffers from similar issues, so once all the torflow scanners are gone, we can disable the bandwidth self-test code. Until then, it should be on and larger.

I don't think we need to wait for all of the torflows to be gone. Isn't it enough to have one sbws scanner? Since that scanner would generate appropriate load on relays? And if we're worried that one isn't a stable situation, we should work on getting a second one, rather than messing with this design?

== Typical sbws measurement times

Let's try to define "appropriate load" more precisely.

When sbws measures a relay, it starts by pairing it with a randomly chosen relay. Then it takes the median of all the measurements. So, on average, sbws starts new fast relays with the median relay bandwidth. Once a relay has a bandwidth measurement, sbws tries to pair it with relays that have twice its bandwidth. So sbws can achieve 2x exponential growth every 5 days, from a base of the median relay bandwidth.

The median relay bandwidth is 20 megabits a second: https://metrics.torproject.org/advbwdist-relay.html?start=2018-08-21&end=2018-11-19&n=3000

The fastest relay bandwidth is 1500 megabits a second: https://metrics.torproject.org/advbwdist-relay.html?start=2018-08-21&end=2018-11-19&n=1

Therefore, it would take log,,2,,(1500 / 20)*5 = 31 days for sbws to get an accurate bandwidth for the fastest relay, if there was no client traffic.

More realistically, the top 10% of relays are at 125 megabits per second: https://metrics.torproject.org/advbwdist-relay.html?start=2018-08-21&end=2018-11-19&n=500

Therefore, it would take log,,2,,(125 / 20)*5 = 13 days for sbws to get an accurate bandwidth for most (90%) of relays, if there was no client traffic.

Do you think that's ok?

== Asking other developers

I also think we need to open this discussion up to a wider audience. I'll ask for opinions in today's network team meeting. If there's no consensus, let's use the tor proposals process to decide.

I wrote a tor-dev email with the questions I think we need to answer: https://lists.torproject.org/pipermail/tor-dev/2018-November/013546.html

If we decide to remove bandwidth self-tests, then we can close this ticket, and every child ticket except #28511 (moved). (Failed ORPort reachability tests open 1200 circuits, and failed DirPort tests open 240 circuits: that's too many.)

Trac:
Status: needs_revision to needs_information

Replying to teor:

Yes, sbws should prioritise relays:

with no sbws measurements, and

with no measured bandwidth in the consensus

I am not so sure about uptime.

Created #28519 (moved) for this.

If we decide we need this test, Nick suggested this design:

new relays should start their bandwidth tests at the 10th percentile of observed bandwidths in all relay descriptors
relays should send traffic for 11 seconds based on their previous observed bandwidth

I also think:

we should use some small multiple (2x?) of the previous observed bandwidth
there should be a minimum and maximum bandwidth, probably min(1st percentile, minimum relay bandwidth) and min(90th percentile, maximum believable bandwidth)

Trac:
Description: Inspired by #8247 (moved) ("In sum. a vestigial tiny bw self-test seems silly to keep around"), I wonder if we're at the point where we can just take out all the bandwidth self-test infrastructure.

In favor of ripping it out: there's some complexity at relay startup where we try to delay publishing our descriptor until we've done the self-test, since we know we'll have a newer bandwidth number to include soon. We've had bugs in this delay step.

In favor of ripping it out: in the current design we try to build 4 separate circuits, without using our guards in order to have actually independent paths, for pushing our 500KB. Relays that aren't reachable end up with hundreds or thousands of connections open, because they keep making new circuits and each one probably is to a new relay. Not a big deal but kind of unfortunate.

In favor of ripping it out: 50KB, which is the most that the current bandwidth test can tell you, is super tiny compared to current descriptor bandwidths and current consensus weights. In fact, as prophesied in #8247 (moved), the threshold for the Fast flag is now above 50KB, so publishing 0 vs 50 is essentially just moving you around within the "don't use, they're too slow" bucket.

In favor of keeping it: maybe the bandwidth authorities have some sort of psychotic behavior in the face of relays that have a 0 in their descriptor? Like, they multiply the 0 by a factor for how much better than the other 0's they are? I have no idea. In case they do, I propose that we proceed with ripping out the self-test, and simply replace it with the number "20KB" to guard against psychotic bwauth behavior. (I picked that number because the directory authorities already use the number 20 when assigning a weight to a relay that (A) is unmeasured and (B) self-declares at least 20KB in its descriptor.)

But what about bridges, you might ask? Public relays might have the bwauths to measure them remotely, but bridges don't have that. I think nothing uses the bandwidths in bridge descriptors. Are there any plans for that to change in the future? Even if there are, I think raising the floor from 0 to 20, and retaining the behavior where we publish a bigger number if we actually see a bigger number due to client load, should make us compatible with whatever these plans might be.

to

Inspired by #8247 (moved) ("In sum. a vestigial tiny bw self-test seems silly to keep around"), I wonder if we're at the point where we can just take out all the bandwidth self-test infrastructure.

In favor of ripping it out: there's some complexity at relay startup where we try to delay publishing our descriptor until we've done the self-test, since we know we'll have a newer bandwidth number to include soon. We've had bugs in this delay step.

In favor of ripping it out: in the current design we try to build 4 separate circuits, without using our guards in order to have actually independent paths, for pushing our 500KB. Relays that aren't reachable end up with hundreds or thousands of connections open, because they keep making new circuits and each one probably is to a new relay. Not a big deal but kind of unfortunate.

In favor of ripping it out: 50KB, which is the most that the current bandwidth test can tell you, is super tiny compared to current descriptor bandwidths and current consensus weights. In fact, as prophesied in #8247 (moved), the threshold for the Fast flag is now above 50KB, so publishing 0 vs 50 is essentially just moving you around within the "don't use, they're too slow" bucket.

In favor of keeping it: maybe the bandwidth authorities have some sort of psychotic behavior in the face of relays that have a 0 in their descriptor? Like, they multiply the 0 by a factor for how much better than the other 0's they are? I have no idea. In case they do, I propose that we proceed with ripping out the self-test, and simply replace it with the number "20KB" to guard against psychotic bwauth behavior. (I picked that number because the directory authorities already use the number 20 when assigning a weight to a relay that (A) is unmeasured and (B) self-declares at least 20KB in its descriptor.)

Note: if we do keep it in, here's a better design: https://trac.torproject.org/projects/tor/ticket/22453#comment:35

But what about bridges, you might ask? Public relays might have the bwauths to measure them remotely, but bridges don't have that. I think nothing uses the bandwidths in bridge descriptors. Are there any plans for that to change in the future? Even if there are, I think raising the floor from 0 to 20, and retaining the behavior where we publish a bigger number if we actually see a bigger number due to client load, should make us compatible with whatever these plans might be.

Hi juga, arma,

Nick and I spoke this morning in the patch party meeting.

He suggested that we deploy sbws first, then make changes to relay bandwidth self-tests. But if there are critical bugs, we should fix those bugs.

Core Tor/Tor tickets:

#22453 (moved) Relays should regularly do a larger bandwidth self-test

We won't know if we need bandwidth self-tests until we deploy sbws. Defer this ticket until 0.4.1 or later.

#28509 (moved) Limit relay bandwidth self-tests based on RelayBandwidthRate, not BandwidthRate

This is a fast-fix ticket that has almost no impact

#28510 (moved) When relays reset bandwidth tests, the test waits until the next directory document is received

We won't know if we need bandwidth self-tests until we deploy sbws. Defer this ticket until 0.4.1 or later.

#28511 (moved) Limit the number of open testing circuits, and the total number of testing circuits

This is a potential DoS vector which we should fix. One fast fix is simply to reduce the rate of testing.

#28512 (moved) Increase NUM_PARALLEL_TESTING_CIRCS to avoid circuit windows

We won't know if we need bandwidth self-tests until we deploy sbws. Defer this ticket until 0.4.1 or later.

#28514 (moved) Open NUM_PARALLEL_TESTING_CIRCS when a bandwidth test is initiated

We won't know if we need bandwidth self-tests until we deploy sbws. Defer this ticket until 0.4.1 or later.

Core Tor/sbws tickets:

#28519 (moved) Prioritize relays with no measured bandwidth in the consensus

This is a really important fix to sbws, to make sure all relays are measured.

#28545 (moved) Use an 11 second download in sbws

This change to sbws makes sure that sbws can grow slow relay bandwidths in the absence of relay bandwidth self-tests.

#28547 (moved) Monitor relays that are not measured by each sbws instance

We need to make sure that all relays are measured by sbws, after the #28519 (moved) fix.

Juga, arma, what do you think of these changes?

This plan sounds like a great plan to me.

I think if we can get the sbws doing what we want ("reliably testing all relays, especially relays that it feels it doesn't have enough measurements about"), then everything else will fall into place.

Replying to teor:

More realistically, the top 10% of relays are at 125 megabits per second: https://metrics.torproject.org/advbwdist-relay.html?start=2018-08-21&end=2018-11-19&n=500

Therefore, it would take log,,2,,(125 / 20)*5 = 13 days for sbws to get an accurate bandwidth for most (90%) of relays, if there was no client traffic.

Do you think that's ok?

This is a fun analysis!

First, I'll start with a "yes we could live with that."

But second, if we have six sbws's going, and there's a lot of variance with each test (sometimes it's faster than expected, sometimes slower), I think the time until one of the tests happens to hit some great throughput, on a relay that's way faster than its self-advertised number, would end up quite a bit less than this analysis predicts. That is, I think it would be quite common to more-than-double, not just double, at each iteration.

And third, what's the "every five days" parameter? Should we teach sbws that when its recent measurements have showed the relay to be way faster than its consensus weight, that means we "don't have enough good recent measurements" and we need to get more (and better) measurements? It seems like there's a lot of space for sbws to be smarter about doing tests for relays that appear to be on an upward trajectory. (But this said, I would still want to try to keep the priority function as simple as possible, to avoid making it hard to analyze what's going wrong: #28519 (moved))

Ultimately, in a design where we base our changes proportional to the self-advertised bandwidth, we are limited by the feedback cycle between "we induce load on relay" and "relay publishes descriptor with higher number". We intentionally slowed down that feedback cycle in the #23856 (moved) fix, so I don't see a way around accepting that -- even best case -- it will take some days to get to the proper number.

Replying to teor:

Juga, arma, what do you think of these changes?

yeah, sounds good to me

Replying to arma:

Replying to teor:

More realistically, the top 10% of relays are at 125 megabits per second: https://metrics.torproject.org/advbwdist-relay.html?start=2018-08-21&end=2018-11-19&n=500

Therefore, it would take log,,2,,(125 / 20)*5 = 13 days for sbws to get an accurate bandwidth for most (90%) of relays, if there was no client traffic.

Do you think that's ok?

This is a fun analysis!

I think it's slightly wrong, because sbws' target download time is 5-10 seconds, not 11 seconds. (We'd need 11 seconds to make sure we covered tor's 10 seconds, when the seconds don't line up.)

But still, we could live with 1x - 2x growth. The important thing is that relays won't go backwards (like they sometimes do with Torflow).

First, I'll start with a "yes we could live with that."

But second, if we have six sbws's going, and there's a lot of variance with each test (sometimes it's faster than expected, sometimes slower), I think the time until one of the tests happens to hit some great throughput, on a relay that's way faster than its self-advertised number, would end up quite a bit less than this analysis predicts. That is, I think it would be quite common to more-than-double, not just double, at each iteration.

It is possible that we would get more than 2x.

And third, what's the "every five days" parameter?

Results are valid for 5 days, and we take the median.

Should we teach sbws that when its recent measurements have showed the relay to be way faster than its consensus weight, that means we "don't have enough good recent measurements" and we need to get more (and better) measurements? It seems like there's a lot of space for sbws to be smarter about doing tests for relays that appear to be on an upward trajectory. (But this said, I would still want to try to keep the priority function as simple as possible, to avoid making it hard to analyze what's going wrong: #28519 (moved))

I think we could open a ticket to try to make fast relays grow faster. But there's a tradeoff: fast growth means less network stability and more chance of a sybil.

Let's do this kind of optimisation in sbws 1.1.

Ultimately, in a design where we base our changes proportional to the self-advertised bandwidth, we are limited by the feedback cycle between "we induce load on relay" and "relay publishes descriptor with higher number". We intentionally slowed down that feedback cycle in the #23856 (moved) fix, so I don't see a way around accepting that -- even best case -- it will take some days to get to the proper number.

And that's ok. Again, there's a growth/stability/security tradeoff.

We won't know if the growth is acceptable until we deploy. So let's stay with what we've got for now?

Trac:
Parent: #25925 (moved) to N/A
Milestone: Tor: 0.4.0.x-final to Tor: unspecified

Replying to teor:

The important thing is that relays won't go backwards (like they sometimes do with Torflow).

In terms of relay self-observation, we need to make sure they see load (from real users or from bwauths) each day (more specifically, maybe it's in the 24 hours before writing their descriptor) or they'll forget about it.

That is, if we're trying to have a feedback cycle where we induce load, the self-assessed number goes higher, and then we induce more load because we see a higher number, then it's risky to let a day go by without hitting the relay with at least that level of load. (I say 'risky', not 'bad', because it could still turn out ok, either because some other scanner induces that load, or because actual users show up and induce it.)

Let's do this kind of optimisation in sbws 1.1.

Sounds like a great plan.

These open, non-merge_ready tickets can not get backported to 0.3.3, because 0.3.3 is now unsupported.

Hmm, I guess they should still get 033-backport-unreached

Trac:
Keywords: N/A deleted, 033-backport-unreached added

These tickets need a proposal before we write any code.

The two competing proposals in #22453 (moved) are:

Remove the bandwidth self-test
Increase the bandwidth self-test so it measures a reasonable amount of bandwidth

Trac:
Keywords: N/A deleted, needs-proposal added

Here's how we might calculate a reasonable bandwidth for the self-test:

Use 2 * NUM_SECONDS_ROLLING_MEASURE * min(RelayBandwidthRate, AuthDirGuardBWGuarantee) bytes
Split those bytes into CIRCWINDOW_START cells per circuit
But keep the number of circuits well below the DoS limits
If there are extra cells left over, just split them evenly across the circuits

Relays should regularly do a larger bandwidth self-test

Child items ...

Activity