How long should sbws keep measured and observed bandwidths?

changed milestone to %sbws: 1.0.x-final

added component::core tor/sbws milestone::sbws: 1.0.x-final parent::27108 priority::medium resolution::implemented sbws-1.0-must-closed-moved-20181128 severity::normal status::closed type::task labels

Torflow uses the latest observed bandwidth, and uses a decaying average for measurements. (I couldn't work out the exact decay factor, because it's a complex feedback loop.)

https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/aggregate.py#n117

Oh, and I think we should always use the latest {Relay,}Bandwidth{Rate,Burst}.

We need to decide which strategy to use, update the bandwidth file spec, and implement this feature in sbws.

i'm not sure how we're going to decide this. Try with each of the methods for 1 week and graph results and/or calculate % differences with Torflow?

In case it's useful, the descriptors' observed bandwidth collected at the time of doing measurements in the results, for 2 days:

number of relays' descriptor observed bandwidth: 6462
mean of all relays' descriptor observed bandwidth taking the last for each relay: 5621550
mean of all relays' descriptor observed bandwidth taking the mean from the relay's results: 5609508
median of all relays' descriptor observed bandwidth taking the last for each relay: 2065215
mean of all relays' descriptor observed bandwidth taking the mean from the relay's results: 2060907
number of relays for which it was collected 1 descriptor observed bandwidth: 5087 (79%)
number of relays for which it was collected 1 descriptor observed bandwidth: 1368 (21%)
number of relays for which it was collected 1 descriptor observed bandwidth: 7 (0.11%)

I've also being collecting descriptors' observed bandwidth every hour (in a separated script). Would be useful to compare only the descriptors' observed bandwidth collected in these 3 different ways?.

I'm having a lot of new code because of all the changes, tests and graphs, i could:

continue with the experiments and make PR only when we have decided this
keep the experiments code so that we can reproduce them in a future and start creating PRs with it. Is it 2 ok?.

For instance, If we collect descriptors' observed bandwidth, that's new code. I think it's fine i keep the code to store descriptors' observed bandwidth only at the time of doing measurements?. I can configure it in a way that the method to be used can be passed as parameter.

Oh, and I think we should always use the latest {Relay,}Bandwidth{Rate,Burst}.

Do you mean descriptors' bandwidth burst [0]?. We have not used it yet for anything. How should we use them?. We have only used descriptors' bandwidth average [1] to cap the measurements.

[0] https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n427 [1] https://github.com/pastly/simple-bw-scanner/blob/master/sbws/lib/v3bwfile.py#L314

Replying to juga:

Oh, and I think we should always use the latest {Relay,}Bandwidth{Rate,Burst}.

Do you mean descriptors' bandwidth burst [0]?. We have not used it yet for anything. How should we use them?. We have only used descriptors' bandwidth average [1] to cap the measurements.

Bandwidth in consensus[2] is min(observed bandwidth, bandwidth rate limit, 10MB/s) I guess bandwidth rate limit here is bandwidth burst, right? Should the torflow or sbws scaled bandwidth be limited to the the bandwidth burst?. AFAIK torflow is not doing that.

[0] https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n427 [1] https://github.com/pastly/simple-bw-scanner/blob/master/sbws/lib/v3bwfile.py#L314 [2] https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2595

Replying to juga:

We need to decide which strategy to use, update the bandwidth file spec, and implement this feature in sbws.

i'm not sure how we're going to decide this. Try with each of the methods for 1 week and graph results and/or calculate % differences with Torflow?

No, we have a method that is good enough, because it is close enough to torflow.

So we need to use what we know about the tor network to make sure we have a good design. Let's make some some rules for the minimum viable product. Then we can merge any design that fits those rules.

Here's what I suggest:

The minimum viable product must:

Use the latest descriptor bandwidth limit, because:
- the latest descriptor contains the limit that the operator has asked for
Use at least 2 sbws measured bandwidths over at least 2 days, because:
- tor relay usage varies on a daily cycle
- each sbws measurement depends on the time of day
Use at least 2 descriptor observed bandwidths over at least 2 days, because:
- a single download by a single client can increase the observed bandwidth
- for security, we want results that don't depend on a single client's behaviour
Don't keep bandwidths for more than 1 week
- old bandwidths do not help us work out current relay capacity

What do you think? Can you implement something based on these suggestions?

If you want, I can write or review patches, or write a detailed spec.

I put some other suggestions in #27346 (moved). They are complicated. We don't need them for the MVP release.

In case it's useful, the descriptors' observed bandwidth collected at the time of doing measurements in the results, for 2 days:

Since most relays only observe bandwidth once per day, a 2 day collection is not long enough to be useful.

number of relays' descriptor observed bandwidth: 6462

mean of all relays' descriptor observed bandwidth taking the last for each relay: 5621550

mean of all relays' descriptor observed bandwidth taking the mean from the relay's results: 5609508

median of all relays' descriptor observed bandwidth taking the last for each relay: 2065215

mean of all relays' descriptor observed bandwidth taking the mean from the relay's results: 2060907

number of relays for which it was collected 1 descriptor observed bandwidth: 5087 (79%)

number of relays for which it was collected 1 descriptor observed bandwidth: 1368 (21%)

number of relays for which it was collected 1 descriptor observed bandwidth: 7 (0.11%)

Do you mean 1, 2, 3 on the last 3 lines?

I've also being collecting descriptors' observed bandwidth every hour (in a separated script). Would be useful to compare only the descriptors' observed bandwidth collected in these 3 different ways?.

It might be useful, but it is not essential. Let's focus on getting a minimal viable product. Then we can make small improvements later.

I'm having a lot of new code because of all the changes, tests and graphs, i could:

continue with the experiments and make PR only when we have decided this

keep the experiments code so that we can reproduce them in a future and start creating PRs with it. Is it 2 ok?.

Please create a PR that fits the minimum viable product rules above. Prefer code that is simple, fast to write, and easy to read.

For instance, If we collect descriptors' observed bandwidth, that's new code. I think it's fine i keep the code to store descriptors' observed bandwidth only at the time of doing measurements?. I can configure it in a way that the method to be used can be passed as parameter.

Please implement one simple method for MVP 1.0. We don't need alternative methods.

If you want, you can make the number of measured and observed bandwidths configurable. I suggest 2 measurements over 2 days is a good default.

Replying to juga:

Replying to juga:

Oh, and I think we should always use the latest {Relay,}Bandwidth{Rate,Burst}.

Do you mean descriptors' bandwidth burst [0]?. We have not used it yet for anything. How should we use them?.

The bandwidth burst can be ignored.

We have only used descriptors' bandwidth average [1] to cap the measurements.

That is ok.

Bandwidth in consensus[2] is min(observed bandwidth, bandwidth rate limit, 10MB/s) I guess bandwidth rate limit here is bandwidth burst, right?

How does the consensus help us, when we are looking at relay descriptors?

If the "Measured=" bandwidth is available in the consensus, clients use it: https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2601

The bandwidth in the "w" line in the consensus is only used if the network has less than 3 bandwidth authorities voting.

Should the torflow or sbws scaled bandwidth be limited to the the bandwidth burst?. AFAIK torflow is not doing that.

The descriptor has:

"bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL

Torflow does:

bw_observed = min(bandwidth-avg, bandwidth-burst, bandwidth-observed)

https://gitweb.torproject.org/pytorctl.git/tree/TorCtl.py#n459

But that's redundant, because tor relays do:

"bandwidth" min(RelayBandwidthRate, RelayBandwidthBust, BandwidthRate, BandwidthBurst, MaxAdvertisedBandwidth) min(RelayBandwidthBust, BandwidthBurst) bandwidth-observed NL

See get_effective_bwrate() and get_effective_bwburst().

So you can use min(bandwidth-avg, bandwidth-burst) or just bandwidth-avg. The results will be the same.

[0] https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n427 [1] https://github.com/pastly/simple-bw-scanner/blob/master/sbws/lib/v3bwfile.py#L314 [2] https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2595

Replying to teor:

The minimum viable product must:

Use the latest descriptor bandwidth limit, because:

the latest descriptor contains the limit that the operator has asked for

Use at least 2 sbws measured bandwidths over at least 2 days, because:

tor relay usage varies on a daily cycle

each sbws measurement depends on the time of day

Use at least 2 descriptor observed bandwidths over at least 2 days, because:

a single download by a single client can increase the observed bandwidth

for security, we want results that don't depend on a single client's behaviour

Currently, it's possible that after 2 (or more) days we didn't collected less than 2 measurements and descriptor observed bandwidths for some relays. It's possible that prioritization might need some changes, which i think might be related to https://github.com/pastly/simple-bw-scanner/issues/136. If prioritization can't change that, it might be the case that is not possible to obtain 2 measurements in the last 2 days.

Don't keep bandwidths for more than 1 week

old bandwidths do not help us work out current relay capacity

For bandwidth files, that's the default. Raw measurements are keep 90 days by default

What do you think? Can you implement something based on these suggestions?

Yes, except for the comments above

If you want, I can write or review patches, or write a detailed spec.

I've already the code except for the comments above (need to clean a bit commits). Reviews and spec would help.

Do you mean 1, 2, 3 on the last 3 lines?

Yes, sorry, distracted copy & paste...

The descriptor has: {{{ "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL }}}

Torflow does: {{{ bw_observed = min(bandwidth-avg, bandwidth-burst, bandwidth-observed) }}} https://gitweb.torproject.org/pytorctl.git/tree/TorCtl.py#n459

But that's redundant, because tor relays do: {{{ "bandwidth" min(RelayBandwidthRate, RelayBandwidthBust, BandwidthRate, BandwidthBurst, MaxAdvertisedBandwidth) min(RelayBandwidthBust, BandwidthBurst) bandwidth-observed NL }}} See get_effective_bwrate() and get_effective_bwburst().

i've been collecting and documenting all these possible values and the different names they could have so i don't get confused. I've just not put that notes online somewhere yet but intend to do so.

Replying to juga:

Replying to teor:

The minimum viable product must:

Use the latest descriptor bandwidth limit, because:

the latest descriptor contains the limit that the operator has asked for

Use at least 2 sbws measured bandwidths over at least 2 days, because:

tor relay usage varies on a daily cycle

each sbws measurement depends on the time of day

Use at least 2 descriptor observed bandwidths over at least 2 days, because:

a single download by a single client can increase the observed bandwidth

for security, we want results that don't depend on a single client's behaviour

Currently, it's possible that after 2 (or more) days we didn't collected less than 2 measurements and descriptor observed bandwidths for some relays. It's possible that prioritization might need some changes, which i think might be related to https://github.com/pastly/simple-bw-scanner/issues/136. If prioritization can't change that, it might be the case that is not possible to obtain 2 measurements in the last 2 days.

Ok, I think those rules are confusing.

Let's try to split them up:

If any of these things are true, do not put the relay in the bandwidth file:

there are less than 2 sbws measured bandwidths
all the sbws measured bandwidths are within 24 hours of each other
there are less than 2 descriptor observed bandwidths
all the descriptor observed bandwidths are within 24 hours of each other

We will need to make these settings configurable, so we can get test network results in less than 1 day.

Don't keep bandwidths for more than 1 week

old bandwidths do not help us work out current relay capacity

For bandwidth files, that's the default. Raw measurements are keep 90 days by default

Sorry, I meant:

Don't use sbws measured bandwidths that are older than 1 week
Don't use descriptor observed bandwidths that are older than 1 week

If you want, I can write or review patches, or write a detailed spec.

I've already the code except for the comments above (need to clean a bit commits). Reviews and spec would help.

Ok, when you finish a ticket, let me know, and I will do the review and spec.

The descriptor has: {{{ "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL }}}

Torflow does: {{{ bw_observed = min(bandwidth-avg, bandwidth-burst, bandwidth-observed) }}} https://gitweb.torproject.org/pytorctl.git/tree/TorCtl.py#n459

But that's redundant, because tor relays do: {{{ "bandwidth" min(RelayBandwidthRate, RelayBandwidthBust, BandwidthRate, BandwidthBurst, MaxAdvertisedBandwidth) min(RelayBandwidthBust, BandwidthBurst) bandwidth-observed NL }}} See get_effective_bwrate() and get_effective_bwburst().

i've been collecting and documenting all these possible values and the different names they could have so i don't get confused. I've just not put that notes online somewhere yet but intend to do so.

Thanks!

Blocked by #27398 (moved). Implementented in https://github.com/juga0/simple-bw-scanner/commits/dev

Trac:
Status: new to needs_review

Trac:

Graph generated scaling results using #27108 (moved), #27337 (moved), #27336 (moved) and this ticket:

there are less than 2 sbws measured bandwidths all the sbws measured bandwidths are within 24 hours of each other there are less than 2 descriptor observed bandwidths all the descriptor observed bandwidths are within 24 hours of each other

Implemented in https://github.com/torproject/sbws/pull/256

Assing child #27346 (moved) to parent #27107 (moved), since this is implemented

Trac:
Status: needs_review to closed
Resolution: N/A to implemented

Move all closed sbws 1.0 must tickets to sbws 1.0.x-final

Trac:
Keywords: N/A deleted, sbws-1.0-must-closed-moved-20181128 added
Milestone: sbws 1.0 (MVP must) to sbws: 1.0.x-final

closed

mentioned in issue #27346 (moved)

mentioned in issue #27690 (moved)

mentioned in issue #28041 (moved)

mentioned in issue #28042 (moved)

mentioned in issue #28103 (moved)

How long should sbws keep measured and observed bandwidths?

Child items 0

Activity