In #27135 (moved), sbws starts keeping observed bandwidths for relays:
Taking the descriptor observed bandwidth only when the relay is measured and calculating the mean when there're several observed bandwidth values for the same relay
Here are some options:
use the latest measured and observed bandwidth
take the latest measured and observed bandwidth every hour, and
average the last N days of bandwidths
apply an exponentially decaying average to all bandwidths
We need to decide which strategy to use, update the bandwidth file spec, and implement this feature in sbws.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
Torflow uses the latest observed bandwidth, and uses a decaying average for measurements. (I couldn't work out the exact decay factor, because it's a complex feedback loop.)
We need to decide which strategy to use, update the bandwidth file spec, and implement this feature in sbws.
i'm not sure how we're going to decide this. Try with each of the methods for 1 week and graph results and/or calculate % differences with Torflow?
In case it's useful, the descriptors' observed bandwidth collected at the time of doing measurements in the results, for 2 days:
number of relays' descriptor observed bandwidth: 6462
mean of all relays' descriptor observed bandwidth taking the last for each relay: 5621550
mean of all relays' descriptor observed bandwidth taking the mean from the relay's results: 5609508
median of all relays' descriptor observed bandwidth taking the last for each relay: 2065215
mean of all relays' descriptor observed bandwidth taking the mean from the relay's results: 2060907
number of relays for which it was collected 1 descriptor observed bandwidth: 5087 (79%)
number of relays for which it was collected 1 descriptor observed bandwidth: 1368 (21%)
number of relays for which it was collected 1 descriptor observed bandwidth: 7 (0.11%)
I've also being collecting descriptors' observed bandwidth every hour (in a separated script). Would be useful to compare only the descriptors' observed bandwidth collected in these 3 different ways?.
I'm having a lot of new code because of all the changes, tests and graphs, i could:
continue with the experiments and make PR only when we have decided this
keep the experiments code so that we can reproduce them in a future and start creating PRs with it.
Is it 2 ok?.
For instance, If we collect descriptors' observed bandwidth, that's new code. I think it's fine i keep the code to store descriptors' observed bandwidth only at the time of doing measurements?. I can configure it in a way that the method to be used can be passed as parameter.
Oh, and I think we should always use the latest {Relay,}Bandwidth{Rate,Burst}.
Do you mean descriptors' bandwidth burst [0]?. We have not used it yet for anything. How should we use them?.
We have only used descriptors' bandwidth average [1] to cap the measurements.
Oh, and I think we should always use the latest {Relay,}Bandwidth{Rate,Burst}.
Do you mean descriptors' bandwidth burst [0]?. We have not used it yet for anything. How should we use them?.
We have only used descriptors' bandwidth average [1] to cap the measurements.
Bandwidth in consensus[2] is min(observed bandwidth, bandwidth rate limit, 10MB/s)
I guess bandwidth rate limit here is bandwidth burst, right?
Should the torflow or sbws scaled bandwidth be limited to the the bandwidth burst?. AFAIK torflow is not doing that.
We need to decide which strategy to use, update the bandwidth file spec, and implement this feature in sbws.
i'm not sure how we're going to decide this. Try with each of the methods for 1 week and graph results and/or calculate % differences with Torflow?
No, we have a method that is good enough, because it is close enough to torflow.
So we need to use what we know about the tor network to make sure we have a good design. Let's make some some rules for the minimum viable product. Then we can merge any design that fits those rules.
Here's what I suggest:
The minimum viable product must:
Use the latest descriptor bandwidth limit, because:
the latest descriptor contains the limit that the operator has asked for
Use at least 2 sbws measured bandwidths over at least 2 days, because:
tor relay usage varies on a daily cycle
each sbws measurement depends on the time of day
Use at least 2 descriptor observed bandwidths over at least 2 days, because:
a single download by a single client can increase the observed bandwidth
for security, we want results that don't depend on a single client's behaviour
Don't keep bandwidths for more than 1 week
old bandwidths do not help us work out current relay capacity
What do you think?
Can you implement something based on these suggestions?
If you want, I can write or review patches, or write a detailed spec.
I put some other suggestions in #27346 (moved). They are complicated. We don't need them for the MVP release.
In case it's useful, the descriptors' observed bandwidth collected at the time of doing measurements in the results, for 2 days:
Since most relays only observe bandwidth once per day, a 2 day collection is not long enough to be useful.
number of relays' descriptor observed bandwidth: 6462
mean of all relays' descriptor observed bandwidth taking the last for each relay: 5621550
mean of all relays' descriptor observed bandwidth taking the mean from the relay's results: 5609508
median of all relays' descriptor observed bandwidth taking the last for each relay: 2065215
mean of all relays' descriptor observed bandwidth taking the mean from the relay's results: 2060907
number of relays for which it was collected 1 descriptor observed bandwidth: 5087 (79%)
number of relays for which it was collected 1 descriptor observed bandwidth: 1368 (21%)
number of relays for which it was collected 1 descriptor observed bandwidth: 7 (0.11%)
Do you mean 1, 2, 3 on the last 3 lines?
I've also being collecting descriptors' observed bandwidth every hour (in a separated script). Would be useful to compare only the descriptors' observed bandwidth collected in these 3 different ways?.
It might be useful, but it is not essential. Let's focus on getting a minimal viable product. Then we can make small improvements later.
I'm having a lot of new code because of all the changes, tests and graphs, i could:
continue with the experiments and make PR only when we have decided this
keep the experiments code so that we can reproduce them in a future and start creating PRs with it.
Is it 2 ok?.
Please create a PR that fits the minimum viable product rules above. Prefer code that is simple, fast to write, and easy to read.
For instance, If we collect descriptors' observed bandwidth, that's new code. I think it's fine i keep the code to store descriptors' observed bandwidth only at the time of doing measurements?. I can configure it in a way that the method to be used can be passed as parameter.
Please implement one simple method for MVP 1.0.
We don't need alternative methods.
If you want, you can make the number of measured and observed bandwidths configurable. I suggest 2 measurements over 2 days is a good default.
Use the latest descriptor bandwidth limit, because:
the latest descriptor contains the limit that the operator has asked for
Use at least 2 sbws measured bandwidths over at least 2 days, because:
tor relay usage varies on a daily cycle
each sbws measurement depends on the time of day
Use at least 2 descriptor observed bandwidths over at least 2 days, because:
a single download by a single client can increase the observed bandwidth
for security, we want results that don't depend on a single client's behaviour
Currently, it's possible that after 2 (or more) days we didn't collected less than 2 measurements and descriptor observed bandwidths for some relays.
It's possible that prioritization might need some changes, which i think might be related to https://github.com/pastly/simple-bw-scanner/issues/136.
If prioritization can't change that, it might be the case that is not possible to obtain 2 measurements in the last 2 days.
Don't keep bandwidths for more than 1 week
old bandwidths do not help us work out current relay capacity
For bandwidth files, that's the default. Raw measurements are keep 90 days by default
What do you think?
Can you implement something based on these suggestions?
Yes, except for the comments above
If you want, I can write or review patches, or write a detailed spec.
I've already the code except for the comments above (need to clean a bit commits). Reviews and spec would help.
Do you mean 1, 2, 3 on the last 3 lines?
Yes, sorry, distracted copy & paste...
The descriptor has:
{{{
"bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL
}}}
But that's redundant, because tor relays do:
{{{
"bandwidth" min(RelayBandwidthRate, RelayBandwidthBust, BandwidthRate, BandwidthBurst, MaxAdvertisedBandwidth) min(RelayBandwidthBust, BandwidthBurst) bandwidth-observed NL
}}}
See get_effective_bwrate() and get_effective_bwburst().
i've been collecting and documenting all these possible values and the different names they could have so i don't get confused.
I've just not put that notes online somewhere yet but intend to do so.
Use the latest descriptor bandwidth limit, because:
the latest descriptor contains the limit that the operator has asked for
Use at least 2 sbws measured bandwidths over at least 2 days, because:
tor relay usage varies on a daily cycle
each sbws measurement depends on the time of day
Use at least 2 descriptor observed bandwidths over at least 2 days, because:
a single download by a single client can increase the observed bandwidth
for security, we want results that don't depend on a single client's behaviour
Currently, it's possible that after 2 (or more) days we didn't collected less than 2 measurements and descriptor observed bandwidths for some relays.
It's possible that prioritization might need some changes, which i think might be related to https://github.com/pastly/simple-bw-scanner/issues/136.
If prioritization can't change that, it might be the case that is not possible to obtain 2 measurements in the last 2 days.
Ok, I think those rules are confusing.
Let's try to split them up:
If any of these things are true, do not put the relay in the bandwidth file:
there are less than 2 sbws measured bandwidths
all the sbws measured bandwidths are within 24 hours of each other
there are less than 2 descriptor observed bandwidths
all the descriptor observed bandwidths are within 24 hours of each other
We will need to make these settings configurable, so we can get test network results in less than 1 day.
Don't keep bandwidths for more than 1 week
old bandwidths do not help us work out current relay capacity
For bandwidth files, that's the default. Raw measurements are keep 90 days by default
Sorry, I meant:
Don't use sbws measured bandwidths that are older than 1 week
Don't use descriptor observed bandwidths that are older than 1 week
If you want, I can write or review patches, or write a detailed spec.
I've already the code except for the comments above (need to clean a bit commits). Reviews and spec would help.
Ok, when you finish a ticket, let me know, and I will do the review and spec.
The descriptor has:
{{{
"bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL
}}}
But that's redundant, because tor relays do:
{{{
"bandwidth" min(RelayBandwidthRate, RelayBandwidthBust, BandwidthRate, BandwidthBurst, MaxAdvertisedBandwidth) min(RelayBandwidthBust, BandwidthBurst) bandwidth-observed NL
}}}
See get_effective_bwrate() and get_effective_bwburst().
i've been collecting and documenting all these possible values and the different names they could have so i don't get confused.
I've just not put that notes online somewhere yet but intend to do so.
there are less than 2 sbws measured bandwidths
all the sbws measured bandwidths are within 24 hours of each other
there are less than 2 descriptor observed bandwidths
all the descriptor observed bandwidths are within 24 hours of each other