A relay that self-reports high bandwidth values will get an inflated consensus weight. I believe that TorFlow somehow uses the self-reported values when producing a measurement result for a relay. We should fix TorFlow so that it better handles self-reported values in order to prevent a relay from accidentally or maliciously getting uncharacteristically high consensus weights.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
A) I wonder how Aaron (cc'ed) is doing at his OTF fellowship project on exactly this topic? Aaron?
B) A helpful workaround in the short-term might be for the bwauths to never increase their weight for a relay by more than some multiple. Then it would take a while for the weights to crank themselves up to crazy numbers, giving folks more of a chance to notice that something is weird.
May be related to #16696 (moved). I noticed that fallback to self-measure might be applied per-relay rather than globally. Casual past observation that I did not re-check for this post.
How much do advertised weights count for bwauths? What is the algorithm? Obviously bwauths don't (100%) trust/use the self-advertised weights otherwise they wouldn't be called bwauths.
IIRC the bwauths can measure any relay, regardless their own speed. We host one on a 250mbit line and it has no problems measuring the fast relays. Maybe we can come up with an algorithm that handles the self-advertised weights smarter.
In relation to the SBWS effort I think it makes sense to preview this by running Torflow with the behavior adjustment on Tom Ritter's test scanner. That is of course if Tom doesn't mind and perhaps likes the idea.
The change: Have aggregate.py substitute the average of self-measure bandwidths for the appropriate class in place of self-report bandwidth of individual relays when calculating each final vote. It could even make sense to bias off a constant for each class to improve stability of votes and consensus medians while retaining class voting biases, e.g. 10000 for exits, 9700 for guards, 1200 for middle-only. Or go a bit radical and apply a single constant such as 5000 while folding all relays into a single class with an edit to Node::node_class(). Perhaps try out both approaches.
Looking at it now it seems to me Node::node_class() should return Exit/Guard/Middle and forget the rest as separate offsets for the different roles relays can participate under are not calculated. Treating Exit+Guard and Exit (only) as independent bandwidth classes no longer makes sense. A case might still exist for Guard and Middle since Middle-only relays comprise just over half the relay population, though with an average bandwidth around 12% of each of the Exit and Guard classes. Or just two classes, 'Exit' and 'NonExit' might work better. . .or one, that is no classes.
While Torflow votes are unitless, they resemble actual bandwidths owing that they are interpretations of bandwidth measurements taken at each node. Using class-average bandwidths as baselines for calculating votes retains this property. Probably a correct method is to decay-average values in typical manner to mitigate the impact of jitter and drift on effective valuation of older votes before replacement. On the other hand hammering in reasonable constants is expedient for a test and will save some trouble. While I'm on the subject of aging votes, seems to me measuring guard relays less frequently than non-guards is unhelpful and should be binned.
Was recently reading Torflow code and wrote a script approximating Torflow calculations. Am dangerous enough now to write a patch implementing the above.
Thank you for your offer to submit a patch to torflow.
But fixing torflow is not on our roadmap.
We are already running sbws instances on the public tor network.
If we are going to put effort into comparisons, I would like to focus on comparing sbws with the existing torflow instances.
If we are going to put effort into modifying code, I would like to focus on developing sbws.
Fair enough. Appears to be a terrible idea anyway.
Cooked up the attached spreadsheet and eliminating self-measure does not seem to work. Also tried applying a 20% linear factor to Torflow's progressive-offset vote generation method; perhaps retaining scanner biased self-advertised bandwidth while demphasizing its consensus impact has merit.
This spreadsheet might be useful for brainstorming and what-if analysis.
While Torflow votes are unitless, they resemble actual bandwidths owing that they are interpretations of bandwidth measurements taken at each node.
Careful here. I think TorFlow measures something closer to residual bandwidth capacity at the time of the measurement, not the full capacity of the link. And it doesn't even measure residual capacity exactly, because of scheduling and fairness. For example, if my relay is operating at 100% link utilization and TorFlow tries to measure it, TorFlow isn't going to get 0 bandwidth and it isn't going to get 100% bandwidth; TorFlow is probably only going to get roughly 1/N of my bandwidth where N is the number of other active flows.
Or am I misunderstanding and the authorities interpret the measurements differently?
I will enjoy having a bandwidth measurement specification, because then I won't have to ask questions like:
when you say "authorities", which part of the bandwidth measurement system are you referring to?
I think all the interpretation is within the bandwidth measurement system, or within tor clients.
Here's a summary of the process:
Torflow measures the available bandwidth at the relay, which is approximately max(current residual bandwidth, available bandwidth / number of current flows)
Torflow converts this figure into kilobytes per second and stores it
Torflow aggregates measurements and self-reported bandwidths to produce a figure that is technically unitless, but is practically kilobytes per second
The authorities read the bandwidths file and put the numbers from the file in their votes
The consensus contains the low-median bandwidth for each relay as the consensus weight
Clients use consensus weights and position weights to choose randomly weighted paths through the network
While Torflow votes are unitless, they resemble actual bandwidths owing that they are interpretations of bandwidth measurements taken at each node.
Careful here. I think TorFlow measures something closer to residual bandwidth capacity at the time of the measurement, not the full capacity of the link.
Yes, of course.
And it doesn't even measure residual capacity exactly, because of scheduling and fairness. For example, if my relay is operating at 100% link utilization and TorFlow tries to measure it, TorFlow isn't going to get 0 bandwidth and it isn't going to get 100% bandwidth; TorFlow is probably only going to get roughly 1/N of my bandwidth where N is the number of other active flows.
Or am I misunderstanding and the authorities interpret the measurements differently?
I may not have this perfectly, but it seems to me that Torflow calculates the ratio/percent offset of the measurement for each relay relative to the average of all relay measurements (or all relays handled by the particular scanner, not sure). Then this value feeds into the "PID error", which presently is limited just the "P" or progressive component and so is in effect a pass-through of the scanner offset. Is then applied to the self-measure of a node under consideration, thereby mirroring the residual bandwidth offset onto the actual declared bandwidth. That's why Torflow votes somewhat resemble real bandwidth capacities. Votes could just as easily have an arbitrary basis so long as the consensus fractions work out the same, but it's nice to have semi-reasonable values to look at.
The request in this ticket and one of the stated design goals of SBWS is to take self-measure out of voting process, but I am skeptical that this will turn out practical. Certainly plugging averages of whole class (exit/guard/middle) self-measurements in place of per-relay self-measure looks terrible in the spreadsheet. Can be seen by sorting on a hypothetical vote column and glancing over at the Maatuska vote and consensus weight columns. I may try revising it with averages specific to each scanner to see if it helps, but I doubt it.
. . . And it doesn't even measure residual capacity exactly, because of scheduling and fairness. For example, if my relay is operating at 100% link utilization and TorFlow tries to measure it, TorFlow isn't going to get 0 bandwidth and it isn't going to get 100% bandwidth; TorFlow is probably only going to get roughly 1/N of my bandwidth where N is the number of other active flows.
Excellent point! I had not considered this previously.