Opened 6 weeks ago
Last modified 11 days ago
#29330 new enhancement
Do something with advertised bandwidth distribution graphs
Reported by: | karsten | Owned by: | metrics-team |
---|---|---|---|
Priority: | Medium | Milestone: | |
Component: | Metrics/Website | Version: | |
Severity: | Normal | Keywords: | |
Cc: | metrics-team, arma | Actual Points: | |
Parent ID: | Points: | ||
Reviewer: | Sponsor: |
Description
We currently have two graphs on advertised bandwidth distribution on Tor Metrics, Advertised bandwidth distribution and Advertised bandwidth of n-th fastest relays.
Unfortunately, the aggregation code that produces the data behind these graphs has always been somewhat painful to maintain. It was never written for the long term, it was rather a one-off analysis that we then made available on Tor Metrics. And now it's blocking a refactoring project where we want to share code between modules (#28342). This is not a good situation to be in.
We discussed this briefly in Brussels, and I put some more thoughts into this today. Basically, I can see four ways for moving forward from here:
- Retain: We accept that this code is hard to maintain, but we retain it. We exclude the module from the refactoring project and keep it as legacy module. This seems like an ugly solution from a bit-rot perspective, unless we're only doing it for a limited time before removing the graphs, in which case this could work.
- Rewrite: We rewrite this module by designing a new database schema that is more flexible than the current approach. This is an awful amount of work, and we should only do it if we really think that these graphs are useful and will stay around for a long time.
- Remove: We remove the graph, because we don't see the need for it anymore. We can do this with a few weeks of warning, and we can archive the .csv files and graphs and put them into an attic kind of thing just like we're planning to do with Tor Messenger graphs (#26030).
- Replace: We replace these two graphs with two that are much easier to provide, namely with consensus weight distribution graphs. I'm going to attach two samples shortly. The code changes are almost trivial, except that the resulting code will be much easier to maintain regarding the refactoring project mentioned earlier.
Child Tickets
Attachments (3)
Change History (12)
Changed 6 weeks ago by
Attachment: | cwdist-n-2019-02-04.png added |
---|
Changed 6 weeks ago by
Attachment: | cwdist-p-2019-02-04.png added |
---|
comment:1 Changed 6 weeks ago by
Status: | new → needs_review |
---|
comment:3 Changed 3 weeks ago by
Status: | needs_review → new |
---|
Switching to consensus weight is a good compromise where the alternative is removing the graphs. I don't think we need both percentiles and n-th fastest. Drop the n-th fastest and just have percentiles. Can we do 100, 99, 98, 95, 75, 50, 25, 3, 2, 1, 0? These don't need to be configurable, just fixed is OK.
Changed 3 weeks ago by
Attachment: | cwdist-irl-percentiles-2019-02-28.png added |
---|
comment:4 Changed 3 weeks ago by
Replying to irl:
Switching to consensus weight is a good compromise where the alternative is removing the graphs.
Works for me.
I don't think we need both percentiles and n-th fastest. Drop the n-th fastest and just have percentiles.
Works for me, too.
Can we do 100, 99, 98, 95, 75, 50, 25, 3, 2, 1, 0? These don't need to be configurable, just fixed is OK.
This one is tricky. We're looking at a distribution that is far from normal. I made a quick graph with those percentiles:
(That graph would need some more love, like using labels on the y axis that are not in scientific notation, reordering percentiles in the legend, and using more intuitive labels for the two subplots than TRUE and NA. I didn't spend the time on that yet, but those things would get fixed.)
The only really visible percentiles are 100, 99, 98, and maybe 95. All others are hard to distinguish in the graph.
I also tried a log scale, but you can imagine how that's rather unintuitive to read. Another uncool aspect of the log scale is that the minimum consensus weight (of unmeasured relays) is 0.
I'd say, if we switch to consensus weight percentiles, let's keep percentiles configurable. Maybe one person is interested in the extremes, and another person wants to look at the center. Giving them just a single graph might make at least one of them unhappy.
In fact, we could even keep the n-th fastest if that keeps folks happy. This part doesn't cost us much maintenance effort. It's the advertised bandwidth stuff that I'd really want to get rid of.
arma, what do you think?
comment:5 Changed 2 weeks ago by
I often use n-th fastest to work out the fastest relay(s) over time.
I expect to use n-th fastest a bit while developing PrivCount to answer questions like:
- what's the highest bandwidth?
- what do we get if we aggregate the top N relays?
- what's the minimum relay count and consensus weight we should require to create an aggregate total?
(we can't have a bandwidth requirement, because we don't know the bandwidth until *after* we aggregate)
comment:6 follow-up: 8 Changed 2 weeks ago by
I am ok with the 'replace' plan, where we switch from descriptor bandwidths to consensus weights.
These graphs are all about (or at least, were started for) visualizing how centralized the Tor network is. For example, they aimed to help answer the questions "how much of the Tor network are the top x relays, or the top x% of the relays?" There are many other ways we might visualize the centralization of the network over time, and which for me might be at least as good as these current graphs. For examples,
- "how many relays are in the top 50% of the network by bandwidth or by consensus weight?"
- "if we think of the current Tor network in terms of equally weighted relays, rather than the current wildly unbalanced weights, how many uniformly-weighted relays would it be the equivalent of?"
- "if a client builds 100 circuits, what's the expected number of relays (maybe broken out into first / second / third hop) that it will interact with?"
comment:7 Changed 2 weeks ago by
Replying to teor:
I often use n-th fastest to work out the fastest relay(s) over time.
Assuming we keep the parameter for n-th fastest, you'd still learn the n-th fastest relay by consensus weight, just not by advertised bandwidth. Depending on how you define how fast a relay is, this would still be possible with the new graph.
I expect to use n-th fastest a bit while developing PrivCount to answer questions like:
- what's the highest bandwidth?
After the suggested change you'd have to look at descriptors yourself for this. However, this particular question is relatively easy: just grep for bandwidth
lines and compute the maximum advertised bandwidth.
- what do we get if we aggregate the top N relays?
This question would require some more code. However, it's also not immediately answered by the Tor Metrics graphs, except maybe for N < 4.
- what's the minimum relay count and consensus weight we should require to create an aggregate total?
(we can't have a bandwidth requirement, because we don't know the bandwidth until *after* we aggregate)
This is another question that would probably require you to write code yourself.
In conclusion, the suggested graph would answer your questions just as well as the current graphs, right?
comment:8 Changed 2 weeks ago by
Replying to arma:
I am ok with the 'replace' plan, where we switch from descriptor bandwidths to consensus weights.
Good to know!
These graphs are all about (or at least, were started for) visualizing how centralized the Tor network is. For example, they aimed to help answer the questions "how much of the Tor network are the top x relays, or the top x% of the relays?" There are many other ways we might visualize the centralization of the network over time, and which for me might be at least as good as these current graphs. For examples,
- "how many relays are in the top 50% of the network by bandwidth or by consensus weight?"
- "if we think of the current Tor network in terms of equally weighted relays, rather than the current wildly unbalanced weights, how many uniformly-weighted relays would it be the equivalent of?"
- "if a client builds 100 circuits, what's the expected number of relays (maybe broken out into first / second / third hop) that it will interact with?"
I'll start a new ticket for doing a one-off analysis to answer these questions. Then we can decide whether we want to add any of these graphs to the website.
Until then I'll move forward with this switch. I guess I'll first add the new graph and declare the existing two graphs as deprecated, and two weeks later I'll remove those two graphs.
Thanks!
comment:9 Changed 11 days ago by
Reviewer: | irl |
---|
Removing myself as reviewer for now. I'll probably be the reviewer when it comes back around but there is no reason a hypothetical third member of the metrics team couldn't also be a reviewer.
Here are two examples with advertised bandwidth distribution (at the top of each graph) compared to consensus weight distribution (at the bottom of each graph). The first example is for the n-th fastest relays, the second for percentiles.
What should we do? Retain, rewrite, remove, or replace the two advertised bandwidth distribution graphs?