I'm attaching a first scatterplot showing advertised bandwidths on x and consensus weights on y, and I'm attaching the underlying CSV file.
Here are the top 20 relays by consensus weight to advertised bandwidth ratio:
=fingerprint
consensus weight=
advertised bandwidth=
9B5AE3EDDD47689C22FAA7A06A9DA8B87728F2A1
5620
0
FF070FEC0C1AA523F248F9827EE3FAA9B32E5B17
4840
0
C92EAF09EB7B2B6D63DB776F7A7C025075765D29
3590
0
7562FD1E39F2CDDD1410A1E0BF28ABB2D7814E4A
2740
0
805B665CC4650D066B5B4816357FFBAC78489642
362
0
47F17268F0A46B34F06409B0CFA117456251E3AE
263
0
358202B828A85B1FB0786C398BDB639A90D3BF92
5
0
5514445F204D9E57E30F407C0C077E592E7B297E
5
0
7E7869829BF20078BE00BCEA3B42971F5BD8D585
3
0
A5F1656F4D2374BB9F85E6B6F143D2C9F22D6A7F
3
0
260FD3C4382D2AF1A56A5ADE6714EFBE0CA6F83E
6930
16009
5374181A93662E4EEF75AFDCDEE0E58ADB2FD85D
423
18093
1070DA28A81BF2C112DDF172649CC4C5AB06CD56
11000
524288
11AA99B76B465333441E3000F477995C70499B92
1610
76800
44E7D73B23A5D53C09D6FEFB8BC38A5DCD711E5E
8390
609280
05FFA39D71DA116F7669EA4EE53A0BAEA315BA7F
24600
2097152
972DCF27BBE901A0640CCA82F400C9DF8E190123
11200
983741
624B6BC3EF75448FCBB57DAFD1819A129A24565E
155
15945
AF14515B5B9B48F24AFD2BF760662003E8E1C6CA
17300
2326278
4483097867F6444533EAAA2D38B5479BE1F36412
7350
1024000
Would we expect something entirely different? There are a few outliers, but that does seem plausible with the delay between a relay advertising its bandwidth and the majority of bandwidth authorities measuring it.
#21394 (moved) turned out to be a problem with DNS, which is now in the process of being fixed. So we don't think it indicates a problem with bandwidth allocation. Not to dissuade anyone from making this measurement as well!
Okay, glad to hear the problem is (in the process of being) fixed!
So, if you don't need this analysis anymore, I don't think we'll want to pursue it any further. The initial analysis above did not provide any new insights, so I guess whoever wants to debug bandwidth authorities would better start over.
I'll close this ticket. If somebody else wants to re-open and continue working on it, please feel free to do that. But please also take it to some component outside of the Metrics/* components which is where we keep tickets that are in some way actionable to metrics team members. Thanks!
Trac: Resolution: N/Ato wontfix Status: new to closed
Hey Karsten, thanks for the graph. It's good to see that most relays have a reasonable consensus weight to bandwidth ratio.
But for a log plot, that spread seems a lot: in some cases, it looks like relays with similar advertised bandwidths get 10x the consensus weight. And we still think that bandwidth allocation isn't balanced geographically. From what people tell us, US East and EU West coasts get a large allocation, and it falls off from there.
Can you show the important section of this graph (the high bandwidths and the high consensus weights) using a linear plot?
Is there a way of showing the consensus weight to advertised bandwidth ratios on a map? Maybe as geolocated points?
Hey Karsten, thanks for the graph. It's good to see that most relays have a reasonable consensus weight to bandwidth ratio.
But for a log plot, that spread seems a lot: in some cases, it looks like relays with similar advertised bandwidths get 10x the consensus weight. And we still think that bandwidth allocation isn't balanced geographically. From what people tell us, US East and EU West coasts get a large allocation, and it falls off from there.
Can you show the important section of this graph (the high bandwidths and the high consensus weights) using a linear plot?
I'm adding a linear plot below:
Here's how I made it, using the attached CSV file:
Is there a way of showing the consensus weight to advertised bandwidth ratios on a map? Maybe as geolocated points?
Not easily, and I'm afraid I'm juggling too many other things to dive into such a new visualization now. But maybe this is something where others can step in. Basically, all relevant data is available via Onionoo, including longitude and latitude of relays.
I'm moving this ticket to Metrics/Analysis where we collect analyses like this one without having concrete plans to integrate results into metrics code.
Trac: Component: Metrics/Statistics to Metrics/Analysis
Currently the map only shows aggregations, I guess it could also show individual relays as points. I will ask Ana about this, as she wrote the actual map plotting code, to see if that is something she would like to work on. If not, I could have a go at this next year.
Ana is working on this along with a non-aggregated version of the current map to integrate into Relay Search, so I'll reassign this ticket to that component.
Trac: Component: Metrics/Analysis to Metrics/Atlas
Merged. I'm going to add some explanations of what the map is showing before deploying.
I'll keep the ticket open as we should also attempt to map individual relays. For the map of individual relays it may be necessary to add some jitter to the positions as many relays have the same position. If we can do this deterministically, that would be even better (perhaps based on fingerprint).
Thanks for this, Ana!
It will really help us work out if our bandwidth authority changes are having the desired effect.
At one pixel per relay, 6000 relays completely fill a 100x60 image.
So maybe we need some way of reducing the number of relays?
(Maybe we should only do individual relays by flag?)
Another trick is to use semi-transparent symbols for relays.
Then there is a visual effect for multiple relays, particularly if we add jitter.
Thank you for the ideas! I'm having a play now to see what technique works best.