Visualize self-reported vs. measured bandwidth of relays

added component::metrics/analysis loadbalancing parent::2769 performance priority::medium resolution::wontfix status::closed type::enhancement labels

Replying to mikeperry:

We should write a script to parse past consensuses and display the ratio values computed for different speeds and types of nodes. (The ratio is the "w Bandwidth=" line divided by the descriptor observed bandwidth value). This would provide us with similar output to the statsplitter.py script (https://ides.fscked.org/transient/stats.log).

It would be useful to plot this data over time.

We already have metrics-db which parses past consensuses and server descriptors and puts the described information into a database. We also have metrics-web which uses R to query the database and generate CSV files or graphs. If we want these data on a regular basis and without having to care about yet another script, let's extend metrics-db and metrics-web.

We can also export more raw stats from the scanners themselves, including measured stream data and time pairs, and circuit failure rates.

We can extend the metrics-db database schema to hold these measurements, too.

Maybe we should discuss the required changes on IRC and add a summary here?

Yeah, ok.

Right now I am thinking it would be illustrative to graph ratio history for Guard, Middle, Exit, and Guard+Exit for the whole network, as well as for the top 10% and the bottom 10%. We can do that with the data in the consensus documents+descriptors already.

Adding new stats from the individual bw authorities is a longer term task. We should possibly make a different ticket for that one.

Trac:
Cc: N/A to tomb, karsten

I'm not exactly sure what history graphs you have in mind. Also, maybe we should look closer at a single consensus before aggregating everything over time. Please have a look at the attached PDF which is based on the consensus published at 2011-01-24 00:00:00.

Page 1 shows a scatterplot of measured bandwidth as reported in consensuses and observed bandwidth as reported in descriptors. Note that observed really means observed, not advertised = min(observed, average). The graph shows that we prefer fast relays by giving them ratios of more than 5 times their observed bandwidth (dashed line). There's no apparent preference for fast relays with/without !Guard/Exit flag.
Page 2 zooms in the lower left part of page 1. We can see that we're discriminating against slow Exit and Middle relays, maybe even against all slow relays.
Page 3 shows cumulative distribution functions of the ratios. The point where lines cross 1 is where ratios turn from relays being rated down to relays being rated up. For example, we rate only 1/3 of Guard and Guard+Exit nodes down, but more than 3/4 of the Exit and Middle nodes. Also, about 5 % of Middle and 10 % of Exit relays have a ratio of exactly 1.

Did you already open a ticket for new stats from bw authorities? If so, we can reassign this ticket to component Metrics.

Trac:
ticket2394-graphs.pdf

Replying to karsten:

Did you already open a ticket for new stats from bw authorities? If so, we can reassign this ticket to component Metrics.

I made a parent at #2532 (moved).

Trac:
Actualpointsdone: N/A to N/A
Parent: N/A to #2532 (moved)
Pointsdone: N/A to N/A
Actualpoints: N/A to N/A
Points: N/A to N/A

Trac:
Summary: Export Bw Authorty stats for metrics/researchers to Export Bw Authority stats for metrics/researchers

I'm not sure if this ticket should be a child of #2532 (moved). The original task description had two ideas in it: 1) compare self-reported to measured bandwidth and 2) export more raw data from the bandwidth scanners. We already have the data for 1) and only need to come up with useful graphs (see comments above). But in order to implement 2), we'll have to find a good data format, start collecting the data, analyze them to see what we can learn from them, and finally come up with useful graphs. I'd like to keep 1) and 2) separate tasks.

I'm changing this ticket to component Metrics and removing the parent ticket relation.

The next step for this ticket would be to discuss the attached graph or what graphs in general we want for comparing self-reported to measured bandwidth.

Trac:
Component: Torflow to Metrics
Owner: mikeperry to karsten
Parent: #2532 (moved) to N/A

(Also changing the ticket summary according to my last comment.)

Trac:
Summary: Export Bw Authority stats for metrics/researchers to Visualize self-reported vs. measured bandwidth of relays

Karsten, while you're working on the system to generate this data for the consensus, can you also try to make your code easily tunable to produce it for the votes too? Seeing that data in comparison with the consensus would also be very informative.

Karsten: Re your PDF, I think you're right. Let's just snapshot each vote and consensus individually for now, but archive like a weeks worth.

I think that graph 3 is most useful the the consensus itself, but can we also produce a version of the CDF that weights itself by Fraction of Consensus Bandwidth?

For the votes, it may also be useful to include those labeled scatterplots from graph 1 and 3, so we can spot-check how the different bw authorities are voting for each relay. (Unless you can think of a better way to visualize this?)

Oh, we also want a seperate category for people who run the default exit policy on each graph. Maybe calling them SuperExits, so we would also have Guard+SuperExits.

> observation 1 is that there are basically no unmeasured guards
<mikeperry> is that spike at 1 due to unmeasured exits?
> yes, i think so.

> so more than half of the guards are measured at better than their advertised
> 60% or so
> whereas more than half the exits and middles are measured at less
> i wonder if the guards that are measured as better are that way because they
recently became guards so they don't have as much load yet

Trac:
Keywords: N/A deleted, #2769 (closed) added

Trac:
Parent: N/A to #2769 (closed)
Keywords: #2769 (closed) deleted, N/A added

I focused on the data part first and left out the visualization part for the moment.

In the attachment you'll find a CSV file with relays contained in consensuses on January 24, 2011 (the same day that I also used in the graphs above). This CSV file contains 6 bandwidth values for each relay: observed bandwidth from the descriptor, bandwidth weight from the consensus, and the bandwidth weights as measured from the 4 bandwidth authorities. Also, there are 6 categories for each relay for the combinations of (Guard, !Guard) x (Exit with default exit policy, Exit with custom exit policy, !Exit).

The code to process descriptor tarballs and generate such a CSV file is in the metrics-tasks repository here.

I'm going to look into visualizations next week. If you have any suggestions, please let me know.

Trac:
Status: new to assigned

Trac:
bandwidth-comparison.csv.bz2

Bandwidth values for relays on January 24, 2011

Trac:

Ratio between measured and self-reported relay bandwidth

Trac:

Measured vs. self-reported bandwidth ratios in consensus and votes

Trac:

Weighted ECDFs for measured vs. self-reported bandwidth ratios in consensus and votes

There are three new graphs attached to this ticket:

The first graph shows empirical cumulative distribution functions of ratios of measured by self-reported bandwidth for six categories of relays in the consensus from January 24, 2011 at 00:00:00 UTC.
The second graph compares ratios in votes to the ratios in the consensus for the six categories. The black line is the consensus, the gray lines are the four votes.
The third graph uses measured bandwidth weights on the y axis rather than absolute relay numbers. This graph is somewhat misleading, because it over-represents, well, relays with high measured bandwidth (surprise!). I wonder if there's a better weight to use here. The second graph is probably more useful than this one.

Trac:
Component: Metrics to Analysis

Mike, I think the ball is in your court now. Please re-assign to me if this ticket needs more work. If not, please close it.

Trac:
Owner: karsten to N/A

Can we have a refresh of graphs 2 and 3? We're just about to upgrade the bw scanners. I am particularly interested in the votes from urras. If those could be colored in a new color, that would help to notice anything strange from the new grouping process (#3444 (moved)) and file sizes (#1975 (moved)).

Replying to karsten:

There are three new graphs attached to this ticket:

The second graph compares ratios in votes to the ratios in the consensus for the six categories. The black line is the consensus, the gray lines are the four votes.

The third graph uses measured bandwidth weights on the y axis rather than absolute relay numbers. This graph is somewhat misleading, because it over-represents, well, relays with high measured bandwidth (surprise!). I wonder if there's a better weight to use here. The second graph is probably more useful than this one.

By "Refresh of graphs 2 and 3" I mean these two.

If this is made easier by not coloring urras specially, that is fine also. I just want to see if on the whole everything is producing sane numbers.

Trac:

See the two new attached graphs. They are based on the consensus from June 29, 2011 at 15:00:00 UTC. urras' line is the purple one.

Replying to karsten:

Mike, I think the ball is in your court now. Please re-assign to me if this ticket needs more work. If not, please close it.

Can we find a condensed way of publishing just enough data to easily remake these graphs without a full 200GB vote history download?

Replying to mikeperry:

Can we find a condensed way of publishing just enough data to easily remake these graphs without a full 200GB vote history download?

For the given analysis you need a single consensus and corresponding votes plus all referenced server descriptors. You can already download a single consensus, e.g., [https://metrics.torproject.org/consensus?valid-after=2011-07-13-05-00-00]. How about I create a similar URL for all votes published for that consensus, e.g., !https://metrics.torproject.org/votes?valid-after=2011-07-13-05-00-00 (DOES NOT WORK YET)? Then you'd only have to download the descriptor tarball, but not the 2G vote tarball.

You can now download all votes from a given valid-after time using the following URL: [https://metrics.torproject.org/votes?valid-after=2011-07-13-05-00-00]

Replying to karsten:

You can now download all votes from a given valid-after time using the following URL: [https://metrics.torproject.org/votes?valid-after=2011-07-13-05-00-00]

Sweet. Can you script graph creation/update the README so it is easy for me to generate these graphs given a particular valid-after time?

See the updated code. There are still a few manual steps and you still need to download and extract server descriptor tarballs, but at least you don't have to download 2G vote tarballs anymore.

Replying to karsten:

See the updated code. There are still a few manual steps and you still need to download and extract server descriptor tarballs, but at least you don't have to download 2G vote tarballs anymore.

Is it possible to provide a way to get all descriptors with a certain range of dates for "published"? That might make make automating the graphing process possible for a given date range. The monthly server descriptors are still a little painful to handle..

We should then showcase these urls and scripts on the data.html page for other people who want to study bw auths.

Replying to mikeperry:

Is it possible to provide a way to get all descriptors with a certain range of dates for "published"? That might make make automating the graphing process possible for a given date range. The monthly server descriptors are still a little painful to handle..

Hmm, I don't like the idea of giving out custom selections of server descriptors covering a few days. But I could imagine providing a URL to download all server descriptors referenced from a given consensus. That URL could be !https://metrics.torproject.org/serverdesc?valid-after=2011-07-13-05-00-00 (DOES NOT WORK YET). In this case these are exactly the descriptors you want to have. I'll put it on my TODO list.

We should then showcase these urls and scripts on the data.html page for other people who want to study bw auths.

data.html is probably not the place where people would go to find out they want to study bw auths. Is there a bwauth page on the Tor homepage where we could put the URLs and scripts?

There, [https://metrics.torproject.org/serverdesc?valid-after=2011-07-13-05-00-00] works now. See also the updated README for instructions to use this URL instead of downloading and extracting server descriptor tarballs.

Trac:
Keywords: N/A deleted, performance added

Trac:
Keywords: performance deleted, performance loadbalancing added

Closing tickets in Metrics/Analysis that have been created 5+ years ago and not seen progress recently, except for the ones that "nickm-cares" about.

Trac:
Status: assigned to closed
Resolution: N/A to wontfix

closed

mentioned in issue #2532 (moved)

mentioned in issue #2533 (moved)

mentioned in issue #2543 (closed)

Visualize self-reported vs. measured bandwidth of relays

Child items ...

Activity