Opened 9 years ago

Closed 2 years ago

#2394 closed enhancement (wontfix)

Visualize self-reported vs. measured bandwidth of relays

Reported by: mikeperry Owned by:
Priority: Medium Milestone:
Component: Metrics/Analysis Version:
Severity: Keywords: performance loadbalancing
Cc: tomb, karsten Actual Points:
Parent ID: #2769 Points:
Reviewer: Sponsor:

Description

We should write a script to parse past consensuses and display the ratio values computed for different speeds and types of nodes. (The ratio is the "w Bandwidth=" line divided by the descriptor observed bandwidth value). This would provide us with similar output to the statsplitter.py script (https://ides.fscked.org/transient/stats.log).

It would be useful to plot this data over time.

We can also export more raw stats from the scanners themselves, including measured stream data and time pairs, and circuit failure rates.

Child Tickets

Attachments (7)

ticket2394-graphs.pdf (201.0 KB) - added by karsten 9 years ago.
bandwidth-comparison.csv.bz2 (1.1 MB) - added by karsten 9 years ago.
Bandwidth values for relays on January 24, 2011
bandwidth-comparison-relays-2011-04-20.png (125.2 KB) - added by karsten 8 years ago.
Ratio between measured and self-reported relay bandwidth
bandwidth-comparison-relays-votes-2011-04-20.png (138.3 KB) - added by karsten 8 years ago.
Measured vs. self-reported bandwidth ratios in consensus and votes
bandwidth-comparison-measured-votes-2011-04-20.png (146.3 KB) - added by karsten 8 years ago.
Weighted ECDFs for measured vs. self-reported bandwidth ratios in consensus and votes
bandwidth-comparison-measured-votes-2011-06-29.png (117.9 KB) - added by karsten 8 years ago.
bandwidth-comparison-relays-votes-2011-06-29.png (117.8 KB) - added by karsten 8 years ago.

Download all attachments as: .zip

Change History (40)

comment:1 in reply to:  description Changed 9 years ago by karsten

Replying to mikeperry:

We should write a script to parse past consensuses and display the ratio values computed for different speeds and types of nodes. (The ratio is the "w Bandwidth=" line divided by the descriptor observed bandwidth value). This would provide us with similar output to the statsplitter.py script (https://ides.fscked.org/transient/stats.log).

It would be useful to plot this data over time.

We already have metrics-db which parses past consensuses and server descriptors and puts the described information into a database. We also have metrics-web which uses R to query the database and generate CSV files or graphs. If we want these data on a regular basis and without having to care about yet another script, let's extend metrics-db and metrics-web.

We can also export more raw stats from the scanners themselves, including measured stream data and time pairs, and circuit failure rates.

We can extend the metrics-db database schema to hold these measurements, too.

Maybe we should discuss the required changes on IRC and add a summary here?

comment:2 Changed 9 years ago by mikeperry

Yeah, ok.

Right now I am thinking it would be illustrative to graph ratio history for Guard, Middle, Exit, and Guard+Exit for the whole network, as well as for the top 10% and the bottom 10%. We can do that with the data in the consensus documents+descriptors already.

Adding new stats from the individual bw authorities is a longer term task. We should possibly make a different ticket for that one.

comment:3 Changed 9 years ago by mikeperry

Cc: tomb karsten added

comment:4 Changed 9 years ago by karsten

I'm not exactly sure what history graphs you have in mind. Also, maybe we should look closer at a single consensus before aggregating everything over time. Please have a look at the attached PDF which is based on the consensus published at 2011-01-24 00:00:00.

  • Page 1 shows a scatterplot of measured bandwidth as reported in consensuses and observed bandwidth as reported in descriptors. Note that observed really means observed, not advertised = min(observed, average). The graph shows that we prefer fast relays by giving them ratios of more than 5 times their observed bandwidth (dashed line). There's no apparent preference for fast relays with/without Guard/Exit flag.
  • Page 2 zooms in the lower left part of page 1. We can see that we're discriminating against slow Exit and Middle relays, maybe even against all slow relays.
  • Page 3 shows cumulative distribution functions of the ratios. The point where lines cross 1 is where ratios turn from relays being rated down to relays being rated up. For example, we rate only 1/3 of Guard and Guard+Exit nodes down, but more than 3/4 of the Exit and Middle nodes. Also, about 5 % of Middle and 10 % of Exit relays have a ratio of exactly 1.

Did you already open a ticket for new stats from bw authorities? If so, we can reassign this ticket to component Metrics.

Changed 9 years ago by karsten

Attachment: ticket2394-graphs.pdf added

comment:5 in reply to:  4 Changed 9 years ago by arma

Parent ID: #2532

Replying to karsten:

Did you already open a ticket for new stats from bw authorities? If so, we can reassign this ticket to component Metrics.

I made a parent at #2532.

comment:6 Changed 9 years ago by arma

Summary: Export Bw Authorty stats for metrics/researchersExport Bw Authority stats for metrics/researchers

comment:7 Changed 9 years ago by karsten

Component: TorflowMetrics
Owner: changed from mikeperry to karsten
Parent ID: #2532

I'm not sure if this ticket should be a child of #2532. The original task description had two ideas in it: 1) compare self-reported to measured bandwidth and 2) export more raw data from the bandwidth scanners. We already have the data for 1) and only need to come up with useful graphs (see comments above). But in order to implement 2), we'll have to find a good data format, start collecting the data, analyze them to see what we can learn from them, and finally come up with useful graphs. I'd like to keep 1) and 2) separate tasks.

I'm changing this ticket to component Metrics and removing the parent ticket relation.

The next step for this ticket would be to discuss the attached graph or what graphs in general we want for comparing self-reported to measured bandwidth.

comment:8 Changed 9 years ago by karsten

Summary: Export Bw Authority stats for metrics/researchersVisualize self-reported vs. measured bandwidth of relays

(Also changing the ticket summary according to my last comment.)

comment:9 Changed 9 years ago by mikeperry

Karsten, while you're working on the system to generate this data for the consensus, can you also try to make your code easily tunable to produce it for the votes too? Seeing that data in comparison with the consensus would also be very informative.

comment:10 Changed 9 years ago by mikeperry

Karsten: Re your PDF, I think you're right. Let's just snapshot each vote and consensus individually for now, but archive like a weeks worth.

I think that graph 3 is most useful the the consensus itself, but can we also produce a version of the CDF that weights itself by Fraction of Consensus Bandwidth?

For the votes, it may also be useful to include those labeled scatterplots from graph 1 and 3, so we can spot-check how the different bw authorities are voting for each relay. (Unless you can think of a better way to visualize this?)

comment:11 Changed 9 years ago by mikeperry

Oh, we also want a seperate category for people who run the default exit policy on each graph. Maybe calling them SuperExits, so we would also have Guard+SuperExits.

comment:12 Changed 9 years ago by arma

> observation 1 is that there are basically no unmeasured guards
<mikeperry> is that spike at 1 due to unmeasured exits?
> yes, i think so.
> so more than half of the guards are measured at better than their advertised
> 60% or so
> whereas more than half the exits and middles are measured at less
> i wonder if the guards that are measured as better are that way because they
recently became guards so they don't have as much load yet

comment:13 Changed 9 years ago by mikeperry

Keywords: #2769 added

comment:14 Changed 9 years ago by mikeperry

Keywords: #2769 removed
Parent ID: #2769

comment:15 Changed 9 years ago by karsten

Status: newassigned

I focused on the data part first and left out the visualization part for the moment.

In the attachment you'll find a CSV file with relays contained in consensuses on January 24, 2011 (the same day that I also used in the graphs above). This CSV file contains 6 bandwidth values for each relay: observed bandwidth from the descriptor, bandwidth weight from the consensus, and the bandwidth weights as measured from the 4 bandwidth authorities. Also, there are 6 categories for each relay for the combinations of (Guard, !Guard) x (Exit with default exit policy, Exit with custom exit policy, !Exit).

The code to process descriptor tarballs and generate such a CSV file is in the metrics-tasks repository here.

I'm going to look into visualizations next week. If you have any suggestions, please let me know.

Changed 9 years ago by karsten

Bandwidth values for relays on January 24, 2011

Changed 8 years ago by karsten

Ratio between measured and self-reported relay bandwidth

Changed 8 years ago by karsten

Measured vs. self-reported bandwidth ratios in consensus and votes

Changed 8 years ago by karsten

Weighted ECDFs for measured vs. self-reported bandwidth ratios in consensus and votes

comment:16 Changed 8 years ago by karsten

There are three new graphs attached to this ticket:

  • The first graph shows empirical cumulative distribution functions of ratios of measured by self-reported bandwidth for six categories of relays in the consensus from January 24, 2011 at 00:00:00 UTC.
  • The second graph compares ratios in votes to the ratios in the consensus for the six categories. The black line is the consensus, the gray lines are the four votes.
  • The third graph uses measured bandwidth weights on the y axis rather than absolute relay numbers. This graph is somewhat misleading, because it over-represents, well, relays with high measured bandwidth (surprise!). I wonder if there's a better weight to use here. The second graph is probably more useful than this one.

comment:17 Changed 8 years ago by karsten

Component: MetricsAnalysis

comment:18 Changed 8 years ago by karsten

Owner: karsten deleted

Mike, I think the ball is in your court now. Please re-assign to me if this ticket needs more work. If not, please close it.

comment:19 Changed 8 years ago by mikeperry

Can we have a refresh of graphs 2 and 3? We're just about to upgrade the bw scanners. I am particularly interested in the votes from urras. If those could be colored in a new color, that would help to notice anything strange from the new grouping process (#3444) and file sizes (#1975).

comment:20 in reply to:  16 Changed 8 years ago by mikeperry

Replying to karsten:

There are three new graphs attached to this ticket:

  • The second graph compares ratios in votes to the ratios in the consensus for the six categories. The black line is the consensus, the gray lines are the four votes.
  • The third graph uses measured bandwidth weights on the y axis rather than absolute relay numbers. This graph is somewhat misleading, because it over-represents, well, relays with high measured bandwidth (surprise!). I wonder if there's a better weight to use here. The second graph is probably more useful than this one.

By "Refresh of graphs 2 and 3" I mean these two.

comment:21 Changed 8 years ago by mikeperry

If this is made easier by not coloring urras specially, that is fine also. I just want to see if on the whole everything is producing sane numbers.

comment:22 Changed 8 years ago by karsten

See the two new attached graphs. They are based on the consensus from June 29, 2011 at 15:00:00 UTC. urras' line is the purple one.

comment:23 in reply to:  18 ; Changed 8 years ago by mikeperry

Replying to karsten:

Mike, I think the ball is in your court now. Please re-assign to me if this ticket needs more work. If not, please close it.

Can we find a condensed way of publishing just enough data to easily remake these graphs without a full 200GB vote history download?

comment:24 in reply to:  23 ; Changed 8 years ago by karsten

Replying to mikeperry:

Can we find a condensed way of publishing just enough data to easily remake these graphs without a full 200GB vote history download?

For the given analysis you need a single consensus and corresponding votes plus all referenced server descriptors. You can already download a single consensus, e.g., https://metrics.torproject.org/consensus?valid-after=2011-07-13-05-00-00. How about I create a similar URL for all votes published for that consensus, e.g., https://metrics.torproject.org/votes?valid-after=2011-07-13-05-00-00 (DOES NOT WORK YET)? Then you'd only have to download the descriptor tarball, but not the 2G vote tarball.

comment:25 in reply to:  24 ; Changed 8 years ago by karsten

You can now download all votes from a given valid-after time using the following URL: https://metrics.torproject.org/votes?valid-after=2011-07-13-05-00-00

comment:26 in reply to:  25 Changed 8 years ago by mikeperry

Replying to karsten:

You can now download all votes from a given valid-after time using the following URL: https://metrics.torproject.org/votes?valid-after=2011-07-13-05-00-00

Sweet. Can you script graph creation/update the README so it is easy for me to generate these graphs given a particular valid-after time?

comment:27 Changed 8 years ago by karsten

See the updated code. There are still a few manual steps and you still need to download and extract server descriptor tarballs, but at least you don't have to download 2G vote tarballs anymore.

comment:28 in reply to:  27 ; Changed 8 years ago by mikeperry

Replying to karsten:

See the updated code. There are still a few manual steps and you still need to download and extract server descriptor tarballs, but at least you don't have to download 2G vote tarballs anymore.

Is it possible to provide a way to get all descriptors with a certain range of dates for "published"? That might make make automating the graphing process possible for a given date range. The monthly server descriptors are still a little painful to handle..

We should then showcase these urls and scripts on the data.html page for other people who want to study bw auths.

comment:29 in reply to:  28 ; Changed 8 years ago by karsten

Replying to mikeperry:

Is it possible to provide a way to get all descriptors with a certain range of dates for "published"? That might make make automating the graphing process possible for a given date range. The monthly server descriptors are still a little painful to handle..

Hmm, I don't like the idea of giving out custom selections of server descriptors covering a few days. But I could imagine providing a URL to download all server descriptors referenced from a given consensus. That URL could be https://metrics.torproject.org/serverdesc?valid-after=2011-07-13-05-00-00 (DOES NOT WORK YET). In this case these are exactly the descriptors you want to have. I'll put it on my TODO list.

We should then showcase these urls and scripts on the data.html page for other people who want to study bw auths.

data.html is probably not the place where people would go to find out they want to study bw auths. Is there a bwauth page on the Tor homepage where we could put the URLs and scripts?

comment:30 in reply to:  29 Changed 8 years ago by karsten

There, https://metrics.torproject.org/serverdesc?valid-after=2011-07-13-05-00-00 works now. See also the updated README for instructions to use this URL instead of downloading and extracting server descriptor tarballs.

comment:31 Changed 8 years ago by arma

Keywords: performance added

comment:32 Changed 8 years ago by arma

Keywords: loadbalancing added

comment:33 Changed 2 years ago by karsten

Resolution: wontfix
Status: assignedclosed

Closing tickets in Metrics/Analysis that have been created 5+ years ago and not seen progress recently, except for the ones that "nickm-cares" about.

Note: See TracTickets for help on using tickets.