Opened 5 weeks ago

Last modified 33 hours ago

#29772 needs_review enhancement

Plot nearly worst-case bandwidth when downloading from [public|onion] server

Reported by: karsten Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Website Version:
Severity: Normal Keywords: scalability
Cc: metrics-team Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


We have been asked to add graphs on (nearly) worst-case performance of our OnionPerf measurements, in addition to the average-case performance graphs we already have. In particular, we were asked to plot latency and bandwidth numbers. This ticket is about bandwidth numbers. It's based on team-internal discussions in Brussels and follow-up discussions.

With OnionPerf we measure download times for 50 KiB/1 MiB/5 MiB files that we download from our own public web server or onion server. We could use our DATAPERC* timestamps to extract how long it takes to download a specific part of our files and use that to compute average bandwidth.

We'd like to exclude the transfer start with all the circuit establishment and TCP slow start stuff and only focus on the parts when things have stabilized. More precisely, we could look at the 5 MiB downloads and consider only the time between finishing 2.5 MiB of it as well as the full 5 MiB. Or we could look at the time between downloading 0.5 MiB to 1 MiB that we have data for from our 1 MiB and 5 MiB downloads.

The ask was to plot nearly worst-case bandwidth. So, my guess is that we shouldn't plot the minimum, because we'd only be looking at outliers, but instead the 1st or 5th or 10th percentile. Let's maybe start with the 1st percentile.

I'm attaching two graphs for the public server case and the onion server case. They both show the respective 1st percentile bandwidth of successful 1 MiB and 5 MiB downloads on a given day.

The coding and deployment effort for bringing this graph on the Tor Metrics website would be comparatively small, because we already have all required data in the database. However, I'm not attaching a patch yet, because I'd first want to discuss the general idea of having such a graph.

Child Tickets

Attachments (3)

onionperf-bandwidth-public.png (172.1 KB) - added by karsten 5 weeks ago.
onionperf-bandwidth-onion.png (181.7 KB) - added by karsten 5 weeks ago.
onionperf-boxplots-annotated-2019-04-17.pdf (770.9 KB) - added by karsten 33 hours ago.

Download all attachments as: .zip

Change History (8)

Changed 5 weeks ago by karsten

Changed 5 weeks ago by karsten

comment:1 Changed 5 weeks ago by karsten

comment:2 Changed 2 weeks ago by karsten

Status: newneeds_review

comment:3 Changed 8 days ago by gaba

Keywords: scalability added

comment:4 Changed 3 days ago by irl

Status: needs_reviewneeds_revision

I'm not sure that 1st percentile is the right way to do this. Can we instead exclude minor/major outliers, those that are slower than 1st quartile minus 1.5/3x interquartile range, and then taken the minimum? How does this change the way the plot looks?

I don't think we should make a public graph on Tor Metrics for it, but can you also do a box plot for a month of measurements so I can understand just how variable the results are? I don't think I've done that before.

Changed 33 hours ago by karsten

comment:5 Changed 33 hours ago by karsten

Status: needs_revisionneeds_review

Great idea! I made some new graphs and annotated them by hand. Please find them attached. And please take another look. Thanks!

Note: See TracTickets for help on using tickets.