Opened 2 years ago

Last modified 13 months ago

#21994 new enhancement

Consensus Health: what is the distribution of a bandwidth authority's measurements?

Reported by: teor Owned by: tom
Priority: Very Low Milestone:
Component: Metrics/Consensus Health Version:
Severity: Normal Keywords:
Cc: nusenu, karsten, metrics-team, juga Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by teor)

Once we know how many relays a bandwidth authority controls (#21992), we might want to know how much it can change their figures, or how their measurements are distributed.

I'm not sure if we would use this, but I am writing it down so others can decide if they want it, and which option they want.

Quartile Option

This option doesn't tell us the exact range for each relay, but it's easy to calculate and understand.

For each bandwidth authority, and for the measured bandwidth medians:

  • the quartiles of its bandwidth measurements

This could look like:
https://consensus-health.torproject.org/#downloadstats

Range Option

This option is more specific, but might be harder to understand and use.

For each bandwidth authority:

  • the median of the measurements it controls (the relays for which it is the median)
  • the median of the next highest measurement where it is the next highest measurement
  • the median of the next lowest measurement where it is the next lowest measurement

We should probably use medians for the average, because we really don't care about extremes.

Maybe this could look something like:

Bandwidth Authority Variance

lower median higher
longclaw 27887 58578 89090
gabelmoo 34585 69344 84323

Or maybe it would be better to express them as percentages of the median.

Child Tickets

Change History (10)

comment:1 Changed 2 years ago by teor

Description: modified (diff)

I think the range option is very confusing, and we probably just want quartiles.

comment:2 Changed 2 years ago by teor

Summary: Consensus Health: what is the distrubution of a bandwidth authority's measurements?Consensus Health: what is the distribution of a bandwidth authority's measurements?

comment:3 Changed 2 years ago by teor

Description: modified (diff)

It might also be helpful to have quartiles for the overall measurements/medians.

comment:4 Changed 2 years ago by tom

Cc: nusenu added

So I'm not sure exactly ewhat this is asking for, or how to implement it. But as far as bwauth debugging information, what I have wanted/envisioned are the following:

  1. A graph on Atlas, per-relay, that shows each bwauth's votes for that relay over time
  2. Something (maybe a graph) on Atlas, per-relay, that shows which scanner the bwauth made the measurement on
  3. A series of graphs that shows below/median/above bwauth buckets for relays where each graph only applies to one country's relays.
  4. A graph or series of graphs that shows bwauth variance on relays per-country.

(1) is for relay operators to understand why their bandwidth usage may have changed without requiring them to do some ugly consensus/vote grepping. But it requires changes to OnionOO and Atlas.

(2) is also for relay operators, but also bwauth operators, to confirm that sometimes relays slip between scanner cracks. To perform the analysis at all, bwauth's need to apply this patch: https://gitweb.torproject.org/torflow.git/commit/?id=7e4ef735858acf5d2fbb183b6f8418b7fc2b364a To get it into Atlas, we need the data in OnionOO, and to get the data into OnionOO we need it in Collector, (#21378) and to get it into Collector we need it exposed by the bwauths (#21377).

(3) Should confirm (or reject) the hypothesis that some bwauths make more high or low measurements because their geographic location hurts or helps them measure a disproportionate amount of the network.

(4) Should confirm (or reject) the hypothesis that maybe, just maybe, our bwauths tend to agree with similarly-located bwauths for similarly-located relays.

I'm not sure if any of those is what you said though.

comment:5 in reply to:  4 ; Changed 2 years ago by teor

Replying to tom:

So I'm not sure exactly ewhat this is asking for, or how to implement it. But as far as bwauth debugging information, what I have wanted/envisioned are the following:

  1. A graph on Atlas, per-relay, that shows each bwauth's votes for that relay over time
  2. Something (maybe a graph) on Atlas, per-relay, that shows which scanner the bwauth made the measurement on
  3. A series of graphs that shows below/median/above bwauth buckets for relays where each graph only applies to one country's relays.
  4. A graph or series of graphs that shows bwauth variance on relays per-country.

(1) is for relay operators to understand why their bandwidth usage may have changed without requiring them to do some ugly consensus/vote grepping. But it requires changes to OnionOO and Atlas.

(2) is also for relay operators, but also bwauth operators, to confirm that sometimes relays slip between scanner cracks. To perform the analysis at all, bwauth's need to apply this patch: https://gitweb.torproject.org/torflow.git/commit/?id=7e4ef735858acf5d2fbb183b6f8418b7fc2b364a To get it into Atlas, we need the data in OnionOO, and to get the data into OnionOO we need it in Collector, (#21378) and to get it into Collector we need it exposed by the bwauths (#21377).

(3) Should confirm (or reject) the hypothesis that some bwauths make more high or low measurements because their geographic location hurts or helps them measure a disproportionate amount of the network.

(4) Should confirm (or reject) the hypothesis that maybe, just maybe, our bwauths tend to agree with similarly-located bwauths for similarly-located relays.

These seem like great ideas.
Which do you think we should do first?

I'm not sure if any of those is what you said though.

I think your suggestions are better than mine.

comment:6 in reply to:  5 Changed 2 years ago by tom

Cc: karsten added

Replying to teor:

These seem like great ideas.
Which do you think we should do first?

Definitely 3 and 4. The first two have a ton of dependencies.

I think adding these graphs will tip the scales of what's feasible in the current design of the consensus-health graphs page though; and we need to rethink how it is organized, and make it interactive. Maybe it should not live on consensus-health at all, and instead on Metrics...

comment:7 Changed 2 years ago by teor

I would like some consensus weight and advertised bandwidth vs number of relays graphs, per authority:

Next question was: what estimates was actually assigned to that
bandwidth spikes? Maybe all zeroes? This led me to another charts:
https://s8.hostingkartinok.com/uploads/images/2017/06/8cefb70fce667a1b89c783ed2bfc9442.png
https://s8.hostingkartinok.com/uploads/images/2017/06/2e42634ea3f9b71df8a7fd17c27660d9.png
x here is "Advertised Bandwidth", y is "Consensus Weight".
x is KiB/s, y is count
(yellow bars are for "Advertised Bandwidth", blue - for
"Consensus Weight", grey mean both values)

https://lists.torproject.org/pipermail/tor-relays/2017-June/012466.html

comment:8 Changed 20 months ago by teor

I opened #24834 as a follow-up to map the consensus weight vs bandwidth per bandwidth authority.

comment:9 Changed 17 months ago by irl

Cc: metrics-team added

Adding metrics-team to cc

comment:10 Changed 13 months ago by juga

Cc: juga added

Add myself in CC

Note: See TracTickets for help on using tickets.