Consensus Health: what is the distribution of a bandwidth authority's measurements?

added component::metrics/consensus health owner::tom priority::very low severity::normal status::new type::enhancement labels

I think the range option is very confusing, and we probably just want quartiles.

Trac:
Description: Once we know how many relays a bandwidth authority controls (#21992 (moved)), we might want to know how much it can change their figures, or how their measurements are distributed.

I'm not sure if we would use this, but I am writing it down so others can decide if they want it, and which option they want.

Quartile Option

This option doesn't tell us the exact range for each relay, but it's easy to calculate and understand.

For each bandwidth authority:

the quartiles of its bandwidth measurements

This could look like: https://consensus-health.torproject.org/#downloadstats

Range Option

This option is more specific, but might be harder to understand and use.

For each bandwidth authority:

the median of the measurements it controls
the median of the next highest measurement
the median of the next lowest measurement

We should probably use medians for the average, because we really don't care about extremes.

Maybe this could look something like:

Bandwidth Authority Variance

	lower	median	higher
longclaw	27887	58578	89090
gabelmoo	34585	69344	84323

Or maybe it would be better to express them as percentages of the median.

to

Once we know how many relays a bandwidth authority controls (#21992 (moved)), we might want to know how much it can change their figures, or how their measurements are distributed.

I'm not sure if we would use this, but I am writing it down so others can decide if they want it, and which option they want.

Quartile Option

This option doesn't tell us the exact range for each relay, but it's easy to calculate and understand.

For each bandwidth authority:

the quartiles of its bandwidth measurements

This could look like: https://consensus-health.torproject.org/#downloadstats

Range Option

This option is more specific, but might be harder to understand and use.

For each bandwidth authority:

the median of the measurements it controls (the relays for which it is the median)
the median of the next highest measurement where it is the next highest measurement
the median of the next lowest measurement where it is the next lowest measurement

We should probably use medians for the average, because we really don't care about extremes.

Maybe this could look something like:

Bandwidth Authority Variance

	lower	median	higher
longclaw	27887	58578	89090
gabelmoo	34585	69344	84323

Or maybe it would be better to express them as percentages of the median.

Trac:
Summary: Consensus Health: what is the distrubution of a bandwidth authority's measurements? to Consensus Health: what is the distribution of a bandwidth authority's measurements?

It might also be helpful to have quartiles for the overall measurements/medians.

Trac:
Description: Once we know how many relays a bandwidth authority controls (#21992 (moved)), we might want to know how much it can change their figures, or how their measurements are distributed.

I'm not sure if we would use this, but I am writing it down so others can decide if they want it, and which option they want.

Quartile Option

This option doesn't tell us the exact range for each relay, but it's easy to calculate and understand.

For each bandwidth authority:

the quartiles of its bandwidth measurements

This could look like: https://consensus-health.torproject.org/#downloadstats

Range Option

This option is more specific, but might be harder to understand and use.

For each bandwidth authority:

the median of the measurements it controls (the relays for which it is the median)
the median of the next highest measurement where it is the next highest measurement
the median of the next lowest measurement where it is the next lowest measurement

We should probably use medians for the average, because we really don't care about extremes.

Maybe this could look something like:

Bandwidth Authority Variance

	lower	median	higher
longclaw	27887	58578	89090
gabelmoo	34585	69344	84323

Or maybe it would be better to express them as percentages of the median.

to

Once we know how many relays a bandwidth authority controls (#21992 (moved)), we might want to know how much it can change their figures, or how their measurements are distributed.

I'm not sure if we would use this, but I am writing it down so others can decide if they want it, and which option they want.

Quartile Option

This option doesn't tell us the exact range for each relay, but it's easy to calculate and understand.

For each bandwidth authority, and for the measured bandwidth medians:

the quartiles of its bandwidth measurements

This could look like: https://consensus-health.torproject.org/#downloadstats

Range Option

This option is more specific, but might be harder to understand and use.

For each bandwidth authority:

the median of the measurements it controls (the relays for which it is the median)
the median of the next highest measurement where it is the next highest measurement
the median of the next lowest measurement where it is the next lowest measurement

We should probably use medians for the average, because we really don't care about extremes.

Maybe this could look something like:

Bandwidth Authority Variance

	lower	median	higher
longclaw	27887	58578	89090
gabelmoo	34585	69344	84323

Or maybe it would be better to express them as percentages of the median.

So I'm not sure exactly ewhat this is asking for, or how to implement it. But as far as bwauth debugging information, what I have wanted/envisioned are the following:

A graph on Atlas, per-relay, that shows each bwauth's votes for that relay over time
Something (maybe a graph) on Atlas, per-relay, that shows which scanner the bwauth made the measurement on
A series of graphs that shows below/median/above bwauth buckets for relays where each graph only applies to one country's relays.
A graph or series of graphs that shows bwauth variance on relays per-country.

(1) is for relay operators to understand why their bandwidth usage may have changed without requiring them to do some ugly consensus/vote grepping. But it requires changes to OnionOO and Atlas.

(2) is also for relay operators, but also bwauth operators, to confirm that sometimes relays slip between scanner cracks. To perform the analysis at all, bwauth's need to apply this patch: https://gitweb.torproject.org/torflow.git/commit/?id=7e4ef735858acf5d2fbb183b6f8418b7fc2b364a To get it into Atlas, we need the data in OnionOO, and to get the data into OnionOO we need it in Collector, (#21378 (moved)) and to get it into Collector we need it exposed by the bwauths (#21377 (moved)).

(3) Should confirm (or reject) the hypothesis that some bwauths make more high or low measurements because their geographic location hurts or helps them measure a disproportionate amount of the network.

(4) Should confirm (or reject) the hypothesis that maybe, just maybe, our bwauths tend to agree with similarly-located bwauths for similarly-located relays.

I'm not sure if any of those is what you said though.

Trac:
Cc: N/A to nusenu

Replying to tom:

So I'm not sure exactly ewhat this is asking for, or how to implement it. But as far as bwauth debugging information, what I have wanted/envisioned are the following:

A graph on Atlas, per-relay, that shows each bwauth's votes for that relay over time

Something (maybe a graph) on Atlas, per-relay, that shows which scanner the bwauth made the measurement on

A series of graphs that shows below/median/above bwauth buckets for relays where each graph only applies to one country's relays.

A graph or series of graphs that shows bwauth variance on relays per-country.

(1) is for relay operators to understand why their bandwidth usage may have changed without requiring them to do some ugly consensus/vote grepping. But it requires changes to OnionOO and Atlas.

(2) is also for relay operators, but also bwauth operators, to confirm that sometimes relays slip between scanner cracks. To perform the analysis at all, bwauth's need to apply this patch: https://gitweb.torproject.org/torflow.git/commit/?id=7e4ef735858acf5d2fbb183b6f8418b7fc2b364a To get it into Atlas, we need the data in OnionOO, and to get the data into OnionOO we need it in Collector, (#21378 (moved)) and to get it into Collector we need it exposed by the bwauths (#21377 (moved)).

(3) Should confirm (or reject) the hypothesis that some bwauths make more high or low measurements because their geographic location hurts or helps them measure a disproportionate amount of the network.

(4) Should confirm (or reject) the hypothesis that maybe, just maybe, our bwauths tend to agree with similarly-located bwauths for similarly-located relays.

These seem like great ideas. Which do you think we should do first?

I'm not sure if any of those is what you said though.

I think your suggestions are better than mine.

Replying to teor:

These seem like great ideas. Which do you think we should do first?

Definitely 3 and 4. The first two have a ton of dependencies.

I think adding these graphs will tip the scales of what's feasible in the current design of the consensus-health graphs page though; and we need to rethink how it is organized, and make it interactive. Maybe it should not live on consensus-health at all, and instead on Metrics...

Trac:
Cc: nusenu to nusenu, karsten

I would like some consensus weight and advertised bandwidth vs number of relays graphs, per authority:

Next question was: what estimates was actually assigned to that bandwidth spikes? Maybe all zeroes? This led me to another charts: https://s8.hostingkartinok.com/uploads/images/2017/06/8cefb70fce667a1b89c783ed2bfc9442.png https://s8.hostingkartinok.com/uploads/images/2017/06/2e42634ea3f9b71df8a7fd17c27660d9.png x here is "Advertised Bandwidth", y is "Consensus Weight". x is KiB/s, y is count (yellow bars are for "Advertised Bandwidth", blue - for "Consensus Weight", grey mean both values)

https://lists.torproject.org/pipermail/tor-relays/2017-June/012466.html

I opened #24834 (moved) as a follow-up to map the consensus weight vs bandwidth per bandwidth authority.

Adding metrics-team to cc

Trac:
Cc: nusenu, karsten to nusenu, karsten, metrics-team

Add myself in CC

Trac:
Cc: nusenu, karsten, metrics-team to nusenu, karsten, metrics-team, juga

mentioned in issue #25687 (moved)

Consensus Health: what is the distribution of a bandwidth authority's measurements?

Child items 0

Activity