Opened 2 years ago

Closed 2 years ago

#26015 closed enhancement (fixed)

Remove inconsistency between bandwidth history graphs

Reported by: karsten Owned by: karsten
Priority: High Milestone:
Component: Metrics/Statistics Version:
Severity: Normal Keywords:
Cc: metrics-team Actual Points:
Parent ID: Points:
Reviewer: iwakeh Sponsor:

Description

Today I found an inconsistency between our various bandwidth history graphs:

The reason is that we're only matching consensuses and extra-info descriptors for the second and third graph, but not for the first. And we need to do that in order to break down totals by guards/exits.

While it may seem simpler to just skip that matching step in the first graph, it leads to inconsistent data. Consider the following data taken from bandwidth.csv:

date isexit isguard advbw bwread bwwrite dirread dirwrite
2018-03-14 f f 6757900851 2454444601 2493893288
2018-03-14 f t 15218678985 7024742679 7191640536
2018-03-14 t f 1592294787 562694042 558274048
2018-03-14 t t 6189896122 3322794675 3356316394
2018-03-14 29758770745 13367602689 13603416291 6877369 187770410

In theory, the sum of the first four rows should match the fifth row, modulo rounding errors.

This works for advertised bandwidth (which is based on server descriptor data, not extra-info descriptors). But it does not work for bandwidth histories:

6757900851 + 15218678985 + 1592294787 + 6189896122 - 29758770745 = 0
2454444601 + 7024742679 + 562694042 + 3322794675 - 13367602689 = -2926692
2493893288 + 7191640536 + 558274048 + 3356316394 - 13603416291 = -3292025

The difference comes from relays that reported bandwidth histories but that the directory authorities did not list as running.

Suggestion: we simply omit the bandwidth totals for cases where we have values by exit/guard flags:

date isexit isguard advbw bwread bwwrite dirread dirwrite
2018-03-14 f f 6757900851 2454444601 2493893288
2018-03-14 f t 15218678985 7024742679 7191640536
2018-03-14 t f 1592294787 562694042 558274048
2018-03-14 t t 6189896122 3322794675 3356316394
2018-03-14 29758770745 13367602689 13603416291 6877369 187770410

We'd remove an inconsistency by doing so, and we'd remove some code. The graphing code would have to do one more step to aggregate data from four rows, but that's not critical.

If this sounds reasonable to others, I'll prepare a patch. Setting to needs_review for the idea, not for code.

Child Tickets

Change History (8)

comment:1 Changed 2 years ago by karsten

Status: assignedneeds_review

comment:2 Changed 2 years ago by karsten

Priority: MediumHigh

Setting priority to high, because that would allow me to move forward with sponsor 13 deliverable 2.

comment:3 Changed 2 years ago by iwakeh

Yep, that all makes sense.
Not changing the status in case others should review the idea, too.

(I actually overlooked the final line of the description earlier and was waiting for a patch to be reviewed.)

comment:4 Changed 2 years ago by karsten

Status: needs_reviewaccepted

Great! I'll prepare a patch then. Thanks!

comment:5 Changed 2 years ago by karsten

Status: acceptedneeds_review

comment:6 Changed 2 years ago by iwakeh

Reviewer: iwakeh

comment:7 Changed 2 years ago by iwakeh

Status: needs_reviewmerge_ready

Looks fine; assuming you performed test runs for db&R code (I don't have a useful environment at hand) ready for merge.

comment:8 Changed 2 years ago by karsten

Resolution: fixed
Status: merge_readyclosed

Yes, local tests have been successful before. Merged and deployed. It will first run tonight. Thanks! Closing.

Note: See TracTickets for help on using tickets.