#26002 closed enhancement (fixed)

Simplify graph with number of bytes spent on answering directory requests

Reported by: karsten Owned by: karsten
Priority: High Milestone:
Component: Metrics/Statistics Version:
Severity: Normal Keywords:
Cc: metrics-team Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

While looking at the code that aggregates data for our Number of bytes spent on answering directory requests graph I found two things:

  1. In contrast to the graph description we're only including directory traffic from directory mirrors, not from directory authorities.
  1. As the graph description says, we're extrapolating whatever statistics we get to an estimated network total; however, that formula is really complex and not very intuitive.

I suggest we simplify this graph by a) showing traffic from all directories (including mirrors and authorities) and b) taking out the extrapolation step.

For what it's worth, that extrapolation step was useful in the beginning when only few relays reported these statistics. But that was many years ago. By now, all running tor versions support these statistics, and they have always been turned on by default.

I'm attaching a graph that compares the current approach to the approach suggested here. It only covers April 2018, because we don't have older data in the database anymore. I'd have to re-import the archives for this locally, which I'd be happy to do.

The main advantage of making this change is that our data will be easier to specify and reproduce for others.

Setting to needs_review to get input on the question whether we should do it. Because if there's a reason not to do it, I wouldn't start reprocessing the archives. But currently I don't see such a reason. Thoughts?

Child Tickets

Attachments (3)

dirbytes.png (272.1 KB) - added by karsten 16 months ago.
dirbytes.2.png (291.5 KB) - added by karsten 16 months ago.
dirbytes.3.png (333.1 KB) - added by karsten 16 months ago.

Download all attachments as: .zip

Change History (15)

Changed 16 months ago by karsten

Attachment: dirbytes.png added

comment:1 Changed 16 months ago by karsten

Status: newneeds_review


comment:2 Changed 16 months ago by irl

LGTM.

Let's keep a comment in the code to point out that the git history will have a version that can extrapolate in case we need it in the future but forgot that it exists (but no need to document the extrapolation steps for reproducible metrics).

comment:3 Changed 16 months ago by iwakeh

Looking at the graph in comment:1 the old and new calculations only differ by a constant factor during April 2018. Would be interesting to see a longer history for this comparison. And true, the extrapolation of the total is really not that useful and complicates reproducibility unnecessarily.

As an improvement (that ought to be easy to implement on first thought) maybe add the counts reporting vs. not-reporting (in some useful way)?

comment:4 in reply to:  2 ; Changed 16 months ago by karsten

Replying to irl:

LGTM.

GTFL. (Great, thanks for looking.)

Let's keep a comment in the code to point out that the git history will have a version that can extrapolate in case we need it in the future but forgot that it exists (but no need to document the extrapolation steps for reproducible metrics).

In theory, I agree. However, this extrapolation code is so legacy that we wouldn't look at it again even if we decide we want to extrapolate numbers again. To give you an idea, this code is based on a previous approach for estimating user numbers, and this graph is the only reason why we still have this code in the codebase. I'd rather want to start cleanly when writing the next extrapolation code than looking again at this very old code. So, good suggestion in theory, but in this case I think it simply doesn't make as much sense.

comment:5 in reply to:  3 Changed 16 months ago by karsten

Owner: changed from metrics-team to karsten
Status: needs_reviewaccepted

Replying to iwakeh:

Looking at the graph in comment:1 the old and new calculations only differ by a constant factor during April 2018. Would be interesting to see a longer history for this comparison.

Agreed. I'll fire up the engine now to produce a longer history. That will take a while, though.

And true, the extrapolation of the total is really not that useful and complicates reproducibility unnecessarily.

Agreed!

As an improvement (that ought to be easy to implement on first thought) maybe add the counts reporting vs. not-reporting (in some useful way)?

I'll add some counts for consideration here, though I'm unsure if we want to display those to the user. It would be a graph similar to the Fraction of relays reporting onion-service statistics graph, and I'm yet unclear whether that's useful or just confusing for 90% of our users.

Grabbing the ticket and setting status to accepted.

Thanks for the feedback so far!

comment:6 in reply to:  4 Changed 16 months ago by irl

Replying to karsten:

In theory, I agree. However, this extrapolation code is so legacy that we wouldn't look at it again even if we decide we want to extrapolate numbers again. To give you an idea, this code is based on a previous approach for estimating user numbers, and this graph is the only reason why we still have this code in the codebase. I'd rather want to start cleanly when writing the next extrapolation code than looking again at this very old code. So, good suggestion in theory, but in this case I think it simply doesn't make as much sense.

Ok, your reasoning as to why this does not make sense makes sense.

Changed 16 months ago by karsten

Attachment: dirbytes.2.png added

comment:7 Changed 16 months ago by karsten

Here's a graph comparing current (light blue) and suggested (light red) approach over the years. It only shows four weeks in the respective Aprils, and some years are missing, because I ran out of disk space. But the general idea should be clear.


I think this looks okay, so I'll go ahead and re-import the rest of the data. This will probably take 1--2 weeks.

Changed 16 months ago by karsten

Attachment: dirbytes.3.png added

comment:8 Changed 16 months ago by karsten

Status: acceptedneeds_review

I'm changing my mind! After waiting for over 5 (!) days for the re-import of just the 2018 descriptors to succeed, I'm giving up on this plan. I don't know how far that import got when I aborted it, but even if it had succeeded 1 second after, the whole import would take many weeks if not months. It's not a good enough use of my time to babysit this import. Nor is it good timing to re-implement this code now. After all, this graph if by far not our most important one. Let's do something else to simplify this code and make it easier to specify.

Suggestion #2: We simplify this graph by a) updating the graph description to say that the graph only includes directory traffic from directory mirrors and b) taking out the extrapolation step.

The difference to suggestion #1 is that all data is already in the database. Here's a graph that compares the current approach with suggestion #2:


This graph shows that the extrapolation step may have been useful in 2010 and 2011. It also shows that it started producing weird results in late 2016 when the extrapolated number got smaller than the reported number. I'm not even sure what happened there. Yet one more reason to get rid of it, in addition to the main goal of simplifying our statistics and making them easier to reproduce.

Please review commit a0cd20e in my task-26002 branch.

comment:9 Changed 16 months ago by karsten

Priority: MediumHigh

Setting priority to high, because it would be neat to finalize the specification for this graph.

comment:10 Changed 15 months ago by iwakeh

Status: needs_reviewmerge_ready

Changes look fine as well as the changes in the graphed results.

comment:11 Changed 15 months ago by karsten

Thanks for checking! I'll deploy the database change when the current run has finished and the website change when the next run tonight finishes without issues. And if this all goes well, I'll merge and resolve tomorrow.

comment:12 Changed 15 months ago by karsten

Resolution: fixed
Status: merge_readyclosed

Merged and deployed. Closing. Thanks!

Note: See TracTickets for help on using tickets.