Opened 20 months ago

Closed 9 days ago

Last modified 6 days ago

#25924 closed enhancement (fixed)

Improve execution time of onion service statistics module

Reported by: karsten Owned by: karsten
Priority: Medium Milestone:
Component: Metrics/Statistics Version:
Severity: Normal Keywords:
Cc: metrics-team Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Two years ago, in February 2016, we noticed that some of our back-end modules had really long execution times. Back then we made improvements to two of these modules. I'm going to post a graph of execution times in the comments.

Looks like we'll have to do it again, and this time work even harder on improving execution times. The onion service statistics module again takes roughly 2 hours to complete, with a clear trend. I assume that we'll reach execution times of 3 hours by the end of 2018 and 5 to 6 hours by the end of 2019.

I'm going to post another graph with recent execution times. (Note that we don't have logs for just the onion service statistics module, but only for that module plus the previous module together; but that other module is still relatively fast, contributing with just a few minutes execution time to the graph.)

I don't have concrete suggestions for improvements. And it's still early enough to try to get this work funded before we dive into it. That's why I'm setting priority to low. But it's clear that we'll have to do something here in 2019.

Child Tickets

Attachments (2)

metrics-modules-2016-01-07.png (102.2 KB) - added by karsten 20 months ago.
metrics-modules-2018-04-26.png (96.3 KB) - added by karsten 20 months ago.

Download all attachments as: .zip

Change History (10)

Changed 20 months ago by karsten

Changed 20 months ago by karsten

comment:1 Changed 20 months ago by karsten



comment:2 Changed 12 months ago by karsten

Priority: LowHigh

comment:3 Changed 10 months ago by gaba

Priority: HighMedium

comment:4 Changed 3 weeks ago by karsten

Owner: changed from metrics-team to karsten
Status: newaccepted

I just noticed that execution time of this module went up to 7.5 hours on average during the last week. This is too much. Looking into this now.

comment:5 Changed 3 weeks ago by karsten

Status: acceptedneeds_review

Alright, I think I know what's going on. The reason why runtime keeps getting higher is that we're re-processing an ever increasing number of onion service statistics reported by relays for which we computed a network fraction of zero for having observed these statistics. Of course, it does make sense not to extrapolate these numbers, because we cannot divide by zero. But we should remember this fact and not attempt to extrapolate statistics in all future executions. That's also what we do with statistics reported by relays with non-zero network fractions: we notice that we did extrapolate these before and skip them. This is the reason why runtime keeps getting higher and higher over time. The fix is simply to write extrapolated numbers for all reported statistics.

Please review commit ed4f75d in my task-25924 branch. I did test this branch locally, but before deploying it we should make fresh a backup of work/modules/hidserv/, just in case.

comment:6 Changed 9 days ago by irl

Status: needs_reviewmerge_ready

We got lucky here with this low hanging fruit. Next time will be harder.

LGTM.

comment:7 Changed 9 days ago by karsten

Resolution: fixed
Status: merge_readyclosed

Merged to master. Will make a backup and deploy either tonight or tomorrow morning before the next daily update starts. Thanks for checking! Closing.

comment:8 Changed 6 days ago by karsten

It's deployed now. Last execution times of daily update runs in hours are: 13.6, 14.4, 17.8, 15.8, 5.4. Those 5.4 hours are from the first execution running this branch. Yay!

Note: See TracTickets for help on using tickets.