Opened 5 months ago

Last modified 4 months ago

#25815 assigned enhancement

Speed up hourly updater performance

Reported by: karsten Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Onionoo Version:
Severity: Normal Keywords:
Cc: metrics-team Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

When looking at Onionoo's logs with irl the other day we noticed that the hourly updater sometimes takes as long as 45 minutes.

We obviously need to be careful not to let this time get closer to 60 minutes or even exceed that time.

I just looked a little bit closer at the logs and made a graph. It shows three phases of the hourly updater:

  • before: from starting the updater to writing details documents;
  • write: writing details documents;
  • after: from after writing details documents to the end.

The graph compares total runtimes on the x axis with the three phases on the y axis. I'm reading two things in this graph.

  • Writing details documents really takes a long time. It's never faster than 15 minutes and can take longer than 30 minutes in some cases.
  • Whenever an hourly update takes longer than, say, 40 minutes, it's not just the details documenting that takes longer than usual, but also the phase before takes longer. I guess the system is just generally slower at those times.

I'd say let's look at details document writing for optimizing runtime. Maybe we can decide early that we don't have to update a file. But that's just guessing, we should add more logging to that code, as right now the logs are silent for those 30 to 45 minutes.

Child Tickets

TicketTypeStatusOwnerSummary
#25848enhancementclosedkarstenReplace Gson with Jackson in Onionoo

Attachments (1)

onionoo-start-end.png (176.8 KB) - added by karsten 5 months ago.

Download all attachments as: .zip

Change History (9)

Changed 5 months ago by karsten

Attachment: onionoo-start-end.png added

comment:1 Changed 5 months ago by karsten


comment:2 Changed 5 months ago by karsten

Owner: changed from metrics-team to karsten
Status: newaccepted

I'll investigate this some more.

comment:3 Changed 5 months ago by karsten

So, I ran some performance measurements here. In particular, I used a recent backup from one of the public Onionoo hosts and ran an update. By doing so I identified two possible areas of improvements:

  1. While writing details documents, we spend almost 40% of CPU time on (de-)serializing JSON documents. We might want to try out another JSON library than Gson. A quick search says that Jackson might be a lot faster. Maybe we can save 10% or 20% of the overall time here.
  1. Also while writing details documents, we read almost all details status files and (re-)write a corresponding details document file. That's almost 1M files and takes roughly 55% of the overall CPU time, including the 40% for JSON handling. If we can find a way to exclude files that don't need to be updated, we might be able to save another 10% or 20% of overall time. (The overall savings are related to what we do to JSON handling.)

I'd like to keep this ticket for a while longer and dive deeper into these two ideas. But I thought I'd share these first results to hear early if they're a bad idea for some reason.

comment:4 Changed 5 months ago by iwakeh

A switch to Jackson might lead to other benefits, which at first will also cause some work as gson behaves differently. (See an older suggestion.)

comment:5 Changed 5 months ago by karsten

Owner: changed from karsten to metrics-team
Status: acceptedassigned

I tried out switching to Jackson, and it turns out that we can save 30% of overall time (for writing details documents). That's almost 4 minutes every hour. Sounds like a good idea to me.

I uploaded my branch with temporary commits here. I did not verify that all outputs are correct with regard to unicode character escapes and character encodings in general. I only looked at performance.

For the moment, I'm giving this ticket back to metrics-team. I might grab it back next week, unless somebody else takes it first.

comment:6 Changed 5 months ago by iwakeh

Owner: changed from metrics-team to iwakeh
Status: assignedaccepted

Taking a look, too.

comment:7 in reply to:  5 Changed 5 months ago by iwakeh

Replying to karsten:

I tried out switching to Jackson, and it turns out that we can save 30% of overall time (for writing details documents). That's almost 4 minutes every hour. Sounds like a good idea to me.

Yes, this is a good idea! I added a child ticket for this task, which can easily be treated independently to possibly additional changes. All Gson-Jackson discussion can be moved there. A first step for sure.

I look further and post findings here.

comment:8 Changed 4 months ago by iwakeh

Owner: changed from iwakeh to metrics-team
Status: acceptedassigned
Note: See TracTickets for help on using tickets.