Opened 8 months ago

Closed 4 months ago

#31901 closed defect (fixed)

webstats-tb.html graph too eager to include today's stats

Reported by: arma Owned by: metrics-team
Priority: High Milestone:
Component: Metrics/Website Version:
Severity: Normal Keywords:
Cc: gk, ggus, metrics-team Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Today the graphs on the https://metrics.torproject.org/webstats-tb.html have collapsed -- all four of the graphs fall off abruptly at the most recent data point.

I checked with weasel and it looks like the data should be getting from the webservers to metrics, or at least nothing has changed there.

So my next guess is that webstats-tb is willing to graph data before it has the full set of data points for the most recent day. Is that a plausible bug here?

In particular, weasel seemed to think that it was graphing today's data using input from only one of the four webservers.

Child Tickets

Attachments (1)

screenshot.png (59.9 KB) - added by arma 8 months ago.
screenshot courtesy weasel

Download all attachments as: .zip

Change History (6)

Changed 8 months ago by arma

Attachment: screenshot.png added

screenshot courtesy weasel

comment:1 Changed 8 months ago by karsten

Cc: metrics-team added
Owner: changed from metrics-team to karsten
Status: newaccepted

Looking into this now.

comment:2 Changed 8 months ago by karsten

Here's the long version of what I think has happened: one log file for dist.tp.o is written at around 12am every day whereas four others are written at around 7am. That first log file is sanitized by CollecTor at 4am and the others at 10am. The metrics-web cronjob runs at 9am, so it only sees that first log file and not the four others. All this should not be an issue, because we're delaying sanitization for three days. We shouldn't be looking at log file timestamps but contained request timestamps. I'm not sure why we're not doing this. This might be a bug.

For now, I changed the timing of sanitizing web logs from running at 4:21am, 10:21am, etc. to 7:41am, 1:41pm, etc. This should better sync with our metrics-web cronjob. And in theory it shouldn't break anything. Let's see how it works over the next few days.

The real fix is described in the first paragraph. This is a bit harder to fix, though.

comment:3 Changed 8 months ago by karsten

Owner: changed from karsten to metrics-team
Status: acceptedassigned

Looks like the quick fix worked. We should still do the real fix, but at least nobody will be confused by the numbers until we get to that. Re-assigning to metrics-team, because I'll not be able to look into this over the next week.

comment:4 Changed 4 months ago by karsten

Priority: MediumHigh

Changing priority of all defects in Metrics/Website to high to get these resolved soon.

comment:5 Changed 4 months ago by karsten

Resolution: fixed
Status: assignedclosed

I think this is solved as a positive side effect of resolving #32747. I set up a cronjob to fetch that graph once per hour for about a week, and I didn't see any changes to the last data point during that time. Closing as (probably-)fixed.

Note: See TracTickets for help on using tickets.