Opened 7 years ago

Closed 7 years ago

#6536 closed defect (fixed)

Sanitized weblogs are missing last three days of a month

Reported by: karsten Owned by: Runa
Priority: Medium Milestone:
Component: Webpages/Website Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Looks like the last two log tarballs (e.g., https://archive.torproject.org/torproject-weblogs/metrics.torproject.org/metrics.torproject.org-access.log-2012-07.tar.bz2) are missing the last three days of the respective months. The reason is very likely that sanitized logs are only made available with a delay of a few days to ensure they're complete. But tarballs are generated on the 1st of a month at 00:47 and not updated afterwards. Changing the tarball-generating script to be executed on the 5th should solve this problem.

Child Tickets

Change History (10)

comment:1 Changed 7 years ago by karsten

If we still have the sanitized files for these six days in non-compressed form, can you please update the tarballs? Thanks!

comment:2 Changed 7 years ago by runa

I recreated the tarballs for 06 and 07 with the remaining raw logs that I could find (28, 29 and 30 for June, and 29 for July). I also updated the script to wait until the 5th day of the month with packing up and archiving logs. Any way we can re-sync/re-sanitize the remaining logs for July?

comment:3 in reply to:  2 Changed 7 years ago by karsten

Replying to runa:

I recreated the tarballs for 06 and 07 with the remaining raw logs that I could find (28, 29 and 30 for June, and 29 for July). I also updated the script to wait until the 5th day of the month with packing up and archiving logs. Any way we can re-sync/re-sanitize the remaining logs for July?

It could be that July 30 and 31 will be there tomorrow and the day after. This is due to the four day delay in the sanitizing process. Let's wait until Monday and look again. Thanks!

comment:4 Changed 7 years ago by karsten

The problem didn't fix itself, but I had to fix it manually. When the disk ran full, the Java program refrained from working again before manual operator investigation. This was actually a feature that I wrote back in the days and is not a bug. The real bug is that we let the disk run out of space.

Can you please update the July tarball? The two missing days should now be available. Thanks!

comment:5 Changed 7 years ago by runa

Resolution: fixed
Status: newclosed

Done. The July tarball will be synced to archive the next time the cron runs, and I also updated the webalizer page with the last three logs.

comment:6 in reply to:  5 Changed 7 years ago by karsten

Resolution: fixed
Status: closedreopened

Replying to runa:

Done. The July tarball will be synced to archive the next time the cron runs, and I also updated the webalizer page with the last three logs.

Found the new tarballs, but it seems they ended up in the wrong place: https://archive.torproject.org/torproject-weblogs/

comment:7 Changed 7 years ago by runa

That's the correct place. Where did you expect to find them?

comment:9 Changed 7 years ago by runa

Oh, had to refresh in my browser to see the two files listed there. weasel; can you please move the two files into the subdirectory?

comment:10 Changed 7 years ago by runa

Resolution: fixed
Status: reopenedclosed
Note: See TracTickets for help on using tickets.