Opened 6 years ago

Closed 6 years ago

#9103 closed enhancement (fixed)

Allow ignoring certain consensus-health warnings

Reported by: karsten Owned by: atagar
Priority: Medium Milestone:
Component: Core Tor/DocTor Version:
Severity: Keywords:
Cc: atagar Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

The consensus-health script produces some rather persistent warnings. For example, here's the most recent list of warnings:

WARNING: The following directory authorities are not reporting bandwidth scanner results: turtles
NOTICE: The certificates of the following directory authorities expire within the next two months: Faravahar 2013-08-09 03:46:54, maatuska 2013-08-06 07:58:18
NOTICE: The following directory authorities recommend other client versions than the consensus: moria1 -0.2.4.14-alpha
NOTICE: The following directory authorities recommend other server versions than the consensus: moria1 -0.2.4.14-alpha

People who watch the consensus-health list are well aware of these warnings and know that two certs expire in 1.5 months and that Roger has left Internet land for at least this week. That leaves just the first issue that somebody should get Mike to resolve.

The current approach to reduce noise is as follows: if there are no new warnings or if a warning persists for a given number of hours, don't send an email.

There should be an option to ignore certain warnings until a given timestamp. For example, I'd want to ignore expiring certs until mid-July and moria1 recommending different versions until next Monday. I'd like to edit a text file on yatei that contains the warning text and a timestamp until when to ignore this warning.

This list of ignored warnings should also be added to the consensus-health.html page, so that everyone can look up which warnings are currently ignored. The text file could also contain a comment saying why the warning is ignored and decide if that's a good idea or not.

Hopefully, this reduces noise even more and makes authority operators pay more attention to the consensus-health list again.

Child Tickets

Change History (5)

comment:1 Changed 6 years ago by karsten

Cc: atagar added

atagar, is this something you want to do in your Python DocTor? If not, I'd close this ticket, because I'm not working on the Java DocTor anymore. Thanks!

comment:2 Changed 6 years ago by atagar

I'm not especially inclined on this one. In essence the python DocTor *did* do this and I considered it to be a pretty bad bug. The notification emails should enumerate all the present issues authority operators need to take care of, not require folks to comb through old email to figure out the present state.

That said, I also think all DocTor notifications should be for real, actionable issues. If a three or two month cert expiration warning is useless noise then we should drop them.

comment:3 Changed 6 years ago by karsten

Component: Metrics UtilitiesDocTor
Owner: set to atagar

comment:4 in reply to:  2 Changed 6 years ago by karsten

Replying to atagar:

I'm not especially inclined on this one. In essence the python DocTor *did* do this and I considered it to be a pretty bad bug. The notification emails should enumerate all the present issues authority operators need to take care of, not require folks to comb through old email to figure out the present state.

Okay.

That said, I also think all DocTor notifications should be for real, actionable issues. If a three or two month cert expiration warning is useless noise then we should drop them.

Yes, dropping that particular warning may be a solution, too, and maybe not even a bad one.

Feel free to do whatever makes most sense here. I mostly wanted to avoid closing this ticket if it still has any value for you.

comment:5 Changed 6 years ago by atagar

Resolution: fixed
Status: newclosed

Lowered the durations we check for when warning about certs (change). Feel free to reopen if there's anything we should do for this ticket.

Note: See TracTickets for help on using tickets.