Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#13008 closed enhancement (fixed)

Create a Nagios check to ensure that Onionoo is updating correctly

Reported by: karsten Owned by:
Priority: Medium Milestone:
Component: Internal Services/Tor Sysadmin Team Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Can we have a Nagios check that fetches ​https://onionoo.torproject.org/summary?limit=0, say, once per hour, and makes sure that the two contained timestamps are not older than, say, three hours?

And can Nagios send me mail when that warning is triggered?

I can maybe help write that check if I know what language to use and what output to provide.

Child Tickets

Attachments (2)

check-onionoo-recent (5.2 KB) - added by karsten 6 years ago.
Nagios plugin to check that the Onionoo service is running and returns recent data
0001-Tweak-Onionoo-check-script-based-on-atagar-s-input.patch (2.5 KB) - added by karsten 6 years ago.
Patch with atagar's suggestions.

Download all attachments as: .zip

Change History (10)

comment:1 Changed 6 years ago by weasel

Status: newneeds_revision

Please provide a check.

Language doesn't matter. Script languages preferred (shell, python, perl).
Exit codes: 0 for OK, 1 for Warning, 2 for Critical, 3 for Unknown.
1st line output to stdout is the status summary.

See https://anonscm.debian.org/cgit/mirror/dsa-nagios.git/tree/dsa-nagios-checks/checks for inspiration.

Or maybe you can write a status file to disk that we can check using dsa-check-statusfile.

comment:2 Changed 6 years ago by karsten

Status: needs_revisionneeds_review

I'm attaching a Python script that should match the stated requirements. I'm setting the status to needs_review, because I don't trust my Python skills enough for this script to run on Tor's Nagios installation yet. I hope a nice Python person comes along and does a quick review. Of course, if you have any feedback, please do tell!

Changed 6 years ago by karsten

Attachment: check-onionoo-recent added

Nagios plugin to check that the Onionoo service is running and returns recent data

comment:3 Changed 6 years ago by weasel

Resolution: fixed
Status: needs_reviewclosed

deployed. No way to check if it actually works.

comment:4 Changed 6 years ago by atagar

Hi Karsten, your script looks real nice!

weasel: karsten emailed a handful of us a few minutes ago asking for feedback


# Standard Nagios return codes
OK = 0
WARNING = 1
CRITICAL = 2
UNKNOWN = 3

This could also be...

# Standard Nagios return codes

OK, WARNING, CRITICAL, UNKNOWN = range(4)

def end(status, message):
  """Exit the plugin with first arg as the return code and second arg as
     the message to output."""

  if status == OK:
    print "ONIONOO OK: %s" % message
    sys.exit(OK)
  elif status == WARNING:
    print "ONIONOO WARNING: %s" % message
    sys.exit(WARNING)
  elif status == CRITICAL:
    print "ONIONOO CRITICAL: %s" % message
    sys.exit(CRITICAL)
  else:
    print "ONIONOO UNKNOWN: %s" % message
    sys.exit(UNKNOWN)

Minor nitpick but the sys.exit() calls are redundant...

def end(status, message):
  """Exit the plugin with first arg as the return code and second arg as
     the message to output."""

  if status == OK:
    print "ONIONOO OK: %s" % message
  elif status == WARNING:
    print "ONIONOO WARNING: %s" % message
  elif status == CRITICAL:
    print "ONIONOO CRITICAL: %s" % message
  else:
    print "ONIONOO UNKNOWN: %s" % message
    status = UNKNOWN

  sys.exit(status)

def main():
  """Call function to check whether Onionoo service is working."""

  result, message = test_onionoo()
  end(result, message)

if __name__ == "__main__":
  try:
    main()
  except KeyboardInterrupt:
    end(CRITICAL, "Caught Control-C...")

This and end() are brief enough that personally I'd just combine it all.

if __name__ == "__main__":
  result, message = None, None

  try:
    result, message = test_onionoo()
  except KeyboardInterrupt:
    result, message = CRITICAL, "Caught Control-C..."
  finally:
    if status == OK:
      print "ONIONOO OK: %s" % message
    elif status == WARNING:
      print "ONIONOO WARNING: %s" % message
    elif status == CRITICAL:
      print "ONIONOO CRITICAL: %s" % message
    else:
      print "ONIONOO UNKNOWN: %s" % message
      status = UNKNOWN

    sys.exit(status)

comment:5 Changed 6 years ago by weasel

We can accept patches against admin/tor-nagios. The script is in tor-../checks/

comment:6 Changed 6 years ago by karsten

Resolution: fixed
Status: closedreopened

Thanks for the code review, atagar! I'm attaching a patch against admin/tor-nagios. weasel, can you apply that patch, please? Thank you!

Changed 6 years ago by karsten

Patch with atagar's suggestions.

comment:7 Changed 6 years ago by weasel

Resolution: fixed
Status: reopenedclosed

comment:8 Changed 6 years ago by karsten

Thanks! (And sorry for undoing your fixes.)

Note: See TracTickets for help on using tickets.