Find out why bwscanners break after a few days/weeks of operation
All four bwscanners on gabelmoo broke at different times in the past two weeks. When running cron.sh manually, I get:
WARN[Sun May 15 18:12:49 2011]:Bandwidth scanner scanner.1 stale. Possible dead bwauthority.py. Timestamp: Mon May 2 23:10:20 2011
WARN[Sun May 15 18:12:49 2011]:Bandwidth scanner scanner.3 stale. Possible dead bwauthority.py. Timestamp: Tue May 10 21:26:10 2011
WARN[Sun May 15 18:12:49 2011]:Bandwidth scanner scanner.2 stale. Possible dead bwauthority.py. Timestamp: Wed May 4 18:13:06 2011
WARN[Sun May 15 18:12:49 2011]:Bandwidth scanner scanner.4 stale. Possible dead bwauthority.py. Timestamp: Wed May 11 02:38:38 2011
Other than that, the logs look pretty normal to me. But maybe that's because Tor was only logging on notice level.
After staring at logs for an hour or two, I changed Tor logging to debug and redirected all of bwscanner's cron output to files. Depending on available disk space, I might reduce Tor logging to info soon.
Hopefully the logs will tell us something about the problem here. The last time I ran into this problem, Mike suggested to just restart the scanners which I did. And Andrew says he's restarting his bwscanners every three days to avoid them breaking. I'd like to find out why they are breaking and get that fixed.