#28652 closed defect (implemented)

When sbws stops making progress, log a warning

Reported by: teor Owned by: juga
Priority: Medium Milestone: sbws: 1.1.x-final
Component: Core Tor/sbws Version:
Severity: Normal Keywords: no-changes-version
Cc: juga, teor Actual Points:
Parent ID: #28547 Points: 1
Reviewer: asn Sponsor:


In #28639, sbws stopped making progress, but didn't log anything.

sbws should detect when its progress rate falls beneath some minimum (for example, all relays measured min_result times in data_period days), and log a warning.

Child Tickets

Change History (11)

comment:1 Changed 19 months ago by juga

#28932 does not do what described here, since it logs the stack after 3 minutes not finishing with the results to start a new loop, but it's similar.
I think that sbws stalling (#28663) happens after the best_priority loop (potentially solved in #28897), maybe while checking the destination again (potentially solved in #28868), so this wouldn't be needed.
It still might be useful to implement this ticket.

comment:2 Changed 17 months ago by juga

Keywords: no-changes-version added

Add keyword to help planify releases/milestones.
Tickets that doesn't imply a change of version are tickets which do not affect the code (docs, tests) and some time of refactors.

comment:3 Changed 17 months ago by juga

Points: 1

Add points

comment:4 Changed 17 months ago by juga

Owner: set to juga
Status: newassigned

comment:5 Changed 17 months ago by juga

Status: assignedneeds_revision


(In revision until other tickets are merged).

comment:6 Changed 17 months ago by juga

Status: needs_revisionneeds_review

I thought it would be better to detect there is no progress measuring unique relays in the scanner, since there we can keep track there of all the different relays seen and measured.

Also because if we try to detect it in the bandwidth file, it will continue making progress until the 5th day, but the number of relays reported will be approximately the same after 5 days. We also don't know in the bandwidth file whether the relays reported are different relays from the previous bandwidth files.

Note that the scanner is already either logging that some relays failed to be measured or stop, no matter whether the relays were unique or not.


comment:7 Changed 17 months ago by asn

Reviewer: asn

comment:8 Changed 17 months ago by asn

Status: needs_reviewneeds_revision

Hey juga, did another review and left some comments on the PR. Also travis is failing, not sure what's up with that.

comment:9 Changed 17 months ago by juga

Status: needs_revisionneeds_review

should be fixed now

comment:10 in reply to:  9 Changed 17 months ago by asn

Status: needs_reviewmerge_ready

Replying to juga:

should be fixed now

Hm, better. And the test looks good.

I think there might be more stuff you can hide into the heartbeat module so that they are not exposed in the main loop. The goal is to simplify the main loop and hide the heartbeat details into the heartbeat module.

I did an implementation here: https://github.com/torproject/sbws/pull/354/commits/944bad5561dc92d04569ed06eceadd139315e0d4

However, I was not able to test it because my user file does not work anymore (something about country codes?), and I'm not sure how to run the unittest either.

If you like the code feel free to use it, otherwise, feel free to merge the current state of your branch. It's all good.

comment:11 Changed 17 months ago by juga

Resolution: implemented
Status: merge_readyclosed

i like the code, merged.

Note: See TracTickets for help on using tickets.