Opened 12 months ago

Closed 3 days ago

#30719 closed defect (fixed)

Work out why 90% of sbws measurements fail

Reported by: teor Owned by: juga
Priority: High Milestone: sbws: 1.1.x-final
Component: Core Tor/sbws Version: sbws: 1.1.0
Severity: Major Keywords: sbws-roadmap
Cc: juga Actual Points:
Parent ID: #33121 Points: 2
Reviewer: Sponsor:


longclaw's most recent vote shows that 90% of measurement attempts fail: recent_measurement_failure_count is 299K, and recent_measurement_attempt_count is 327K.

We should work out why sbws is doing so many failing measurements.

bandwidth-file-headers timestamp=1559442886 version=1.4.0 destinations_countries=ZZ earliest_bandwidth=2019-05-28T02:35:11 file_created=2019-06-02T02:35:03 generator_started=2019-05-19T14:04:34 latest_bandwidth=2019-06-02T02:34:46 minimum_number_eligible_relays=3934 minimum_percent_eligible_relays=60 number_consensus_relays=6556 number_eligible_relays=6287 percent_eligible_relays=96 recent_consensus_count=120 recent_measurement_attempt_count=327183 recent_measurement_failure_count=299072 recent_measurements_excluded_error_count=876 recent_measurements_excluded_few_count=678 recent_measurements_excluded_near_count=237 recent_measurements_excluded_old_count=0 recent_priority_list_count=991 recent_priority_relay_count=327183 scanner_country=US software=sbws software_version=1.1.0 time_to_report_half_network=225229
bandwidth-file-digest sha256=UkxK9KS5KZ5hKDiLI3bqGoMvpMW9gBjKGoYbD2bdZVE

Child Tickets

#30905closedjugaMaybe monitoring values in the state file should be reset when sbws is restartedCore Tor/sbws
#33570closedjugaCorrect the relays to keep after retrieving new consensusesCore Tor/sbws

Change History (10)

comment:1 Changed 12 months ago by teor

Milestone: sbws: unspecifiedsbws: 1.1.x-final
Priority: MediumVery High
Severity: NormalCritical
Version: sbws: unspecifiedsbws: 1.1.0

comment:2 Changed 12 months ago by teor

We're not seeing very many network errors (4%), so this bug mainly wastes CPU time.

comment:3 Changed 12 months ago by teor

Priority: Very HighHigh
Severity: CriticalMajor

Not a critical bug any more.

comment:4 Changed 12 months ago by teor

We need to re-do these checks after #30905 is fixed, because it makes the statistics inaccurate.

comment:5 Changed 11 months ago by gaba

Keywords: sbws-roadmap-october added
Points: 2

comment:6 Changed 4 months ago by gaba

Keywords: sbws-roadmap added

Changing keyword of roadmapped open sbws tickets to a general sbws-roadmap one.

comment:7 Changed 4 months ago by gaba

Keywords: sbws-roadmap-october removed

comment:8 Changed 4 months ago by gaba

Parent ID: #29710#33121

The goal is to deploy sbws in all bw authorities. We need to fix critical bugs to do this.

comment:9 Changed 6 weeks ago by juga

Owner: set to juga
Status: newassigned

comment:10 Changed 3 days ago by juga

Resolution: fixed
Status: assignedclosed

Since longclaw changed to sbws 1.1.0+84.g3033421, and as commented in, the number of failures has been around 1000.
So the high number of failures were not due to measurement failures, but bad counting of the actual errors.
I think we can close this ticket.

Note: See TracTickets for help on using tickets.