Opened 7 months ago

Last modified 5 days ago

#27346 new defect

Improve sbws bandwidth accuracy

Reported by: teor Owned by:
Priority: Medium Milestone: sbws: unspecified
Component: Core Tor/sbws Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Better designs SHOULD:

  • use at least 4 measurements that are at least 6 hours apart, because:
    • there is a daily cycle
    • each day contains 2 similar points in the cycle (it is an up and down cycle)
    • if all 4 measurements happen within a few hours, they will still be biased
  • use at least 3 days of observed bandwidths, because:
    • a single download at the changeover point can affect 2 days
  • weight bandwidths based on the time since the last bandwidth, because:
    • if we only record bandwidths when they change, bandwidths that are updated soon after the last bandwidth are weighted too high
    • we can either:
      • record the bandwidths every hour, even if they haven't changed
      • weight each bandwidth by the time since the last bandwidth
  • use a decaying average for measured and observed bandwidths, because:
    • recent bandwidths are closer to the relay's current capacity
    • and we want accurate results

Child Tickets

TicketStatusOwnerSummaryComponent
#27786newsbws: use at least 4 measurements that are at least 6 hours apartCore Tor/sbws
#27787newsbws: use at least 3 days of observed bandwidthsCore Tor/sbws
#27788newsbws: weight bandwidths based on the time since the last bandwidthCore Tor/sbws
#27789newsbws: use a decaying average for measured and observed bandwidthsCore Tor/sbws
#27790newsbws: design and construct bias curvesCore Tor/sbws
#27791newsbws: compare relays against other similar relaysCore Tor/sbws

Change History (11)

comment:1 in reply to:  description ; Changed 7 months ago by pastly

Replying to teor:

Better designs SHOULD:

  • use at least 4 measurements that are at least 6 hours apart, because:

Depending on how sbws is configured, it might already accidentally do this. By default it keeps results for 5 days and it roughly (again: depending on configuration) manages to do the entire network in a day.

  • use at least 3 days of observed bandwidths, because:

See above.

  • weight bandwidths based on the time since the last bandwidth, because:

...

  • use a decaying average for measured and observed bandwidths, because:

Both sound like reasonable this for after the MVP.

comment:2 in reply to:  1 Changed 7 months ago by teor

Replying to pastly:

Replying to teor:

Better designs SHOULD:

  • use at least 4 measurements that are at least 6 hours apart, because:

Depending on how sbws is configured, it might already accidentally do this. By default it keeps results for 5 days and it roughly (again: depending on configuration) manages to do the entire network in a day.

  • use at least 3 days of observed bandwidths, because:

See above.

Ok, that's good enough for a first release, we can tune it later.

comment:3 Changed 7 months ago by teor

Milestone: sbws 1.0 (MVP nice)sbws 1.1

comment:4 Changed 6 months ago by juga

Parent ID: #27338#27107

Change parent to #27107, since #27338 is implemented

comment:5 Changed 4 months ago by teor

Parent ID: #27107

This ticket is not required for the transition.

comment:6 Changed 4 months ago by teor

Milestone: sbws 1.1sbws 1.2

Milestone renamed

comment:7 Changed 4 months ago by teor

Milestone: sbws 1.2sbws: 1.2.x

Milestone renamed

comment:8 Changed 4 months ago by teor

Milestone: sbws: 1.2.xsbws: 1.2.x-final

Milestone renamed

comment:9 Changed 4 months ago by teor

Milestone: sbws: 1.2.x-finalsbws: unspecified

Milestone renamed

comment:10 Changed 5 days ago by juga

For the record, i believe that:

use at least 4 measurements that are at least 6 hours apart
use at least 3 days of observed bandwidth

It is hard to achieve with the current design, at least for all the relays, since it seams to measure 6500 in around 34h, and we still don't know the 6500 relays are unique relays, which will know after #28547 is implemented or we would have known time ago implementing #29783.

I also believe, that any optimization on the way the measurements are done, should only come after we implement other scaling method that is not torflow, since measurements change very little the consensus bandwidth.

I think #28582 should be done first.

comment:11 Changed 5 days ago by juga

Cc: pastly juga@… teor juga removed

Remove unneded CC and noise

Note: See TracTickets for help on using tickets.