Opened 4 months ago

Last modified 3 months ago

#25687 new defect

over-report of observed / self-measure bandwidth on fast hardware -- important to torflow / peerflow

Reported by: starlight Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.6.10
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Have observed that on fast hardware the maximum bandwidth estimate reported by rep_hist_bandwidth_assess() and published via ri->bandwidthcapacity is frequently overstated and have seen it go as high 160% of true physical bandwidth, stay there for days.

Connected today in my mind that this may be one of the larger causes of torflow misbehavior No control system will function correctly with bad input data and without question +60% qualifies as bad--GIGO.

Problem appears to have worsened with the arrival of KIST scheduler.

Have no idea why it occurs, but have see for years, with overrating appearing relative to absolute physical link speeds and relative to Linux tc-police rate limits.

Even peerflow when it arrives will require reliable measurement data to function properly.

Child Tickets

Change History (8)

comment:1 in reply to:  description Changed 4 months ago by arma

Replying to starlight:

Have observed that on fast hardware the maximum bandwidth estimate reported by rep_hist_bandwidth_assess() and published via ri->bandwidthcapacity is frequently overstated and have seen it go as high 160% of true physical bandwidth, stay there for days.

Well, from Tor's perspective I think it really is seeing you get that level of throughput.

That is, it was able to see a sustained 10 second period where it got that average rate.

Part of the explanation might be that the kernel is saying "yep, it's sent" when actually it's just queued inside the kernel. For those cases, in theory Kist should be doing *better*, because it has a chance of saying "ok, actually I checked with the kernel and there's some stuff queued so I'm not going to try to write more quite yet". That said, the way the kernel maintains good throughput is by having a bunch of waiting stuff queued, so it can always be sending the next bit as quickly as possible rather than having to wait for user space to decide to ask it to send some more.

comment:2 Changed 4 months ago by teor

Milestone: Tor: unspecified
Version: Tor: 0.3.3.3-alpha

comment:3 Changed 3 months ago by starlight

Version: Tor: 0.3.3.3-alpha

In my view the problem constitutes a critical measurement error. The most important consumer of this value is torflow and overstating true available bandwidth distorts consensus allocations. No chance overstatements are correct, especially in the traffic-control scenario where Linux caps bandwidth with fine-grained, reliable precision. Some ISPs permit traffic to burst briefly before enforcing bandwidth restrictions, but in the cases I've seen bandwidth self-measure is well in excess of the the line burst rate and anyway reporting burst-rate maximum capacity to torflow is counterproductive. The impact of the issue may be acute due to torflow's control-signal reliance on measurements of marginal unused bandwidth rather than actual capacity.

To venture a guess, any of queuing (as suggested), chunky event-loop processing delays, course time precision, misalignment of time determination vs. data accounting are likely culprits.

Setting milestone was unintended, was thinking of the latest reproducing version.

comment:4 Changed 3 months ago by starlight

Summary: extreme over-report of observed / self-measure bandwidth on fast hardware -- critical to torflow / peerflowover-report of observed / self-measure bandwidth on fast hardware -- important to torflow / peerflow

Realized the +60% number I threw down is flawed and went back over data and configurations carefully. Nonetheless the problem is, or was significant.

Found an example where, with a tc-police rate limit, reported self-measure was 19% in excess of the limit. Daemon version 0.2.6.10. At the time of the observation the relay was overrated in the consensus 50% and flooded with ingress data at about 150% of the cap. More typically have seen continuous overrating at +5% with tc and a reasonable consensus weight.

I retract the assertion that KIST increased the problem--admit that I do not know.

Also stand by the recommendation that self-measure should not include peaks resulting from adaptive ISP rate limits and am sure that I've seen this.

comment:5 Changed 3 months ago by starlight

Version: Tor: 0.3.3.3-alphaTor: 0.2.6.10

have to walk-back the report-version for now; will run an experiment with 0.3.3

comment:6 Changed 3 months ago by teor

The current variance between bandwidth authorities is 50%.
Perhaps this issue is a contributing factor.

We're designing a new bandwidth authority codebase that doesn't rely on self-reported bandwidths.
We'll be able to prioritise this issue better once it is deployed.

comment:7 Changed 3 months ago by starlight

thank you for the update, bwscanner I presume?

I won't spend time quantifying this issue with 0.3.3, at least for now, though I did come up with a plan. If peerflow happens then measurement quality is a concern; could spend some time on the aforementioned analysis.

comment:8 in reply to:  7 Changed 3 months ago by teor

Replying to starlight:

thank you for the update, bwscanner I presume?

No, simple bandwidth scanner.
bwscanner doesn't generate files that authorities can use yet, so I don't know if it will use self-reported bandwidths for scanning.

I won't spend time quantifying this issue with 0.3.3, at least for now, though I did come up with a plan. If peerflow happens then measurement quality is a concern; could spend some time on the aforementioned analysis.

I think there are some inaccuracies we will just have to live with, and this might be one of them.

Edit: alternately, we can measure inbound data from peers, or measure acknowledged sent data, rather than raw sent data. There are plenty of options.

Last edited 3 months ago by teor (previous) (diff)
Note: See TracTickets for help on using tickets.