wiki:doc/BandwidthAuthorityMeasurements

Version 2 (modified by teor, 8 months ago) (diff)

First draft

So You Want to Fix the Tor Network: Episode Three

- or -

Does a Bandwidth Authority Really Matter?

So you've set up a bandwidth authority.

Now what happens to the tor network?

What do Bandwidth Authorities Measure?

Tor bandwidth scanners measure download speed by downloading files. The scanner Tor client connects to a remote HTTPS bandwidth server via two-hop path. This path has one Guard/Middle node, and one Exit, selected from a partition of relays with similar bandwidths. The scanner uses larger files for partitions with larger capacity.

Scanners measure the time it takes to download the file. To accurately measure the typical client experience, this must include both latency and throughput.

The throughput of the circuit is the minimum of the spare throughput of both relays in the circuit. The relays may also limit the per-circuit throughput. (TODO: what is this limit?)

Circuit latency affects the connection setup time, and the time taken to confirm receipt of data at the client. The time it takes to build the circuit depends on the latency of the network links between the scanner, the entry, the exit, and the bandwidth server. There Congested relays may also experience delays writing data to the network, or dropped packets, which need to be re-sent from the end of the circuit. Even if there is no congestion, tor clients still need to confirm that data has been received, although data is sent optimistically up to a certain limit. (TODO: what is this limit?)

Overall Stability Improvements

An additional bandwidth authority will cause all the bandwidth measurements for the tor network to become more stable.

If there are N bandwidth authorities, each authority should directly affect the measurements for about 1/N relays. A bandwidth authority will also decrease the variance for 3/N relays. This is because we take the median of all measurements.

A graph of the relays that are directly determined by each bandwidth authority is in #21882.

Does Bandwidth Authority Location Matter?

Yes! It allocates Guard and Middle bandwidth to relays close to the bandwidth scanner, and Exit bandwidth to relays close to the bandwidth server.

Current Bandwidth Authority Locations

All of the current bandwidth scanners are located in North America or Europe, and most of the bandwidth servers are located in North America or Europe, with one in Asia.

We're working to change this, by placing bandwidth servers (and maybe scanners) on other continents.

(TODO: add a table with specific locations, if the operators are ok with that)

How does Location Impact the Client Experience?

The current bandwidth authority locations mean that relays in North America and Europe handle more traffic:

  • the Tor network is faster for all clients, because they are more likely to choose a path containing relays that are near each other: this affects hidden services in particular, because they have 6-hop long paths,
  • Tor clients in North America and Europe are also faster, because their Guard is closer (on average),
  • websites with servers in North America or Europe have lower latency to Tor Exits,
  • websites that use a CDN are faster if the CDN DNS corresponds to nearby data center, and if the CDN has many servers in North America or Europe,
  • Exits in North America and Europe can easily become congested, slowing down the overall network speed. Guards are less likely to become congested, because there are more guards in the network. (TODO: measure Exit congestion)

Bandwidth Authorities Outside North America / Europe

Adding or moving bandwidth authorities will change relay measurements: relays closer to the new location will be measured higher. A bandwidth scanner affects Guard and Middle measurements. A bandwidth server affects Exit measurements.

A Worked Example

Let's say we wanted to add a bandwidth server in South America to a scanner with an existing server elsewhere.

Adding a bandwidth server in South America will shift Exit bandwidth away from Europe, and possibly North America.

Websites in South America will become faster through Tor. (Tor is mainly used for web traffic.)

The average latency between middles and exits will increase, so tor might become slightly slower. This will be offset by decreased load on European Exits, decreasing congestion. (TODO: work out which effect wins?)

There will also be a second-order effect, where Guards and Middles that are closer to Exits that are closer to South America will measure higher. But this is offset by the scanner bias for Guards/Middles towards North America and Europe.

There will also be a third-order effect, where people put more Exits (or relays in general) in South America, because they measure better.

Adding 1 of 2 bandwidth servers on 1 of 5 bandwidth authorities will move the median on ~20% of relays, by ~50% of the difference between the other scannner location and South America. But the system isn't linear, so the actual change will likely be much smaller. You'd have to move a majority of the scanners and servers to see a significant effect.