Opened 2 years ago

Closed 20 months ago

#19875 closed enhancement (implemented)

shuffle our bridges when we load them from config

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: easy, tor-03-unspecified-201612
Cc: brade, mcs, isis, sirmatt@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Right now when the Tor client is configured to use bridges, it adds each bridge to its state file, and then it selects which bridge to use just like it selects a guard. That means when we changed to the 1-guard design, Tor clients by default use whichever bridge is listed first in their bridge list.

When we started shipping a hard-coded list of bridges with Tor Browser, that meant that every Tor Browser client who used default bridges would use the first bridge in the list, they would all fall back to the second if the first went down, etc. This is poor for load balancing and probably poor for other properties as well.

The Tor Browser team hacked around this issue in #18113, by shuffling the bridge lines for each user, and recording the shuffle outcome somewhere in the prefs so each user can have a persistent ordering.

But Tor itself does almost all of what we want: if Tor itself shuffled the list of bridges each time it reads its config, then the very first time, whatever order is picked will get written into the state file that way, and then for future times, the shuffle outcome won't matter because we'll go with whatever order we find them in the state file.

This fix, on the Tor side, will also help in non Tor Browser cases, e.g. when the user pulls down a list of 3 bridges from bridgedb and adds them to her torrc file.

Child Tickets

Change History (16)

comment:1 Changed 2 years ago by mcs

Cc: brade mcs added

comment:2 Changed 2 years ago by arma

Keywords: easy added

I believe we can accomplish this goal by doing a

smartlist_shuffle(bridge_list);

right after where we call sweep_bridge_list() in config.c.

The clean way to do it is probably to make a new function shuffle_bridge_list() that we call from that same place in config.c, with a comment about why we're doing it, and then that new function lives in entrynodes.c right under the sweep_bridge_list() function.

I am tagging as keyword easy because it now just remains to write the code that way and write up a changelog entry. :)

comment:3 Changed 2 years ago by hdevalence

Hi, a first patch is here: https://code.ciph.re/hdevalence/tor/commit/4fda92e51bc5c69f5ba4c19c22184a1e0b87d9c1

I'm not really sure what the right way to write a test for it is -- the description of the intended behaviour above relies on a bunch of other parts of tor behaving in a certain way (e.g., interaction with the statefile). Testing just that shuffle_bridge_list() actually shuffles the list doesn't ensure the behaviour in the top of the ticket -- is that OK?

Also, should this behaviour be configurable, instead of always-on? Do people rely on bridge ordering?

comment:4 Changed 2 years ago by isis

Cc: isis added

comment:5 in reply to:  3 ; Changed 2 years ago by isis

Replying to hdevalence:

Hi, a first patch is here: https://code.ciph.re/hdevalence/tor/commit/4fda92e51bc5c69f5ba4c19c22184a1e0b87d9c1

I'm not really sure what the right way to write a test for it is -- the description of the intended behaviour above relies on a bunch of other parts of tor behaving in a certain way (e.g., interaction with the statefile). Testing just that shuffle_bridge_list() actually shuffles the list doesn't ensure the behaviour in the top of the ticket -- is that OK?


Tor's testing/mocking system takes a tiny bit of learning, but not too much. I'm happy to help you write tests, or write them for you if you'd prefer to move on to working on something else.

Also, should this behaviour be configurable, instead of always-on? Do people rely on bridge ordering?


It seems like a good idea to make it configurable, but off-by-default. This way, we can preserve the currently expected behaviour without surprising users. We can revert my patch in #18113 and just flip the config variable on for TB users, and, likewise, non-TB users who really want this can opt-in to enabling it.

(I'm willing to be talked out of this viewpoint if there's good reason to do otherwise.)

Regardless of whether it should be on or off by default, it seems like it's probably a good idea to have a config variable.

comment:6 Changed 2 years ago by hdevalence

Hi. I got some time to learn how to use Chutney for testing. I tweaked the "bridges" network configuration to have 3 bridges instead of 1, hoping to see that (without the patch) the bridge client always selected the first one as its guard. However, this doesn't happen -- the bridge client sometimes uses other bridges than the first one.

entrynodes.c says:

/** A list of configured bridges. Whenever we actually get a descriptor
 * for one, we add it as an entry guard.  Note that the order of bridges
 * in this list does not necessarily correspond to the order of bridges
 * in the torrc. */
static smartlist_t *bridge_list = NULL;

which is different from the behaviour described at the top of the ticket. So, if the order already does not correspond to the order in the torrc, I'm not sure how to test that it's shuffled.

comment:7 Changed 2 years ago by hdevalence

This version of the patch uses a config option, ShuffleBridges: https://github.com/hdevalence/tor/compare/trac/19875 . It seems to work.

Last edited 2 years ago by hdevalence (previous) (diff)

comment:8 in reply to:  5 Changed 2 years ago by pastly

This is interesting to me. I'm going to see if I have time to test hdevalence's branch and write tests.

Replying to isis:

Replying to hdevalence:

Also, should this behaviour be configurable, instead of always-on? Do people rely on bridge ordering?


It seems like a good idea to make it configurable, but off-by-default. This way, we can preserve the currently expected behaviour without surprising users. We can revert my patch in #18113 and just flip the config variable on for TB users, and, likewise, non-TB users who really want this can opt-in to enabling it.

(I'm willing to be talked out of this viewpoint if there's good reason to do otherwise.)

How much do we care about people who use non-default torrcs for the TBB? The only concern I want to raise is that if enough people remove the ShuffleBridges option and #18113 is reverted, then there could be unfair load on the first few bridges again.

comment:9 Changed 2 years ago by arma

I can see having a ShuffleBridges option during testing the patch, but I think the actual patch we merge should be much simpler -- ideally just the one line change that I proposed long ago, assuming that works. To be clearer, I think any user who is relying on the "one at a time, in that order" accidental behavior of bridge lines in torrc is doing it wrong.

We would be wise to write a more explanatory comment for the function though -- e.g. about how Tor is going to use the order of these bridges in its state file, so shuffling will only matter for bridges that aren't already in the state file, for example on first run.

I'm also a big fan of pastly's notion of tests -- it might take quite a bit of testing infrastructure to test this one well.

comment:10 in reply to:  9 Changed 2 years ago by teor

Replying to arma:

I can see having a ShuffleBridges option during testing the patch, but I think the actual patch we merge should be much simpler -- ideally just the one line change that I proposed long ago, assuming that works. To be clearer, I think any user who is relying on the "one at a time, in that order" accidental behavior of bridge lines in torrc is doing it wrong.

We would be wise to write a more explanatory comment for the function though -- e.g. about how Tor is going to use the order of these bridges in its state file, so shuffling will only matter for bridges that aren't already in the state file, for example on first run.

I'm also a big fan of pastly's notion of tests -- it might take quite a bit of testing infrastructure to test this one well.

Typically, we implement tests with random outcomes by working out what the probability of a hardware bit error is, and then run the test enough times that the failure rate is less than the hardware error rate.

(Hmm, we should standardise this, see #20617.)

For current examples of code that tests random outcomes, see test_crypto_rng_range and crypto_rand_check_failure_mode_zero.

As a concrete example, let's say you have a unit test with two bridges, A and B.
Let's also say that the current code always chooses A.
The new code is expected to choose A or B, uniformly at random.
To keep the failure rate below 2^2048 (which is what crypto_rand_check_failure_mode_zero uses), run the test 2048 times. If it comes up A each time, it is almost certain that the code is broken.

Edit: clarify hardware errors

Last edited 2 years ago by teor (previous) (diff)

comment:11 Changed 2 years ago by arma

If I were wanting to test this, I would want to verify my expectations that whatever order it chose on the first run, on the *second* run, i.e. after it's committed to an ordering in the state file, it sticks with that ordering.

Also I might want to verify my expectations that when I have a pile of bridge addresses, and then I try to use a bridge, Tor will hand me the first 'up' one in the state file.

Verifying that the shuffle actually shuffles the list is pretty far down on my list of things I'd expect a unit test to test here.

But take all of this with a grain of salt since I am not skilled at unit tests. :)

comment:12 Changed 2 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:13 Changed 22 months ago by pastly

Cc: sirmatt@… added

comment:14 Changed 22 months ago by nickm

I think that my work on #19877 probably supersedes this ticket?

comment:15 Changed 22 months ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:16 Changed 20 months ago by nickm

Resolution: implemented
Status: newclosed

Prop271 supersedes this.

Note: See TracTickets for help on using tickets.