Right now when the Tor client is configured to use bridges, it adds each bridge to its state file, and then it selects which bridge to use just like it selects a guard. That means when we changed to the 1-guard design, Tor clients by default use whichever bridge is listed first in their bridge list.
When we started shipping a hard-coded list of bridges with Tor Browser, that meant that every Tor Browser client who used default bridges would use the first bridge in the list, they would all fall back to the second if the first went down, etc. This is poor for load balancing and probably poor for other properties as well.
The Tor Browser team hacked around this issue in #18113 (moved), by shuffling the bridge lines for each user, and recording the shuffle outcome somewhere in the prefs so each user can have a persistent ordering.
But Tor itself does almost all of what we want: if Tor itself shuffled the list of bridges each time it reads its config, then the very first time, whatever order is picked will get written into the state file that way, and then for future times, the shuffle outcome won't matter because we'll go with whatever order we find them in the state file.
This fix, on the Tor side, will also help in non Tor Browser cases, e.g. when the user pulls down a list of 3 bridges from bridgedb and adds them to her torrc file.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
right after where we call sweep_bridge_list() in config.c.
The clean way to do it is probably to make a new function shuffle_bridge_list() that we call from that same place in config.c, with a comment about why we're doing it, and then that new function lives in entrynodes.c right under the sweep_bridge_list() function.
I am tagging as keyword easy because it now just remains to write the code that way and write up a changelog entry. :)
I'm not really sure what the right way to write a test for it is -- the description of the intended behaviour above relies on a bunch of other parts of tor behaving in a certain way (e.g., interaction with the statefile). Testing just that shuffle_bridge_list() actually shuffles the list doesn't ensure the behaviour in the top of the ticket -- is that OK?
Also, should this behaviour be configurable, instead of always-on? Do people rely on bridge ordering?
I'm not really sure what the right way to write a test for it is -- the description of the intended behaviour above relies on a bunch of other parts of tor behaving in a certain way (e.g., interaction with the statefile). Testing just that shuffle_bridge_list() actually shuffles the list doesn't ensure the behaviour in the top of the ticket -- is that OK?
Tor's testing/mocking system takes a tiny bit of learning, but not too much. I'm happy to help you write tests, or write them for you if you'd prefer to move on to working on something else.
Also, should this behaviour be configurable, instead of always-on? Do people rely on bridge ordering?
It seems like a good idea to make it configurable, but off-by-default. This way, we can preserve the currently expected behaviour without surprising users. We can revert my patch in #18113 (moved) and just flip the config variable on for TB users, and, likewise, non-TB users who really want this can opt-in to enabling it.
(I'm willing to be talked out of this viewpoint if there's good reason to do otherwise.)
Regardless of whether it should be on or off by default, it seems like it's probably a good idea to have a config variable.
Hi. I got some time to learn how to use Chutney for testing. I tweaked the "bridges" network configuration to have 3 bridges instead of 1, hoping to see that (without the patch) the bridge client always selected the first one as its guard. However, this doesn't happen -- the bridge client sometimes uses other bridges than the first one.
entrynodes.c says:
/** A list of configured bridges. Whenever we actually get a descriptor * for one, we add it as an entry guard. Note that the order of bridges * in this list does not necessarily correspond to the order of bridges * in the torrc. */static smartlist_t *bridge_list = NULL;
which is different from the behaviour described at the top of the ticket. So, if the order already does not correspond to the order in the torrc, I'm not sure how to test that it's shuffled.
Also, should this behaviour be configurable, instead of always-on? Do people rely on bridge ordering?
It seems like a good idea to make it configurable, but off-by-default. This way, we can preserve the currently expected behaviour without surprising users. We can revert my patch in #18113 (moved) and just flip the config variable on for TB users, and, likewise, non-TB users who really want this can opt-in to enabling it.
(I'm willing to be talked out of this viewpoint if there's good reason to do otherwise.)
How much do we care about people who use non-default torrcs for the TBB? The only concern I want to raise is that if enough people remove the ShuffleBridges option and #18113 (moved) is reverted, then there could be unfair load on the first few bridges again.
I can see having a ShuffleBridges option during testing the patch, but I think the actual patch we merge should be much simpler -- ideally just the one line change that I proposed long ago, assuming that works. To be clearer, I think any user who is relying on the "one at a time, in that order" accidental behavior of bridge lines in torrc is doing it wrong.
We would be wise to write a more explanatory comment for the function though -- e.g. about how Tor is going to use the order of these bridges in its state file, so shuffling will only matter for bridges that aren't already in the state file, for example on first run.
I'm also a big fan of pastly's notion of tests -- it might take quite a bit of testing infrastructure to test this one well.
I can see having a ShuffleBridges option during testing the patch, but I think the actual patch we merge should be much simpler -- ideally just the one line change that I proposed long ago, assuming that works. To be clearer, I think any user who is relying on the "one at a time, in that order" accidental behavior of bridge lines in torrc is doing it wrong.
We would be wise to write a more explanatory comment for the function though -- e.g. about how Tor is going to use the order of these bridges in its state file, so shuffling will only matter for bridges that aren't already in the state file, for example on first run.
I'm also a big fan of pastly's notion of tests -- it might take quite a bit of testing infrastructure to test this one well.
Typically, we implement tests with random outcomes by working out what the probability of a hardware bit error is, and then run the test enough times that the failure rate is less than the hardware error rate.
For current examples of code that tests random outcomes, see test_crypto_rng_range and crypto_rand_check_failure_mode_zero.
As a concrete example, let's say you have a unit test with two bridges, A and B.
Let's also say that the current code always chooses A.
The new code is expected to choose A or B, uniformly at random.
To keep the failure rate below 2^2048 (which is what crypto_rand_check_failure_mode_zero uses), run the test 2048 times. If it comes up A each time, it is almost certain that the code is broken.
If I were wanting to test this, I would want to verify my expectations that whatever order it chose on the first run, on the second run, i.e. after it's committed to an ordering in the state file, it sticks with that ordering.
Also I might want to verify my expectations that when I have a pile of bridge addresses, and then I try to use a bridge, Tor will hand me the first 'up' one in the state file.
Verifying that the shuffle actually shuffles the list is pretty far down on my list of things I'd expect a unit test to test here.
But take all of this with a grain of salt since I am not skilled at unit tests. :)