Trying to switch from a direct connection to using a bridge is not working anymore with tor on master (found while using commit f2f1cab2b3c6a56f93862c424663f083b79c7bc3) on my Linux box.
Steps to reproduce:
Take a recent Tor Browser (e.g. an alpha version)
Make sure you replace tor shipped in your Tor Browser instance with one compiled from a recent master commit
Start Tor Browser and choose direct connection
Shut Tor Browser down after you got greeted with the about:tor page
Restart Tor Browser and press the "Open Settings" button before the bootstrapping is finished
Select e.g. the recommended obfs4 default bridge option
The start-up stalls (I quit Tor Browser after 5 minutes waiting)
The first bad commit is c21cfd28f43a969229ede02e20c6b554c1b88aae which fixed #17750 (moved). Without that one Tor Browser resumes bootstrapping after a couple of seconds and is using an obfs4 bridge.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
(I am a bit confused about the Milestone usage: #17750 (moved) has "Tor: 0.3.1.x-final" even though the code is not on maint-0.3.1, and reading the comments of this ticket, this might not even happen. I chose "Tor: 0.3.2.x-final" as the problematic code is only on master right now. If that's wrong please readjust.)
#17750 (moved) made download schedules use the specified initial delay, rather than 0.
The two schedules with a non-zero delay are the fallback authority schedule and the bridge schedule.
So my guess is that the bridge schedule's initial delay of 1 hour is wrong, and tor needs to download a bridge descriptor to bootstrap. It is possible that this bug was introduced in Tor 0.3.0 with the guard algorithm changes.
I'll work out how to reproduce this on the command-line, and submit a schedule patch.
This will probably involve re-thinking the entire schedule. (Because if we really need the bridge descriptor, we shouldn't wait 2 hours if we fail the first time.)
I have closed #17750 (moved), because we really shouldn't backport it.
It fixes this issue, makes bridge bootstraps more reliable by trying each bridge 3 times (rather than once), and brings the man page up to date with the latest schedules.
Trac: Summary: Switching from direct connection to a pluggable transport is not working anymore with tor on master to Using bridges is not working anymore with tor on master Priority: Medium to Very High Actualpoints: N/Ato 0.5
(As an aside, we didn't catch this with chutney, because the first scheduled download in chutney was 30s, and chutney waits 60s. Now the chutney schedules are consistent with each other, too.)
I pushed a fixup to this branch to fix a bug that arma identified on IRC: bridges reset their download schedule when they're successfully downloaded, and then want to do the next download after an hour. (Maybe the way I fixed it isn't the best design, but it does maintain the pre-#17750 behaviour.)
I also added a commit that refactors bridge downloads to use the "increment on attempt" functions. (They were using the "increment on failure" functions to increment on each attempt, and never incrementing on failure, which was confusing.)
FWIW, clang builds are broken, but that's not your fault, it appears to be from commit 6eb9de1b8c where there's now two typedef struct response_handler_args_ts in src/or/directory.h. I made #23358 (moved) for this.
I'm wondering whether we shouldn't extract the magic pair of calls to download_status_implement() and turn them into some other "adjust bridge download schedule" function? Or use two separate schedules?
I'm wondering whether we shouldn't extract the magic pair of calls to download_status_implement() and turn them into some other "adjust bridge download schedule" function? Or use two separate schedules?
I think we should use two separate schedules, like we do for bootstrapping consensuses and regular consensuses. This makes the different behavior explicit, rather than relying on magic numbers.
It also allows us to have more fine-grained control over how often we retry missing bridge descriptors.
We could use schedules that match the old behaviour:
TestingMissingBridgeDownloadSchedule 0, 1200, 900, 900, 3600TestingBridgeDownloadSchedule 1200, 900, 900, 3600/* And in a test network */TestingMissingBridgeDownloadSchedule 0, 60, 30, 30, 60TestingBridgeDownloadSchedule 60, 30, 30, 60
But we probably want something more like this:
/* If we can't get a bridge descriptor, backoff exponentially, just like authority consensus downloads */TestingMissingBridgeDownloadSchedule 0, 3, 7, 3600, 10800, 25200, 54000, 111600, 262800/* If the bridge keeps giving us a valid descriptor, it's ok to keep asking for one every 6 hours (this gives a bridge client 4 attempts per day to refresh each bridge descriptor) */TestingBridgeDownloadSchedule 21600, 21600/* And in a test network, match authority consensus downloads */TestingMissingBridgeDownloadSchedule 0, 0, 5, 10, 15, 20, 30, 60TestingBridgeDownloadSchedule 30, 30
Is there any reason that a bridge client with a valid bridge descriptor should re-download it every hour?
I'm happy to make this change, but it will probably be towards the end of the week. Feel free to grab this ticket if you want to do it before then.
I ran into a crash bug here: if you have Bridge lines in your configuration, but UseBridges is 0, then the download_status_reset() in 1b5e34badb06bb1a844a6e70164fc5c894d95d0a will fail. I'm going to comment it out for now so that my Tor works. I commented it out in 63af663b8c83d771ed8fd29802e9a4c5cb074c70
Trac: Status: closed to reopened Resolution: fixed toN/A
Trac: Summary: Switching from direct to bridges is not working anymore with tor on master to Using bridges or switching to bridges sometimes does not work with tor 0.3.2