#23347 closed defect (fixed)
Using bridges or switching to bridges sometimes does not work with tor 0.3.2
Reported by: | gk | Owned by: | teor |
---|---|---|---|
Priority: | Very High | Milestone: | Tor: 0.3.2.x-final |
Component: | Core Tor/Tor | Version: | |
Severity: | Normal | Keywords: | tor-bootstrap, tor-bridge-client |
Cc: | Actual Points: | 1 | |
Parent ID: | Points: | 0.5 | |
Reviewer: | isis | Sponsor: |
Description
Trying to switch from a direct connection to using a bridge is not working anymore with tor on master
(found while using commit f2f1cab2b3c6a56f93862c424663f083b79c7bc3) on my Linux box.
Steps to reproduce:
1) Take a recent Tor Browser (e.g. an alpha version)
2) Make sure you replace tor
shipped in your Tor Browser instance with one compiled from a recent master commit
3) Start Tor Browser and choose direct connection
4) Shut Tor Browser down after you got greeted with the about:tor
page
5) Restart Tor Browser and press the "Open Settings" button before the bootstrapping is finished
6) Select e.g. the recommended obfs4
default bridge option
7) The start-up stalls (I quit Tor Browser after 5 minutes waiting)
The first bad commit is c21cfd28f43a969229ede02e20c6b554c1b88aae which fixed #17750. Without that one Tor Browser resumes bootstrapping after a couple of seconds and is using an obfs4
bridge.
Child Tickets
Change History (24)
comment:1 Changed 2 years ago by
comment:2 Changed 2 years ago by
#17750 made download schedules use the specified initial delay, rather than 0.
The two schedules with a non-zero delay are the fallback authority schedule and the bridge schedule.
So my guess is that the bridge schedule's initial delay of 1 hour is wrong, and tor needs to download a bridge descriptor to bootstrap. It is possible that this bug was introduced in Tor 0.3.0 with the guard algorithm changes.
I'll work out how to reproduce this on the command-line, and submit a schedule patch.
This will probably involve re-thinking the entire schedule. (Because if we really need the bridge descriptor, we shouldn't wait 2 hours if we fail the first time.)
I have closed #17750, because we really shouldn't backport it.
comment:3 Changed 2 years ago by
To reproduce this, you can use:
src/or/tor DataDirectory `mktemp -d` UseBridges 1 Bridge ...
where ... is any bridge on any transport.
(I used a bridge from bridges.torproject.org, but any bridge or relay should work.)
I have confirmed that this works on 0.3.0.10, but fails on master with:
Bootstrapped 0%: Starting Delaying directory fetches: No running bridges
comment:4 Changed 2 years ago by
Keywords: | tor-bootstrap tor-bridge-client added |
---|---|
Points: | → 0.5 |
comment:5 Changed 2 years ago by
Actual Points: | → 0.5 |
---|---|
Priority: | Medium → Very High |
Summary: | Switching from direct connection to a pluggable transport is not working anymore with tor on master → Using bridges is not working anymore with tor on master |
Please see my branch bug23347 on https://github.com/teor2345
It fixes this issue, makes bridge bootstraps more reliable by trying each bridge 3 times (rather than once), and brings the man page up to date with the latest schedules.
comment:6 Changed 2 years ago by
Status: | new → needs_review |
---|
(As an aside, we didn't catch this with chutney, because the first scheduled download in chutney was 30s, and chutney waits 60s. Now the chutney schedules are consistent with each other, too.)
comment:7 Changed 2 years ago by
Owner: | set to teor |
---|---|
Status: | needs_review → assigned |
comment:8 Changed 2 years ago by
Status: | assigned → needs_review |
---|
comment:9 Changed 2 years ago by
I pushed a fixup to this branch to fix a bug that arma identified on IRC: bridges reset their download schedule when they're successfully downloaded, and then want to do the next download after an hour. (Maybe the way I fixed it isn't the best design, but it does maintain the pre-#17750 behaviour.)
I also added a commit that refactors bridge downloads to use the "increment on attempt" functions. (They were using the "increment on failure" functions to increment on each attempt, and never incrementing on failure, which was confusing.)
comment:10 Changed 2 years ago by
Reviewer: | → isis |
---|
comment:11 Changed 2 years ago by
435952538
LGTM61227b7b0
LGTMea662f00d
LGTM836cc60d4
LGTM
FWIW, clang builds are broken, but that's not your fault, it appears to be from commit 6eb9de1b8c
where there's now two typedef struct response_handler_args_t
s in src/or/directory.h
. I made #23358 for this.
comment:12 Changed 2 years ago by
Status: | needs_review → merge_ready |
---|
Forgot to change state after review.
comment:13 follow-up: 14 Changed 2 years ago by
I'm wondering whether we shouldn't extract the magic pair of calls to download_status_implement() and turn them into some other "adjust bridge download schedule" function? Or use two separate schedules?
comment:14 Changed 2 years ago by
Status: | merge_ready → needs_revision |
---|
Replying to nickm:
I'm wondering whether we shouldn't extract the magic pair of calls to download_status_implement() and turn them into some other "adjust bridge download schedule" function? Or use two separate schedules?
I think we should use two separate schedules, like we do for bootstrapping consensuses and regular consensuses. This makes the different behavior explicit, rather than relying on magic numbers.
It also allows us to have more fine-grained control over how often we retry missing bridge descriptors.
We could use schedules that match the old behaviour:
TestingMissingBridgeDownloadSchedule 0, 1200, 900, 900, 3600 TestingBridgeDownloadSchedule 1200, 900, 900, 3600 /* And in a test network */ TestingMissingBridgeDownloadSchedule 0, 60, 30, 30, 60 TestingBridgeDownloadSchedule 60, 30, 30, 60
But we probably want something more like this:
/* If we can't get a bridge descriptor, backoff exponentially, just like authority consensus downloads */ TestingMissingBridgeDownloadSchedule 0, 3, 7, 3600, 10800, 25200, 54000, 111600, 262800 /* If the bridge keeps giving us a valid descriptor, it's ok to keep asking for one every 6 hours (this gives a bridge client 4 attempts per day to refresh each bridge descriptor) */ TestingBridgeDownloadSchedule 21600, 21600 /* And in a test network, match authority consensus downloads */ TestingMissingBridgeDownloadSchedule 0, 0, 5, 10, 15, 20, 30, 60 TestingBridgeDownloadSchedule 30, 30
Is there any reason that a bridge client with a valid bridge descriptor should re-download it every hour?
I'm happy to make this change, but it will probably be towards the end of the week. Feel free to grab this ticket if you want to do it before then.
comment:15 Changed 2 years ago by
Status: | needs_revision → needs_review |
---|
I updated the branch with a new design that:
- runs quick checks when we don't know any bridge descriptors
- when we get a (new/uncached) bridge descriptor, don't check that descriptor again for a few hours
This splits the standard bridge schedule into a Bridge and a BridgeBootstrap schedule.
comment:16 Changed 2 years ago by
Actual Points: | 0.5 → 1 |
---|
comment:17 Changed 2 years ago by
Resolution: | → fixed |
---|---|
Status: | needs_review → closed |
Nice! This looks much cleaner this way. I'm testing and merging.
comment:18 Changed 2 years ago by
Resolution: | fixed |
---|---|
Status: | closed → reopened |
I ran into a crash bug here: if you have Bridge lines in your configuration, but UseBridges is 0, then the download_status_reset() in 1b5e34badb06bb1a844a6e70164fc5c894d95d0a will fail. I'm going to comment it out for now so that my Tor works. I commented it out in 63af663b8c83d771ed8fd29802e9a4c5cb074c70
comment:19 Changed 2 years ago by
Status: | reopened → needs_review |
---|
Ok, so there are a few issues here:
- the reset is too early: it should only happen when we actually go to use bridges
- there's no way to check if a bridge download status has already been reset
So it's ok to rely on the existing code to do this.
I added comments explaining this in bug23347-extra.
comment:21 Changed 2 years ago by
Resolution: | → fixed |
---|---|
Status: | needs_review → closed |
I've merged the patches for the child tickets, so I'm closing this one too. Thanks for the fast response!
comment:22 Changed 2 years ago by
Summary: | Using bridges is not working anymore with tor on master → Switching from direct to bridges is not working anymore with tor on master |
---|
comment:23 Changed 2 years ago by
Summary: | Switching from direct to bridges is not working anymore with tor on master → Using bridges or switching to bridges sometimes does not work with tor 0.3.2 |
---|
(I am a bit confused about the Milestone usage: #17750 has "Tor: 0.3.1.x-final" even though the code is not on
maint-0.3.1
, and reading the comments of this ticket, this might not even happen. I chose "Tor: 0.3.2.x-final" as the problematic code is only onmaster
right now. If that's wrong please readjust.)