Opened 4 months ago
Last modified 4 hours ago
#28027 needs_information defect
Tor keeps opening circuits while waiting for bridge descriptors
Reported by: | dgoulet | Owned by: | neel |
---|---|---|---|
Priority: | Medium | Milestone: | Tor: unspecified |
Component: | Core Tor/Tor | Version: | Tor: 0.3.0.1-alpha |
Severity: | Normal | Keywords: | tor-hs, regression, tor-guard, 033-backport, 034-backport, 035-backport, postfreeze-ok, 040-deferred-20190220 |
Cc: | neel, catalyst | Actual Points: | |
Parent ID: | Points: | ||
Reviewer: | teor | Sponsor: |
Description
From someone in #tor-dev (Tovok7
), they migrated an HS from 029 to 034 and some feature broke.
Tor, configured with an HS, starts fine, bootstraps and all is good.
Then, through the control port, setconf UseBridge=1 Bridge=...
created these logs:
2018-10-12 16:18:54.440 I/TorPlugin: Enabling network, using bridges 2018-10-12 16:18:54.456 I/TorPlugin: NOTICE Switching to guard context "bridges" (was using "default") 2018-10-12 16:18:54.608 I/TorPlugin: NOTICE Delaying directory fetches: No running bridges 2018-10-12 16:18:55.031 I/TorPlugin: WARN Error launching circuit to node [scrubbed] for service [scrubbed]. 2018-10-12 16:18:55.032 I/TorPlugin: WARN Error launching circuit to node [scrubbed] for service [scrubbed]. 2018-10-12 16:18:55.616 I/TorPlugin: OR connection LAUNCHED $02069A3C5362476936B62BA6F5ACC41ABD573A9B 2018-10-12 16:18:55.616 I/TorPlugin: OR connection LAUNCHED $5A2D2F4158D0453E00C7C176978D3F41D69C45DB 2018-10-12 16:18:55.616 I/TorPlugin: OR connection LAUNCHED $B31A7DAD9AACEDDB9915A16617BB8F06BA429D6B 2018-10-12 16:18:55.642 I/TorPlugin: WARN Hidden service [scrubbed] exceeded launch limit with 13 intro points in the last 13 seconds. Intro circuit launches are limited to 10 per 300 seconds. 2018-10-12 16:18:55.642 I/TorPlugin: WARN Service configured in [EPHEMERAL]: 2018-10-12 16:18:55.642 I/TorPlugin: WARN Intro point 0 at [scrubbed]: no circuit 2018-10-12 16:18:55.643 I/TorPlugin: WARN Intro point 1 at [scrubbed]: no circuit 2018-10-12 16:18:55.643 I/TorPlugin: WARN Intro point 2 at [scrubbed]: no circuit 2018-10-12 16:18:55.643 I/TorPlugin: WARN Intro point 3 at [scrubbed]: no circuit 2018-10-12 16:18:55.643 I/TorPlugin: WARN Intro point 4 at [scrubbed]: no circuit 2018-10-12 16:18:56.652 I/TorPlugin: OR connection CONNECTED $02069A3C5362476936B62BA6F5ACC41ABD573A9B 2018-10-12 16:18:57.013 I/TorPlugin: NOTICE new bridge descriptor 'pointingRespighi' (fresh): [scrubbed] 2018-10-12 16:18:57.014 I/TorPlugin: NOTICE Our directory information is no longer up-to-date enough to build circuits: We're missing descriptors for 1/2 of our primary entry guards (total microdescriptors: 6457/6457). 2018-10-12 16:18:57.080 I/TorPlugin: OR connection CONNECTED $5A2D2F4158D0453E00C7C176978D3F41D69C45DB 2018-10-12 16:18:57.184 I/TorPlugin: OR connection CONNECTED $B31A7DAD9AACEDDB9915A16617BB8F06BA429D6B 2018-10-12 16:18:57.586 I/TorPlugin: NOTICE new bridge descriptor 'AlliumGermanicus' (fresh): [scrubbed] 2018-10-12 16:18:57.641 I/TorPlugin: NOTICE Bridge 'nonLinearGeometry' has both an IPv4 and an IPv6 address. Will prefer using its IPv4 address ([scrubbed]) based on the configured Bridge address. 2018-10-12 16:18:57.641 I/TorPlugin: NOTICE new bridge descriptor 'nonLinearGeometry' (fresh): [scrubbed] 2018-10-12 16:18:57.641 I/TorPlugin: NOTICE We now have enough directory information to build circuits. 2018-10-12 16:19:00.201 I/TorPlugin: NOTICE Tor has successfully opened a circuit. Looks like client functionality is working.
It appears that the HS tried to open intro points even though tor didn't have bridge descriptors (guard state got switched).
The HS subsystem is safeguarded by this check (for circuit events):
if (router_have_consensus_path() == CONSENSUS_PATH_UNKNOWN || !have_completed_a_circuit()) { return; }
In other words, if we can't open circuits, tor will never proceed with HS service circuits.
The main theory, discussed with armadev, can be deduced with the three first log line:
2018-10-12 16:18:54.456 I/TorPlugin: NOTICE Switching to guard context "bridges" (was using "default") 2018-10-12 16:18:54.608 I/TorPlugin: NOTICE Delaying directory fetches: No running bridges 2018-10-12 16:18:55.031 I/TorPlugin: WARN Error launching circuit to node [scrubbed] for service [scrubbed].
First line: Guard context switched to bridges. All is good.
Second line: router_have_minimum_dir_info()
is called from, actually wherever... It is used quite often in many places including our mainloop. The point is that within that function, we do look at should_delay_dir_fetches()
which is the one creating that notice. However, because 1 was returned, we never went to update_router_have_minimum_dir_info()
which would have mark that we can't complete circuits (with note_that_we_maybe_cant_complete_circuits()
).
Third line: Circuit launch failure.
Once the guard context was switched, all circuits were marked as unusable (normal) so the HS service has to rebuild all its intro points but the have_completed_a_circuit()
was still returning true.
Whatever is the cause, there is a clear issue that when we switch guard context, we should _always_ stop circuit creation until the guard state is usable.
Child Tickets
Change History (14)
comment:1 Changed 4 months ago by
comment:2 Changed 4 months ago by
Keywords: | tor-guard added |
---|
comment:3 Changed 3 months ago by
Cc: | neel added |
---|---|
Owner: | set to neel |
Status: | new → assigned |
comment:4 Changed 3 months ago by
For this patch, I am thinking about the following:
- If
UseBridges
was originally set to0
, but then set to1
later on, keep a flag infeature/client/entrynodes.c
- Make
should_delay_dir_fetches()
check for this proposed flag ifUseBridges
is1
to fetch the bridge descriptors, and if both are set, return0
- In
options_act()
, initialize the proposed flag to 0 if we don't haveold_options
- In
options_act()
, if we haveold_options
, andold_options->UseBridges
andoptions->UseBridges
differ, set this flag to1
comment:5 Changed 3 months ago by
Cc: | catalyst added |
---|
comment:6 Changed 3 months ago by
Hmmmm I think the problem lies in the fact that "tor" thinks it can complete a circuit but in fact it can't because it has no running bridges as in it just switched to using one and is waiting for it to be usable.
That being said, I think the avenue to resolving this is rather to signal tor that we can NOT complete circuits when our Guard state just changed.
To do that, simply call note_that_we_maybe_cant_complete_circuits()
. Quick look at this, probably it should be done around this in config.c
:
if (transition_affects_guards) { if (guards_update_all()) { abandon_circuits = 1; } }
The tricky part will be to confirm that tor does at some point realizes it can complete circuits once the bridge is usable and thus the HS subsystem can start building circuits.
comment:7 Changed 3 months ago by
Thanks for the clarification.
Also, if a bridge is usable, is it okay to call note_that_we_completed_a_circuit()
in learned_bridge_descriptor()
if we get a bridge descriptor?
comment:8 follow-up: 12 Changed 3 months ago by
I have a PR here: https://github.com/torproject/tor/pull/558
I do have one concern, since note_that_we_maybe_cant_complete_circuits()
is called in learned_bridge_descriptor()
, I am worried that my patch may have race conditions. This claim may be hypothetical as I did not experience any from my tests.
Also, this patch works from bridge to non-bridge, and non-bridge to bridge with a onion service.
comment:9 Changed 3 months ago by
Keywords: | 033-backport 034-backport 035-backport added |
---|---|
Milestone: | Tor: 0.3.5.x-final → Tor: 0.4.0.x-final |
Status: | assigned → needs_review |
Version: | Tor: 0.3.4.8 → Tor: 0.3.0.1-alpha |
comment:10 Changed 3 months ago by
Reviewer: | → teor |
---|---|
Status: | needs_review → needs_revision |
Do we need the extra call to note_that_we_completed_a_circuit()?
The existing call happens when tor completes a multi-hop circuit:
https://github.com/torproject/tor/blob/586c3a7c902dfb6fdf52390ecb2a08d6fceef77a/src/core/or/circuitbuild.c#L1055
So calling note_that_we_completed_a_circuit() when we make a one-hop circuit to a bridge might not be a good idea.
comment:11 Changed 3 months ago by
Status: | needs_revision → needs_review |
---|
Removed that line and pushed it. Same PR.
comment:12 Changed 3 months ago by
Status: | needs_review → needs_information |
---|
Hi,
I just need to check that you've tested this case with the latest code:
The tricky part will be to confirm that tor does at some point realizes it can complete circuits once the bridge is usable and thus the HS subsystem can start building circuits.
Replying to neel:
Also, this patch works from bridge to non-bridge, and non-bridge to bridge with a onion service.
When you've tested this case, please put the ticket into merge_ready?
comment:13 Changed 5 weeks ago by
Keywords: | postfreeze-ok added |
---|
Mark some tickets as postfreeze-ok, to indicate that I think they are okay to accept in 0.4.0 post-freeze. Does not indicate that they are all necessary to do postfreeze.
comment:14 Changed 4 hours ago by
Keywords: | 040-deferred-20190220 added |
---|---|
Milestone: | Tor: 0.4.0.x-final → Tor: unspecified |
Deferring 51 tickets from 0.4.0.x-final. Tagging them with 040-deferred-20190220 for visibility. These are the tickets that did not get 040-must, 040-can, or tor-ci.
For context, I think tovok7 is from briar.