Opened 6 years ago

Closed 2 years ago

#9972 closed defect (fixed)

Failed to find node for hop 0 of our path. Discarding this circuit.

Reported by: mr-4 Owned by: nickm
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.4.17-rc
Severity: Normal Keywords: tor-client
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

When I introduce EntryNodes restrictions in my torrc file (also having StrictNodes 1) and then start tor, I get the following rather bizarre sequence going:

[notice] {DIR} We now have enough directory information to build circuits.
[notice] {CONTROL} Bootstrapped 80%: Connecting to the Tor network.
[warn] {CIRC} Failed to find node for hop 0 of our path. Discarding this circuit.
[...ad nauseum...]

If, at this point, I shut down tor and then start it again, without changing anything at all, the bootstrap completes (100% done) and I have no further problems.

My EntryNodes statement isn't very restrictive (something like {DE},{SE},{AT},{EU}), but even if it is, I don't think it should prevent tor from bootstrapping properly.

Child Tickets

Attachments (4)

tor-9972.zip (22.5 KB) - added by mr-4 6 years ago.
configuration, log files and readme file with instructions on how to reproduce this bug
tor_browser_log (2.1 KB) - added by gatorchomps 6 years ago.
Example of warnings in log file
gchomps_torrc (480 bytes) - added by gatorchomps 6 years ago.
gatorchomps torrc
torrc_sample.txt (1.4 KB) - added by Toweri 5 years ago.
torrc, commenting out EntryNodes fixed the issue.

Download all attachments as: .zip

Change History (44)

comment:1 Changed 6 years ago by mr-4

Apologies, forgot to add two things: I also have quite extensive "ReachableAddresses reject xx.xx.xx.xx/yy, [...], *:443, *:9001, *:9090-9091" as well as "ExcludeNodes" statements.

The former was introduced at the same time I added my "EntryNodes" statement as described above, the latter has been in my "default-torrc" file for ages.

My "ReachableAddresses reject xx.xx.xx.xx/yy" and ExcludeNodes statements are auto-generated and contain subnets I have completely banned from my network (I have IP firewall rules doing the same), so tor did manage to bootstrap before I included these in torrc.

The reason I put them in my torrc file is to simply let tor know what is banned from my net and what isn't, so that it doesn't waste time trying to connect to nodes/IP addresses which are banned by my firewall.

comment:2 Changed 6 years ago by mr-4

Any luck with this one?

comment:3 Changed 6 years ago by mr-4

Is anyone actively looking into this???

comment:4 Changed 6 years ago by nickm

This is another one that is going to need a torrc file that can reproduce the behavior before there's much chance of tracking it down.

comment:5 Changed 6 years ago by mr-4

I'll upgrade to the new tor version (0.2.5.2-alpha) as there was something in the changelogs that suggested this might have been fixed. If not, I'll prepare a set of config & log files and post them here as I did with #10461 and #10722

comment:6 Changed 6 years ago by mr-4

Nope! Nothing has changed and I get exactly the same thing.

What I am going to attach next is archive file, which includes my configuration files (defaults-torrc and tor), log files and small readme file with instructions on how to reproduce this bug.

Changed 6 years ago by mr-4

Attachment: tor-9972.zip added

configuration, log files and readme file with instructions on how to reproduce this bug

comment:7 Changed 6 years ago by nickm

Keywords: tor-client added
Milestone: Tor: 0.2.5.x-final

comment:8 Changed 6 years ago by nickm

Okay, I think I have a fix, but I need to poke it for a while longer to see why the code was doing what it was doing, and whether my fix will break something else.

The first part of this fix is that we need is to change the part of choose_random_entry_impl() that does pick_entry_guards(), so that it says:

  if (! options->UseBridges &&
      smartlist_len(entry_guards) < num_needed)
    pick_entry_guards(options, for_directory);

(But if we do this, what other cases of entry_list_is_constrained() need to change? And does this actually do the right thing?)

(And if we do that, do we need to adjust entry_guards_set_from_config? The *10 there seems pointless and silly now.)

comment:9 Changed 6 years ago by nickm

(I've pushed the updated code to branch 'bug9972' for reference, but no guarantees that it's right)

comment:10 Changed 6 years ago by nickm

Keywords: 025-triaged added

comment:11 Changed 6 years ago by mr-4

Nick, is this stable-enough to test it? Am I likely to break something on my machine?

comment:12 Changed 6 years ago by nickm

It might be good to test this, but I'm not sure it's right. Back up your state file, keep an eye on what's happening to your guard nodes, and don't count on the set of guard nodes not getting completely trashed while testing this patch.

comment:13 Changed 6 years ago by gatorchomps

Hey, I just wanted to add some information regarding this issue. I see these warnings in my log quite often, and I have no restrictions on EntryNodes or anything else. I do run a bridge relay (so BridgeRelay = 1 in my torrc file). Otherwise, my torrc file has been unchanged from the default settings.

Changed 6 years ago by gatorchomps

Attachment: tor_browser_log added

Example of warnings in log file

comment:14 Changed 6 years ago by gatorchomps

Also forgot to mention: I run Debian Linux. I'm currently using the 3.6-beta-2 version of TBB, but I received the same warnings when I ran the stable version of the browser, as well.

comment:15 Changed 6 years ago by nickm

gatorchomps: If possible, could you post as much of your torrc as you can? (I know, you say it's unchanged from a default, but my experience is that sometimes there's one or two little things that people forgot about.)

Also, are you able to test the patch in the branch I mentioned above?

comment:16 Changed 6 years ago by nickm

Owner: set to nickm
Status: newassigned

marking as assigned since I've started coding on these.

Changed 6 years ago by gatorchomps

Attachment: gchomps_torrc added

gatorchomps torrc

comment:17 Changed 6 years ago by gatorchomps

Okay, I've attached my torrc. Not sure how to implement that code fix, though. What file would I need to edit? I'm pretty new to this whole thing :)

comment:18 Changed 6 years ago by nickm

If you are used to building the Tor program yourself from source, I can let you know how to get the updated source code...but if that's not something you know how to do, don't worry; I'll find some way to test this.

comment:19 Changed 6 years ago by nickm

Current plan on these is to investigate the bug, try to finish writing the patch on each, and then evaluate whether the patch is 0.2.5 or 0.2.6 material in terms of simplicity and importance.

comment:20 Changed 5 years ago by asn

Hm, I still don't see what causes this bug. Especially in the case of a bridge with no restrictions on its EntryNodes. I will look into this more.

Some notes on branch bug9972:

This looks like a forgotten format string:

-  log_notice(LD_GENERAL, "%d entries in guards", smartlist_len(entry_guards));
+  log_notice(LD_GENERAL, "%d entries in guards after adding enough EntryNodes");

I'm also a bit confused by these changes:

-  if (!entry_list_is_constrained(options) &&
+  if (! options->UseBridges &&

It seems to me that those checks were there, so that if the user has configured EntryGuards we make sure that they are strictly enforced. With the new checks, it seems that even if the user has configured EntryGuards, if they are less than num_needed we will go ahead and add random entry guards from the consensus. Is this what we want, or am I misreading the code? I'd say that if a user's configured EntryGuards are not sufficient to bootstrap Tor, Tor should fail closed and abort and ask the user to tune their EntryGuards.

Turning this to needs_revision for the format string, and I plan to review more soon.

I'd like to try to reproduce this. I'm still puzzled by the case of the single bridge with no entry guard restrictions.

comment:21 Changed 5 years ago by asn

Status: assignedneeds_revision

comment:22 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.6.x-final

When I tried, it was pretty easy to reproduce, and the fix _did_ appear to fix the issue. But yeah, this isn't tested enough or solid enough for 0.2.5. More review invited; putting in 0.2.6. I think this is not going to get done right until it can have solid unit tests.

comment:23 Changed 5 years ago by asn

FWIW, this message can also appear naturally, if EntryNodes are set, and the network is down. Tor will try to add and connect to all the EntryNodes and fail (mark them unreachable), and then when there are no more EntryNodes to add to our guard list, Tor will spit out that message.

comment:24 Changed 5 years ago by asn

I did some debugging using mr-4's config. Here is at least part of the problem:

If we start with no state, we have 0 entry guards. When we call choose_random_entry_impl() with EntryNodes {US} set, Tor is supposed to use entry_guards_set_from_config() to add all entry nodes from the US to the entry_guards list.
Then the entry_guards list is suppose to be filtered, so that in the end we pick a single entry guard.

However, it seems that entry_guards_set_from_config() is called quite early in the bootstrap process, and when it calls routerset_get_all_nodes() (which is supposed to add all those US entry guards) it gets to:

  } else {
    /* We need to iterate over the routerlist to get all the ones of the
     * right kind. */
    smartlist_t *nodes = nodelist_get_list();
    SMARTLIST_FOREACH(nodes, const node_t *, node, {

but so early in the bootstrap process we don't have a consensus and nodelist_get_list() just returns an empty smartlist. So no looping happens, and no entry guards are added to our entry_guards smartlist.

To test this hypothesis, I added:

  if (smartlist_len(entry_guards) > 1)
    should_add_entry_nodes = 0;

in the end of entry_guards_set_from_config(), instead of zeroing should_add_entry_nodes in the beginning. I just did this to test what would happen if we keep on trying to add entry nodes till we succeed to add at least one. With the above change, Tor managed to bootstrap normally!

We should think of a smarter way to realize when we should_add_entry_nodes should be toggled. entry_nodes_should_be_added() is not smart enough.

comment:25 Changed 5 years ago by nickm

Keywords: 026-triaged-1 026-deferrable added; 025-triaged removed

comment:26 Changed 5 years ago by cypherpunks

I have this same issue.

Changed 5 years ago by Toweri

Attachment: torrc_sample.txt added

torrc, commenting out EntryNodes fixed the issue.

comment:27 Changed 5 years ago by Toweri

Had the same issue, was corrected by editing out EntryNodes in 'torrc'. See attached "torrc_sample.txt"

comment:28 Changed 5 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.7.x-final

comment:29 Changed 5 years ago by nickm

Keywords: 027-triaged-1-out added

Marking triaged-out items from first round of 0.2.7 triage.

comment:30 Changed 5 years ago by nickm

Milestone: Tor: 0.2.7.x-finalTor: 0.2.???

Move *most* 0.2.7-triaged-1-out needs_revision items into 0.2.???. Keep a few based on my sense of the sensible.

comment:31 Changed 3 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:32 Changed 3 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:33 Changed 3 years ago by arma

Severity: Normal

The title of this ticket, "Failed to find node for hop 0 of our path", simply means that Tor tried to choose a path, and it didn't like any of the available choices for the first hop. That situation can happen for all sorts of reasons, some of them normal and expected. Did somebody figure out, above, what the situation is in this ticket? If so, maybe we should rename the ticket? If not, maybe we should close it and wait for somebody to open a more specific ticket?

comment:34 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:35 Changed 2 years ago by nickm

Keywords: 027-triaged-in added

comment:36 Changed 2 years ago by nickm

Keywords: 027-triaged-in removed

comment:37 Changed 2 years ago by nickm

Keywords: 027-triaged-1-out removed

comment:38 Changed 2 years ago by nickm

Keywords: 026-triaged-1 removed

comment:39 Changed 2 years ago by nickm

Keywords: 026-deferrable removed

comment:40 Changed 2 years ago by nickm

Resolution: fixed
Status: needs_revisionclosed

Prop271 completely rewrote the logic for applying restrictions to guard node selection. Closing this as fixed. There are probably other ways to trigger that message above, but they are different bugs than this one.

Note: See TracTickets for help on using tickets.