We use entry_list_is_constrained() in a few places to find out if our list of entry points is highly limited (e.g., to a few bridges or a few EntryNodes). But it doesn't do that very well: instead, it looks to see if EntryNodes is set or UseBridges is set.
We have better ways: we should be looking at the size of the guard sample, or something.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
to find out if our list of entry points is highly limited
At least originally, the definition of constrained was more like 'fixed'. The question is whether the entry list is growable using the default algorithm of "go to the consensus and pick a new one". So it isn't about whether it's limited to a few or a lot, it's about whether it's growable.
(I haven't looked at the code changes lately to see how much we've gotten away from that definition, but I'd posit that wherever we have, it's a potential bug. :)
I think Roger's comment:2 is key here: we use entry_list_is_constrained() to check whether we only have a hardcoded set of already-sampled nodes to choose from. We use it to aggressively retry guards (since we can't sample new ones), or to ignore certain reachability failures (so that we can retry dirfetches).
The proposed patch does the following:
+entry_list_is_constrained(const or_options_t *options, guard_selection_t *gs) {- // XXXX #21425 look at the current selection.- if (options->EntryNodes)- return 1;- if (options->UseBridges)+ if (gs->primary_guards_up_to_date) return 1; return 0;
which seems kinda random (what do primary guards have to do with this?), and definitely not equivalent to the previous functionality of that function.
I think the current version of this function is much better than the proposed alternative, and I can't come up with an obvious improvement here.
I'm somewhere between WONTFIXing this, and waiting some more to see if problems come up with the current code.
Otherwise we should try to definite more precisely what entry_list_is_constrained() is supposed to do and see if we are missing any obvious cases here.
Sorry for the delay, but I decided to come back to the patch and start over from scratch.
I have two questions:
Is checking the capacity of sampled_entry_guards from the guard_selection_t object okay for this patch?
Checking the current size of sampled entry guards is ok.
(The capacity of a smartlist is the size it can grow to without allocating additional memory.)
If #1 is okay, which capacity size should sampled_entry_guards be to return 1? Right now, I am guessing 3 guards. Should it be more? Less?
If we want to consider Tor Browser bridge users constrained, the answer is around 10 or 20.
If not, the answer is around 3.
two comments on the patch so that I understand a bit better what's going on because it's been a while and I'm a bit confuse:
What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)
Why are we using the capacity field of a smartlist? We should (almost) never dig into the guts of smartlists. If we want to check the current number of sampled guards we should use smartlist_len(). Is that we are trying to do here?
I'm marking this as needs_revision so that we get a good answer to (1) and address (2).
What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)
We check guards more aggressively when entry list is constrained,
There's no point checking guards more aggressively if we have a lot of potential guards.
We chose 50, because we want to make sure Tor Browser's obfs4 guard set is considered "constrained".
Neel, please explain why we chose 50 in the comments in the patch.
We use entry_list_is_constrained() in a few places to find out if our list of entry points is highly limited (e.g., to a few bridges or a few EntryNodes). But it doesn't do that very well: instead, it looks to see if EntryNodes is set or UseBridges is set.
We have better ways: we should be looking at the size of the guard sample, or something.
Why are we using the capacity field of a smartlist? We should (almost) never dig into the guts of smartlists. If we want to check the current number of sampled guards we should use smartlist_len(). Is that we are trying to do here?
I asked for this change already, but I forgot to check that it had happened:
Is checking the capacity of sampled_entry_guards from the guard_selection_t object okay for this patch?
Checking the current size of sampled entry guards is ok.
(The capacity of a smartlist is the size it can grow to without allocating additional memory.)
What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)
We check guards more aggressively when entry list is constrained,
There's no point checking guards more aggressively if we have a lot of potential guards.
Hmm, as arma said in comment:2, this function has been used in the past to query whether our guard list is fixed, and not whether it contains lots of elements. Said otherwise, whether for each circuit we go back to the consensus and get a new random guard (not constrained), or we will just keep on trying the same ones all the time (constrained).
Furthermore, the general approach in this ticket has been to look at the guard sample size (which is always between 20 and 60: see get_max_sample_size()) and denotes how many guards we have sampled in the lifetime of the Tor state file. It does not denote how many of those guards are currently reachable or usable. That's bad since any Tor state file has lived long enough will have accumulated 60 sampled guards (e.g. all of my state files) and considered not constrained. However, those Tors are definitely constrained since we don't know how many of those 60 sampled guards are currently reachable (in prop#271 terms belong to USABLE_FILTERED_GUARDS set): it could be that only 2-3 of them are currently usable. And even if many of them are reachable, Tor will constrain itself by persistently trying the primaries and confirmed guards before falling back to the rest of its list.
In any case, we should try to understand what we are trying to gain with this ticket because it's not clear to me currently. Perhaps this ticket is a red herring and we should WONTFIX this, or perhaps switch from checking the sampled set to the usable filtered set, but we should think some more here.
That's not helped by the fact that we have two uses for entry_list_is_constrained() and they are both kinda arbitrary:
In circuit_get_open_circ_or_launch(), if we lack dirinfo and the guards are constrained then we retry all our guards again. I guess that's needed in this case so that we don't run out of guards to try if our list is constrained.
In connection_dir_request_failed(), where if our guards are not constrained and we just failed a dir request due to network error, we mark that guard as down.
Maybe Nick remembers what we wanted to gain with this ticket?