entry_list_is_constrained() should look at the guard_selection_t object

changed milestone to %Tor: unspecified

added 031-deferred-20170425 033-removed-20180320 033-removed-20180403 033-triage-20180320 component::core tor/tor guards milestone::Tor: unspecified parent::20822 points::.5 priority::medium review-group-18 reviewer::asn severity::normal status::needs-information type::defect labels

Trac:
Parent: N/A to #20822 (moved)

Replying to nickm:

to find out if our list of entry points is highly limited

At least originally, the definition of constrained was more like 'fixed'. The question is whether the entry list is growable using the default algorithm of "go to the consensus and pick a new one". So it isn't about whether it's limited to a few or a lot, it's about whether it's growable.

(I haven't looked at the code changes lately to see how much we've gotten away from that definition, but I'd posit that wherever we have, it's a potential bug. :)

Trac:
Points: N/A to .5

Trac:
tor_guard_selection_t.patch

Patch: Make entry_list_is_constrained() use guard_selection_t

I have a created a patch for this (tor_guard_selection_t.patch). Please tell me your opinions on it.

Triage: batch-defer unowned items of priority Medium or lower to 0.3.2.

Trac:
Milestone: Tor: 0.3.1.x-final to Tor: 0.3.2.x-final
Keywords: N/A deleted, 031-deferred-20170425 added

Moving to needs-review, because neel very quietly submitted a patch to it.

Trac:
Status: new to needs_review

Trac:
Cc: N/A to neel@neelc.org

Trac:
Keywords: N/A deleted, review-group-18 added

Trac:
Reviewer: N/A to asn

Hello, I'm not a huge fan of the attached patch.

I think Roger's comment:2 is key here: we use entry_list_is_constrained() to check whether we only have a hardcoded set of already-sampled nodes to choose from. We use it to aggressively retry guards (since we can't sample new ones), or to ignore certain reachability failures (so that we can retry dirfetches).

The proposed patch does the following:

+entry_list_is_constrained(const or_options_t *options, guard_selection_t *gs)
 {
-  // XXXX #21425 look at the current selection.
-  if (options->EntryNodes)
-    return 1;
-  if (options->UseBridges)
+  if (gs->primary_guards_up_to_date)
     return 1;
   return 0;

which seems kinda random (what do primary guards have to do with this?), and definitely not equivalent to the previous functionality of that function.

I think the current version of this function is much better than the proposed alternative, and I can't come up with an obvious improvement here.

I'm somewhere between WONTFIXing this, and waiting some more to see if problems come up with the current code.

Otherwise we should try to definite more precisely what entry_list_is_constrained() is supposed to do and see if we are missing any obvious cases here.

Trac:
Status: needs_review to needs_revision

Trac:
Sponsor: N/A to SponsorV-can

Defer more needs_revision items to 0.3.3

Trac:
Milestone: Tor: 0.3.2.x-final to Tor: 0.3.3.x-final

Sorry for the delay, but I decided to come back to the patch and start over from scratch.

I have two questions:

Is checking the capacity of sampled_entry_guards from the guard_selection_t object okay for this patch?
If #1 is okay, which capacity size should sampled_entry_guards be to return 1? Right now, I am guessing 3 guards. Should it be more? Less?

Replying to neel:

Sorry for the delay, but I decided to come back to the patch and start over from scratch.

I have two questions:

Is checking the capacity of sampled_entry_guards from the guard_selection_t object okay for this patch?

Checking the current size of sampled entry guards is ok. (The capacity of a smartlist is the size it can grow to without allocating additional memory.)

If #1 is okay, which capacity size should sampled_entry_guards be to return 1? Right now, I am guessing 3 guards. Should it be more? Less?

If we want to consider Tor Browser bridge users constrained, the answer is around 10 or 20. If not, the answer is around 3.

There are 27 obfs4 bridges in Tor Browser at the moment: https://gitweb.torproject.org/builders/tor-browser-build.git/tree/projects/tor-browser/Bundle-Data/PTConfigs/bridge_prefs.js The other bridge types have 1-5 bridges.

So let's set the limit to 50 or 100?

Replying to teor:

So let's set the limit to 50 or 100?

For just bridge users or also relay users?

I don't see any reason to use different numbers for bridge and relay users

Trac:
b21425-p002.diff

Make entry_list_is_constrained() use guard_selection_t (Revision 2)

I have a updated patch. The filename is b21425-p002.diff.

I chose to use 50 guards as the sample size from sampled_entry_guards's capacity.

This patch looks good.

Please make the "50" into a #define constant with an appropriate name. We usually use names like MAX_CONSTRAINED_ENTRY_LIST_COUNT.

This line doesn't need the "?:", the condition is already boolean:

return (gs->sampled_entry_guards->capacity > 50) ? 1 : 0;

Trac:
b21425-p003.diff

Make entry_list_is_constrained() use guard_selection_t (Revision 3)

I have a new patch with the changes you have requested. The filename is b21425-p003.diff.

Trac:
Status: needs_revision to needs_review

Hello people,

two comments on the patch so that I understand a bit better what's going on because it's been a while and I'm a bit confuse:

What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)
Why are we using the capacity field of a smartlist? We should (almost) never dig into the guts of smartlists. If we want to check the current number of sampled guards we should use smartlist_len(). Is that we are trying to do here?

I'm marking this as needs_revision so that we get a good answer to (1) and address (2).

Thanks for the code <3

Trac:
Status: needs_review to needs_revision

Replying to asn:

What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)

We check guards more aggressively when entry list is constrained, There's no point checking guards more aggressively if we have a lot of potential guards. We chose 50, because we want to make sure Tor Browser's obfs4 guard set is considered "constrained".

Neel, please explain why we chose 50 in the comments in the patch.

Here's the background:

Replying to nickm:

We use entry_list_is_constrained() in a few places to find out if our list of entry points is highly limited (e.g., to a few bridges or a few EntryNodes). But it doesn't do that very well: instead, it looks to see if EntryNodes is set or UseBridges is set.

We have better ways: we should be looking at the size of the guard sample, or something.

Replying to teor:

Replying to neel:

If #1 is okay, which capacity size should sampled_entry_guards be to return 1? Right now, I am guessing 3 guards. Should it be more? Less?

If we want to consider Tor Browser bridge users constrained, the answer is around 10 or 20. If not, the answer is around 3.

Replying to teor:

There are 27 obfs4 bridges in Tor Browser at the moment: https://gitweb.torproject.org/builders/tor-browser-build.git/tree/projects/tor-browser/Bundle-Data/PTConfigs/bridge_prefs.js The other bridge types have 1-5 bridges.

So let's set the limit to 50 or 100?

Replying to asn:

Why are we using the capacity field of a smartlist? We should (almost) never dig into the guts of smartlists. If we want to check the current number of sampled guards we should use smartlist_len(). Is that we are trying to do here?

I asked for this change already, but I forgot to check that it had happened:

Replying to teor:

Replying to neel:

Is checking the capacity of sampled_entry_guards from the guard_selection_t object okay for this patch?

Checking the current size of sampled entry guards is ok. (The capacity of a smartlist is the size it can grow to without allocating additional memory.)

Edit: fix formatting

Replying to teor:

Replying to asn:

What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)

We check guards more aggressively when entry list is constrained, There's no point checking guards more aggressively if we have a lot of potential guards.

Hmm, as arma said in comment:2, this function has been used in the past to query whether our guard list is fixed, and not whether it contains lots of elements. Said otherwise, whether for each circuit we go back to the consensus and get a new random guard (not constrained), or we will just keep on trying the same ones all the time (constrained).

Furthermore, the general approach in this ticket has been to look at the guard sample size (which is always between 20 and 60: see get_max_sample_size()) and denotes how many guards we have sampled in the lifetime of the Tor state file. It does not denote how many of those guards are currently reachable or usable. That's bad since any Tor state file has lived long enough will have accumulated 60 sampled guards (e.g. all of my state files) and considered not constrained. However, those Tors are definitely constrained since we don't know how many of those 60 sampled guards are currently reachable (in prop#271 terms belong to USABLE_FILTERED_GUARDS set): it could be that only 2-3 of them are currently usable. And even if many of them are reachable, Tor will constrain itself by persistently trying the primaries and confirmed guards before falling back to the rest of its list.

In any case, we should try to understand what we are trying to gain with this ticket because it's not clear to me currently. Perhaps this ticket is a red herring and we should WONTFIX this, or perhaps switch from checking the sampled set to the usable filtered set, but we should think some more here.

That's not helped by the fact that we have two uses for entry_list_is_constrained() and they are both kinda arbitrary:

In circuit_get_open_circ_or_launch(), if we lack dirinfo and the guards are constrained then we retry all our guards again. I guess that's needed in this case so that we don't run out of guards to try if our list is constrained.
In connection_dir_request_failed(), where if our guards are not constrained and we just failed a dir request due to network error, we mark that guard as down.

Maybe Nick remembers what we wanted to gain with this ticket?

Trac:
Status: needs_revision to needs_information

Marking all tickets reached by current round of 033 triage.

Trac:
Keywords: N/A deleted, 033-triage-20180320 added

Mark all not-already-included tickets as pending review for removal from 0.3.3 milestone.

Trac:
Keywords: N/A deleted, 033-removed-20180320 added

Trac:
Keywords: N/A deleted, 033-removed-20180403 added

Trac:
Milestone: Tor: 0.3.3.x-final to Tor: unspecified

Removing sponsor V as we do not have more time to include this tickets in the sponsor.

Removing sponsor from tickets that we do not have time to fit in the remain of this sponsorship.

Trac:
Sponsor: SponsorV-can to N/A

changed time estimate to 4h

moved to tpo/core/tor#21425 (closed)

entry_list_is_constrained() should look at the guard_selection_t object

Child items 0

Activity