Opened 2 years ago

Last modified 13 months ago

#21425 needs_information defect

entry_list_is_constrained() should look at the guard_selection_t object

Reported by: nickm Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: guards, 031-deferred-20170425, review-group-18, 033-triage-20180320, 033-removed-20180320, 033-removed-20180403
Cc: neel@… Actual Points:
Parent ID: #20822 Points: .5
Reviewer: asn Sponsor: SponsorV-can

Description

We use entry_list_is_constrained() in a few places to find out if our list of entry points is highly limited (e.g., to a few bridges or a few EntryNodes). But it doesn't do that very well: instead, it looks to see if EntryNodes is set or UseBridges is set.

We have better ways: we should be looking at the size of the guard sample, or something.

Child Tickets

Attachments (3)

tor_guard_selection_t.patch (3.3 KB) - added by neel 2 years ago.
Patch: Make entry_list_is_constrained() use guard_selection_t
b21425-p002.diff (4.0 KB) - added by neel 14 months ago.
Make entry_list_is_constrained() use guard_selection_t (Revision 2)
b21425-p003.diff (4.4 KB) - added by neel 14 months ago.
Make entry_list_is_constrained() use guard_selection_t (Revision 3)

Download all attachments as: .zip

Change History (32)

comment:1 Changed 2 years ago by nickm

Parent ID: #20822

comment:2 in reply to:  description Changed 2 years ago by arma

Replying to nickm:

to find out if our list of entry points is highly limited

At least originally, the definition of constrained was more like 'fixed'. The question is whether the entry list is growable using the default algorithm of "go to the consensus and pick a new one". So it isn't about whether it's limited to a few or a lot, it's about whether it's growable.

(I haven't looked at the code changes lately to see how much we've gotten away from that definition, but I'd posit that wherever we have, it's a potential bug. :)

comment:3 Changed 2 years ago by nickm

Points: .5

Changed 2 years ago by neel

Attachment: tor_guard_selection_t.patch added

Patch: Make entry_list_is_constrained() use guard_selection_t

comment:4 Changed 2 years ago by neel

I have a created a patch for this (tor_guard_selection_t.patch). Please tell me your opinions on it.

comment:5 Changed 2 years ago by nickm

Keywords: 031-deferred-20170425 added
Milestone: Tor: 0.3.1.x-finalTor: 0.3.2.x-final

Triage: batch-defer unowned items of priority Medium or lower to 0.3.2.

comment:6 Changed 2 years ago by arma

Status: newneeds_review

Moving to needs-review, because neel very quietly submitted a patch to it.

comment:7 Changed 23 months ago by neel

Cc: neel@… added

comment:8 Changed 22 months ago by nickm

Keywords: review-group-18 added

comment:9 Changed 22 months ago by asn

Reviewer: asn

comment:10 Changed 22 months ago by asn

Status: needs_reviewneeds_revision

Hello, I'm not a huge fan of the attached patch.

I think Roger's comment:2 is key here: we use entry_list_is_constrained() to check whether we only have a hardcoded set of already-sampled nodes to choose from. We use it to aggressively retry guards (since we can't sample new ones), or to ignore certain reachability failures (so that we can retry dirfetches).

The proposed patch does the following:

+entry_list_is_constrained(const or_options_t *options, guard_selection_t *gs)
 {
-  // XXXX #21425 look at the current selection.
-  if (options->EntryNodes)
-    return 1;
-  if (options->UseBridges)
+  if (gs->primary_guards_up_to_date)
     return 1;
   return 0;

which seems kinda random (what do primary guards have to do with this?), and definitely not equivalent to the previous functionality of that function.

I think the current version of this function is much better than the proposed alternative, and I can't come up with an obvious improvement here.

I'm somewhere between WONTFIXing this, and waiting some more to see if problems come up with the current code.

Otherwise we should try to definite more precisely what entry_list_is_constrained() is supposed to do and see if we are missing any obvious cases here.

comment:11 Changed 22 months ago by nickm

Sponsor: SponsorV-can

comment:12 Changed 19 months ago by nickm

Defer more needs_revision items to 0.3.3

comment:13 Changed 19 months ago by nickm

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

comment:14 Changed 14 months ago by neel

Sorry for the delay, but I decided to come back to the patch and start over from scratch.

I have two questions:

  1. Is checking the capacity of sampled_entry_guards from the guard_selection_t object okay for this patch?
  2. If #1 is okay, which capacity size should sampled_entry_guards be to return 1? Right now, I am guessing 3 guards. Should it be more? Less?

comment:15 in reply to:  14 Changed 14 months ago by teor

Replying to neel:

Sorry for the delay, but I decided to come back to the patch and start over from scratch.

I have two questions:

  1. Is checking the capacity of sampled_entry_guards from the guard_selection_t object okay for this patch?

Checking the current size of sampled entry guards is ok.
(The capacity of a smartlist is the size it can grow to without allocating additional memory.)

  1. If #1 is okay, which capacity size should sampled_entry_guards be to return 1? Right now, I am guessing 3 guards. Should it be more? Less?

If we want to consider Tor Browser bridge users constrained, the answer is around 10 or 20.
If not, the answer is around 3.

comment:16 Changed 14 months ago by teor

There are 27 obfs4 bridges in Tor Browser at the moment:
https://gitweb.torproject.org/builders/tor-browser-build.git/tree/projects/tor-browser/Bundle-Data/PTConfigs/bridge_prefs.js
The other bridge types have 1-5 bridges.

So let's set the limit to 50 or 100?

comment:17 in reply to:  16 Changed 14 months ago by neel

Replying to teor:

So let's set the limit to 50 or 100?

For just bridge users or also relay users?

comment:18 Changed 14 months ago by teor

I don't see any reason to use different numbers for bridge and relay users

Changed 14 months ago by neel

Attachment: b21425-p002.diff added

Make entry_list_is_constrained() use guard_selection_t (Revision 2)

comment:19 Changed 14 months ago by neel

I have a updated patch. The filename is b21425-p002.diff.

I chose to use 50 guards as the sample size from sampled_entry_guards's capacity.

comment:20 Changed 14 months ago by teor

This patch looks good.

Please make the "50" into a #define constant with an appropriate name.
We usually use names like MAX_CONSTRAINED_ENTRY_LIST_COUNT.

This line doesn't need the "?:", the condition is already boolean:

return (gs->sampled_entry_guards->capacity > 50) ? 1 : 0;

Changed 14 months ago by neel

Attachment: b21425-p003.diff added

Make entry_list_is_constrained() use guard_selection_t (Revision 3)

comment:21 Changed 14 months ago by neel

I have a new patch with the changes you have requested. The filename is b21425-p003.diff​.

comment:22 Changed 14 months ago by teor

Status: needs_revisionneeds_review

comment:23 Changed 14 months ago by asn

Status: needs_reviewneeds_revision

Hello people,

two comments on the patch so that I understand a bit better what's going on because it's been a while and I'm a bit confuse:

1) What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)

2) Why are we using the capacity field of a smartlist? We should (almost) never dig into the guts of smartlists. If we want to check the current number of sampled guards we should use smartlist_len(). Is that we are trying to do here?

I'm marking this as needs_revision so that we get a good answer to (1) and address (2).

Thanks for the code <3

comment:24 in reply to:  23 ; Changed 14 months ago by teor

Replying to asn:

1) What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)

We check guards more aggressively when entry list is constrained,
There's no point checking guards more aggressively if we have a lot of potential guards.
We chose 50, because we want to make sure Tor Browser's obfs4 guard set is considered "constrained".

Neel, please explain why we chose 50 in the comments in the patch.

Here's the background:

Replying to nickm:

We use entry_list_is_constrained() in a few places to find out if our list of entry points is highly limited (e.g., to a few bridges or a few EntryNodes). But it doesn't do that very well: instead, it looks to see if EntryNodes is set or UseBridges is set.

We have better ways: we should be looking at the size of the guard sample, or something.

Replying to teor:

Replying to neel:

  1. If #1 is okay, which capacity size should sampled_entry_guards be to return 1? Right now, I am guessing 3 guards. Should it be more? Less?

If we want to consider Tor Browser bridge users constrained, the answer is around 10 or 20.
If not, the answer is around 3.

Replying to teor:

There are 27 obfs4 bridges in Tor Browser at the moment:
https://gitweb.torproject.org/builders/tor-browser-build.git/tree/projects/tor-browser/Bundle-Data/PTConfigs/bridge_prefs.js
The other bridge types have 1-5 bridges.

So let's set the limit to 50 or 100?


Replying to asn:

2) Why are we using the capacity field of a smartlist? We should (almost) never dig into the guts of smartlists. If we want to check the current number of sampled guards we should use smartlist_len(). Is that we are trying to do here?

I asked for this change already, but I forgot to check that it had happened:

Replying to teor:

Replying to neel:

  1. Is checking the capacity of sampled_entry_guards from the guard_selection_t object okay for this patch?

Checking the current size of sampled entry guards is ok.
(The capacity of a smartlist is the size it can grow to without allocating additional memory.)

Edit: fix formatting

Last edited 14 months ago by teor (previous) (diff)

comment:25 in reply to:  24 Changed 14 months ago by asn

Status: needs_revisionneeds_information

Replying to teor:

Replying to asn:

1) What are we trying to achieve with this new entry_list_is_constrained() logic? Can someone explain to me why we consider the entry list constrained if it contains more than 50 guards, and why this logic makes sense given the way entry_list_is_constrained() is used in the rest of the code? I think this is a very important question and ideally we should have a clear and concise answer :)

We check guards more aggressively when entry list is constrained,
There's no point checking guards more aggressively if we have a lot of potential guards.

Hmm, as arma said in comment:2, this function has been used in the past to query whether our guard list is fixed, and not whether it contains lots of elements. Said otherwise, whether for each circuit we go back to the consensus and get a new random guard (not constrained), or we will just keep on trying the same ones all the time (constrained).

Furthermore, the general approach in this ticket has been to look at the guard sample size (which is always between 20 and 60: see get_max_sample_size()) and denotes how many guards we have sampled in the lifetime of the Tor state file. It does not denote how many of those guards are currently reachable or usable. That's bad since any Tor state file has lived long enough will have accumulated 60 sampled guards (e.g. all of my state files) and considered not constrained. However, those Tors are definitely constrained since we don't know how many of those 60 sampled guards are currently reachable (in prop#271 terms belong to USABLE_FILTERED_GUARDS set): it could be that only 2-3 of them are currently usable. And even if many of them are reachable, Tor will constrain itself by persistently trying the primaries and confirmed guards before falling back to the rest of its list.

In any case, we should try to understand what we are trying to gain with this ticket because it's not clear to me currently. Perhaps this ticket is a red herring and we should WONTFIX this, or perhaps switch from checking the sampled set to the usable filtered set, but we should think some more here.

That's not helped by the fact that we have two uses for entry_list_is_constrained() and they are both kinda arbitrary:

  • In circuit_get_open_circ_or_launch(), if we lack dirinfo and the guards are constrained then we retry all our guards again. I guess that's needed in this case so that we don't run out of guards to try if our list is constrained.
  • In connection_dir_request_failed(), where if our guards are not constrained and we just failed a dir request due to network error, we mark that guard as down.

Maybe Nick remembers what we wanted to gain with this ticket?

comment:26 Changed 13 months ago by nickm

Keywords: 033-triage-20180320 added

Marking all tickets reached by current round of 033 triage.

comment:27 Changed 13 months ago by nickm

Keywords: 033-removed-20180320 added

Mark all not-already-included tickets as pending review for removal from 0.3.3 milestone.

comment:28 Changed 13 months ago by asn

Keywords: 033-removed-20180403 added

comment:29 Changed 13 months ago by nickm

Milestone: Tor: 0.3.3.x-finalTor: unspecified
Note: See TracTickets for help on using tickets.