#19877 closed task (fixed)

Implement new guard selection algorithm of prop 271

Reported by: andrea Owned by: nickm
Priority: Medium Milestone: Tor: 0.3.0.x-final
Component: Core Tor/Tor Version: Tor: unspecified
Severity: Normal Keywords: isaremoved, nickwants029, tor-guards-revamp, review-group-13, actual-review-points=5, TorCoreTeam201612
Cc: asn Actual Points:
Parent ID: Points:
Reviewer: chelseakomlo, asn Sponsor: SponsorU-must

Description

Parent ticket for tasks implementing the new guard selection algorithm described in https://lists.torproject.org/pipermail/tor-dev/2016-July/011234.html

Child Tickets

TicketStatusOwnerSummaryComponent
#19858closedandreaMove guard state out of globals per new guard planCore Tor/Tor
#19878closednickmSample SAMPLED_GUARDS from GUARDS per new guard planCore Tor/Tor
#19879closednickmDerive FILTERED_GUARDS / USABLE_FILTERED_GUARDS from SAMPLED_GUARDS per new guard planCore Tor/Tor
#19880closednickmMaintain set of PRIMARY_GUARDS per new guard planCore Tor/Tor
#19881closednickmNew guard plan - guard selection for circuitsCore Tor/Tor
#19882closednickmNew guard plan - update guard state when a circuit fails/succeedsCore Tor/Tor
#19883closednickmMaintain CONFIRMED_GUARDS per new guard planCore Tor/Tor
#19884closednickmRetry schedule for guards per new guard planCore Tor/Tor
#19885closednickmNew guard plan - update list of waiting circuitsCore Tor/Tor
#19886closednickmNew guard plan - update state whenever we get a new consensusCore Tor/Tor
#19887closednickmNew guard plan - bridge supportCore Tor/Tor
#19888closednickmNew guard plan - separate state instances when EntryNodes/ExcludeNodes/etc are usedCore Tor/Tor
#19889closednickmNew guard plan - Glue things together!Core Tor/Tor
#20720closednickmprop271 -- Save persistent data in state file.Core Tor/Tor
#20829closednickmDocument state format for prop271 guardsCore Tor/Tor
#20834closednickmWrite patches for all spec-deviations in prop271Core Tor/Tor
#20920closednickmLower MAX_SAMPLE_THRESHOLD by a lotCore Tor/Tor
#20922closedConsider faster retry schedule for primary guardsCore Tor/Tor

Change History (30)

comment:1 Changed 14 months ago by isabela

Keywords: isaremoved added
Milestone: Tor: 0.2.9.x-finalTor: 0.2.???

comment:2 Changed 14 months ago by nickm

Keywords: nickwants029 tor-guards-revamp added

Batch modification: I believe these are all (or nearly all!) required for the successful completion of the SponorU guards task.

comment:3 Changed 14 months ago by asn

Cc: asn added

comment:4 Changed 14 months ago by isabela

Milestone: Tor: 0.2.???Tor: 0.2.9.x-final

comment:5 Changed 14 months ago by isabela

Sponsor: SponsorUSponsorU-must

comment:6 Changed 13 months ago by nickm

Keywords: nickm-deferred-20161005 added
Milestone: Tor: 0.2.9.x-finalTor: 0.3.0.x-final

Deferring more things -- even ones I love -- to 0.3.0. Please argue if I'm wrong.

comment:7 Changed 11 months ago by nickm

Keywords: TorCoreTeam201611 added; nickm-deferred-20161005 removed
Owner: set to nickm
Status: newassigned

Status: My prop271-wip branch (which I will rebase very much) has most of the common backend for prop271 implemented and tested. Next I finish the "waiting circuits list" magic, and tie it into Tor.

comment:8 Changed 11 months ago by nickm

Status: It's tied into tor, and it seems to work for the cases I can test. Coverage is high. Remaining must-do items are bridge support and entrynodes support. After that, we should take stock as we review; I will have some further tweaks to the algorithm to suggest.

comment:9 Changed 11 months ago by asn

Hello, I did a very rough initial review based on the WIP branch prop271-wip. Haven't run it yet. Great work so far, very excited about splitting bridge code to another file as well.

Some of these comments might be too low level for this stage of implementation, so feel free to ignore them. Also, I have not re-loaded the whole proposal in my brain yet, so I didn't go too deep in the nitty gritty details.

  • There are a few big scary functions that could be splitted into smaller functions to make them more manageable. e.g. sampled_guards_update_from_consensus() and entry_guards_update_primary.

Also the fundamental function select_entry_guard_for_circuit() could be split into three smaller functions, each for one spec case.

  • Agreed that the entry_guard_succeeded() tristate is ugly and makes the circuit_send_next_onion_skin() logic harder to read. Perhaps the retval could be turned into an enum?
  • If add_bridge_as_entry_guard() is a pub function of the entrynodes API, should it have a prefix? Or we are not at that level of tidyness yet?
  • Why do we need the spaceless ISO time functions? Can't we use the spaceful ones for state? Or is it to maintain backwards compat?
  • Parsing guards from state seems a bit of a pain now. For example, entry_guard_parse_from_state() does ad-hoc string parsing. Would it be possible to use the config parsing API for that?
  • Could smartlist_remove_keeporder() be implemented with smartlist_pos() and smartlist_del_keeporder() to avoid writing more smartlist code?

Finally, how should one robustly test this branch? Do you use iptable scripts or something?

comment:10 in reply to:  9 Changed 11 months ago by nickm

Replying to asn:

Hello, I did a very rough initial review based on the WIP branch prop271-wip. Haven't run it yet. Great work so far, very excited about splitting bridge code to another file as well.

Some of these comments might be too low level for this stage of implementation, so feel free to ignore them. Also, I have not re-loaded the whole proposal in my brain yet, so I didn't go too deep in the nitty gritty details.

  • There are a few big scary functions that could be splitted into smaller functions to make them more manageable. e.g. sampled_guards_update_from_consensus() and entry_guards_update_primary.

Also the fundamental function select_entry_guard_for_circuit() could be split into three smaller functions, each for one spec case.

Ack. I'll add a "XXXX can this split up" comment to those.

  • Agreed that the entry_guard_succeeded() tristate is ugly and makes the circuit_send_next_onion_skin() logic harder to read. Perhaps the retval could be turned into an enum?

Ack, will fix.

  • If add_bridge_as_entry_guard() is a pub function of the entrynodes API, should it have a prefix? Or we are not at that level of tidyness yet?

The entire bridge->entry interface is in flux; I hope we clean it up by the end.

  • Why do we need the spaceless ISO time functions? Can't we use the spaceful ones for state? Or is it to maintain backwards compat?

For the state format I chose, everything is spaceless. That makes it much much easier.

  • Parsing guards from state seems a bit of a pain now. For example, entry_guard_parse_from_state() does ad-hoc string parsing. Would it be possible to use the config parsing API for that?

Oooh. Interesting. We'd have to make it recursive, so it can handle the values within a line rather than just handling lines. Could be neat though. Maybe as another ticket?

  • Could smartlist_remove_keeporder() be implemented with smartlist_pos() and smartlist_del_keeporder() to avoid writing more smartlist code?

Only in quadratic-time: smartlist_remove_keeporder() needs to remove every occurrence of the target element. So to emulate it with pos and del_keeporder, you'd need to say:

    int idx;
    while ( (idx = smartlist_pos(sl, element)) >= 0) {
      smartlist_del_keeporder(sl, idx);
    }

Finally, how should one robustly test this branch? Do you use iptable scripts or something?

I've been doing ad-hoc iptable stuff, doing some of my hacking from nasty coffeeshop ISPs, etc. I haven't figured out how to do a really thorough hostile-network integration test though.

comment:11 Changed 11 months ago by nickm

Update: bridges are now supported and working.

comment:12 Changed 11 months ago by nickm

Status: assignedneeds_review

Okay, here we go.

 46 files changed, 8004 insertions(+), 1356 deletions(-)

The code is pretty solid, and I think it's worth reviewing now. I've made a new branch, "prop271_030_v1". This branch will _not_ get rebased. I've uploaded it to gitlab for review at https://gitlab.com/nickm_tor/tor/merge_requests/11

There is a corresponding torspec branch in "prop271-changes" in my public torspec repository.

For follow-up items for this work, see #20822.

comment:13 Changed 11 months ago by nickm

Keywords: review-group-13 added

comment:14 Changed 11 months ago by nickm

Reviewer: chelseakomlo, asn

comment:15 Changed 11 months ago by asn

Status: needs_reviewneeds_revision

Hello,

I did some initial testing and a first pass of manual code review.

You can see the first testing results here: https://trac.torproject.org/projects/tor/wiki/doc/NewGuardAlgorithmTesting

I also left a few comments on the gitlab review with things that should be discussed. Some of those comments might be follow-up material (#20822).

Also, please check my prop271_030_v1 branch for some misc fixes.

BTW, starting tomorrow and until Monday I will be travelling and will have reduced connectivity and time for review. I might manage to do a few stuff, but I will continue in full force starting next Tuesday. Next week I can do more testing, and also some refactoring, and perhaps add some unittests (e.g. for the flappiness feature).

comment:16 Changed 11 months ago by asn

Keywords: actual-review-points=2 added

comment:17 Changed 11 months ago by chelseakomlo

Hey Nick,

Here are some general thoughts, feel free to take/leave what is useful. I added notes to the gitlab review as well.

  • Naming conventions are a bit confusing, as we use entrynodes, entryguards, and guards. If this naming signifies pre and post prop271, maybe we can add in legacy namespacing to make it more clear (see below). Also, the bridge-specific module looks really great- if there is guard-specific code that is distinct from entrynodes, maybe this belongs in a separate guard module.
  • Making a clear distinction between pre and post prop271 code would definitely make reviewing easier, as well as eventually migrating away from legacy code... I liked asn's recommendation of namespacing, we could also move legacy code to it's own module, etc.
  • Some functions are quite large, such as sampled_guards_update_from_consensus and entry_guards_upgrade_waiting_circuits. I see some todos about splitting these up, which is great.
  • Would it make sense to put parsing code into a parser.c?
  • bridges.c doesn't have test coverage :(
  • It looks like there are several remaining todos that need to be completed. Which of these should be completed as part of this patch? For example, in entrynodes.c, update_guard_selection_choice has a comment about needing to expire existing circuits (line 653).

comment:18 in reply to:  15 Changed 11 months ago by nickm

Replying to asn:

Hello,

I did some initial testing and a first pass of manual code review.

Thank you!

You can see the first testing results here: https://trac.torproject.org/projects/tor/wiki/doc/NewGuardAlgorithmTesting

I also left a few comments on the gitlab review with things that should be discussed. Some of those comments might be follow-up material (#20822).

Okay. I'll look over all the comments on the gitlab pull request next.

Also, please check my prop271_030_v1 branch for some misc fixes.

I've merged that into my branch; thak you!

BTW, starting tomorrow and until Monday I will be travelling and will have reduced connectivity and time for review. I might manage to do a few stuff, but I will continue in full force starting next Tuesday. Next week I can do more testing, and also some refactoring, and perhaps add some unittests (e.g. for the flappiness feature).

Thank you!

comment:19 in reply to:  17 Changed 11 months ago by nickm

Replying to chelseakomlo:

Hey Nick,

Here are some general thoughts, feel free to take/leave what is useful. I added notes to the gitlab review as well.

  • Naming conventions are a bit confusing, as we use entrynodes, entryguards, and guards. If this naming signifies pre and post prop271, maybe we can add in legacy namespacing to make it more clear (see below). Also, the bridge-specific module looks really great- if there is guard-specific code that is distinct from entrynodes, maybe this belongs in a separate guard module.
  • Making a clear distinction between pre and post prop271 code would definitely make reviewing easier, as well as eventually migrating away from legacy code... I liked asn's recommendation of namespacing, we could also move legacy code to it's own module, etc.

I've tried adding #ifdefs around the old code, in 472a621ccc8a90bf90d8394210519974ad235848.

For namespacing, I think everything should wind up in one entryguards_* namespace once the legacy things are removed, and the rest of the code is cleaned up. The guards_* functions are there as generic frontends to the old and new code.

  • Some functions are quite large, such as sampled_guards_update_from_consensus and entry_guards_upgrade_waiting_circuits. I see some todos about splitting these up, which is great.

Agreed.

  • Would it make sense to put parsing code into a parser.c?

I'd like to extract it into something more general; for now it's pretty specific, though it could become more generally useful. I've added it #20919.

I think we discussed this on IRC, and decided you needed to hit "expand" on gitlab, and then things got better.

  • bridges.c doesn't have test coverage :(

Right. At least, that code is no less covered than it was before I moved it. :/

  • It looks like there are several remaining todos that need to be completed. Which of these should be completed as part of this patch? For example, in entrynodes.c, update_guard_selection_choice has a comment about needing to expire existing circuits (line 653).

That's taken care of, so I removed the comment and updated the doxygen in 6a6625c632248c. Resolving todos in in general is #20718 under #20822

comment:20 Changed 11 months ago by nickm

Okay, I think I've answered all the comments here and on gitlab, most of them with code fixes. Do we think this is ready for merge?

comment:21 in reply to:  20 Changed 10 months ago by asn

Replying to nickm:

Okay, I think I've answered all the comments here and on gitlab, most of them with code fixes. Do we think this is ready for merge?

Thanks for fixing all the comments above. I'd like to do another round of review and testing next week before merging this. ETA for review results is Thursday.

If you feel like merging this, you can do this, and I'll follow up with comments anyhow.

comment:22 Changed 10 months ago by nickm

I'm not in a hurry; Thursday would be fine. I've started trying to fix up some of the followup issues in other tickets under #20822 ; I can keep doing that while you do the second round. Thank you!

comment:23 Changed 10 months ago by nickm

Status: needs_revisionneeds_review

comment:24 Changed 10 months ago by dgoulet

Keywords: TorCoreTeam201612 added; TorCoreTeam201611 removed

Going to December!

comment:25 Changed 10 months ago by asn

Keywords: actual-review-points=5 added; actual-review-points=2 removed

Some more comments were added to gitlab. Getting closer to merge-ready I think!

comment:26 Changed 10 months ago by nickm

Ok; I don't think I see any comments I didn't answer there. Did I miss some?

comment:27 Changed 10 months ago by nickm

(asn says he believes this is ready)

comment:28 Changed 10 months ago by nickm

Status: needs_reviewmerge_ready

comment:29 Changed 10 months ago by nickm

prop271_030_v1 is now squashed as prop271_030_v1_squashed. The HEADs of the branches are identical.

Merging to master; resolving conflicts.

comment:30 Changed 10 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

MERGED.

Note: See TracTickets for help on using tickets.