Opened 9 months ago

Closed 4 months ago

#19877 closed task (fixed)

Implement new guard selection algorithm of prop 271

Reported by: andrea Owned by: nickm
Priority: Medium Milestone: Tor: 0.3.0.x-final
Component: Core Tor/Tor Version: Tor: unspecified
Severity: Normal Keywords: isaremoved, nickwants029, tor-guards-revamp, review-group-13, actual-review-points=5, TorCoreTeam201612
Cc: asn Actual Points:
Parent ID: Points:
Reviewer: chelseakomlo, asn Sponsor: SponsorU-must

Description

Parent ticket for tasks implementing the new guard selection algorithm described in https://lists.torproject.org/pipermail/tor-dev/2016-July/011234.html

Child Tickets

TicketSummaryOwner
#19858Move guard state out of globals per new guard planandrea
#19878Sample SAMPLED_GUARDS from GUARDS per new guard plannickm
#19879Derive FILTERED_GUARDS / USABLE_FILTERED_GUARDS from SAMPLED_GUARDS per new guard plannickm
#19880Maintain set of PRIMARY_GUARDS per new guard plannickm
#19881New guard plan - guard selection for circuitsnickm
#19882New guard plan - update guard state when a circuit fails/succeedsnickm
#19883Maintain CONFIRMED_GUARDS per new guard plannickm
#19884Retry schedule for guards per new guard plannickm
#19885New guard plan - update list of waiting circuitsnickm
#19886New guard plan - update state whenever we get a new consensusnickm
#19887New guard plan - bridge supportnickm
#19888New guard plan - separate state instances when EntryNodes/ExcludeNodes/etc are usednickm
#19889New guard plan - Glue things together!nickm
#20720prop271 -- Save persistent data in state file.nickm
#20829Document state format for prop271 guardsnickm
#20834Write patches for all spec-deviations in prop271nickm
#20920Lower MAX_SAMPLE_THRESHOLD by a lotnickm
#20922Consider faster retry schedule for primary guards

Change History (30)

comment:1 Changed 9 months ago by isabela

  • Keywords isaremoved added
  • Milestone changed from Tor: 0.2.9.x-final to Tor: 0.2.???

comment:2 Changed 9 months ago by nickm

  • Keywords nickwants029 tor-guards-revamp added

Batch modification: I believe these are all (or nearly all!) required for the successful completion of the SponorU guards task.

comment:3 Changed 9 months ago by asn

  • Cc asn added

comment:4 Changed 8 months ago by isabela

  • Milestone changed from Tor: 0.2.??? to Tor: 0.2.9.x-final

comment:5 Changed 8 months ago by isabela

  • Sponsor changed from SponsorU to SponsorU-must

comment:6 Changed 7 months ago by nickm

  • Keywords nickm-deferred-20161005 added
  • Milestone changed from Tor: 0.2.9.x-final to Tor: 0.3.0.x-final

Deferring more things -- even ones I love -- to 0.3.0. Please argue if I'm wrong.

comment:7 Changed 5 months ago by nickm

  • Keywords TorCoreTeam201611 added; nickm-deferred-20161005 removed
  • Owner set to nickm
  • Status changed from new to assigned

Status: My prop271-wip branch (which I will rebase very much) has most of the common backend for prop271 implemented and tested. Next I finish the "waiting circuits list" magic, and tie it into Tor.

comment:8 Changed 5 months ago by nickm

Status: It's tied into tor, and it seems to work for the cases I can test. Coverage is high. Remaining must-do items are bridge support and entrynodes support. After that, we should take stock as we review; I will have some further tweaks to the algorithm to suggest.

comment:9 follow-up: Changed 5 months ago by asn

Hello, I did a very rough initial review based on the WIP branch prop271-wip. Haven't run it yet. Great work so far, very excited about splitting bridge code to another file as well.

Some of these comments might be too low level for this stage of implementation, so feel free to ignore them. Also, I have not re-loaded the whole proposal in my brain yet, so I didn't go too deep in the nitty gritty details.

  • There are a few big scary functions that could be splitted into smaller functions to make them more manageable. e.g. sampled_guards_update_from_consensus() and entry_guards_update_primary.

Also the fundamental function select_entry_guard_for_circuit() could be split into three smaller functions, each for one spec case.

  • Agreed that the entry_guard_succeeded() tristate is ugly and makes the circuit_send_next_onion_skin() logic harder to read. Perhaps the retval could be turned into an enum?
  • If add_bridge_as_entry_guard() is a pub function of the entrynodes API, should it have a prefix? Or we are not at that level of tidyness yet?
  • Why do we need the spaceless ISO time functions? Can't we use the spaceful ones for state? Or is it to maintain backwards compat?
  • Parsing guards from state seems a bit of a pain now. For example, entry_guard_parse_from_state() does ad-hoc string parsing. Would it be possible to use the config parsing API for that?
  • Could smartlist_remove_keeporder() be implemented with smartlist_pos() and smartlist_del_keeporder() to avoid writing more smartlist code?

Finally, how should one robustly test this branch? Do you use iptable scripts or something?

comment:10 in reply to: ↑ 9 Changed 5 months ago by nickm

Replying to asn:

Hello, I did a very rough initial review based on the WIP branch prop271-wip. Haven't run it yet. Great work so far, very excited about splitting bridge code to another file as well.

Some of these comments might be too low level for this stage of implementation, so feel free to ignore them. Also, I have not re-loaded the whole proposal in my brain yet, so I didn't go too deep in the nitty gritty details.

  • There are a few big scary functions that could be splitted into smaller functions to make them more manageable. e.g. sampled_guards_update_from_consensus() and entry_guards_update_primary.

Also the fundamental function select_entry_guard_for_circuit() could be split into three smaller functions, each for one spec case.

Ack. I'll add a "XXXX can this split up" comment to those.

  • Agreed that the entry_guard_succeeded() tristate is ugly and makes the circuit_send_next_onion_skin() logic harder to read. Perhaps the retval could be turned into an enum?

Ack, will fix.

  • If add_bridge_as_entry_guard() is a pub function of the entrynodes API, should it have a prefix? Or we are not at that level of tidyness yet?

The entire bridge->entry interface is in flux; I hope we clean it up by the end.

  • Why do we need the spaceless ISO time functions? Can't we use the spaceful ones for state? Or is it to maintain backwards compat?

For the state format I chose, everything is spaceless. That makes it much much easier.

  • Parsing guards from state seems a bit of a pain now. For example, entry_guard_parse_from_state() does ad-hoc string parsing. Would it be possible to use the config parsing API for that?

Oooh. Interesting. We'd have to make it recursive, so it can handle the values within a line rather than just handling lines. Could be neat though. Maybe as another ticket?

  • Could smartlist_remove_keeporder() be implemented with smartlist_pos() and smartlist_del_keeporder() to avoid writing more smartlist code?

Only in quadratic-time: smartlist_remove_keeporder() needs to remove every occurrence of the target element. So to emulate it with pos and del_keeporder, you'd need to say:

    int idx;
    while ( (idx = smartlist_pos(sl, element)) >= 0) {
      smartlist_del_keeporder(sl, idx);
    }

Finally, how should one robustly test this branch? Do you use iptable scripts or something?

I've been doing ad-hoc iptable stuff, doing some of my hacking from nasty coffeeshop ISPs, etc. I haven't figured out how to do a really thorough hostile-network integration test though.

comment:11 Changed 5 months ago by nickm

Update: bridges are now supported and working.

comment:12 Changed 5 months ago by nickm

  • Status changed from assigned to needs_review

Okay, here we go.

 46 files changed, 8004 insertions(+), 1356 deletions(-)

The code is pretty solid, and I think it's worth reviewing now. I've made a new branch, "prop271_030_v1". This branch will _not_ get rebased. I've uploaded it to gitlab for review at https://gitlab.com/nickm_tor/tor/merge_requests/11

There is a corresponding torspec branch in "prop271-changes" in my public torspec repository.

For follow-up items for this work, see #20822.

comment:13 Changed 5 months ago by nickm

  • Keywords review-group-13 added

comment:14 Changed 5 months ago by nickm

  • Reviewer set to chelseakomlo, asn

comment:15 follow-up: Changed 5 months ago by asn

  • Status changed from needs_review to needs_revision

Hello,

I did some initial testing and a first pass of manual code review.

You can see the first testing results here: https://trac.torproject.org/projects/tor/wiki/doc/NewGuardAlgorithmTesting

I also left a few comments on the gitlab review with things that should be discussed. Some of those comments might be follow-up material (#20822).

Also, please check my prop271_030_v1 branch for some misc fixes.

BTW, starting tomorrow and until Monday I will be travelling and will have reduced connectivity and time for review. I might manage to do a few stuff, but I will continue in full force starting next Tuesday. Next week I can do more testing, and also some refactoring, and perhaps add some unittests (e.g. for the flappiness feature).

comment:16 Changed 5 months ago by asn

  • Keywords actual-review-points=2 added

comment:17 follow-up: Changed 5 months ago by chelseakomlo

Hey Nick,

Here are some general thoughts, feel free to take/leave what is useful. I added notes to the gitlab review as well.

  • Naming conventions are a bit confusing, as we use entrynodes, entryguards, and guards. If this naming signifies pre and post prop271, maybe we can add in legacy namespacing to make it more clear (see below). Also, the bridge-specific module looks really great- if there is guard-specific code that is distinct from entrynodes, maybe this belongs in a separate guard module.
  • Making a clear distinction between pre and post prop271 code would definitely make reviewing easier, as well as eventually migrating away from legacy code... I liked asn's recommendation of namespacing, we could also move legacy code to it's own module, etc.
  • Some functions are quite large, such as sampled_guards_update_from_consensus and entry_guards_upgrade_waiting_circuits. I see some todos about splitting these up, which is great.
  • Would it make sense to put parsing code into a parser.c?
  • bridges.c doesn't have test coverage :(
  • It looks like there are several remaining todos that need to be completed. Which of these should be completed as part of this patch? For example, in entrynodes.c, update_guard_selection_choice has a comment about needing to expire existing circuits (line 653).

comment:18 in reply to: ↑ 15 Changed 5 months ago by nickm

Replying to asn:

Hello,

I did some initial testing and a first pass of manual code review.

Thank you!

You can see the first testing results here: https://trac.torproject.org/projects/tor/wiki/doc/NewGuardAlgorithmTesting

I also left a few comments on the gitlab review with things that should be discussed. Some of those comments might be follow-up material (#20822).

Okay. I'll look over all the comments on the gitlab pull request next.

Also, please check my prop271_030_v1 branch for some misc fixes.

I've merged that into my branch; thak you!

BTW, starting tomorrow and until Monday I will be travelling and will have reduced connectivity and time for review. I might manage to do a few stuff, but I will continue in full force starting next Tuesday. Next week I can do more testing, and also some refactoring, and perhaps add some unittests (e.g. for the flappiness feature).

Thank you!

comment:19 in reply to: ↑ 17 Changed 5 months ago by nickm

Replying to chelseakomlo:

Hey Nick,

Here are some general thoughts, feel free to take/leave what is useful. I added notes to the gitlab review as well.

  • Naming conventions are a bit confusing, as we use entrynodes, entryguards, and guards. If this naming signifies pre and post prop271, maybe we can add in legacy namespacing to make it more clear (see below). Also, the bridge-specific module looks really great- if there is guard-specific code that is distinct from entrynodes, maybe this belongs in a separate guard module.
  • Making a clear distinction between pre and post prop271 code would definitely make reviewing easier, as well as eventually migrating away from legacy code... I liked asn's recommendation of namespacing, we could also move legacy code to it's own module, etc.

I've tried adding #ifdefs around the old code, in 472a621ccc8a90bf90d8394210519974ad235848.

For namespacing, I think everything should wind up in one entryguards_* namespace once the legacy things are removed, and the rest of the code is cleaned up. The guards_* functions are there as generic frontends to the old and new code.

  • Some functions are quite large, such as sampled_guards_update_from_consensus and entry_guards_upgrade_waiting_circuits. I see some todos about splitting these up, which is great.

Agreed.

  • Would it make sense to put parsing code into a parser.c?

I'd like to extract it into something more general; for now it's pretty specific, though it could become more generally useful. I've added it #20919.

I think we discussed this on IRC, and decided you needed to hit "expand" on gitlab, and then things got better.

  • bridges.c doesn't have test coverage :(

Right. At least, that code is no less covered than it was before I moved it. :/

  • It looks like there are several remaining todos that need to be completed. Which of these should be completed as part of this patch? For example, in entrynodes.c, update_guard_selection_choice has a comment about needing to expire existing circuits (line 653).

That's taken care of, so I removed the comment and updated the doxygen in 6a6625c632248c. Resolving todos in in general is #20718 under #20822

comment:20 follow-up: Changed 5 months ago by nickm

Okay, I think I've answered all the comments here and on gitlab, most of them with code fixes. Do we think this is ready for merge?

comment:21 in reply to: ↑ 20 Changed 5 months ago by asn

Replying to nickm:

Okay, I think I've answered all the comments here and on gitlab, most of them with code fixes. Do we think this is ready for merge?

Thanks for fixing all the comments above. I'd like to do another round of review and testing next week before merging this. ETA for review results is Thursday.

If you feel like merging this, you can do this, and I'll follow up with comments anyhow.

comment:22 Changed 5 months ago by nickm

I'm not in a hurry; Thursday would be fine. I've started trying to fix up some of the followup issues in other tickets under #20822 ; I can keep doing that while you do the second round. Thank you!

comment:23 Changed 5 months ago by nickm

  • Status changed from needs_revision to needs_review

comment:24 Changed 5 months ago by dgoulet

  • Keywords TorCoreTeam201612 added; TorCoreTeam201611 removed

Going to December!

comment:25 Changed 5 months ago by asn

  • Keywords actual-review-points=5 added; actual-review-points=2 removed

Some more comments were added to gitlab. Getting closer to merge-ready I think!

comment:26 Changed 5 months ago by nickm

Ok; I don't think I see any comments I didn't answer there. Did I miss some?

comment:27 Changed 4 months ago by nickm

(asn says he believes this is ready)

comment:28 Changed 4 months ago by nickm

  • Status changed from needs_review to merge_ready

comment:29 Changed 4 months ago by nickm

prop271_030_v1 is now squashed as prop271_030_v1_squashed. The HEADs of the branches are identical.

Merging to master; resolving conflicts.

comment:30 Changed 4 months ago by nickm

  • Resolution set to fixed
  • Status changed from merge_ready to closed

MERGED.

Note: See TracTickets for help on using tickets.