Predict and build specific HS purpose circuits (rather than GENERAL)

changed milestone to %Tor: 0.3.3.x-final

added component::core tor/tor guard-discovery-prop247-controller milestone::Tor: 0.3.3.x-final owner::mikeperry parent::13837 priority::medium resolution::implemented review-group-27 review-group-30 reviewer::asn severity::normal status::closed tor-guard tor-hs type::enhancement labels

One approach here is to indeed define new circuit purposes for HS circuits. My main fear here is that these circuit purposes are used for tons of things in the codebase (from circuit timeouts, to cannibalization, to guard stuff, to stats stuff, to basic circuit functionality state machine), and hence adding new purposes means that we need to find all the right places to consider our new purposes which I imagine is gonna be a PITA and perhaps error-prone...

Another approach would be to use the hs_ident and rend_data elements of the circuit_t to distinguish HS circuits from normal circuits. That's what most of the code is doing to detect HS rend circuits.

Trac:
Keywords: guard-discovery deleted, guard-discovery-prop247-controller added
Milestone: Tor: unspecified to Tor: 0.3.3.x-final

Replying to asn:

One approach here is to indeed define new circuit purposes for HS circuits. My main fear here is that these circuit purposes are used for tons of things in the codebase (from circuit timeouts, to cannibalization, to guard stuff, to stats stuff, to basic circuit functionality state machine), and hence adding new purposes means that we need to find all the right places to consider our new purposes which I imagine is gonna be a PITA and perhaps error-prone...

Yes, it will be error-prone. I wonder how we can make this state machine easier to modify in future?

Another approach would be to use the hs_ident and rend_data elements of the circuit_t to distinguish HS circuits from normal circuits. That's what most of the code is doing to detect HS rend circuits.

No, these fields are not populated until the circuit is used for rendezvous. So they can't be used to identify preemptive circuits that are built with vanguards.

Re-parenting, because I think this prediction should be fixed for #13837 (moved) (and for our experiments using #13837 (moved) w/ https://github.com/mikeperry-tor/vanguards).

Trac:
Parent: #9001 (moved) to #13837 (moved)

Asn - you are right to worry about trying to alter how we deal with existing purposes, especially in a way that changes the HS state machine. However, adding a whole new circuit purpose is quite easy. I have done it several times before (including in my draft patch for parent ticket #13837 (moved)). I know how to do it without issue. Aside from a couple asserts and checks here and there, new purposes get ignored by existing code. That's what circuit purposes are for.

How about this: instead of messing with the existing HS purpose state machine(s), we create a new purpose (say CIRCUIT_PURPOSE_HS_GENERAL), and then in circuit_predict_and_launch_new() we launch new circuits of that type when we predict we need any hidden service circuits. Then we should be able to make just a few changes to circuit_get_open_circ_or_launch(), circuit_launch_by_extend_info(), and circuit_find_to_cannibalize() to use this new circuit type for hidden service circuits (instead of using CIRCUIT_PURPOSE_C_GENERAL for them).

That way circuits with this new purpose will otherwise be ignored unless we are looking to build a new one. And that mechanism for reusing them will be very similar as it is today, just drawing from CIRCUIT_PURPOSE_HS_GENERAL instead of CIRCUIT_PURPOSE_C_GENERAL, and we also don't have to change our current rephist prediction code.

Sound like a plan?

Replying to mikeperry:

Asn - you are right to worry about trying to alter how we deal with existing purposes, especially in a way that changes the HS state machine. However, adding a whole new circuit purpose is quite easy. I have done it several times before (including in my draft patch for parent ticket #13837 (moved)). I know how to do it without issue. Aside from a couple asserts and checks here and there, new purposes get ignored by existing code. That's what circuit purposes are for.

How about this: instead of messing with the existing HS purpose state machine(s), we create a new purpose (say CIRCUIT_PURPOSE_HS_GENERAL), and then in circuit_predict_and_launch_new() we launch new circuits of that type when we predict we need any hidden service circuits. Then we should be able to make just a few changes to circuit_get_open_circ_or_launch(), circuit_launch_by_extend_info(), and circuit_find_to_cannibalize() to use this new circuit type for hidden service circuits (instead of using CIRCUIT_PURPOSE_C_GENERAL for them).

That way circuits with this new purpose will otherwise be ignored unless we are looking to build a new one. And that mechanism for reusing them will be very similar as it is today, just drawing from CIRCUIT_PURPOSE_HS_GENERAL instead of CIRCUIT_PURPOSE_C_GENERAL, and we also don't have to change our current rephist prediction code.

Sound like a plan?

Plausible plan. I'm still a bit concerned but I don't have a better plan tbh.

What would be the interaction between CIRCUIT_PURPOSE_C_GENERAL and CIRCUIT_PURPOSE_HS_GENERAL? e.g. would you be able to cannibalize a C_GENERAL circuit to an HS_GENERAL circuit if we find that we have idle general circuits and we need HS ones (or the opposite)?

Replying to asn:

Replying to mikeperry:

Asn - you are right to worry about trying to alter how we deal with existing purposes, especially in a way that changes the HS state machine. However, adding a whole new circuit purpose is quite easy. I have done it several times before (including in my draft patch for parent ticket #13837 (moved)). I know how to do it without issue. Aside from a couple asserts and checks here and there, new purposes get ignored by existing code. That's what circuit purposes are for.

How about this: instead of messing with the existing HS purpose state machine(s), we create a new purpose (say CIRCUIT_PURPOSE_HS_GENERAL), and then in circuit_predict_and_launch_new() we launch new circuits of that type when we predict we need any hidden service circuits. Then we should be able to make just a few changes to circuit_get_open_circ_or_launch(), circuit_launch_by_extend_info(), and circuit_find_to_cannibalize() to use this new circuit type for hidden service circuits (instead of using CIRCUIT_PURPOSE_C_GENERAL for them).

That way circuits with this new purpose will otherwise be ignored unless we are looking to build a new one. And that mechanism for reusing them will be very similar as it is today, just drawing from CIRCUIT_PURPOSE_HS_GENERAL instead of CIRCUIT_PURPOSE_C_GENERAL, and we also don't have to change our current rephist prediction code.

Sound like a plan?

Plausible plan. I'm still a bit concerned but I don't have a better plan tbh.

What would be the interaction between CIRCUIT_PURPOSE_C_GENERAL and CIRCUIT_PURPOSE_HS_GENERAL? e.g. would you be able to cannibalize a C_GENERAL circuit to an HS_GENERAL circuit if we find that we have idle general circuits and we need HS ones (or the opposite)?

The desired behavior is that this should not be allowed. The idea is that these two types of general circuits will have different path construction mechanisms for the first three hops (vanguards/pinned middles vs normal), and should not be mixed.

In terms of how this will be done - I am going to alter circuit_find_to_cannibalize() so that you must also specify a circuit type to cannibalize from in addition to the one you want to create, and you can't switch between these classes of circuits (in the case where there is nothing to cannibalize in this way, that function can find nothing and return NULL, and the calling code in circuit_launch_by_extend_info() will build a fresh circuit instead). Since circuit_find_to_cannibalize() already behaves as if you specified canibalize_from == CIRCUIT_PURPOSE_C_GENERAL, I don't anticipate that the change to draw from CIRCUIT_PURPOSE_HS_GENERAL will be very complicated.

Trac:
Status: new to assigned
Owner: N/A to mikeperry

Replying to mikeperry:

Replying to asn:

Replying to mikeperry:

Asn - you are right to worry about trying to alter how we deal with existing purposes, especially in a way that changes the HS state machine. However, adding a whole new circuit purpose is quite easy. I have done it several times before (including in my draft patch for parent ticket #13837 (moved)). I know how to do it without issue. Aside from a couple asserts and checks here and there, new purposes get ignored by existing code. That's what circuit purposes are for.

How about this: instead of messing with the existing HS purpose state machine(s), we create a new purpose (say CIRCUIT_PURPOSE_HS_GENERAL), and then in circuit_predict_and_launch_new() we launch new circuits of that type when we predict we need any hidden service circuits. Then we should be able to make just a few changes to circuit_get_open_circ_or_launch(), circuit_launch_by_extend_info(), and circuit_find_to_cannibalize() to use this new circuit type for hidden service circuits (instead of using CIRCUIT_PURPOSE_C_GENERAL for them).

That way circuits with this new purpose will otherwise be ignored unless we are looking to build a new one. And that mechanism for reusing them will be very similar as it is today, just drawing from CIRCUIT_PURPOSE_HS_GENERAL instead of CIRCUIT_PURPOSE_C_GENERAL, and we also don't have to change our current rephist prediction code.

Sound like a plan?

Plausible plan. I'm still a bit concerned but I don't have a better plan tbh.

What would be the interaction between CIRCUIT_PURPOSE_C_GENERAL and CIRCUIT_PURPOSE_HS_GENERAL? e.g. would you be able to cannibalize a C_GENERAL circuit to an HS_GENERAL circuit if we find that we have idle general circuits and we need HS ones (or the opposite)?

The desired behavior is that this should not be allowed. The idea is that these two types of general circuits will have different path construction mechanisms for the first three hops (vanguards/pinned middles vs normal), and should not be mixed.

Interesting and indeed makes sense for the cases where vanguard-ish designs are enabled (e.g. hidden services in the beginning, and maybe even HS clients in the future).

I think we are reaching the point where the cannibalization logic requires a small spec of its own to document this stuff (+ various other intricacies).

Another behavior worth documenting: I wonder how the predictive circuit launching should work after this change. Should clients create predictive general circuits and also create predictive HS circuits? What about HSes?

Ok, I implemented what I described above in https://oniongit.eu/mikeperry/tor/commits/bug23101. It only does anything if #13837 (moved) is enabled, but is written such that when we implement vanguards, it will work the same for them, too.

Because https://github.com/mikeperry-tor/vanguards needs all of this code to work properly, this branch is based on #13837 (moved), which is based on #23114 (moved).

So consider this early preview review stage, I guess, while we slowly work up the merge chain on all of this.

Also asn, to your point about specs: all of our path selection has an incredible amount of cruft, technical dept, specification deficiency, and complete lack of test coverage. I don't think it's a good idea to make paying back all of that debt a blocker to fixing the guard discovery problem.

In particular, I think it is especially important to get these options merged by the 0.3.3 deadline in January, so that hidden service operators at least have the option of using them while we work on a more complete solution for 0.3.4 in May, and for later tor releases.

Trac:
Status: assigned to needs_review

Replying to mikeperry:

... Also asn, to your point about specs: all of our path selection has an incredible amount of cruft, technical dept, specification deficiency, and complete lack of test coverage. I don't think it's a good idea to make paying back all of that debt a blocker to fixing the guard discovery problem.

In particular, I think it is especially important to get these options merged by the 0.3.3 deadline in January, so that hidden service operators at least have the option of using them while we work on a more complete solution for 0.3.4 in May, and for later tor releases.

I agree that we can't fix all the code structure, missing documentation and missing tests in the short term.

But we can work out a plan for paying back this debt, so it doesn't cause the same issues the next time we want to modify this code. And we can mark this ticket with the "technical-debt" keyword, so we find it when we review our processes.

Trac:
Keywords: N/A deleted, review-group-27 added

Trac:
Reviewer: N/A to asn

Initial review performed here on oniongit. This one is a hard and complicated feature to do well and clean. Let's do it!

I'm a bit anxious about the new purpose added here, and how well it's handled by the whole codebase. I'd really like a way to test this new purpose, either unittest or integration test (currently all the integration tests fail make test-network-all because of #13837 (moved) issues). After a few #13837 (moved)/#23101 (moved) fixes get in I'll also give the vanguard project a try.

Trac:
Status: needs_review to needs_revision

Replying to teor:

I agree that we can't fix all the code structure, missing documentation and missing tests in the short term.

But we can work out a plan for paying back this debt, so it doesn't cause the same issues the next time we want to modify this code. And we can mark this ticket with the "technical-debt" keyword, so we find it when we review our processes.

Wrt a plan: An important thing to remember for this code is that it is also the first round of changes for vanguards. We will have an additional chance to clean old crufty stuff up more when we implement the full vanguard algorithm in Tor (which will involve much more refactoring of the node selection code anyhow).

Replying to asn:

Initial review performed here on oniongit. This one is a hard and complicated feature to do well and clean. Let's do it!

I'm a bit anxious about the new purpose added here, and how well it's handled by the whole codebase. I'd really like a way to test this new purpose, either unittest or integration test (currently all the integration tests fail make test-network-all because of #13837 (moved) issues). After a few #13837 (moved)/#23101 (moved) fixes get in I'll also give the vanguard project a try.

Ok, I rebased this on top of the fixups from mikeperry/bug13837, and then wrote some more fixups for your code review. They are in mikeperry/bug23101-rebased, and each fixup is linked in a comment on your oniongit review.

The main outstanding issue from both bugs is what to do about choose_good_middle_server(). Since we don't like various things about how it is changed in each bug, we should probably get together on IRC and talk about how we want to refactor it to satisfy both sets of concerns, otherwise it will be a git conflict nightmare.

Wrt checks, the vanguard code is basically meant to self-verify all of this. It verifies all of the generated paths are as expected using control port events. If there is no scaffolding in the current network tests to do things like that verify paths are as expected, then I'm leaning towards making a separate ticket for that work for all of our circuit building stuff, so that it does not block this.

OK, thanks for fixups! I reviewed the fixups and did another round of review. I still haven't had time to actually test the branch tho.

In general, this branch was hard to review IMO mainly due to the surrounding code being extremely hairy, the lack of concise high-level comments of what we are trying to achieve, and everything being done in a single commit (adding circ purpose && handling cannibalization && launching preemptive circs). Also the lack of tests is troubling, and the vanguards repo that is supposed to help testing had no README and was very hard to understand, but thankfuly mikeperry helped me over IRC. It took me 3 days of review to get a basic understanding of the subsystem and grasp the code changes that were done, and I still haven't actually tested this branch. I feel that no strong attempt was done to make the feature pleasurable for review and that's why it took me so long.

I think the next steps here are to squash the mikeperry/bug23101-rebased branch, then apply my fixes (bug23101-dev) on top of it, and then start handling the latest review comments. Then we would need a fresh merge request on gitlab because the current one is quite dirty.

In the next days, I'll try to test the branch on the real network to understand more what's going on and provide further review points, or a branch.

You're right, I could have separated the pre-building of HS_GENERAL from the cannibalization of them (similar to how I did in #13837 (moved)). I forgot to do that in the rush to get this done. I'll keep it in mind for next time.

As for the next steps, I took your dev patches and squashed them into mikeperry/bug23101-rebased, and the result is mikeperry/bug23101-squashed, which is rebased onto the freshly squashed+cleaned mikeperry/bug13837-squashed. https://oniongit.eu/mikeperry/tor/commits/bug23101-squashed if you want to comment there some more.

I'm now out until ~Jan 2. Leaving this in needs_revision because I am wary of the issue you found in #13837 (moved).

Predict and build specific HS purpose circuits (rather than GENERAL)

Child items ...

Activity