Opened 4 months ago

Last modified 2 hours ago

#23100 needs_review enhancement

Circuit Build Timeout needs to count hidden service circuits

Reported by: mikeperry Owned by: mikeperry
Priority: Medium Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Normal Keywords: tor-hs, path-bias, guard-discovery-prop247-controller, needs-proposal, mike-can, prop247, tor-guard, review-group-25
Cc: asn Actual Points:
Parent ID: #9001 Points:
Reviewer: asn Sponsor:

Description

The Circuit Build Timeout code currently ignores all circuits with a desired_path_len other than 3. This causes it not to count hidden service circuits of longer path lengths.

When we change the path selection for Proposal247/#9001, we are going to want to count those circuits, because we want the circuit build timeout to prune 20% of Prop247 circuits, too. If we omit those circuits from the CBT calculation, we risk timing out either too few, too many, or even *all* of them.

The simplest way to fix this is to change circuit_timeout_want_to_count_circ() to decide to count every circuit once it has completed 3 hops, even if it plans to build more.

We may also want to alter the timeout application in circuit_expire_building(), depending on the prop247 implementation we choose. In some versions of the proposal, we can end up creating what are technically 5 hop paths (though these topologies were not very popular in Wilmington).

Child Tickets

TicketTypeStatusOwnerSummary
#23114enhancementneeds_reviewmikeperryCircuit Build Timeout should apply at circuit completion

Change History (24)

comment:1 Changed 4 months ago by mikeperry

Ok, I have an initial patch for this in mikeperry/prop247_torrc-rebased (d730d6337d24f20980469e6bb331b093afec643b). However, along the way I also noticed #23114, which we should also fix so that the results from the #13837 experiments make sense.

comment:2 Changed 8 weeks ago by dgoulet

Status: newneeds_review

This should have been in needs_review a while back!

comment:3 Changed 8 weeks ago by asn

Status: needs_reviewneeds_revision

Thanks for the patch! This seems important! A few comments:

  • I don't understand how the mods to circuit_timeout_want_to_count_circ() helps us achieve the goal of this ticket. It kinda seems like we are restricting that function even more (doc change implies that too), when we actually want it to be more accepting. I think a nice comment is required for people like me to understand how that function works wrt HS circs.
  • I also don't understand why the circuit_build_times_decide_to_count_circ() logic was moved from one function to another. I think we need some help here to understand, since I'm not familiar with the CBT system at all.
  • I think circuit_get_cpath_len() should be moded instead of adding another func circuit_get_cpath_opened_len(). We can add an arg only_count_opened which does that. I think that'd be cleaner.
  • circuit_build_times_decide_to_count_circ() is undocumented and I'd actually like to know what it does.

comment:4 Changed 4 weeks ago by nickm

Owner: set to mikeperry
Status: needs_revisionassigned

setting owner

comment:5 Changed 4 weeks ago by nickm

Status: assignedneeds_revision

comment:6 Changed 4 weeks ago by mikeperry

Keywords: guard-discovery-prop247-controller added; guard-discovery removed
Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

comment:7 in reply to:  3 Changed 12 days ago by mikeperry

Replying to asn:

Thanks for the patch! This seems important! A few comments:

  • I don't understand how the mods to circuit_timeout_want_to_count_circ() helps us achieve the goal of this ticket. It kinda seems like we are restricting that function even more (doc change implies that too), when we actually want it to be more accepting. I think a nice comment is required for people like me to understand how that function works wrt HS circs.

Ok, I will add to the comment to explain this. Basically we're not making it more restrictive. We are changing it from considering only *exactly* three hop circuits to considering any circuit that has currently less than or equal to 3 hops built (and plans to build at least 3 hops, but maybe more).

  • I also don't understand why the circuit_build_times_decide_to_count_circ() logic was moved from one function to another. I think we need some help here to understand, since I'm not familiar with the CBT system at all.

Actually, what happened here was that a bunch of code that was dangling off of circuit_build_no_more_hops() got its own function that does only one thing (decides to count the circuit). circuit_send_next_onion_skin() now calls that function before we decide if we're done adding hops (so we can calculate timeout info at the point where the circuit reaches 3 hops).

  • I think circuit_get_cpath_len() should be moded instead of adding another func circuit_get_cpath_opened_len(). We can add an arg only_count_opened which does that. I think that'd be cleaner.

In terms of code readability, code quality, and testability, I really disagree here. Combining these two would make for a really ugly conditional in the loop, which will take much more effort to verify for correctness than two separate functions. Also, it's not like binary size matters for us for the extra copy of the loop...

  • circuit_build_times_decide_to_count_circ() is undocumented and I'd actually like to know what it does.

Ok I will add a comment here, too.

Last edited 12 days ago by mikeperry (previous) (diff)

comment:8 Changed 12 days ago by mikeperry

Status: needs_revisionneeds_review

Ok, patches for this and #23114 are in mikeperry/bugs23100+23114-cleanup-squashed (head cf6c5cc04c3f4e8024409f9a02d4285a804b6c4e). I updated the comments and wrote tests. This and the child are ready for review.

comment:9 Changed 12 days ago by dgoulet

Thanks Mike! You have an oniongit.eu account because this ain't small amount of code and would be much better to do it there? If not, we can push it there and start the review but you'll need eventually to get an account so you can push your fixups.

comment:10 Changed 12 days ago by nickm

Keywords: review-group-25 added

comment:11 Changed 11 days ago by asn

Reviewer: asn

comment:12 Changed 11 days ago by mikeperry

Ok I made a merge request on oniongit at https://oniongit.eu/network/tor/merge_requests/5. I hope that's the right way to do this.

comment:13 Changed 10 days ago by asn

Initial review on oniongit. I'm mainly concerned about the lack of testability and some parts of the code which I don't understand.

While testing this I found #24228 which seems to be quite important right now.

comment:14 Changed 10 days ago by asn

Status: needs_reviewneeds_revision

comment:15 Changed 10 days ago by mikeperry

Replied on gitlab. As for testability, there is a separate testing commit that tests the new functionality. Based on your gitlab comments, it sounded like you didn't see it? It tests one of the functions you asked for tests for.

comment:16 in reply to:  15 Changed 10 days ago by asn

Replying to mikeperry:

Replied on gitlab. As for testability, there is a separate testing commit that tests the new functionality. Based on your gitlab comments, it sounded like you didn't see it? It tests one of the functions you asked for tests for.

Thanks for the quick replies. I only looked at 42e4b3e18 since I thought that this was the commit for this ticket. As I said in #23114, I'm not used to a single branch for multiple tickets. I will review the extra commit and also reply to the gitlab comments next week.

comment:17 Changed 8 days ago by asn

OK addressed all comments on gitlab. I think with a few small improvements, and a clean squashed branch, we can move to merge_ready this.

comment:18 Changed 5 days ago by mikeperry

Status: needs_revisionneeds_review

Ok. I pushed fixups for all of asn's comments. We also had some discussion about refactoring, so I did that in the commit for #23114.

I prefer this and the child ticket (#23114) to be merged together, since the tests were written to cover both. The new merge request is https://oniongit.eu/network/tor/merge_requests/6.

The old branch with all of the fixups is still bugs23100+23114-cleanup-squashed. The new one is bugs23100+23114-cleanup-squashed2. Both heads are identical.

comment:19 Changed 5 days ago by asn

Status: needs_reviewneeds_revision

Hey Mike,

I don't like the way this and #23114 are being developed. I'm trying to review your code and I feel like you are not respecting my efforts. I have asked you three times to decouple this from the #23114 branch and you have still not done it, and not only that but you are now merging fixes of #23100 into the #23114 commits... I think you are underestimating the time it takes me to review your code and you are making it much harder for me to ACK it. It took me more than a full day to work out the details of #23100 and it's gonna take me more time to do #23114, so by having those two branches together you are delaying the whole thing. I'd prefer to see #23100 get merged, before I jump into reviewing #23114, or maybe I'd even prefer if someone else reviewed #23114 since it's non-trivial amount of reviewing.

I'm not familiar with the CBT code and reviewing it sure ain't easy, so please try to make it easy for me. Also decoupling tickets and branches from each other is standard procedure and it's how Tor development is being done.

Or maybe I'm having a bad day :)

Thanks!

comment:20 Changed 4 days ago by mikeperry

asn - I am trying to honor your many requests for things that were either already done, redundant to other code, and/or difficult to deal with wrt git history.

Since these two tickets touch the very same block of code, I see them as a unit. Maybe I should have simply filed only one ticket... Separating them is very time consuming, especially wrt tests (which did not exist for any of this code before, so I ended up having to write tests that covered a lot of timeout functionality - which was also changing in *both* of these tickets). If you want me to separate the tests, and refactor separately, and do the copy-code-first thing (even though in my eyes, that is what #213100 mostly was, aside from one extra conditional and a relocation), that will take me another *two days* of git juggling, test re-writing, and verification, at least.

I spent all day yesterday trying to merge the fixups for #23100 down without conflicts for #23114. Other than the refactoring you asked for, in fact all of those fixups are cleanly merged into #23100. I even provided you two versions of the branch (pre-fixup and post-fixup) to verify this.

Wrt refactoring: because you requested that the initial commit just be a move of code with minimal changes, I merged the refactoring that you asked for into the following ticket, #23114. I think the refactoring was a good idea. But to avoid making you review a completely different #23100, I did it in #23114 instead... It is frustrating to me that you are now upset by this effort.

I think we are both underestimating what we're asking of the other, and communicating poorly on top of it. :/

I can make a separate commit for this, but I think it is way too much for you to also ask for me to separate the tests for this and #23114 just for git history. And I'm not sure if you are even asking for this. That is still not clear.

If you want to merge this without tests, that's fine. I will make a branch for the #23100 commit in a second, and I will also make a reference branch with only the fixups it contains. But if you want tests with this code, again, I would rather simply wait for someone to review #23114. As I said, the tests cover a lot of timeout functionality that is being changed by these tickets. Writing tests for each ticket means changing the tests after each ticket, which is a lot of busy work for no gain in the end.

comment:21 Changed 4 days ago by mikeperry

Ok, I separated bug #23100 without the tests. mikeperry/bug23100-squashed is just that (and is the same commit hash for #23100 from bugs23100+23114-cleanup-squashed2). mikeperry/bug23100-presquashed-fixups contains just the fixups in that commit.

I think I also see where you got confused with the rebase. I accidentally labeled the refactoring commits as fixups to #23100, since you asked for them in that review. I simply removed those commits from mikeperry/bug23100-presquashed-fixups now. They will be part of #23114.

I will annotate the gitlab review with the fixup commit hashes for your comments.

comment:22 Changed 4 days ago by mikeperry

As I said on IRC, for further background: The reason why these tickets are functionally together (and touch the same block of code) is that both changes are needed for the rate of circuit timeouts for hidden service circuits (and regular circuits) to be at the target 20%. If we only merge this ticket, the resulting tor still will timeout circuits at some other rate. This is why in my mind they are a package deal. Tor won't exhibit the designed behavior without both fixes together.

I filed the tickets separately as I began noticing problems while writing the prop247 simulator. In retrospect, I should have filed a single ticket: "Tor does not time out circuits at the target 20% rate".

comment:23 Changed 3 days ago by mikeperry

There is now an extra fixup commit on top of both mikeperry/bug23100-squashed and mikeperry/bug23100-presquashed-fixups that refactors the function that both of these tickets are changing. This fixup commit does not change any functionality, it only moves code. #23114 will be rebased on top of this fixup.

comment:24 Changed 2 hours ago by mikeperry

Status: needs_revisionneeds_review
Note: See TracTickets for help on using tickets.