Opened 4 years ago
Last modified 14 months ago
#13928 new defect
Tor Authorities reachability testing is predictable and sequential
Reported by: | teor | Owned by: | |
---|---|---|---|
Priority: | Medium | Milestone: | Tor: unspecified |
Component: | Core Tor/Tor | Version: | Tor: 0.2.6.1-alpha |
Severity: | Normal | Keywords: | tor-dirauth reachable testing rotation |
Cc: | nickm | Actual Points: | |
Parent ID: | Points: | ||
Reviewer: | Sponsor: |
Description
In the tor network, all tor authorities test reachability in the same, predictable sequence. Each authority uses the same sequence, and, if started at similar times (a 10 second window ever 1280 seconds), they will start at the same point. (This is a particular issue with test networks.)
I'd like to randomise the start point and progression of the sequence, while keeping the property that each 1280 second cycle tests all routers.
Child Tickets
Change History (28)
comment:1 Changed 4 years ago by
comment:2 Changed 4 years ago by
Owner: | set to teor |
---|---|
Status: | new → assigned |
comment:3 Changed 4 years ago by
Status: | assigned → needs_review |
---|
These changes to tor are included in commits in:
Bugs: #13718, #13814, maybe #13787, #13839, #13924, #13823, #13929, #13963, #13928
Branch: bug13718-bug13928-randomise-reachability
Note: There are 5 branches that start with bug13718, please choose the right one.
Repository: https://github.com/teor2345/tor.git
This is a highly experimental change, and requires proper unit testing to ensure it is functioning as designed. I recommend most people use the change in #13718 instead:
Bugs: #13718, #13814, maybe #13787, #13839, #13924, #13823, #13929, #13963
Branch: bug13718-fast-bootstrap
Note: There are 5 branches that start with bug13718, please choose the right one.
Repository: https://github.com/teor2345/tor.git
comment:4 Changed 4 years ago by
Status: | needs_review → needs_revision |
---|
I am now revising #13929 & #13928 separately from #13718, as this is particularly authority-related.
See #13929 for more extensive comments.
This needs some comprehensive tests (check each unique sequence covers all relays, before repeating any relays), and therefore probably needs some refactoring to be more testable.
The WIP branch for tor is:
Bugs: #13929, #13928 (in that order)
Branch: faster-permuted-reachability
Repository: https://github.com/teor2345/tor.git
comment:5 Changed 4 years ago by
Parent ID: | #13718 → #14034 |
---|
We will now track this as part of #14034 Make TestingDirAuthVoteGuard, TestingDirAuthVoteExit and AssumeReachable less essential in test networks.
comment:6 Changed 4 years ago by
Milestone: | → Tor: 0.2.6.x-final |
---|
Let's have a look at this in 0.2.6, maybe.
comment:7 Changed 4 years ago by
Status: | needs_revision → needs_review |
---|
Nick, I've created an experimental branch for the changes in #13929 & #13928, controlled using the macros:
Using these macros makes it easy for me to tell the difference between refactored code and new code. They will need to be permantently activated, or turned into torrc options, in the final version.
Also, in order to do unit testing, I have created functions that can be set to a fixed value for testing. These functions can be mocked by passing them any value other than DSV_STD
(DSV_STD
produces the standard, non-testing behaviour).
This probably needs to be converted to the tinytest way, once I get my head around it.
I'm not expecting feedback any time soon - merging and alpha-testing #13718 is a higher priority, and I also need to polish the fixes in #13192 before going another round with tor authorities.
Commits: Refactoring, Unit Testing, #13929, #13928 (in that order)
Branch: experimental-auth-reachability
Repository: https://github.com/teor2345/tor.git
comment:8 Changed 4 years ago by
Note: passing DSV_STD (-1) as an int, checking the value, then assigning it to a uint8_t is error-prone. I need to mock these functions properly using MOCK_DECL from tinytest.
comment:9 Changed 4 years ago by
Status: | needs_review → needs_revision |
---|
reviewing...
065bb7be0309535ae298eb99a5452b3620d4c018:
- In dirserv_reachability_modulo_per_test and dirserv_reachability_initial_group, I'm not thrilled with having this be a uint8_t but take an int, and having most possible values of int getting silently turned into something out of range.
- I'd prefer separate getter and setter functions; that's what we do in most other places.
- Maybe "n_relay_testing_groups" or something would be a better name than "modulo_per_test"? Modulo as a noun doesn't mean much here.
4e6789aef56e71a9ab5ba6c983ffd757147d8dd7:
- Should probably get tests to ensure that nothing is missed or double-hit.
- This isn't exactly a random permutation. It's predictable and observable once you see a couple of values. Would using smartlist_shuffle() to get an actual random permutation be of any use here? (If not, we should probably use another word besides "permutation")
comment:10 Changed 4 years ago by
Yes, I think smartlist_shuffle() would be a much better alternative.
The same tests and speedup code would still apply, so only 4e6789aef56e71a9ab5ba6c983ffd757147d8dd7 would need to be modified.
This would make me very happy, because it would cut out much of the far-too-complicated random start/step code in favour of a static array of 128 (or fewer if scaled) uint8 values.
comment:11 Changed 4 years ago by
Nick, did you mean shuffle the entire list of relays?
Otherwise, the relay testing would still be predictable inasmuch that relays with the same 7-bits in the first byte would be tested together. I'm not sure if that matters that much.
comment:13 Changed 4 years ago by
Keywords: | 027-deferrable added |
---|
This didn't pass triage, but since it's in needs_revision, I'm just marking it deferrable and leaving it in the milestone. It can't hurt, can it?
My rationale for redlining it during triage was that the best place to do unpredictable-order testing is probably in a successor to torflow.
comment:14 Changed 4 years ago by
Keywords: | 027-triaged-1-out added |
---|
Marking triaged-out items from first round of 0.2.7 triage.
comment:15 Changed 4 years ago by
Milestone: | Tor: 0.2.7.x-final → Tor: 0.2.??? |
---|
Move *most* 0.2.7-triaged-1-out needs_revision items into 0.2.???. Keep a few based on my sense of the sensible.
comment:16 Changed 4 years ago by
Status: | needs_revision → needs_information |
---|
#13928 is a solution to a problem I don't think exists (authority reachability testing being predictable). And making it random could cause other hard-to-diagnose bugs.
Deferring it for later for now.
comment:19 Changed 2 years ago by
Keywords: | tor-03-unspecified-201612 added |
---|---|
Milestone: | Tor: 0.3.??? → Tor: unspecified |
Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.
comment:20 Changed 21 months ago by
Keywords: | tor-03-unspecified-201612 removed |
---|
Remove an old triaging keyword.
comment:21 Changed 21 months ago by
Keywords: | 027-triaged-in added |
---|
comment:22 Changed 21 months ago by
Keywords: | 027-triaged-in removed |
---|
comment:23 Changed 21 months ago by
Keywords: | 027-triaged-1-out removed |
---|
comment:24 Changed 21 months ago by
Keywords: | tor-dirauth added; tor-auth removed |
---|
Turns out that tor-auth is for directory authority so make it clearer with tor-dirauth
comment:25 Changed 21 months ago by
Keywords: | 027-deferrable removed |
---|
comment:26 Changed 21 months ago by
Keywords: | reachable testing rotation added |
---|---|
Severity: | → Normal |
comment:27 Changed 14 months ago by
Owner: | teor deleted |
---|---|
Status: | needs_information → assigned |
Disowning tickets I don't intend to work on in the next 6 months.
comment:28 Changed 14 months ago by
Status: | assigned → new |
---|
Mark all tickets that are assigned to nobody as "new".
Fixed as part of #13718. Composing commits over the next week.