Opened 4 years ago

Last modified 10 months ago

#13928 new defect

Tor Authorities reachability testing is predictable and sequential

Reported by: teor Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.6.1-alpha
Severity: Normal Keywords: tor-dirauth reachable testing rotation
Cc: nickm Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

In the tor network, all tor authorities test reachability in the same, predictable sequence. Each authority uses the same sequence, and, if started at similar times (a 10 second window ever 1280 seconds), they will start at the same point. (This is a particular issue with test networks.)

I'd like to randomise the start point and progression of the sequence, while keeping the property that each 1280 second cycle tests all routers.

Child Tickets

Change History (28)

comment:1 Changed 4 years ago by teor

Fixed as part of #13718. Composing commits over the next week.

comment:2 Changed 4 years ago by teor

Owner: set to teor
Status: newassigned

comment:3 Changed 4 years ago by teor

Status: assignedneeds_review

These changes to tor are included in commits in:

Bugs: #13718, #13814, maybe #13787, #13839, #13924, #13823, #13929, #13963, #13928
Branch: bug13718-bug13928-randomise-reachability
Note: There are 5 branches that start with bug13718, please choose the right one.
Repository: ​​​​​​​https://github.com/teor2345/tor.git

This is a highly experimental change, and requires proper unit testing to ensure it is functioning as designed. I recommend most people use the change in #13718 instead:

Bugs: #13718, #13814, maybe #13787, #13839, #13924, #13823, #13929, #13963
Branch: bug13718-fast-bootstrap
Note: There are 5 branches that start with bug13718, please choose the right one.
Repository: ​​​​​​​https://github.com/teor2345/tor.git

comment:4 Changed 4 years ago by teor

Status: needs_reviewneeds_revision

I am now revising #13929 & #13928 separately from #13718, as this is particularly authority-related.

See #13929 for more extensive comments.

This needs some comprehensive tests (check each unique sequence covers all relays, before repeating any relays), and therefore probably needs some refactoring to be more testable.

The WIP branch for tor is:

Bugs: #13929, #13928 (in that order)
Branch: faster-permuted-reachability
Repository: ​​​​​​​https://github.com/teor2345/tor.git

comment:5 Changed 4 years ago by teor

Parent ID: #13718#14034

We will now track this as part of #14034 Make TestingDirAuthVoteGuard, TestingDirAuthVoteExit and AssumeReachable less essential in test networks.

comment:6 Changed 4 years ago by nickm

Milestone: Tor: 0.2.6.x-final

Let's have a look at this in 0.2.6, maybe.

comment:7 Changed 4 years ago by teor

Status: needs_revisionneeds_review

Nick, I've created an experimental branch for the changes in #13929 & #13928, controlled using the macros:

  • DIRSERV_SCALE_REACHABILITY - #13929
  • DIRSERV_PERMUTE_REACHABILITY - #13928

Using these macros makes it easy for me to tell the difference between refactored code and new code. They will need to be permantently activated, or turned into torrc options, in the final version.

Also, in order to do unit testing, I have created functions that can be set to a fixed value for testing. These functions can be mocked by passing them any value other than DSV_STD (DSV_STD produces the standard, non-testing behaviour).

This probably needs to be converted to the tinytest way, once I get my head around it.

I'm not expecting feedback any time soon - merging and alpha-testing #13718 is a higher priority, and I also need to polish the fixes in #13192 before going another round with tor authorities.

Commits: Refactoring, Unit Testing, #13929, #13928 (in that order)
Branch: experimental-auth-reachability
Repository: ​​​​​​​​https://github.com/teor2345/tor.git

Last edited 4 years ago by teor (previous) (diff)

comment:8 Changed 4 years ago by teor

Note: passing DSV_STD (-1) as an int, checking the value, then assigning it to a uint8_t is error-prone. I need to mock these functions properly using MOCK_DECL from tinytest.

comment:9 Changed 4 years ago by nickm

Status: needs_reviewneeds_revision

reviewing...

065bb7be0309535ae298eb99a5452b3620d4c018:

  • In dirserv_reachability_modulo_per_test and dirserv_reachability_initial_group, I'm not thrilled with having this be a uint8_t but take an int, and having most possible values of int getting silently turned into something out of range.
  • I'd prefer separate getter and setter functions; that's what we do in most other places.
  • Maybe "n_relay_testing_groups" or something would be a better name than "modulo_per_test"? Modulo as a noun doesn't mean much here.

4e6789aef56e71a9ab5ba6c983ffd757147d8dd7:

  • Should probably get tests to ensure that nothing is missed or double-hit.
  • This isn't exactly a random permutation. It's predictable and observable once you see a couple of values. Would using smartlist_shuffle() to get an actual random permutation be of any use here? (If not, we should probably use another word besides "permutation")

comment:10 Changed 4 years ago by teor

Yes, I think smartlist_shuffle() would be a much better alternative.
The same tests and speedup code would still apply, so only 4e6789aef56e71a9ab5ba6c983ffd757147d8dd7 would need to be modified.

This would make me very happy, because it would cut out much of the far-too-complicated random start/step code in favour of a static array of 128 (or fewer if scaled) uint8 values.

Last edited 4 years ago by teor (previous) (diff)

comment:11 Changed 4 years ago by teor

Nick, did you mean shuffle the entire list of relays?
Otherwise, the relay testing would still be predictable inasmuch that relays with the same 7-bits in the first byte would be tested together. I'm not sure if that matters that much.

comment:12 Changed 4 years ago by teor

Milestone: Tor: 0.2.6.x-finalTor: 0.2.7.x-final

Maybe 0.2.7

comment:13 Changed 4 years ago by nickm

Keywords: 027-deferrable added

This didn't pass triage, but since it's in needs_revision, I'm just marking it deferrable and leaving it in the milestone. It can't hurt, can it?

My rationale for redlining it during triage was that the best place to do unpredictable-order testing is probably in a successor to torflow.

comment:14 Changed 4 years ago by nickm

Keywords: 027-triaged-1-out added

Marking triaged-out items from first round of 0.2.7 triage.

comment:15 Changed 4 years ago by nickm

Milestone: Tor: 0.2.7.x-finalTor: 0.2.???

Move *most* 0.2.7-triaged-1-out needs_revision items into 0.2.???. Keep a few based on my sense of the sensible.

comment:16 Changed 3 years ago by teor

Status: needs_revisionneeds_information

#13928 is a solution to a problem I don't think exists (authority reachability testing being predictable). And making it random could cause other hard-to-diagnose bugs.

Deferring it for later for now.

comment:17 Changed 3 years ago by teor

Parent ID: #14034

No longer depends on the parent

comment:18 Changed 2 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:19 Changed 23 months ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:20 Changed 17 months ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:21 Changed 17 months ago by nickm

Keywords: 027-triaged-in added

comment:22 Changed 17 months ago by nickm

Keywords: 027-triaged-in removed

comment:23 Changed 17 months ago by nickm

Keywords: 027-triaged-1-out removed

comment:24 Changed 17 months ago by dgoulet

Keywords: tor-dirauth added; tor-auth removed

Turns out that tor-auth is for directory authority so make it clearer with tor-dirauth

comment:25 Changed 17 months ago by nickm

Keywords: 027-deferrable removed

comment:26 Changed 17 months ago by nickm

Keywords: reachable testing rotation added
Severity: Normal

comment:27 Changed 10 months ago by teor

Owner: teor deleted
Status: needs_informationassigned

Disowning tickets I don't intend to work on in the next 6 months.

comment:28 Changed 10 months ago by teor

Status: assignednew

Mark all tickets that are assigned to nobody as "new".

Note: See TracTickets for help on using tickets.