Implement prop250 (Random Number Generation During Tor Voting)

changed milestone to %Tor: 0.2.9.x-final

added component::core tor/tor milestone::Tor: 0.2.9.x-final owner::dgoulet parent::8244 points::6 priority::high resolution::fixed review-group-3 review-group-4 reviewer::nickm severity::normal sponsor::R-must status::closed tor-hs type::enhancement labels

OK. Some very initial work can be found at shared-random-refactor1 in my repo.

The main contribution here is refactoring dirvote_compute_consensuses(). This is the function that will eventually create the SR doc, and it was a big clusterfuck. The first three commits of my branch try to refactor in a way that will allow us to add code to it, without complicating it even more.

The final commit (80bd3da) adds in some of the SR stuff that we wrote during camp plus a few more stuff. These are just ideas and scaffolding, so don't take it seriously. However, I didn't manage to proceed with this too much because the consensus signature code is very thick. It seems to me that more refactoring is needed.

The main problem is that when dirvote.c thinks consensus, it actually thinks of networkstatus. So all consensus-related functions are created with networkstatus documents in mind, which is totally different from the SR doc. You can see this by seeing how pending_consensus_t is used around the code, and also by taking a look at the following functions:

networkstatus_add_detached_signatures()
dirvote_fetch_missing_signatures()
dirvote_add_signatures_to_pending_consensus()
dirvote_add_signatures_to_all_pending_consensuses()
dirvote_add_signatures()
networkstatus_check_consensus_signature()

Also, grep for N_CONSENSUS_FLAVORS to see various places where tor loops over all the various flavors and expect to find networkstatus documents underneath. All the above functions will need to be refactored, or their caller will need to handle SR doc totally different.

We will need to find a nice solution to this. We could for example introduce a new data structure called consensus_t that is a union that hosts either a networkstatus or an SR doc. And then write various methods for it, to be able to interact with it. I did a beginning with this, but it required shitloads of changes all around the codebase and then I dropped it.

A lame fix here would be to just stuff all the SR info into a networkstatus_t, which might solve a few problems, but it's terrible.

Seems like the "Ah, let's just use a new consensus flavor. What can go wrong." idea is starting to show its complexity :)

So we David we discussed a possible refactoring here that would allow us to introduce SR doc logic to functions like networkstatus_add_detached_signatures().

The idea is that we will make a new data structure called consensus_t which will be used in places where we don't care about the specific consensus flavor. This data structure will be able to contain networkstatus_t or shared_random_doc_t or any other weird flavor we come up in the future.

Then we can have some methods for consensus_t, which based on the flavor returns you the appropriate data structure.

Then we will need to refactor all the functions mentioned in the previous comment (and a few more), to be able to do their thing on a networkstatus_t but then also do a similar thing for shared_random_doc_t. This will require lots of hairy code to be refactored which is scary, but doable.

Any opinions on this strategy?

OK. I did some work on figuring out the current phase during voting, and on generating commitments/reveals. You can find it here: https://gitweb.torproject.org/user/asn/tor.git/log/?h=prop250-voting-day1

OK, second day progress at branch prop250-voting-day2: https://gitweb.torproject.org/user/asn/tor.git/log/?h=prop250-voting-day2

This branch can do the following things:

Authorities can keep track of the protocol run. Authorities know which protocol phase they are into at any given time. Both in the real network and in the testing network. (Unit tests included for real network)
Authorities know when a protocol run starts and can do some basic housekeeping. Like wipe counters, and generate fresh commitment values for themselves.
Authorities can generate actual reveal values. Authorities can generate some dummy commitment values.
Initial work has started on encoding commits/reveals to votes. Scaffolding work has also started on parsing commits/reveals from votes.

I also looked at your branch David. I see you have code for parsing the binary representation of commitments and reveals . And also have code for parsing the state file.

I noticed that the state file parsing code is quite similar to the code that we will have to write for parsing commits/reveals from votes, but with statefile code patterns instead of routeparse code patterns.

I'm talking about this type of code:

+  alg = crypto_digest_algorithm_parse_name(value);
+  if (alg == -1 || alg != SR_DIGEST_ALG) {
+  }
+  value = smartlist_get(args, 1);
+  if (digest256_from_base64(identity, value) < 0) {

It would be nice to be able to write utility parsing functions that can be used by both subsystems so that we don't have duplicate code. I wrote some skeleton code for the routerparse.c code in case you want to think more of how we could merge those two logics. Check out commit 39cb7ab.

Cheers!

Trac:
Milestone: Tor: 0.2.??? to Tor: 0.2.8.x-final

Trac:
Sponsor: N/A to SponsorR?

Trac:
Sponsor: SponsorR? to SponsorR

Trac:
Points: N/A to large

Hello,

please see my branch prop250-voting-day5 for further work on generating commitments and reveals. This is the first part of switching from the dummy integer commitments/reveals of prop250-voting-day4, to the signature-based commitments of prop250.

This is still work in progress, but commit e3b6f4f introduces a very important unit test that shows a skeleton of how commitments and reveals are supposed to be generated, parsed and verified. Let me know if this logic seems acceptable to you, or we should change it.

To move forward now, we need to introduce the SR-specific ed25519 auth key and include its certificate in votes, so that we stop using the ancient RSA fingerprints and be able to start verifying signatures.

Tasks that need to be done next:

Create and start using the ed25519 SR key and certificate chains. Needs proposal!
Include the shared secret in the consensus.
Use weasel's algorithm in get_testing_network_protocol_phase().
Implement the conflict line (see add_conflict_to_sr_state()).
Implement the maintainance function and has_majority logic (see state_phase_transition()).
Plug ALL the memory leaks (many!)
Fix the disk state (but let's do this later so that all the data structures have stabilized).

[FWIW I left the stupid integer-based commits/reveals as generate_sr_commitment_stupid() in case we want to do some chutney testing. Eventually, generate_sr_commitment_stupid() will be replaced by generate_sr_commitment().]

Cheers!

Replying to asn:

Hello,

Tasks that need to be done next:

Create and start using the ed25519 SR key and certificate chains. Needs proposal!

FWIW, I made ticket #17349 (moved) specifically for this task.

Trac:
Severity: N/A to Blocker

I worked a bit on the outdated spec! Please see branch prop250-asn in my torspec repo!

Some major mods are:

I described how to get the ed25519 shared randomness keys from the votes.
I changed the protocol to start at midnight and last the whole day!
I fixed the majority section and the agreement section.

Please see and add your own changes!

Hello,

please see prop250-general-day1 for the latest branch here.

Changes since last time:

Calculates shared random value normally and puts it in consensus.
Uses ed25519 shared randomness keys.
Improved (but not yet ready) disk state logic.
Fixed the broken unittests.
Uses weasel's algorithm.

OK, here is the latest state of the game here.

In my torspec branch prop250-nosrkeys-v2 you can find prop250 without the SR keys as discussed here: https://lists.torproject.org/pipermail/tor-dev/2015-November/009875.html

In my tor branch prop250-nosrkeys-day1 you can find the corresponding code, where we use the ed25519 master key for referencing, and the RSA identity key to detect multiple commits by a single authority. I also fixed the get_state_valid_until_time() function and wrote some unittests.

I think the next step in our simplification process here is to refactor the decide() step which is not really useful anymore now that we don't do majority or conflicts. I imagine that we could make it so that there is no decide step or voted_commits, and we just move valid authoritative commits directly to the sr state during vote parsing.

Proposal: Please find a more refined version of the proposal at my torspec branch prop250-final-v2: https://gitweb.torproject.org/user/asn/torspec.git/log/?h=prop250-final-v2

Code: Branch prop250-nosrkeys-day2 in my repo contains the latest code. It fixes some annoying bugs that were making testing harder.

Latest branch with code rebased on master and improvement: prop250_v4.

This branch has few new things.

A random number is now always hashed H(RN) so we avoid leaking bytes from our PRNG to the network.
Add SR flag to the vote indicating if the vote participates in the protocol
Parse votes and populate an object in the networkstatus_t object. Once done, we only handle those commits once we are sure a vote is valid and from known authority (we weren't doing that before).
Refactor some functions to be able to support the last bullet above and improve naming on some functions.

Few things are missing that I'll get to it:

Consensus params for the number of participants.
Torrc option to disable SR support (basically making the dirauth not adding the SR flag to the vote).
Some issues with the disk state also because we know rely on the RSA fingerprint of a dirauth.

Replying to dgoulet:

Latest branch with code rebased on master and improvement: prop250_v4.

Great! Please also check my prop250_v4 branch. Based on yours with a few mods.

Specifically:

Some small code tweaks and variable/function renames.
I removed the disaster formula. It's not useful anymore. Please see commit message for more info.
Made the vote parsing logic a bit more robust, I hope.
I renamed and changed the semantics of NumSRParticipants. Now, instead of using it just to count the number of participants, we use it to reject SRVs that have been voted by less than NumSRParticipants dirauths. This makes us more robust against edge cases like the one where the 5th SR dirauth appears during the reveal phase and calculates a different SRV than the other 4.
Also, now on the majority calculation, I use the total number of dirauths instead of the number of voters. I did this to mirror the way the n_required works in networkstatus_check_consensus_signature().

Let me know if you find these things reasonable!!!

More future things:

We should start actually voting the NumSRVAgreements parameter. And maybe for the real network the default value should be ((n_auths / 2) + 1), instead of n_auths that is now.
Also, I've been getting these interesting log messages for a while now:

[SR] You may not provide a value for virtual option 'SharedRandCurrentValue'

but I'm not sure what they mean. I think they started appearing when we switched from LINELIST_S to LINELIST_V for SRVs.

Replying to asn:

Great! Please also check my prop250_v4 branch. Based on yours with a few mods.

Specifically:

Some small code tweaks and variable/function renames.

Commit: d653f8cc17d6e278e16acb8a3101a821f9f42f6e

- Macroify previous/current SRV strings.

Not sure I agree with that. I much prefer typed variable here instead of macros for strings that we only use locally in this file. We have compiler protection with that where macros is pretty yolo in terms of warnings for compiler. You see a win for macros?

More future things:

We should start actually voting the NumSRVAgreements parameter. And maybe for the real network the default value should be ((n_auths / 2) + 1), instead of n_auths that is now.

So you mean before merging 250, we should make an adhoc patch right now? Most dirauth won't upgrade until a stable and since we plan to have this in 028-stable, we can make this a big block.

Hrm, consensus params are options in a file for dirauth so we have to decide on a "hard" number which in this case would be 5. And we can see how the SR behaves over time and increase it if needed?

Also, I've been getting these interesting log messages for a while now: {{{ [SR] You may not provide a value for virtual option 'SharedRandCurrentValue' }}} but I'm not sure what they mean. I think they started appearing when we switched from LINELIST_S to LINELIST_V for SRVs.

Yeah I have yet understantd this one but I think that when the state file is read that option can be missing and thus we have this warning...

Latest spec at prop250-final-v4. Should include everything we have discussed up to and including CCC.

Implement prop250 (Random Number Generation During Tor Voting)

Child items ...

Activity