Opened 15 months ago

Closed 14 months ago

Last modified 14 months ago

#18318 closed defect (fixed)

Make sure keys and IP:Ports are unique in a consensus

Reported by: teor Owned by: nickm
Priority: Very High Milestone: Tor: 0.2.8.x-final
Component: Core Tor/Tor Version:
Severity: Blocker Keywords: TorCoreTeam201602, must-fix-before-028-rc, 027-backport, 2016-bug-retrospective
Cc: Actual Points:
Parent ID: #17668 Points:
Reviewer: Sponsor: None

Description

When voting for RSA / Ed25519 key combinations, each RSA key must be unique in the vote, and and each Ed key must be unique in the vote:

  • authorities must vote using the most recent descriptor signed by a RSA key, as the signature proves ownership of that key, and ignore earlier descriptors signed by that key;
  • authorities must vote using the most recent descriptor signed by both a RSA and an Ed key, as the signatures prove ownership of both keys, and ignore earlier descriptors signed by either key.

When voting for RSA key / IPv4:Port combinations, there can only be one Running instance of each RSA key, and one Running instance of each IPv4:Port, in the vote:

  • authorities must vote Running for only one IPv4:Port per RSA key, and only one RSA key per IPv4:Port. The IPv4:Port and RSA key must be the latest RSA key proven by reachability test to that IPv4:Port from that authority;
  • authorities may vote for RSA keys that have signed a descriptor specifying an IPv4:Port, but which haven't been reachability tested, or which have been superseded by a later reachability test. (This helps us deduce internal authority state from votes.) But authorities must not vote Running for these additional RSA key or IPv4:Port instances.

Whether or not authorities can perform IPv6 reachability tests, there can only be one Running instance of each IPv6:Port in the vote:

  • authorities must vote Running for at most one IPv6:Port per RSA key, and only one RSA key per IPv6:Port. The IPv6:Port and RSA key must be the latest RSA key proven by reachability test to that IPv6:Port from that authority;
  • authorities that aren't on IPv6 must vote Running for at most one IPv6:Port per RSA key, and only one RSA key per IPv6:Port. If multiple RSA keys claim an IPv6:Port, the RSA key voted Running must be the one with the latest reachable IPv4:Port.
  • authorities may include additional IPv6:Port instances, but must not vote them running.

When we transition to Ed25519 proofs via authenticate cells in reachability tests, similar uniqueness constraints will apply. But that's out of scope for this ticket.

Since consensuses only include Running relays, and the Running flag is assigned by a majority vote, each RSA key, Ed key, IPv4:Port, and IPv6:Port must be unique in the consensus.

Child Tickets

Change History (22)

comment:1 Changed 15 months ago by teor

This resolves this part of #17668:

When a relay changes its RSA key, we'll include it in our vote twice. If both RSA keys map to the same ed25519 key, this bug triggers. So far we just never noticed that this is happening because we never cared that two things are on the same IP:port combination.

comment:2 Changed 15 months ago by teor

#17107 was a duplicate of this, and included this reference:
https://lists.torproject.org/pipermail/tor-dev/2015-September/009523.html

comment:3 Changed 15 months ago by teor

https://trac.torproject.org/projects/tor/ticket/17668#comment:15

In #17668 (comment 15), moria1 votes for two microdescriptors with:

  • (I can't compare RSA keys easily because they're microdescriptors)
  • the same Ed key,
  • the same IPv4 address and Port,
  • but only one of which is Running.

This means that the current implementation breaks the above rules because:

  • it possibly allows duplicates of the same RSA key in votes, but I can't tell for sure,
  • it produces duplicates of the same Ed key in votes,
  • it doesn't parse its own votes, probably because they have duplicates of the same Ed key.

It seems to respect the following rules:

  • if there are duplicate IPv4:Port entries, only one is marked Running.

I don't know what it does for IPv6.

The rules I wrote are ambiguous - I think these are more concise:

  • if an older descriptor with the same RSA/Ed/IPv4/Port is verified reachable, and a new descriptor arrives with the same RSA/Ed/IPv4/Port, authorities must vote the new details, and mark the relay Running;
  • but if one of the keys or the IPv4 or Port changes, authorities must discard the old descriptor and reset the reachability check, and only vote Running on the new descriptor if the authority has checked its reachability by voting time;
  • if both keys change but the IPv4/Port stays the same, authorities should vote for all keys claiming to own that IP:Port. Authorities must vote for and assign Running to whichever RSA/Ed key was most recently found reachable at that IP:Port.

(And we should do the same for IPv6.)

comment:4 Changed 15 months ago by nickm

  • Owner set to nickm
  • Status changed from new to accepted

comment:5 Changed 15 months ago by arma

I think it might be useful to have some proposal-like thing that walks us through what went wrong, and why the above behavior will resolve the thing that went wrong.

At least for my current state, we have a bug that I don't fully understand, and a set of guidelines, and there's still some space in the middle.

comment:6 Changed 15 months ago by nickm

Only the "duplicate ed" part of this is actually causing #17668.

comment:7 Changed 15 months ago by nickm

  • Status changed from accepted to needs_review

I have a patch for the "duplicate ed" part of this in my branch ed25519_voting_fixes. I have run it through chutney, but should probably test it more thoroughly. I should also document it all in dir-spec.

comment:8 Changed 15 months ago by nickm

There is now an ed25519_voting_fixes branch in torspec too.

comment:9 follow-up: Changed 15 months ago by dgoulet

  • Status changed from needs_review to needs_revision

There is now an ed25519_voting_fixes branch in torspec too.

This branch diff with master is _empty_. Maybe you forgot to commit/push? :)

Code review:

routers_make_ed_keys_unique(): the digestmap by_ed_key is not freed.

routers_make_ed_keys_unique(): can't you use SMARTLIST_DEL_CURRENT directly in the first loop instead of flagging it then removing it? Actually, maybe I'm not understanding the whole picture here, but that's called in dirserv_generate_networkstatus_vote_obj() then few lines below rl is used with dirserv_count_measured_bws(); and not the list with unique entries routers. Is it OK here to use a list of non unique entries? (same goes for dirserv_compute_performance_thresholds)

if (! vrs->has_ed25519_listing), has extra space.

Below, we just warn and we continue to try to create a valid consensus even though there is an obvious issue? If that bug is hit, will it create an invalid consensus?

+      if (ed_consensus > 0) {
+        tor_assert(consensus_method >= MIN_METHOD_FOR_ED25519_ID_VOTING);
+        if (ed_consensus <= total_authorities / 2) {
+          log_warn(LD_BUG, "Not enough entries had ed_consensus set; how "
+                   "can we have a consensus of %d?", ed_consensus );
+        }
+      }

comment:10 follow-up: Changed 15 months ago by nickm

Whoops: the torspec branch is ed25519_voting_again. I'll look at the other comments asap.

comment:11 in reply to: ↑ 10 Changed 15 months ago by cypherpunks

Replying to nickm:

Whoops: the torspec branch is ed25519_voting_again. I'll look at the other comments asap.

Hi, seen this happen on a few tickets so far. Is there a way to auto-check the comments of each ticket (for example last 5000) and each final comment with 'fixed' against the branches just to make sure code has been committed/pushed and not left accidentally while thinking it has been committed?

comment:12 Changed 15 months ago by nickm

  • Keywords must-fix-before-028-rc added
  • Milestone Tor: 0.2.7.x-final deleted

These are backportable, but they are tied to the 0.2.8 rc.

comment:13 in reply to: ↑ 9 Changed 15 months ago by nickm

Replying to dgoulet:

There is now an ed25519_voting_fixes branch in torspec too.

This branch diff with master is _empty_. Maybe you forgot to commit/push? :)

As above actually see the torspec branch ed25519_voting_again.

Code review:

routers_make_ed_keys_unique(): the digestmap by_ed_key is not freed.

Calling this DG1. Fix in babd9df586312f

routers_make_ed_keys_unique(): can't you use SMARTLIST_DEL_CURRENT directly in the first loop instead of flagging it then removing it?

No because sometimes we flag ri and sometimes we flag ri2. Calling this DG2. Adding comments for this in babd9df586312f

Actually, maybe I'm not understanding the whole picture here, but that's called in dirserv_generate_networkstatus_vote_obj() then few lines below rl is used with dirserv_count_measured_bws(); and not the list with unique entries routers. Is it OK here to use a list of non unique entries? (same goes for dirserv_compute_performance_thresholds)

Oh, very interesting. Calling this DG3. Will work on a fix.

if (! vrs->has_ed25519_listing), has extra space.

If you mean the one between the ! and the vrs, I don't mind a single space there.

Below, we just warn and we continue to try to create a valid consensus even though there is an obvious issue? If that bug is hit, will it create an invalid consensus?

+      if (ed_consensus > 0) {
+        tor_assert(consensus_method >= MIN_METHOD_FOR_ED25519_ID_VOTING);
+        if (ed_consensus <= total_authorities / 2) {
+          log_warn(LD_BUG, "Not enough entries had ed_consensus set; how "
+                   "can we have a consensus of %d?", ed_consensus );
+        }
+      }

IMO this bug can't actually be hit. But it wouldn't mean that the consensus was invalid; it would mean that the algorithm in dircollate.h would be broken.

comment:14 Changed 15 months ago by nickm

comment:15 Changed 15 months ago by nickm

  • Keywords 027-backport added
  • Status changed from needs_revision to needs_review

comment:16 Changed 14 months ago by dgoulet

  • Sponsor set to None

DG1 and DG3 lgtm! The whole branch also lgtm now.

comment:17 Changed 14 months ago by nickm

  • Milestone set to Tor: 0.2.8.x-final

comment:18 Changed 14 months ago by teor

Can we do the tie-breaker the same we we did in #17688?
Or is that not an issue for IP addresses and ports?
(I haven't had a chance to review this code yet, but I'd like to.)

From comments 38/44 in #17688, labelled T3 in that issue:

This check preserves the existing digestmap entry if the published times are equal. But different authorities could choose different descriptors-with-identical-published-times, depending on the order they arrived at the authority.
So let's find a consistent way of selecting a descriptor to omit if their published times are identical - how about comparing signed_descriptor_digest? (And if the digests are identical, it really doesn't matter which descriptor we omit, because they have the same content.)

Good idea; Calling it T3.

We should do the same tie-breaking in the IP:Port checks too.

+    if ((ri2 = digest256map_get(by_ed_key, pk))) {
+      /* Duplicate; must omit one.  Omit the earlier. */
+      if (ri2->cache_info.published_on < ri->cache_info.published_on) {
+        digest256map_set(by_ed_key, pk, ri);
+        ri2->omit_from_vote = 1;
+      } else {
+        ri->omit_from_vote = 1;
+      }

comment:19 Changed 14 months ago by nickm

(This isn't the same kind of an issue for IP addresses and ports. If two relays appear in a vote with the same IP:Port, one will be unreachable, and that's okay. But if they appear with the same ID, then the vote is unparseable.)

comment:20 Changed 14 months ago by nickm

(I haven't had a chance to review this code yet, but I'd like to.)

I'd like to try to get it merged RSN, if you can look at it soon?

comment:21 Changed 14 months ago by nickm

  • Resolution set to fixed
  • Status changed from needs_review to closed

These are now merged into maint-0.2.7 and forwards.

comment:22 Changed 14 months ago by nickm

  • Keywords 2016-bug-retrospective added

Marking these tickets (based on severity and hand-review) for inclusion in 2016 bug retrospective

Note: See TracTickets for help on using tickets.