Opened 21 months ago

Closed 17 months ago

Last modified 16 months ago

#17668 closed defect (fixed)

moria1, with updated v3 cert: Bug: Generated a networkstatus consensus we couldn't parse.

Reported by: arma Owned by: nickm
Priority: High Milestone: Tor: 0.2.8.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Blocker Keywords: must-fix-before-028-rc, 027-backport, needs-merge, 2016-bug-retrospective
Cc: weasel Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

On moria1, my v3 cert was going to expire soon, so I made a new one. At the same time, I decided to upgrade to git master (0.2.8.0-alpha-dev (git-d939a83f5bc8ccf3)).

Now, each hour when moria1 tries to vote, it instead says

Nov 24 01:55:01.628 [notice] Time to compute a consensus.
Nov 24 01:55:01.687 [info] networkstatus_compute_consensus(): Generating consensus using method 20.
Nov 24 01:55:01.807 [notice] Computed bandwidth weights for Case 3a (E scarce) with v10: G=20194094 M=3719276 E=1504624 D=7084460 T=32502454
Nov 24 01:55:01.849 [warn] ID on signature on network-status vote does not match any declared directory source.
Nov 24 01:55:01.853 [info] dump_desc(): Unable to parse descriptor of type v3 networkstatus. See file unparseable-desc in data directory for details.
Nov 24 01:55:01.853 [err] networkstatus_compute_consensus(): Bug: Generated a networkstatus consensus we couldn't parse. (on Tor 0.2.8.0-alpha-dev d939a83f5bc8ccf3)
Nov 24 01:55:01.857 [warn] Couldn't generate a ns consensus at all!
Nov 24 01:55:01.858 [info] networkstatus_compute_consensus(): Generating consensus using method 20.
Nov 24 01:55:01.970 [notice] Computed bandwidth weights for Case 3a (E scarce) with v10: G=20194094 M=3719276 E=1504624 D=7084460 T=32502454
Nov 24 01:55:02.020 [warn] ID on signature on network-status vote does not match any declared directory source.
Nov 24 01:55:02.021 [err] networkstatus_compute_consensus(): Bug: Generated a networkstatus consensus we couldn't parse. (on Tor 0.2.8.0-alpha-dev d939a83f5bc8ccf3)
Nov 24 01:55:02.027 [warn] Couldn't generate a microdesc consensus at all!
Nov 24 01:55:02.027 [warn] Couldn't generate any consensus flavors at all.

Now I have one of these v3 consensus documents sitting in my $datadir/unparseable-desc. It looks pretty normal.

I wonder if we broke "moving to a new certificate" somewhere in there?

Child Tickets

TicketStatusOwnerSummaryComponent
#17107closeddir auths should not include more than one (running,valid) relay per IP:port in their voteCore Tor/Tor
#18318closednickmMake sure keys and IP:Ports are unique in a consensusCore Tor/Tor
#18368closednickmDon't call a consensus a "vote" in log messagesCore Tor/Tor

Attachments (1)

unparseable-desc (1.4 MB) - added by arma 21 months ago.

Download all attachments as: .zip

Change History (54)

comment:1 Changed 21 months ago by arma

Milestone: Tor: 0.2.8.x-finalTor: 0.2.7.x-final

The plot thickens! I downgraded to "Tor version 0.2.7.4-rc (git-47649a558de24961)." and the problem remains.

comment:2 Changed 21 months ago by nickm

What's in the 'unparseable-desc' file it's telling you to look at?

comment:3 Changed 21 months ago by micah

On longclaw, I'm seeing these, oddly they are all related to dannenburg, and I see no problems with moria1:

Nov 24 08:50:21.000 [warn] Got a vote from an authority (nickname dannenberg, address dannenberg.torauth.de) with authority key ID 0232AF901C31A04EE9848595AF9BB7620D4C5B2E. This key ID is not recognized.  Known v3 key IDs are: D586D18309DED4CD6D57C18FDB97EFA96D330566, 14C131DFC5C6F93646BE72FA1401C02A8DF2E8B4, E8A9C45EDE6D711294FADF8E7951F4DE6CA56B58, ED03BB616EB2F60BEC80151114BB25CEF515B226, 585769C78764D58426B8B52B6651A5A71137189A, 80550987E1D626E3EBA5E5E75A458DE0626D088C, 49015F787433103580E3B66A1707A00E60F2D15B, EFCBE720AB3A82B99F9E953CD5BF50F7EEFC7B97, 23D15D965BC35114467363C165C4F724B64B4F66
Nov 24 08:50:21.000 [warn] Rejected vote from 193.23.244.244 ("Vote not from a recognized v3 authority").
Nov 24 08:52:31.000 [notice] Time to fetch any votes that we're missing.
Nov 24 08:52:31.000 [notice] We're missing votes from 2 authorities (D586D18309DED4CD6D57C18FDB97EFA96D330566 585769C78764D58426B8B52B6651A5A71137189A). Asking every other authority for a copy.
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '208.83.223.34:443' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '128.31.0.39:9131' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '194.109.206.212:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '86.59.21.38:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '193.23.244.244:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '131.188.40.189:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '171.25.193.9:443' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:34.000 [warn] Received http status code 404 ("Not found") from server '154.35.175.225:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z
Nov 24 08:55:03.000 [warn] http status 400 ("Mismatched digest.") response after uploading signatures to dirserver '193.23.244.244:80'. Please correct.
Nov 24 08:55:03.000 [notice] Got a signature from 193.23.244.244. Adding it to the pending consensus.
Nov 24 08:55:03.000 [warn] Unable to store signatures posted by 193.23.244.244: Mismatched digest.

Changed 21 months ago by arma

Attachment: unparseable-desc added

comment:4 in reply to:  2 Changed 21 months ago by arma

Replying to nickm:

What's in the 'unparseable-desc' file it's telling you to look at?

Attached.

comment:5 Changed 21 months ago by nickm

Priority: MediumHigh
Severity: NormalBlocker

comment:6 in reply to:  3 Changed 21 months ago by arma

Replying to micah:

On longclaw, I'm seeing these, oddly they are all related to dannenburg, and I see no problems with moria1:

Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '208.83.223.34:443' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '128.31.0.39:9131' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '194.109.206.212:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '86.59.21.38:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '193.23.244.244:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '131.188.40.189:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:31.000 [warn] Received http status code 404 ("Not found") from server '171.25.193.9:443' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z".
Nov 24 08:52:34.000 [warn] Received http status code 404 ("Not found") from server '154.35.175.225:80' while fetching "/tor/status-vote/next/D586D18309DED4CD6D57C18FDB97EFA96D330566+585769C78764D58426B8B52B6651A5A71137189A.z

D586D... is moria1. You're asking for moria1's vote and nobody has it -- including moria1.

comment:7 in reply to:  1 Changed 21 months ago by arma

Replying to arma:

The plot thickens! I downgraded to "Tor version 0.2.7.4-rc (git-47649a558de24961)." and the problem remains.

Ok, I downgraded to Tor 0.2.6.10-dev, and it looks ok for now. I guess I'll keep moria1 on oldstable until this ticket is resolved.

comment:8 Changed 21 months ago by weasel

Cc: weasel added

comment:9 Changed 21 months ago by nickm

Hm. When I try to parse it I get ...

Nov 26 12:47:31.000 [warn] ID on signature on network-status vote does not match any declared directory source.
Nov 26 12:47:31.000 [warn] Unable to parse networkstatus consensus
Nov 26 12:47:31.000 [warn] Couldn't load consensus ns networkstatus from "/Users/nickm/.tor/cached-consensus"

That comes from:

    if (!(v = networkstatus_get_voter_by_id(ns, declared_identity))) {
      log_warn(LD_DIR, "ID on signature on network-status vote does not match "
               "any declared directory source.");
      goto err;
    }

So apparently it's complaining because it got a signature on this document that didn't have a corresponding dir-source block. That's pretty weird -- there's only one signature on it, and that signature should be from moria. But apparently moria is omitting its own dir-source block in the consensus:

dir-source tor26 14C131DFC5C6F93646BE72FA1401C02A8DF2E8B4 86.59.21.38 86.59.21.38 80 443
contact Peter Palfrader
vote-digest F3B5BED9DE82C3811CD32A832243B6EBEF1EBF63
dir-source longclaw 23D15D965BC35114467363C165C4F724B64B4F66 199.254.238.53 199.254.238.53 80 443
contact Riseup Networks <collective at riseup dot net> - 1nNzekuHGGzBYRzyjfjFEfeisNvxkn4RT
vote-digest 65C7D8171ABDAF345B91BDAA43824519EC2B2740
dir-source maatuska 49015F787433103580E3B66A1707A00E60F2D15B 171.25.193.9 171.25.193.9 443 80
contact 4096R/23291265 Linus Nordberg <linus@nordberg.se>
vote-digest 06643D9FEA0027E3E0147B2A92724525713BFBB2
dir-source urras 80550987E1D626E3EBA5E5E75A458DE0626D088C 208.83.223.34 208.83.223.34 443 80
contact 4096R/D255D3F5C868227F Jacob Appelbaum <jacob@appelbaum.net>
vote-digest EB58639545D77BC67F0DB11EB4141BF97F5E7D9E
dir-source dizum E8A9C45EDE6D711294FADF8E7951F4DE6CA56B58 194.109.206.212 194.109.206.212 80 443
contact 1024R/8D56913D Alex de Joode <adejoode@sabotage.org>
vote-digest E08A26FEB7C50DDF03D63730DCDBB5E864319D3D
dir-source gabelmoo ED03BB616EB2F60BEC80151114BB25CEF515B226 131.188.40.189 131.188.40.189 80 443
contact 4096R/261C5FBE77285F88FB0C343266C8C2D7C5AA446D Sebastian Hahn <tor@sebastianhahn.net> - 12NbRAjAG5U3LLWETSF7fSTcdaz32Mu5CN
vote-digest 9C0AE8149701964BB123ED6E3EDFDFD69A57451A
dir-source Faravahar EFCBE720AB3A82B99F9E953CD5BF50F7EEFC7B97 154.35.175.225 154.35.175.225 80 443
contact 0x0B47D56D Sina Rabbani (inf0) <sina redteam net>
vote-digest 5283DEC2B07E20AE334AFA7CB7EA837BA83AB38B

I think that's the problem. But now to find out why it happens!

comment:10 Changed 21 months ago by nickm

(afk thanksgiving; if anybody else can track that down, that might rock.)

comment:11 Changed 21 months ago by nickm

Three possibilities I can think of here: moria1 is not generating a vote, moria1 is not considering its own vote, or moria1 is not including a dir-source line for its own vote.

comment:12 Changed 21 months ago by nickm

I would really love to see the vote that moria1 produced. I don't suppose you archived it...? Or were there any messages a bit before the ones you pasted above about moria1 failing to vote...?

comment:13 Changed 21 months ago by Sebastian

Ok, I think I found a few bugs (some might be different tickets, some might be this thing):

  • In the key-pinning-journal file, we accumulate duplicates over time as relays switch back and forth between two different ed25519 keys with the same rsa key. This has both the issue of exhausting dirauth disk space as well as making the involved data structure slow, if it happens a lot. We should prune duplicates when updating our latest view on a relay's identity mapping.
  • We're creating a vote that is invalid, but try to make a consensus anyway like nothing's wrong. Then we fail doing that as described above.
  • When a relay changes its RSA key, we'll include it in our vote twice. If both RSA keys map to the same ed25519 key, this bug triggers. So far we just never noticed that this is happening because we never cared that two things are on the same IP:port combination.
  • When we log unparseable desc stuff for our vote, we proceed to overwrite it with the invalid consensus we produced. The vote gets logged at log level notice, but only in truncated form not allowing one to analyze this bug.

comment:14 Changed 21 months ago by Sebastian

Looks like we should add a notion of a "canonical entry" for each distinct IP:port and ed25519 key, so that we have duplicates in neither. This is probably whatever we found to be reachable last at that IP:port / the rsa key that we talked to when we were last saw that ed25519 key. Yuck.

But, good news: This could be one of the oldest bugs in tor :p

comment:15 Changed 21 months ago by Sebastian

Confirmed. I finally managed to get an unparseable vote, and it contains:

r BeverlyHills 5pAi3sAPJZsYsvbqYWevH/S1SPI 0Ag3GcNJ/1Qt2AihzUTRUs1vgjQ 2015-11-28 21:07:46 76.92.225.216 9001 9030
s Fast Stable V2Dir Valid
v Tor 0.2.7.5
w Bandwidth=0 Measured=339
p reject 1-65535
id ed25519 ZQ+UKBiQgAc5j0KW/gCUAPE1FbdfvOHLmBGzIxjYPsQ
m 13,14,15 sha256=07L4SK+sGXc+BnJM+vh3the7VpII1eM6iagXhLPe0YY
m 16,17 sha256=31TfRY5Q22N+y1z8WNV+X3eKj19CnKTtUhV4f/U0Iis
m 18,19,20 sha256=WQEaDxDWgAoNT11wDlZKQ1pZ2D0CZF2/SmhL9O9w9iY
m 21 sha256=VuNfXpQLGf2po+WwfJem20sb0UTb+iezHhH+L8IHHpg
r BeverlyHills 8cs3O1hTmae5zHdVCAyvgnjDT/k qa5dkxBI75WPZUnWHVCDYNbYcH8 2015-11-29 00:38:17 76.92.225.216 9001 9030
s Running V2Dir Valid
v Tor 0.2.7.5
w Bandwidth=54
p reject 1-65535
id ed25519 ZQ+UKBiQgAc5j0KW/gCUAPE1FbdfvOHLmBGzIxjYPsQ
m 13,14,15 sha256=07L4SK+sGXc+BnJM+vh3the7VpII1eM6iagXhLPe0YY
m 16,17 sha256=31TfRY5Q22N+y1z8WNV+X3eKj19CnKTtUhV4f/U0Iis
m 18,19,20 sha256=xhpwKVpslPeYktN0S1s8KdDJ//TI3FEHenpSCDrtvqM
m 21 sha256=VuNfXpQLGf2po+WwfJem20sb0UTb+iezHhH+L8IHHpg

This is from tor26 (0.2.6.10):

r BeverlyHills 5pAi3sAPJZsYsvbqYWevH/S1SPI 0Ag3GcNJ/1Qt2AihzUTRUs1vgjQ 2015-11-2
s V2Dir Valid
v Tor 0.2.7.5
w Bandwidth=0
p reject 1-65535
m 13,14,15 sha256=07L4SK+sGXc+BnJM+vh3the7VpII1eM6iagXhLPe0YY
m 16,17 sha256=31TfRY5Q22N+y1z8WNV+X3eKj19CnKTtUhV4f/U0Iis
m 18,19,20 sha256=WQEaDxDWgAoNT11wDlZKQ1pZ2D0CZF2/SmhL9O9w9iY
r BeverlyHills 8cs3O1hTmae5zHdVCAyvgnjDT/k qa5dkxBI75WPZUnWHVCDYNbYcH8 2015-11-2
s Running V2Dir Valid
v Tor 0.2.7.5
w Bandwidth=54
p reject 1-65535
m 13,14,15 sha256=07L4SK+sGXc+BnJM+vh3the7VpII1eM6iagXhLPe0YY
m 16,17 sha256=31TfRY5Q22N+y1z8WNV+X3eKj19CnKTtUhV4f/U0Iis
m 18,19,20 sha256=xhpwKVpslPeYktN0S1s8KdDJ//TI3FEHenpSCDrtvqM

comment:16 Changed 21 months ago by nickm

Hmm. I'll try to look at this over the upcoming week if I can.

In the meantime, has somebody told the remaining authorities to stay on 0.2.6 for now? If not, could somebody please do so?

comment:17 Changed 21 months ago by Sebastian

I did, yes

comment:18 Changed 21 months ago by nickm

Keywords: TorCoreTeam201512 added

comment:19 Changed 21 months ago by atagar

Sorry if this has been mentioned already (long backlog) but micah cited the following in his logs...

Nov 24 08:50:21.000 [warn] Got a vote from an authority (nickname dannenberg, address dannenberg.torauth.de) with authority key ID 0232AF901C31A04EE9848595AF9BB7620D4C5B2E. This key ID is not recognized.  Known v3 key IDs are: D586D18309DED4CD6D57C18FDB97EFA96D330566, 14C131DFC5C6F93646BE72FA1401C02A8DF2E8B4, E8A9C45EDE6D711294FADF8E7951F4DE6CA56B58, ED03BB616EB2F60BEC80151114BB25CEF515B226, 585769C78764D58426B8B52B6651A5A71137189A, 80550987E1D626E3EBA5E5E75A458DE0626D088C, 49015F787433103580E3B66A1707A00E60F2D15B, EFCBE720AB3A82B99F9E953CD5BF50F7EEFC7B97, 23D15D965BC35114467363C165C4F724B64B4F66

On November 18th Andreas sent a notice that dannenberg's v3ident was changing to 0232AF901C31A04EE9848595AF9BB7620D4C5B2E. This isn't yet reflected in config.c...

https://gitweb.torproject.org/tor.git/tree/src/or/config.c#n860

Seems config.c should definitely be updated, but perhaps this is surfacing the current woes?

comment:20 Changed 21 months ago by nickm

Owner: set to nickm
Status: newaccepted

comment:21 in reply to:  19 Changed 20 months ago by teor

Replying to atagar:

Sorry if this has been mentioned already (long backlog) but micah cited the following in his logs...

Nov 24 08:50:21.000 [warn] Got a vote from an authority (nickname dannenberg, address dannenberg.torauth.de) with authority key ID 0232AF901C31A04EE9848595AF9BB7620D4C5B2E. This key ID is not recognized.  Known v3 key IDs are: D586D18309DED4CD6D57C18FDB97EFA96D330566, 14C131DFC5C6F93646BE72FA1401C02A8DF2E8B4, E8A9C45EDE6D711294FADF8E7951F4DE6CA56B58, ED03BB616EB2F60BEC80151114BB25CEF515B226, 585769C78764D58426B8B52B6651A5A71137189A, 80550987E1D626E3EBA5E5E75A458DE0626D088C, 49015F787433103580E3B66A1707A00E60F2D15B, EFCBE720AB3A82B99F9E953CD5BF50F7EEFC7B97, 23D15D965BC35114467363C165C4F724B64B4F66

On November 18th Andreas sent a notice that dannenberg's v3ident was changing to 0232AF901C31A04EE9848595AF9BB7620D4C5B2E. This isn't yet reflected in config.c...

https://gitweb.torproject.org/tor.git/tree/src/or/config.c#n860

Seems config.c should definitely be updated, but perhaps this is surfacing the current woes?

Split off into #17906, remaining work on this ticket is to fix the "unparseable consensus" issue

comment:22 Changed 20 months ago by nickm

Keywords: TorCoreTeam201601 201512-deferred added; TorCoreTeam201512 removed

Perhaps in January?

comment:23 Changed 19 months ago by nickm

Bulk-modify: It is February 2016, and no longer possible that anything else will get done in January 2016. Time's arrow and all that.

comment:24 Changed 19 months ago by nickm

Keywords: TorCoreTeam201602 added; TorCoreTeam201601 removed

comment:25 Changed 18 months ago by teor

I've just spoken to Sebastian and arma in #tor-dev, and I'm going to try and to describe the desired authority behaviour, then split it into different tickets:

I think these need to go in 0.2.7, I'm not sure if some could wait:

  • #18318 - Make sure keys and IP:Ports are unique in a consensus
  • #18319 - Exclude relays that don't match pinned RSA/Ed key pairs
    • this works already, but we need to test for regressions
  • #18320 - Clear old entries from the key-pinning journal file
  • #18321 - Exclude our own vote from the consensus if we think our own vote is invalid
  • #18322 - Log unparseable votes so they can be analysed

comment:26 Changed 18 months ago by teor

(I've left #18318 in 0.2.7, shifted #18319 into 0.2.8, and shifted all the others to 0.2.9. We can push them back further if we need to.)

comment:27 Changed 18 months ago by nickm

Thanks for the dividing-up, teor!

comment:28 Changed 18 months ago by nickm

I'm very curious why nobody reported this thing:

     if (digest256map_get(ed_id_map, vrs->ed25519_id) != NULL) {
       log_warn(LD_DIR, "Vote networkstatus ed25519 identities were not "
                "unique");
       digest256map_free(ed_id_map, NULL);
       goto err;
     }

That was supposed to have triggered in this case.

comment:29 Changed 18 months ago by nickm

I'm also curious why arma didn't see this one:

    if (!(v = networkstatus_parse_vote_from_string(status, NULL,
                                                   v3_ns->type))) {
      log_err(LD_BUG,"Generated a networkstatus %s we couldn't parse: "
              "<<%s>>",
              v3_ns->type == NS_TYPE_VOTE ? "vote" : "opinion", status);
      goto err;
    }

It should have happened after arma generated the unparseable vote, right?

comment:30 Changed 18 months ago by Sebastian

I see both those. I didn't think to paste them here because I included in my analysis that this is what's happening

comment:31 Changed 18 months ago by nickm

ok, cool! I thought that your analysis was right, but I was wondering, "How come I didn't see those warnings?"

comment:32 Changed 18 months ago by nickm

My ed25519_voting_fixes branch should fix this; see comments on #18318.

comment:33 Changed 18 months ago by Sebastian

60efce445b17d4b4153e91527887873812f5599f looks fine. Needs a tor-spec patch to go with it, tho.

42750ab63c2a5dc2b5c0fd96335fd043b6beef32:

General note: The style for opening and closing comments is inconsistent, sometimes they go on their own line, sometimes not. Do we care? I don't necessarily.

+/** Helper: add a single vote_routerstatus_t <b>vrs</b> to the collator
+    <b>dc</b>, indexing it by  */

this wants expansion

+/** Sort the entries in <b>dc</b> according to <b>consensus_method</b>, so
+ * that the consensus process can iterate over them with
+ * <b>dircollator_n_routers</b> and dircollator_get_votes_for_router</b>. */

this wants a <b> before the dircollator_get_votes_for_router

+/**
+ * Collation function for ed25519 consensuses: collate the votes for each
+ * entry in <b>dc</b> by ed25519 key and by RSA key.
+ *
+ * The rule is:
+ *    If an RSA identity key is listed by more than half of the authorities,
+ *    include it, and treat all routers with that identity as the same.
+ */

This should probably explain what this means for conflicts of the ed25519 keys?

+    /* If not enough authroties listed this exact <ed,rsa> pair,
+     * don't include it. */

typo authorities

Should we add asserts to the functions that say "this may only be called after dircollator_collate"? seems like we have the is_collated field for that purpose.

+  /** Map from <ed, RSA-SHA1> pair to an array similar to that used in
+   * by_rsa_sha1 above. */

what happens for duplicate rsa keys? Should spell it out I think

7cd091220c35c71025ab90cfa0bd18d206c29e8e:
the function name is networkstatus_parse_vote_from_string() - that seems misleading if we're treating consensuses?
I don't get it especially for stuff like this:

     if (ns->type != NS_TYPE_CONSENSUS) {
       if (check_signature_token(ns_digests.d[DIGEST_SHA1], DIGEST_LEN,
                                 tok, ns->cert->signing_key, 0,
-                                "network-status vote")) {
+                                "network-status document")) {

3941fa64ccf10dda5ac224bb1160564581c0e213:
I'm unsure this is sufficient. Can't we still construct a consensus from a situation where (consider attacking relays):
one dirauth publishes two rsa keys without ed keys, four dirauths publish one of the rsa keys with an ed key, the other four publish the other rsa key with the same ed key? Once I'm convinced this cannot happen I'll review the logic of this commit, too

comment:34 Changed 18 months ago by nickm

Hi, sebastian! I'll go over the other stuff the next time I'm working on the code, but I hope I can answer the questions you're blocking on so you can review the rest.

the function name is networkstatus_parse_vote_from_string() - that seems misleading if we're treating consensuses?

Yes, but that commit is only fixing log messages. I'd be happy to rename that to networkstatus_parse_from_string, but that seemed like a separate issue to me.

I'm unsure this is sufficient. Can't we still construct a consensus from a situation where (consider attacking relays): [...]

Ah, you found a bug. Great!

The situation you describe is supposed to be handled by dircollator_collate_by_ed25519(). The change in 3941fa64ccf10dda5ac224bb1160564581c0e213 only makes sure that the individual votes are conformant (by not having duplicate ed keys).

I'll walk through what's supposed to happens. The voters say:

V1 says: <RSA_1, nil> <RSA_2, nil>
V2, V3, V4, V5 all say: <RSA_1, Ed_1>
V6, V7, V8, V9 all say: <RSA_2, Ed_2>

First, we have already walked through the list of votes and passed each one to dircollator_add_vote. Let's assume we added them in the order V1...V9. So the internal state of the dircollator is:

  by_rsa_sha1 =
      RSA_1: { V1:<RSA_1, nil>, V2....V5:<RSA_1, Ed_1>, V6...V9: NULL }
      RSA_2: { V1:<RSA_2, nil>, V2....V5:NULL, V6.....V9:<RSA_2, Ed_1l> }
  by_both_ids =
     <RSA_1, Ed_1>: { V1: NULL, V2...V4:<RSA_1, Ed_1> V6...V9:NULL }
     <RSA_2, Ed_1>: { V1...V5: NULL, V6...V9:<RSA_2, Ed_1 }

In the first loop in dircollator_collate_by_ed25519(), we will find that no entry in the by_both_ids map has >5 entries. So we will do nothing in that loop.

In the second loop, we will add all of the entries for RSA1, and all of the entries for RSA2. So it looks like we have added two separate routers with the same Ed key, right?

Well, note that the ed25519_reflects_consensus field wasn't set in these cases, so we're supposed to leave off the "ed25519 id" when we make the microdescriptors, and possibly take it into account in compute_routerstatus_consensus() .

comment:35 Changed 18 months ago by nickm

Hrm, this turned out to be just a little tricker than I thought.

You see, first I had thought I'd just omit the Ed key when it didn't reflect consensus. Unfortunately, that would involve stripping it from a microdescriptor, and producing a microdescriptor that no authority had actually voted for. Doing that isn't such a good idea.

Instead, I am adding a "NoEdConsensus" flag that the authorities should produce whenever the consensus for a node does not enable the authorities to make a statement about its Ed25519 key. See branch (and the corresponding torspec branch, 'ed25519_voting_again') for full details.

comment:36 Changed 18 months ago by Sebastian

typos:

+ /* If not enough authorties listed this exact <ed,rsa> pair,

+ * by_rsa_sha1 above. We an include <NULL,RSA-SHA1> entries for votes that

dir-spec: Looks fine, but I think we should include a note about needing to look at the consensus to verify that an ed key listed in a descriptor is actually valid, at least until the ed key becomes the canonical key.

Comment from above that didn't get a reply, but I think it should:

"Should we add asserts to the functions that say "this may only be called after dircollator_collate"? seems like we have the is_collated field for that purpose."

lgtm otherwise!

comment:37 Changed 18 months ago by nickm

Thanks; I'll fix those. Also , begging for even more reviews.

comment:38 Changed 18 months ago by teor

Spec / Design Review:

This addition to dir-spec.txt only makes sense if you've read the NoEdConsensus description, can we say "do not produce a majority consensus about a relay's Ed25519 key"?

+        * If consensus method 22 or later is used, and the votes do not
+          produce a majority consensus about Ed25519 key (see 3.8.0.1 below), the
+          consensus must include a NoEdConsensus flag on the "s" line.

I'm not seeing how the NoEdConsensus flag actually affects tor's behaviour. So I assume the client/relay/authority modifications for using the NoEdConsensus flag to avoid certain ed25519 keya will be handled in a separate ticket?

Code review:

Things I think are wrong or incomplete:

routers_make_ed_keys_unique claims to check IP:ORPort, but I don't see where we check it in that function.
That comment is ambiguous, we should be checking IPv4:ORPort, IPv4:DirPort, IPv6:ORPort, and IPv6:DirPort for uniqueness. But I think we only want to do that check among routerstatuses with the Running flag (those we;ve checked reachable). Which means IPv4 ORPorts only for the moment, as we don't check anything else.
(Otherwise, this opens up a denial of service opportunity where a relay claims to have a popular relay's IPv4:ORPort, IPv4:DirPort, IPv6:ORPort, or IPv6:DirPort.)
Maybe this is best tackled by a different function in a different ticket. So let's fix the comment in this one.

+/** If there are entries in <b>routers</b> with exactly the same IP:OrPort or
+ * exactly the same ed25519 keys, remove the older one.  May alter the order
+ * of the list. */

This check preserves the existing digestmap entry if the published times are equal. But different authorities could choose different descriptors-with-identical-published-times, depending on the order they arrived at the authority.
So let's find a consistent way of selecting a descriptor to omit if their published times are identical - how about comparing signed_descriptor_digest? (And if the digests are identical, it really doesn't matter which descriptor we omit, because they have the same content.)
We should do the same tie-breaking in the IP:Port checks too.

+    if ((ri2 = digest256map_get(by_ed_key, pk))) {
+      /* Duplicate; must omit one.  Omit the earlier. */
+      if (ri2->cache_info.published_on < ri->cache_info.published_on) {
+        digest256map_set(by_ed_key, pk, ri);
+        ri2->omit_from_vote = 1;
+      } else {
+        ri->omit_from_vote = 1;
+      }

Things I don't understand:

I have no idea what "treat all routers with that identity as the same" means here - does it mean: "a new descriptor with this id replaces any older descriptors with this id"?

+ * The rule is:
+ *    If an RSA identity key is listed by more than half of the authorities,
+ *    include it, and treat all routers with that identity as the same.
+ */

Should this read "no <ed,rsa> has been listed by more than half the authorities"?
Should it read "include <NULL, rsa>"?

+ *    Otherwise, if an <*,rsa> or <rsa> identity is listed by more than
+ *    half of the authorities, and no <ed,rsa> has been listed, include
+ *    it.

Typos:

I think you mean n'th and n'th here (to match dircollator_get_votes_for_router), not i'th and n'th:

+  /* The i'th member of this array corresponds to the vote_routerstatus_t (if
+   * any) received for this digest pair from the n'th voter. */

Can we decide whether to use i'th or n'th in all these added comments, or do they mean different things?

comment:39 Changed 18 months ago by ln5

Looking at 60efce4..03d484f.

NOTE: I'm not competent to verify the logic.

In routers_make_ed_keys_unique(), is by_ed_key ever freed?

That's all I've got.

comment:40 Changed 18 months ago by nickm

Status: acceptedneeds_review

comment:41 Changed 18 months ago by nickm

Keywords: must-fix-before-028-rc added
Milestone: Tor: 0.2.7.x-final

These are backportable, but they are tied to the 0.2.8 rc.

comment:42 Changed 17 months ago by Sebastian

Status: needs_reviewneeds_revision

comment:43 in reply to:  36 Changed 17 months ago by nickm

Replying to Sebastian:

typos:

+ /* If not enough authorties listed this exact <ed,rsa> pair,

+ * by_rsa_sha1 above. We an include <NULL,RSA-SHA1> entries for votes that

These two are fixed in fixup commits.

dir-spec: Looks fine, but I think we should include a note about needing to look at the consensus to verify that an ed key listed in a descriptor is actually valid, at least until the ed key becomes the canonical key.

Calling this S1.

Comment from above that didn't get a reply, but I think it should:

"Should we add asserts to the functions that say "this may only be called after dircollator_collate"? seems like we have the is_collated field for that purpose."

Calling this S2.

comment:44 in reply to:  38 Changed 17 months ago by nickm

Replying to teor:

Spec / Design Review:

This addition to dir-spec.txt only makes sense if you've read the NoEdConsensus description, can we say "do not produce a majority consensus about a relay's Ed25519 key"?

+        * If consensus method 22 or later is used, and the votes do not
+          produce a majority consensus about Ed25519 key (see 3.8.0.1 below), the
+          consensus must include a NoEdConsensus flag on the "s" line.

Calling this T1.

I'm not seeing how the NoEdConsensus flag actually affects tor's behaviour. So I assume the client/relay/authority modifications for using the NoEdConsensus flag to avoid certain ed25519 keya will be handled in a separate ticket?

Yes, that's right.

Code review:

Things I think are wrong or incomplete:

routers_make_ed_keys_unique claims to check IP:ORPort, but I don't see where we check it in that function.

Will fix documentation. Calling thar T2. The issue is that it's not actually a disaster for IP:Port to be duplicated; only key duplication does horrible things to the consensus process.

This check preserves the existing digestmap entry if the published times are equal. But different authorities could choose different descriptors-with-identical-published-times, depending on the order they arrived at the authority.
So let's find a consistent way of selecting a descriptor to omit if their published times are identical - how about comparing signed_descriptor_digest? (And if the digests are identical, it really doesn't matter which descriptor we omit, because they have the same content.)

Good idea; Calling it T3.

We should do the same tie-breaking in the IP:Port checks too.

+    if ((ri2 = digest256map_get(by_ed_key, pk))) {
+      /* Duplicate; must omit one.  Omit the earlier. */
+      if (ri2->cache_info.published_on < ri->cache_info.published_on) {
+        digest256map_set(by_ed_key, pk, ri);
+        ri2->omit_from_vote = 1;
+      } else {
+        ri->omit_from_vote = 1;
+      }

Things I don't understand:

I have no idea what "treat all routers with that identity as the same" means here - does it mean: "a new descriptor with this id replaces any older descriptors with this id"?

+ * The rule is:
+ *    If an RSA identity key is listed by more than half of the authorities,
+ *    include it, and treat all routers with that identity as the same.
+ */

Calling this T4.

Should this read "no <ed,rsa> has been listed by more than half the authorities"?
Should it read "include <NULL, rsa>"?

+ *    Otherwise, if an <*,rsa> or <rsa> identity is listed by more than
+ *    half of the authorities, and no <ed,rsa> has been listed, include
+ *    it.

Calling this T5.

Typos:

I think you mean n'th and n'th here (to match dircollator_get_votes_for_router), not i'th and n'th:

+  /* The i'th member of this array corresponds to the vote_routerstatus_t (if
+   * any) received for this digest pair from the n'th voter. */

Can we decide whether to use i'th or n'th in all these added comments, or do they mean different things?

Calling this T6.

Thanks!

comment:45 Changed 17 months ago by nickm

Tor changes:

S2: fixed in 1d503db20a981981fc8cc445f00d66cd696764e3
T2: Fixup commit as 29d162f4009f5a
T3: fixed in 33dc09146d3005
T4,T5: fixed with 65cfd96b217e7460b449b5ffb76a1e75204abb8b
T6: fixed with 831bbb1eb7fbc460992e5ae7d0e3a824158082a5

The remaining issues above are torspec-only; moving on to those.

comment:46 Changed 17 months ago by nickm

Keywords: 201512-deferred TorCoreTeam201602 removed
Milestone: Tor: 0.2.8.x-final
Status: needs_revisionneeds_review

S1,T1: Fixed in ed25519_voting_again with 313eff650d3aa7d4a221ea2f21074eb7cc7d5c52.

comment:47 Changed 17 months ago by nickm

Keywords: 027-backport added

comment:48 Changed 17 months ago by Sebastian

I'm happy with S1, S2.

comment:49 Changed 17 months ago by teor

Keywords: needs-merge added
Version: Tor: 0.2.7

I'm happy with T1/S1 for torspec, and S2, T2, T4/T5, and T6 for tor.

T3's code for tor is ok, but we've changed the behaviour, so let's update the function comment:

/** If there are entries in <b>routers</b> with exactly the same ed25519 keys,
 * remove the older one. If they are exactly the same age, remove the one with
 * the greater descriptor digest. May alter the order of the list. */

After that, let's get it merged. (No need to re-review comment-only changes.)

comment:50 Changed 17 months ago by nickm

Fixed in ed25519_voting_fixes. Made new branch ed25519_voting_fixes_squashed to squash all the above onto 0.2.7. Merged it into maint-0.2.7 and forward.

comment:51 Changed 17 months ago by nickm

Resolution: fixed
Status: needs_reviewclosed

These are now merged into maint-0.2.7 and forwards.

comment:52 Changed 16 months ago by nickm

Keywords: 2016-bug-retrospective added

Marking these tickets (based on severity and hand-review) for inclusion in 2016 bug retrospective

comment:53 Changed 16 months ago by nickm

Mark more tickets for bug retrospective based on hand-review of changelogs from 0.2.5 onwards.

Note: See TracTickets for help on using tickets.