Opened 5 months ago

Closed 12 days ago

#23170 closed task (wontfix)

Include ed25519 relay id keys in the consensus

Reported by: asn Owned by: nickm
Priority: Very High Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-spec prop224 tor-dirauth tor-hs ed25519 needs-proposal TorCoreTeam201711.1
Cc: Actual Points:
Parent ID: Points: 1
Reviewer: Sponsor: SponsorR-can

Description

Proposal 220 specifies how ed25519 id keys should be encoded in the consensus, but the change has not been implemented yet.

This is a problem in the next gen HS design, since prop224 specifies that the HSDir hash ring should be formed using the relay ed25519 keys, but these are not in the consensus which means that the service and the clients need to fetch all the microdescriptors before getting an accurate picture of the hash ring. Uploading or fetching HS descriptors without having all the relay microdescriptors can lead to hash ring desynch and reachability failures.

Child Tickets

Attachments (1)

xxx-move-ed25519-into-consensus.txt (2.3 KB) - added by nickm 2 months ago.

Download all attachments as: .zip

Change History (24)

comment:1 Changed 5 months ago by nickm

Owner: set to nickm
Sponsor: SponsorR-can
Status: newaccepted

comment:2 Changed 5 months ago by nickm

Keywords: needs-proposal added

Looking at prop220, I don't think it specifies that ed25519 keys should go in the consensus. It says they should go in the votes and the microdescriptors.

We can put them in the consensus as well, I guess, with a plan to eventually remove them from the microdescriptors. I'm concerned about the overhead, but as more people migrate to consensus diffs, the impact should become less horrible.

comment:3 Changed 5 months ago by dgoulet

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

No chance we can pull this off before the freeze.

comment:4 Changed 4 months ago by teor

If this is an issue on the live network, we could increase the spread to compensate.

comment:5 Changed 4 months ago by dgoulet

Priority: MediumVery High

This "missing descriptors, waiting to download" problem is starting to be much more of an issue in terms of user experience and reachability. 032 works fine but sometimes those delays are inevitable which we've seen gone up to 10 minutes.

This would simplify massively everything and could be deploy as soon as 0.3.3.

comment:6 Changed 4 months ago by arma

Hopefully useful thoughts:

A) Is the only reason we need ed keys in the consensus for this hashring thing? That seems like a sad reason to put in all the overhead for every client for every hour forever.

A2) And if that is the only use, we could put like 4 characters of the key in, as just a hashring hint?

B) Is there a longer term plan for phasing out rsa keys entirely, and replacing them with ed keys? If not, maybe prop224 hashring should use rsa keys.

C) Maybe the consensus overhead isn't so bad because of consensus diffs. Or maybe it is? Has anybody done any follow-up on the consensus diff deployment to see what effect it has had?

comment:7 in reply to:  6 ; Changed 4 months ago by teor

Replying to arma:

Hopefully useful thoughts:

A) Is the only reason we need ed keys in the consensus for this hashring thing? That seems like a sad reason to put in all the overhead for every client for every hour forever.

Consensus diffs will make this into a once-off addition for 0.3.1 and later clients.

Also, if we're having trouble getting some descriptors for our primary guards, does this fix that issue as well?

A2) And if that is the only use, we could put like 4 characters of the key in, as just a hashring hint?

No, because the key is hashed, so you need the entire key.

B) Is there a longer term plan for phasing out rsa keys entirely, and replacing them with ed keys?

I hope so.

If not, maybe prop224 hashring should use rsa keys.

No, it's too late to change this for 0.3.2.

C) Maybe the consensus overhead isn't so bad because of consensus diffs. Or maybe it is? Has anybody done any follow-up on the consensus diff deployment to see what effect it has had?

Yes, ahf wrote an entire technical report on it. Search tor-project@ for Isabela's recent report.

comment:8 in reply to:  7 Changed 3 months ago by teor

Replying to teor:

Replying to arma:

Hopefully useful thoughts:

A) Is the only reason we need ed keys in the consensus for this hashring thing? That seems like a sad reason to put in all the overhead for every client for every hour forever.

Consensus diffs will make this into a once-off addition for 0.3.1 and later clients.

Also, if we're having trouble getting some descriptors for our primary guards, does this fix that issue as well?

Moving ed ids to the consensus means we can eventually remove them from microdescs, stop voting on NoEdConsensus, and just drop them from the consensus when they're wrong.

comment:9 Changed 3 months ago by nickm

Keywords: TorCoreTeam201711.1 added

comment:10 Changed 3 months ago by karsten

Quick request for clarification: when you say that IDs will be added to the consensus, you mean the microdesc-flavored consensus, whereas the unflavored consensus will remain unchanged, right? Where can I read more about the planned changes to dir-spec?

comment:11 in reply to:  10 Changed 3 months ago by teor

Keywords: tor-spec added

Replying to karsten:

Quick request for clarification: when you say that IDs will be added to the consensus, you mean the microdesc-flavored consensus, whereas the unflavored consensus will remain unchanged, right?

No, they need to be added to both consensuses, otherwise clients that set UseMicrodescriptors 0 won't work (or they won't work as well).
Ed25519 ids are only in the votes (and descriptors and microdescriptiors) right now:

https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n2260

Where can I read more about the planned changes to dir-spec?

When nickm writes a proposal, it will hopefully come with a dir-spec patch and some code :-)

Is there something in particular you are concerned about?
(If parsers are written to ignore unknown lines in consensuses, they should be fine.)

comment:12 Changed 3 months ago by karsten

Makes sense. No concerns, mostly curiosity. And yes, metrics-lib should handle those new lines just fine. Thanks for the quick response!

Changed 2 months ago by nickm

comment:13 Changed 2 months ago by nickm

Status: acceptedneeds_revision

Added a draft proposal in an attachment, but I don't think this is such a great idea. It causes a 40% regression in the size of compressed consensus documents.

comment:14 in reply to:  13 Changed 2 months ago by asn

Replying to nickm:

Added a draft proposal in an attachment, but I don't think this is such a great idea. It causes a 40% regression in the size of compressed consensus documents.

Hm, that's not good indeed.

I wonder what else we could do, since we do need to eventually migrate away from RSA1024.

comment:15 Changed 2 months ago by Sebastian

How big is the compressed regression when we eventually drop RSA keys from the consensus? Perhaps we can better justify doing this if the cost is temporary until we move away from rsa?

comment:16 Changed 2 months ago by nickm

So, one possibility to consider is to introduce another consensus flavor, as we did before when we added microdescriptors in the first place. That would potentially let us re-engineer a lot of stuff. Though I'm not sure we can actually drop RSA identity keys from the consensus as long as the legacy HS system exists, if we believe that the hash ring needs to work like this.

For the purposes of this ticket, maybe it would make more sense to figure out what the probability is for onion service clients/servers to upload/download from the wrong place, and look into other ways to constrain that. For example, we could increase the spread (as teor recommends), or require that a certain fraction of the network be known, or something like that.

Who knows? It may be that the problem that led to this ticket will go away once we've solved the various pending microdescriptor download issues (like #23985 and #23817)

comment:17 Changed 8 weeks ago by nickm

I've run some numbers on the necessary changes in order to get better performance for hsdir fetches. I'm assuming that we hold fetch_spread constant at 3, and that we vary store_spread. I'm also assuming that the onion service itself is never missing ed25519 keys when it uploads.

With those assumptions, here is the probability that the client finds a hsdir holding the descriptor on its first try:

fraction mds missing store_spread=3 store_spread=4 store_spread=5 store_spread=6
0.15 0.8500 0.9594 0.9903 0.99793
0.10 0.9000 0.9813 0.9970 0.99956
0.05 0.9499 0.9951 0.9996 0.99997
0.01 0.9899 0.9998 1.0000 1.00000

With this in mind, I suggest that we close or defer this ticket for now; that we increase the store_spread value to 5 or 6, and that we make sure that onion services won't upload until they have a high fraction of the HSDirs' descriptors.

comment:18 in reply to:  17 Changed 8 weeks ago by nickm

Replying to nickm:

that we increase the store_spread value to 5 or 6,

Opened #24425 for this.

and that we make sure that onion services won't upload until they have a high fraction of the HSDirs' descriptors.

Opened #24426 for this.

comment:19 in reply to:  17 Changed 8 weeks ago by dgoulet

Replying to nickm:

fraction mds missing store_spread=3 store_spread=4 store_spread=5 store_spread=6
0.15 0.8500 0.9594 0.9903 0.99793
0.10 0.9000 0.9813 0.9970 0.99956
0.05 0.9499 0.9951 0.9996 0.99997
0.01 0.9899 0.9998 1.0000 1.00000

Hmm... not all relays are HSDirs and not all HSDirs support v3 right now. So, does that reflect this reality?

comment:20 Changed 8 weeks ago by nickm

Sorry -- that should be "fraction of v3 hsdir mds missing"

comment:21 Changed 7 weeks ago by nickm

Hmm... not all relays are HSDirs and not all HSDirs support v3 right now. So, does that reflect this reality?

I believe so -- just relabel the first column as "fraction of hsdirv3 mds missing". :)

comment:22 Changed 7 weeks ago by nickm

I'm currently inclined to call this ticket "wontfix" based on the performance regression.

comment:23 in reply to:  22 Changed 12 days ago by dgoulet

Resolution: wontfix
Status: needs_revisionclosed

Replying to nickm:

I'm currently inclined to call this ticket "wontfix" based on the performance regression.

ACK.

Note: See TracTickets for help on using tickets.