Armadev discovered that hsv3 revision counters are harmful to scalability since if an onion service is hosted by multiple servers (like the fb one), every server should have visibility of the revision counter if they want to publish a descriptor.
We should figure out whether there is an easy way around that, or whether this is actually a big problem for scalable v3s. We should also consider how this works out with onionbalance-based designs.
Rev counters are there so that HSDirs (and other actors) cannot replay old HS descriptors. However, they are not really needed since now HS descriptors are only replayable for a day (before the blinded key gets refreshed), and also HSDirs could keep a replay cache of the descriptor assigned to a blinded key.
If we decide to rip them off, the way to do it is in two painful steps:
a) Remove rev counter checking from HSDirs, and do a replay cache or something.
b) In the far future, when all HSDirs have upgraded to (a), rip out the rev counter code from onion services.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
Trac: Description: Armadev discovered that hsv3 revision counters are harmful to scalability since if an onion service is hosted by multiple servers (like the fb one), every server should have visibility of the revision counter if they want to publish a descriptor.
We should figure out whether there is an easy way around that, or whether this is actually a big problem for scalable v3s.
Rev counters are there so that HSDirs (and other actors) cannot replay old HS descriptors. However, they are not really needed since now HS descriptors are only replayable for a day (before the blinded key gets refreshed), and also HSDirs could keep a replay cache of the descriptor assigned to a blinded key.
If we decide to rip them off, the way to do it is in two painful steps:
a) Remove rev counter checking from HSDirs, and do a replay cache or something.
b) In the far future, when all HSDirs have upgraded to (a), rip out the rev counter code from onion services.
to
Armadev discovered that hsv3 revision counters are harmful to scalability since if an onion service is hosted by multiple servers (like the fb one), every server should have visibility of the revision counter if they want to publish a descriptor.
We should figure out whether there is an easy way around that, or whether this is actually a big problem for scalable v3s. We should also consider how this works out with onionbalance-based designs.
Rev counters are there so that HSDirs (and other actors) cannot replay old HS descriptors. However, they are not really needed since now HS descriptors are only replayable for a day (before the blinded key gets refreshed), and also HSDirs could keep a replay cache of the descriptor assigned to a blinded key.
If we decide to rip them off, the way to do it is in two painful steps:
a) Remove rev counter checking from HSDirs, and do a replay cache or something.
b) In the far future, when all HSDirs have upgraded to (a), rip out the rev counter code from onion services.
WRT the replay cache idea of step (a) above, we probably do need a replay cache on the HSDirs, because there is a 24 hour window (before we change blinded pubkey) where HSDirs can replace the descriptors on other HSDirs with older versions of the descriptor. We probably want to avoid this and we should use a replay cache for this.
The right way to use a replay cache here is to store the hash of the HSV3 descriptor on the replay cache. We should investigate whether we need to hash the whole descriptor, or the whole descriptor minus the signature (in case the signature is malleable and an attacker can tweak it to bypass replay cache). If we need to do the latter approach, we should add the right data in hs_cache_dir_descriptor_t as part of cache_dir_desc_new().
Implementation plan for step (a) above:
Introduce a global replay_cache_t *hs_cache_replay_cache in hs_cache.c next to hs_cache_v3_dir. We should index entries to this replay cache by blinded key, or maybe add an insertion timestamp so that we know when to clean it up.
2 In cache_store_v3_as_dir, remove the revision counter check, and instead query the replay cache for whether we already have seen this descriptor before. If we have seen this descriptor before we should treat it the same way we treat descriptors with a smaller or equal revision counter right now, that is, reject them and log_info.
We should clean up the replay cache when we are sure that the blinded key for a descriptor is now useless and will never be used by clients again. We should look in rend-spec-v3.txt to make sure when that is; probably at 24 or 48 hours or so.
So as long as we get the new functionality into HSDirs before the next long-term-stable, the "far future" will just be a matter of waiting some months for intermediate stable versions to die out.
So as long as we get the new functionality into HSDirs before the next long-term-stable, the "far future" will just be a matter of waiting some months for intermediate stable versions to die out.
But hang on, do clients require descriptors to have revision counters?
If so, we can't rip out revision counters on services for a long time.
Some systems assume signatures are not malleable: that is, given a valid signature for some message under some key, the attacker can't produce another valid signature for the same message and key. Ed25519 and Ed448 signatures are not malleable due to the verification check that decoded S is smaller than l. Without this check, one can add a multiple of l into a scalar part and still pass signature verification, resulting in malleable signatures.
We should check if our ed25519 implementations do the above check. If they do, it should be possible to just replay cache the ED25519_SIG_LEN bytes of our ed25519_signature_t. I plan to look at our implementation this week to see if the aboce check is done, then send an email to Ian Goldberg, and if he agrees that it's legit, proceed with this plan.
Latest plan: According to RFC 8032:
{{{
Some systems assume signatures are not malleable: that is, given a
valid signature for some message under some key, the attacker can't
produce another valid signature for the same message and key.
Ed25519 and Ed448 signatures are not malleable due to the
verification check that decoded S is smaller than l. Without this
check, one can add a multiple of l into a scalar part and still pass
signature verification, resulting in malleable signatures.
}}}
We should check if our ed25519 implementations do the above check. If they do, it should be possible to just replay cache the ED25519_SIG_LEN bytes of our ed25519_signature_t. I plan to look at our implementation this week to see if the aboce check is done, then send an email to Ian Goldberg, and if he agrees that it's legit, proceed with this plan.
Hey @cypherpunks, we really appreciate you trying to help organise and triage our tickets, but in this case the HS team has a flow that works for them, so please don't set parent tickets on their stuff. Thanks!
To summarize, after a discussion with teor and asn, we'll go with a replaycache that stores a hash of the descriptor without the signature. And this would be done after desc signature validation.
Trac: Owner: asn to dgoulet Cc: dgoulet, franklin, isis to franklin, isis
Here is a fun fact. We use the revision counter in the computation of the descriptor encryption keys. See spec section HS-DESC-ENCRYPTION-KEYS.
So bottom line, this means that we have to remove it from secret_input computation but only if we can't find the counter in the plaintext data of the descriptor ("revision-counter" SP Integer NL).
Code wise, this isn't very complicated but I thought it would be wise to just throw it out there since it affects our crypto construction.
So as long as we get the new functionality into HSDirs before the next long-term-stable, the "far future" will just be a matter of waiting some months for intermediate stable versions to die out.
But hang on, do clients require descriptors to have revision counters?
So here is the fun part of this. Client do look at the revision counter when caching but only to decide if they have a newer one in their cache or not. Thus, a revision counter always to 0 for instance wouldn't affect the client cache.
As for HSDir, they won't accept a descriptor with a revision counter that is lower or equal with the one they have in their cache. Meaning that 034+ services will still need to put the revision counter in their descriptor for a while until <= 032 is phased out. Not putting the revision counter in the descriptor for specific HSDir is not trivial amount of engineering work.
Now, this is where it gets messy. When decoding the plaintext part of a descriptor (where the rev. counter is), we hard require the counter to be in it (notice the T1()):
Thus, as long as we have 032 and 033 HSDirs and clients on the network, we can't remove the counter from the descriptor else they won't be able to store/access 034+ services as every descriptor will fail to be decoded.
Thus to summarize, the only thing we can do for now is make HSDir use a replaycache instead of revision counter, make client ignore revision counter and make the revision counter optional in the descriptor when decoding it.
But we MUST make the service put the rev counter regardless, with the current mechanism, in the descriptor for a while else it will break client and HSDir <= 033.
Or a more nuclear option, we bump the descriptor version to 4 which won't have the revision counter which will effectively make 034+ the birth of the onion service v4 :P...... not ideal ;).
To be honest, I currently don't see a way for service to stop putting the revision counter without breaking many clients because of comment:15.
Seems like once all HSDir <= 032 are phased out, we can then make the service stop putting it in the descriptor (which will break <= 032 clients...). This means that right now (it is in the branch) we have to either make the client ignore the revision counter in the secret in put construction or always use 0 if no rev counter in the descriptor (which I did for simplicity).
Anyway, see the attempt above.
Trac: Status: assigned to needs_review Reviewer: N/Ato asn
Make v3 onion services use the descriptor generation timestamp for the revision counter
Backport this change to all tor versions with v3 onion services (0.3.2 and later)
This fix will make v3 onion services scaleable, by allowing multiple services to submit descriptors with a very small probability of revision number collisions. It also retains the property that newer descriptors replace older ones.
We can make a separate decision about replay caches on HSDirs.
We can make a separate decision about removing the revision counter entirely.
If we decide to keep it, we should check that it's a 64-bit field, so it lasts past 2038.