Make .onion addresses harder to harvest by directory servers

added component::core tor/tor parent::12424 priority::high resolution::duplicate severity::normal sponsor::R status::closed tor-hs type::defect labels

Replying to asn:

On actual solutions, Robert posted: https://lists.torproject.org/pipermail/tor-dev/2012-September/004026.html some months ago. I don't have the cryptographic skills to robustly analyze his idea, but if this is the only thing we have, we should point some cryptographers at it so that it gets some comments.

For an Ed25519-based signature scheme with both the public-key group element and the base point blinded, the verification equation is equivalent to S*B = (HB(nonce, B, A)^(-1))*R + H(R, HB(nonce, B, A)*B, HB(nonce, B, A)*A, M)*A, where R is carefully chosen to be a uniform random group element and HB(nonce, B, A) is (computationally) independent of R. This equation does not leak any more information about the log of A than the verification equation for unmodified Ed25519 does, so this cryptosystem is obviously as safe as Ed25519.

(My original proposal (only blind the public-key group element (A in Ed25519)) would have had a different, potentially leakier, verification equation.)

Are there any other proposed solutions out there? If not, this might be a fun research problem.

Tor's hidden service protocol requires a cryptosystem with the following properties:

a keypair generation function keygen: RNG -> (PubKey, PrivKey);
a blinding function blindPub: (PubKey, BlindingNonce) -> BlindedPubKey;
a signing function sign: (RNG, PrivKey, PubKey, BlindingNonce, Message) -> Signature;
a signature-verification function verify: (RNG, BlindedPubKey, Message, Signature) -> Bool;
blindPub cannot be inverted, even by an attacker who knows the BlindingNonce value passed to it;
the output distributions of blindPub for different public keys are indistinguishable, even when BlindingNonce is known; and
signature forgery is infeasible (under the usual definition of security for a signature system, where (PubKey, BlindingNonce) is the type of public keys and (PrivKey, BlindingNonce) is the type of private keys (this implies security against forgery by someone who only knows BlindedPubKey)).

(This is equivalent to a pair of signature schemes A and B, with a pair of blinding functions (A.PubKey, BlindingNonce) -> B.PubKey and (RNG, A.PrivKey, BlindingNonce) -> B.PrivKey, but specifying the properties that this pair of blinding functions needs would be more annoying.)

The obvious ways to blind RSA-like and McEliece-like public keys allow an attacker to link different blinded versions of the same original public key. It might be possible to blind public keys for multivariate-quadratic or Alekhnovich-type cryptosystems, but there's enough structure in MQ keys that I would expect to see linkability issues, and if there are suitable signature schemes based on the LPN-like problems, they won't be as well-understood as discrete-log-based signatures. So, for now, any cryptosystem that solves this problem will look a lot like my (revised) proposal (unless someone comes up with a screwy pairing-based version that provides lower security and lower performance).

Trac:
Milestone: N/A to Tor: 0.2.5.x-final
Priority: normal to major

(Finally, found the paper I mentioned in my first post. It's Lasse and Paul's Valet services: Improving hidden servers with a personal touch and the idea is detailed in subsection Distributing tickets of section 3.2. Unfortunately, IIRC their idea depends on the Valet services idea which is not-implemented and controversial (proposal 142 is related). When I get some time I should reread the Valet services paper.)

Valet Services paper: http://freehaven.net/anonbib/cache/valet:pet2006.pdf

Here is the Valet Services scheme (as presented in subsection_Distributing Tickets_ of section 3.2) adapted to the current HS protocol:

a) HS finds the HSDirs responsible for it exactly like in the current HS protocol.
b) HS uploads its descriptor symmetrically encrypted with key = H(onion+'1')
   HS also uploads an index for the descriptor, where index = H(onion+'2')

When a client wants to visit the HS:

a) The client derives the key and the index from the .onion.
b) The client fetches the descriptor from the HSDir and decrypts it with the key.

A problem pointed out by Nick is that this turns HSDir into cloud storage servers, since anyone can upload anything to it.

Replying to asn:

A problem pointed out by Nick is that this turns HSDir into cloud storage servers, since anyone can upload anything to it.

I guess we can make this a bit less painful, by partially encrypting the descriptor. Basically, by encrypting all the fields of the descriptor, but leaving the headers intact. And of course by enforcing a size limit.

The current v2 descriptor has encrypted fields that can be used to store arbitrary information anyway (like the introduction-points encrypted-string field).

Reposting with permission an old message of Robert Ransom's about Valet Services, their drawbacks, and better living through cryptography:

The system described in section 3.2 of that paper does not allow a
directory server to verify that a hidden service descriptor is being
published to a descriptor index by a party which "owns" the index.  I
see no way that the reverse-hash-chain technique described there could
prevent an attacker from publishing a bogus descriptor with a targeted
hidserv's descriptor index.  At best, if the hidserv "wins the race",
the hash-chain would allow it to hold on to its descriptor index; at
worst, the attacker could win the race and lock out the real hidserv.

The only system I have come up with to hide a hidserv's identity key
from the directory servers requires that the hidserv use a discrete-log
cryptosystem for its identity key, and that the HS address contain
enough of the identity key that a client can compute a scalar multiple
of the identity key.  (For example, the identity key can be a point on
an elliptic curve in Weierstrass form, and the HS address can be the
point's x coordinate.)

The system designer chooses a group, an element P of prime order p in
that group, and a publicly computable one-way function h mapping
bitstrings to integers in the interval [1, p-1].

The HS chooses a secret key n and computes its public key nP; its
descriptor index in time period t can be computed by any party which
knows its public key as h(t | nP)*nP, but only the HS will know the
discrete logarithm h(t | nP)*n of the descriptor index with base P.
The HS can therefore compute an ECDSA signature, for example, which the
directory server can verify using the descriptor index as a public key.

> Also "Privacy-enhancing Technologies for Private Services", Karsten's
> PhD dissertation, also available from the anonbib.
> Also see Tor proposal
> 121-hidden-service-authentication.txt

I am aware of the current authentication mechanism for hidden services
(which those documents describe), but it cannot hide the hidden
service's identity key or name from the directory servers.

Robert, in the scheme of comment:1, where you use Ed25519 for the discrete-log-based long-term keypair, the HS address (".onion") is supposed to be the Ed25519 public key base32'ed, right?

If yes, Ed25519 public keys are 256-bits long, whereas old HS addresses were only 80 base32'ed bits. This means that new HS addresses will be much longer than the old ones, right?

If the difference is so big, why not just append a symmetric key to the end of the 80-bits that currently get base32'ed?

What am I missing here?

Replying to asn:

Robert, in the scheme of comment:1, where you use Ed25519 for the discrete-log-based long-term keypair, the HS address (".onion") is supposed to be the Ed25519 public key base32'ed, right?

I would use the Montgomery-form x coordinate of the public-key group element, not the Ed25519 public key format (which consists of the Edwards-form y coordinate plus an additional bit specifying the other Edwards coordinate's LSB). (Point-reduced Montgomery form is more convenient for single-scalar variable-base multiplication, which is what a client would need to do with that group element in order to blind the public key.)

For an elliptic curve without a secure ‘twist’, the Edwards-form y coordinate is probably better. (The extra bit identifying the other coordinate is still unnecessary.)

If yes, Ed25519 public keys are 256-bits long, whereas old HS addresses were only 80 base32'ed bits. This means that new HS addresses will be much longer than the old ones, right?

HS addresses for my protocol over the Curve25519/Ed25519 elliptic curve would be 255 bits long (so 51 base32 characters, not 52). They would be longer than the current protocol's addresses, but 80-bit addresses are too vulnerable to brute-force attacks to be used in any new protocol (about 2^80-n^ ‘operations’ to break the first of 2^n^ targets).

Note that if you want to continue to sacrifice security for usability with the new HS protocol, you can use an elliptic curve over a smaller coordinate field. For example, page 16 of [http://cr.yp.to/newelliptic/twisted-20080313.pdf] specifies a suitable curve over 2^160^ - 47 (32 base32 characters); impersonating any HS over that curve would require a discrete-logarithm computation (around 2^80^ ‘operations’ to impersonate the first HS of any number of targets; significantly harder than brute-forcing an 80-bit space). (There should be a good elliptic curve over a convenient 200-bit coordinate field too, if you want something in between that and Curve25519.)

If the difference is so big, why not just append a symmetric key to the end of the 80-bits that currently get base32'ed?

What am I missing here?

80-bit addresses are no longer long enough to prevent impersonation by brute force.

My protocol is the only one that can be used to prevent HSDir servers from linking an HS's descriptors across days (or whatever time periods you choose for the new protocol). (If you want this feature, you'll also have to either encrypt HS descriptors or abandon introduction points at ‘day’ boundaries. Encrypting the bodies of HS descriptors (everything but the expiration time, day number, and serial number) is a good idea anyway.)

Apart from those reasons, I see several possible protocols that fit your description (or are plausible extensions of it) except for the length of the public-key hash. All of them are trickier to implement securely than my protocol, and require similar HS address lengths to my protocol for similar security levels.

2^200^ - 75, 2^199^ - 49, 2^199^ + 101, 2^198^ - 17, 2^198^ + 15, 2^195^ + 35, 2^194^ - 33, and 2^194^ + 27 are prime.

2^190^ - 11 is also prime.

Replying to rransom:

Replying to asn:

On actual solutions, Robert posted: https://lists.torproject.org/pipermail/tor-dev/2012-September/004026.html some months ago. I don't have the cryptographic skills to robustly analyze his idea, but if this is the only thing we have, we should point some cryptographers at it so that it gets some comments.

For an Ed25519-based signature scheme with both the public-key group element and the base point blinded, the verification equation is equivalent to S*B = (HB(nonce, B, A)^(-1))*R + H(R, HB(nonce, B, A)*B, HB(nonce, B, A)*A, M)*A, where R is carefully chosen to be a uniform random group element and HB(nonce, B, A) is (computationally) independent of R. This equation does not leak any more information about the log of A than the verification equation for unmodified Ed25519 does, so this cryptosystem is obviously as safe as Ed25519.

Thanks for your last message (wrt onion address sizes using your scheme).

Another question: how did you end up with S*B = (HB(nonce, B, A)^(-1))*R + H(R, HB(nonce, B, A)*B, HB(nonce, B, A)*A, M)*A as the verification equation of ed25519? How did the extra HB(nonce, B, A)*B get into H()?

I called it extra because the hash in the vanilla verification equation is H(R,A,M) while yours has 4 parameters: H(R, HB(nonce, B, A)*B, HB(nonce, B, A)*A, M)

Replying to asn:

I called it extra because the hash in the vanilla verification equation is H(R,A,M) while yours has 4 parameters: H(R, HB(nonce, B, A)*B, HB(nonce, B, A)*A, M)

In Ed25519, the public key is A. In my blinded-public-key variant of Ed25519, the blinded public key is (HB(nonce, B, A)*B, HB(nonce, B, A)*A). Since Ed25519 uses H(R, (public key), M) as its message hash, the obvious message hash to use for a blinded-public-key version was H(R, (blinded public key), M).

Hi, this is Tanja Lange Replying to rransom:

Replying to asn:

I called it extra because the hash in the vanilla verification equation is H(R,A,M) while yours has 4 parameters: H(R, HB(nonce, B, A)*B, HB(nonce, B, A)*A, M)

In Ed25519, the public key is A. In my blinded-public-key variant of Ed25519, the blinded public key is (HB(nonce, B, A)*B, HB(nonce, B, A)*A). Since Ed25519 uses H(R, (public key), M) as its message hash, the obvious message hash to use for a blinded-public-key version was H(R, (blinded public key), M).

I'm a bit puzzled by the basepoint updating etc. Could you state what your definition of S is (other than: it's the S that makes this work). In general you want to avoid inversions modulo the group order.

I also doubt that basepoint blinding is a good idea: I understood the idea of this scheme to be that the server can "verify" signatures under the updated public keys and ensure that no party can override another parties data (unless by chance they hit the same daily public key or one party breaks the DLP). However, if I let an attacker choose the basepoint then he can sabotage the system by choosing the public key the same as mine (i.e. overriding my data) and computing a basepoint and matching secret key to have the signature verify on the server side. Of course this would be caught by the clients, because the signature is not valid under what should be today's basepoint so it "only" affects availability but that's not good. You cannot give the functionality to compute today's basepoint to the server (if you want to hide A), so I don't know how to bootstrap from this. I don't have a similar concern with the original scheme posted in the email.

If you want proofs then of course it's necessary to come up with a simple cryptographic assumption that this can boil down to -- but it makes sense to define the scheme first (and rule out simple attacks).

Some commments on the earlier post: I don't see what the trouble with the factor 8 on all sides would be - other than 3 extra doublings.

Trac:
Username: hyperelliptic

Here is ‘pseudocode’ (actually, ready-to-compile code written in a not-yet-implemented programming language) specifying exactly how to implement my signature scheme.

PrivKey = struct {
  a: Exponent;
  messageKeySecret: byte[32];
};
PubKey = struct {
  A: GroupElement;
};
BlindingNonce = byte[32];
BlindedPubKey = struct {
  Bprime: GroupElement;
  Aprime: GroupElement;
};
Message = byte[];
Signature = struct {
  R: GroupElement;
  s: Exponent;
};

B: GroupElement;

keygen = procedure(): (PubKey, PrivKey) {
  priv := PrivKey{
    a = hashToExponent(generateRandomBytes(64)),
    messageKeySecret = generateRandomBytes(32)
  };
  pub := PubKey{
    A = scalarmult(priv.a, B)
  };
  return (pub, priv);
};

blindPub = procedure(pub: PubKey; nonce: BlindingNonce): BlindedPubKey {
  blindingExponent := HB(nonce, B, pub.A);
  return BlindedPubKey{
    Bprime = scalarmult(blindingExponent,     B),
    Aprime = scalarmult(blindingExponent, pub.A)
  };
};

sign = procedure(priv: PrivKey; pub: PubKey; nonce: BlindingNonce;
                 msg: Message): Signature {

  blindedPub := blindPub(pub, nonce);

  r := hashToExponent(H(priv.messageKeySecret, nonce, msg));
  R := scalarmult(r, blindedPub.Bprime);

  messageHash := H(groupElementToBytes(R),
                   groupElementToBytes(blindedPub.Bprime),
                   groupElementToBytes(blindedPub.Aprime),
                   msg);

  s := r + (hashToExponent(messageHash) * priv.a);

  return Signature{R = R, s = s};
};

verify = procedure(blindedPub: BlindedPubKey,
                   msg: Message, sig: Signature): Bool {

  messageHash := H(groupElementToBytes(sig.R),
                   groupElementToBytes(blindedPub.Bprime),
                   groupElementToBytes(blindedPub.Aprime),
                   msg);

  verificationEqLHS := scalarmult(sig.s, blindedPub.Bprime);
  verificationEqRHS := sig.R + scalarmult(hashToExponent(messageHash),
                                          blindedPub.Aprime);

  return (verificationEqLHS == verificationEqRHS);
};

sig.s is computed in almost exactly the same way as Ed25519's S; the only change is in what is hashed along with the message. (I've used a lowercase s here instead of Ed25519's uppercase S so that it won't be mistaken for a group element.)

There is no need to skip Ed25519's multiplication of both sides of the verification equation by 8; I omitted it here (and in my post above) for the sake of clarity.

To prove that the verification equation for this signature scheme leaks no more information about the secret key than Ed25519's verification equation, divide both sides of it by blindingExponent and divide the definition of sig.R by blindingExponent. (The verification equation is equivalent to the equation I posted above.)

Also note that the blinded public key contains the blinded base point; the attacker does not get to choose the base point separately from the blinded public-key group element.

Also, the blinded base point is a function of the non-blinded public key.

Replying to rransom:

Also, the blinded base point is a function of the non-blinded public key. I understand, that's why I said that the clients will notice the attacker, but the server cannot check this (without knowing A -- and the whole point of this scheme is that the server should not know A). There is also no way that the server can verify that B and the other point have been multiplied by the same scalar -- again, that's a feature, otherwise you'd be back to enumeration attacks, so don't try using pairing-friendly curves on this.

My understanding is that the nonce will be time dependent, so clients know the nonce and know the public key of the hidden service and everybody knows B. So, they can compute HB(nonce, B, pub.A) and the updated key parameters. I assume that they then ask the server for the information on Aprime = scalarmult(blindingExponent, pub.A), right? If the server stores the encrypted routing information under Aprime then I can override this -- with a differnt Bprime, but there is no way to say which one is right; you cannot take the first one either, otherwise at least insiders knowing A can compute the next Aprime just a tad faster and submit that along with a wrong Bprime.

Note that you need 2 different verification equations: one for the server (which does not know A and the nonce) and one for clients who first recompute Aprime and then check that the world matches, i.e. the verification needs 2 steps.

I realize that you can bootstrap from this by including Bprime in the storage location so that the real data and the attack data get written to different places, but then you suddendly have twice the length.

Could you describe what worried you about the version in which the basepoint does not vary?

Trac:
Username: hyperelliptic

Hey Robert,

I talked with hyperelliptic today and she explained me her concerns of comment:17. I'm just bumping this thread -- in case you missed the last comment or forgot about it -- because now I'm also wondering about the leak in the version where only the public key is blinded.

Replying to asn:

Hey Robert,

I talked with hyperelliptic today and she explained me her concerns of comment:17.

None of those concerns are legitimate.

I'm just bumping this thread -- in case you missed the last comment or forgot about it -- because now I'm also wondering about the leak in the version where only the public key is blinded.

I have told you three times now -- twice on this ticket, and another time in a semi-private discussion -- why my cryptosystem is as secure as Ed25519. This is the last time that I will explain it to you.

As you know, in all ElGamal-like discrete-logarithm-based signature schemes, each signature discloses a linear equation relating the long-term signing secret key to the per-message secret key. Because of this information leak, a signature scheme which is secure when used with independent, identically distributed long-term signing secret keys may not be secure when a signer uses multiple long-term secret keys which are related by publicly known equations. (Forgery remains as hard as recovering one of the secret keys, even when they are related.)

My original idea was to use an existing signature scheme with multiple related secret keys. Since you and Nick claim to be too afraid of ‘new’ cryptographic primitives to consider using one in Tor, this would have required an easy-to-understand security argument, and I didn't expect it to have one.

In my revised cryptosystem, each signature discloses the same linear equation involving the hidden service's long-term signing secret key that an Ed25519 signature using a slightly different hash function would. Thus, this cryptosystem is provably as secure as Ed25519, and the proof is so simple that anyone who had read the Ed25519 paper would understand it.

Because my goal was to find a cryptosystem which would provide the security properties which a new hidden service protocol would require, I quit looking for a security argument for my original idea once I had found an alternative with such a simple proof of security. I don't know why you have put so much effort into finding a broken scheme to use instead.

Make .onion addresses harder to harvest by directory servers

Child items 0

Activity