Opened 2 years ago

Last modified 5 days ago

#8244 assigned enhancement

The HSDirs for a hidden service should not be predictable indefinitely into the future

Reported by: arma Owned by:
Priority: normal Milestone: Tor: 0.2.???
Component: Tor Version:
Keywords: tor-hs, needs-proposal, 026-triaged-1, 027-triaged-1-out Cc: rpw, tom@…, iang, tD66BSHWU@…, dave2008713@…
Actual Points: Parent ID: #12424
Points:

Description

When a hidden service chooses what HSDir relays to publish its hidden service descriptor to, it does it in a deterministic way based on day and .onion address. That way clients can do the same calculation to decide what HSDirs to contact when fetching the hidden service descriptor.

But a flaw in this approach is that anybody can predict what six HSDir relays will be responsible for a given hidden service, 22 days from now. There's no reason to have that property, and it makes attacks to temporarily censor a hidden service much more effective since you can e.g. choose the identity keys of your Sybils such that there exists a day in the next 30 days where you'll be running all six of the HSDirs for your target hidden service.

One solution would be for the directory authorities to produce a periodic random string that is unpredictable until they have produced it. Then put that string in the consensus.

The first issue is whether a single authority can play tricks where it waits to vote until it sees the votes from the other authorities, and then chooses its random string to produce the desired consensus random string. This issue is actually really serious, since I bet for any six adversarial HSDirs, there exists a random string that puts them in charge of the target hidden service. See all the contortions we went through in http://freehaven.net/anonbib/#casc-rep about generating a consensus random number; I hope we don't need as many contortions here.

The second issue is how we should handle transitions between epochs. One option is to post two random strings (today's and tomorrow's), and then each hidden service uses both of them. Surely there's a more efficient answer here.

I guess issue number three is how the directory authorities should vote on a thing that doesn't have granularity of one vote period. Do they all just vote the random string that they voted at the beginning of the day, for the whole day? If I'm an authority and I missed the first hour of the day, do I get to add my vote on the second hour (I think the answer has to be no)? What if there weren't enough votes to make a consensus in the first hour? If I come up as an authority and can't get a proper recent consensus, but now it's time to vote, what do I vote?

And lastly, how do we transition? I think hidden services would publish to the old ones and the new ones, until clients that don't know about the new way are obsolete. In the mean time that increases the exposure of the hidden service to an adversary who just wants to get one of the n HSDirs for the hidden service for that period. (Is getting some-but-not-all that bad?)

(This ticket is inspired by rpw's upcoming Oakland paper)

Child Tickets

Change History (29)

comment:1 Changed 2 years ago by tom

  • Cc tom@… added

comment:3 Changed 23 months ago by nickm

(I haven't read it in enough detail to see whether it requires all of the crypto that it describes in section 2, and some of the constructions look pretty complex to implement, but I bet it's got a good set of references to start reading through at the very least.)

comment:4 follow-up: Changed 23 months ago by rransom

Pedersen's secret-sharing scheme (http://www.cs.huji.ac.il/~ns/Papers/pederson91.pdf (sīc)) is simpler and is sufficient for generating a shared random nonce.

(Usually that paper is only cited for the Pedersen commitment scheme, which is more widely used as a building block in cryptosystems and protocols. This time, you want the secret-sharing parts.)

comment:5 Changed 23 months ago by iang

  • Cc iang added

comment:6 Changed 22 months ago by qSKvY

  • Cc tD66BSHWU@… added

comment:7 in reply to: ↑ 4 Changed 22 months ago by qSKvY

Replying to rransom:

Pedersen's secret-sharing scheme (http://www.cs.huji.ac.il/~ns/Papers/pederson91.pdf (sīc)) is simpler and is sufficient for generating a shared random nonce.

(Usually that paper is only cited for the Pedersen commitment scheme, which is more widely used as a building block in cryptosystems and protocols. This time, you want the secret-sharing parts.)

Are you referring to the scheme outlined in section 5.2?

comment:8 Changed 21 months ago by nickm

I'm posting some comments I've had here from Ian Goldberg and Aniket Kate in an email thread. I'll excerpt the technical stuff that's most relevant to this ticket.

Nick

Do you have any quick insights on the correct "let's all securely come up with a random number" scheme? I'd imagine you know a good bit about the field.

Ian:

It very strongly depends on your network model. What can you assume about connectivity, for example? What can you assume about reachability of the participants? (Are the hosts up? Is the network up?) What is the adversary model? (Are some of the participants controlled by the adversary? How many?)

Depending on the answers, it can be as simple as "in each timeslot, everyone broadcasts a random string (which are XORed) and a hash of their upcoming random string for the next timeslot", or as complicated as Aniket's thesis.

Nick:

Hm. I think that for Tor directory authorities, our model is something like, "We assume that some fraction of them might be down; we want the algorithm to still work even if some fraction of them are compromised; we don't want compromised directories to be able to choose the output."

In general, we tolerate situations where a rogue authority can make the algorithm fail to produce a result, if we can point to which authority was failing.

Implementation complexity is a killer here, too.

Ian:

I seem to remember that the hourly consensus voting protocol has a couple of phases that execute synchronously, in ~5-minute windows just before the hour. Can you remind me of those phases? How suspicious is it if an authority participates in one phase, but not the last phase? Is it acceptable for a malicious authority to be able to choose from one of two outputs values: one if he participates honestly, and one if he drops out at the last minute?

Nick:

[summary of algorithm phases deleted]

How suspicious is it if an authority participates in one phase, but not the last phase?

Not horribly so if it happens once in a while. If it happened with any regularity, it would probably be suspicious. [though we'd probably suspect a bug rather than malice]]

Is it acceptable for a malicious authority to be able to choose from one of two outputs values: one if he participates honestly, and one if he drops out at the last minute?

Hm. I don't think it's ideal, but it's certainly better than the status quo. I'd like to do a little analysis to see how much this ability helps or doesn't help an attacker's ability to do the various censorship attacks we're trying to prevent here.

It would be proportionally more troublesome if (say) three colluding authorities could pick any one of eight different values. The same analysis would be needed there.

I assume you're thinking of a system where honest authorities do a commitment in phase 1, reveal their secrets in some new phase 1C, and the shared secret is just (say) the hash of all the revealed secrets?

Ian:

That indeed would be the implementation-simple approach.

There's also the possibility of just using the authorities' signatures on the consensus *as* their values. But then you'd need to be using a unique signature scheme (there's only one valid signature for any (key,message) pair); is that the case now?

Nick:

We're using RSA2048-PKCS1 signatures, which are deterministic. We're hoping to add Ed25519, which is deterministic, but not verifiably so by a third party.

(To be continued...)

comment:9 Changed 21 months ago by nickm

Then Aniket said:

  • In the synchronous setting, the round 1-commit and round 2-open approach is indeed an efficient approach if an authority has lot to lose for not opening his/her commitment.

Nick:

Well, if it happens once in a while, an authority operator could always claim it was a bug.

Aniket also had some comments about the voting protocol itself; I'll open a fresh ticket for those.

comment:10 Changed 21 months ago by nickm

(The above was reposted with permission,)

comment:11 Changed 20 months ago by asn

This also looks relevant:
www.cypherpunks.ca/~iang/pubs/DKG.pdf

comment:12 Changed 20 months ago by qSKvY

Robust random number generation for peer-to-peer systems
http://www14.in.tum.de/personen/scheideler/papers/OPODIS-116b.pdf
Talks about the problem of distributed random number generation and presents possible a solution. The authors describe what is essentially the ticketed problem as their motivation.
Seems to be designed for situations where less than n/6 peers are adversarial.

Another paper by the same guys that seems very relevant is:
Group Spreading: A protocol for provably secure distributed name service
http://www.cs.jhu.edu/%7Escheideler/papers/p_icalp04.ps.gz

comment:13 Changed 19 months ago by asn

paper and software worth checking out:
https://crysp.uwaterloo.ca/software/DKG/download.html

looking at the protocol figures in the paper, it doesn't seem like a lightweight protocol.

comment:14 Changed 19 months ago by nickm

On the bright side, they appear to have an implementation. If the quality is up to snuff, we might be able to go with it. Unfortunately, ISTR that it's in C++, so it might not be easy to audit for safety. Also, I'm not sure if anybody else is using it--and if they aren't, we'd be at risk for becoming the sole users, which eventually leads to becoming the maintainers. :)

comment:15 Changed 19 months ago by iang

Indeed, I don't know of anyone using that code in a production environment. (Nor have I personally audited the code, in fact.)

comment:16 Changed 18 months ago by qSKvY

How [in]secure is Tor's consensus agreement / voting stuff?
A lot of the complexity in the DKG protocol seems to come from it solving the "agreement on a set problem".
The directory authorities also face this problem when they're voting and agreeing on consensus documents.
You might be able to simplify it quite a bit by piggybacking what Tor already has, perhaps through use of "extra-info documents".

comment:17 Changed 18 months ago by asn

Apparently the GNUNet people are also in need of a similar mechanism. Specifically, they need a way to generate public keys in a collaborative byzantine setup; this is harder than our use case, since we only need to generate a nonce. Particularly they are planning on implementing a paper called "One Round Threshold Discrete-Log Key Generation without Private Channels".

Apparently, the complex is not easy to implement and contains peculiar crypto schemes. Part of the complexity is caused by the fact that the resulting value must be a public key, so all the nodes have to do multiple checks to ensure that the resulting value is a public key.

comment:18 follow-up: Changed 18 months ago by qSKvY

I think if you just wanted to generate a nonce with that system you'd still need to perform the same checks. Most of them seem to be for proving that all the shares have been correctly created and encrypted and that nothing has been corrupted.

It looks like it'd be well suited for layering on top of an existing Byzantine fault tolerant consensus system [like Tor's directory protocol]. Their requirement of a synchronous network or an "incorruptible third party" is a little worrisome though.

comment:19 in reply to: ↑ 18 Changed 18 months ago by asn

Replying to qSKvY:

I think if you just wanted to generate a nonce with that system you'd still need to perform the same checks. Most of them seem to be for proving that all the shares have been correctly created and encrypted and that nothing has been corrupted.

It looks like it'd be well suited for layering on top of an existing Byzantine fault tolerant consensus system [like Tor's directory protocol]. Their requirement of a synchronous network or an "incorruptible third party" is a little worrisome though.

Yep, seems that way.

FWIW, here is the proposal of Nick Hopper: https://lists.torproject.org/pipermail/tor-dev/2013-December/005944.html
We need to look into this more. See what kind of "proof of knowledge" can work, look into generation of threshold elgamal keypairs, etc.

comment:20 Changed 17 months ago by dave2008

  • Cc dave2008713@… added

comment:21 Changed 16 months ago by nickm

  • Milestone changed from Tor: 0.2.5.x-final to Tor: 0.2.6.x-final

comment:22 Changed 13 months ago by nickm

  • Keywords 026-triaged-1 added
  • Parent ID set to #12424

comment:23 Changed 8 months ago by nickm

  • Milestone changed from Tor: 0.2.6.x-final to Tor: 0.2.7.x-final

These are important, but they're almost certainly not going to get done by January.

comment:24 Changed 5 months ago by asn

And here is the strawman proposal by nick: https://lists.torproject.org/pipermail/tor-dev/2013-November/005878.html . And #14354 is a ticket for discussing strategies for making it easier to run dirauth scripts like the one required by this ticket.

Last edited 5 months ago by asn (previous) (diff)

comment:25 Changed 4 months ago by nickm

  • Status changed from new to assigned

comment:26 Changed 3 months ago by nickm

  • Keywords 027-triaged-1-out added

Marking triaged-out items from first round of 0.2.7 triage.

comment:27 Changed 3 months ago by nickm

  • Milestone changed from Tor: 0.2.7.x-final to Tor: 0.2.???

Make all non-needs_review, non-needs_revision, 027-triaged-1-out items belong to 0.2.???

comment:28 Changed 8 days ago by kernelcorn

There is an estimation of the consensus in my OnioNS paper. My analysis found ~8.9 kilobits of entropy based on routers that change some selected fields in their descriptors, and 16kb - 28kb of entropy based on routers entering and leaving the Tor network. Using the consensus (at least cached-microdesc and cached-certs) as a global source of entropy appears to be a secure idea. It's also trivial to add more entropy. Although not everyone has the same consensus at the same time, they are timestamped, so a specific consensus is easy to reference. That's what I'm doing with OnioNS anyway.

comment:29 Changed 5 days ago by nickm

The consensus is a less secure source of secure entropy, because a single hostile authority can manipulate it.

All the authority has to do is wait for the honest authorities to vote, and then try generating as many different candidate votes as possible, computing the consensus from the honest votes and each candidate vote. Finally, it publishes the candidate vote that gives it the most favorable outcome.

Note: See TracTickets for help on using tickets.