Opened 7 months ago

Last modified 7 months ago

#33712 new defect

Design a PoW scheme for HS DoS defence

Reported by: asn Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs, tor-dos, network-team-roadmap-2020Q2, network-health, research
Cc: Actual Points:
Parent ID: #31223 Points: 96
Reviewer: Sponsor:

Description (last modified by asn)

Some initial research has been done on comment:13:ticket:31223 to sketch out a PoW/credential scheme that works for HS DoS defence.

This ticket is to continue scheming and working on this design.

See https://lists.torproject.org/pipermail/tor-dev/2020-April/014215.html for initial proposal.

Child Tickets

TicketStatusOwnerSummaryComponent
#33650closedmikeperryVerify that intro2 cell extensions actually workCore Tor/Tor
#33843assigneddgouletWrite detailed priority queue scheduler design on the proposalCore Tor/Tor
#33844assignedasnDo next iteration of proposal by folding in comments from dgoulet/mikeCore Tor/Tor

Change History (9)

comment:1 Changed 7 months ago by asn

Description: modified (diff)

comment:2 Changed 7 months ago by asn

The strawman proposal of a basic PoW-over-INTRO scheme is:

1) Client sends INTRO1 with a special PoW extension
2) Intro sends back INTRO_CHALLENGE to client with a nonce
3) Client crafts PoW with that nonce and sends it back to client
4) Intro validates PoW difficulty and either forwards intro to service or rejects.

This can come with various variants like the service encoding the nonce and parameters in the descriptor in an attempt to cut the challenge round trip (with extra complexity coming from replay detection etc.), or with clients doing PoW bidding (or "staking") as proposed in the recent call.


I wanted to mention this strawman proposal as a basic building block after reading that mtp-argon2 type of protocols require way too much space for proving. I was wondering if the above strawman approach but using argon2 as the hash function for memory-hardness would work for us, but then I understood that the space requirement is caused by using a merkle tree as part of enforcing the memory-hardness; as in that argon2 itself is not sufficient to enforce full memory-hardness.

So how do we go from a simple PoW scheme like the above, to something that works for us? Is it just the memory-hardness that we are losing by using the strawman approach over a more hardcore mtp-argon2 approach?

comment:3 in reply to:  2 Changed 7 months ago by mikeperry

Replying to asn:

The strawman proposal of a basic PoW-over-INTRO scheme is:

1) Client sends INTRO1 with a special PoW extension
2) Intro sends back INTRO_CHALLENGE to client with a nonce
3) Client crafts PoW with that nonce and sends it back to client
4) Intro validates PoW difficulty and either forwards intro to service or rejects.

This can come with various variants like the service encoding the nonce and parameters in the descriptor in an attempt to cut the challenge round trip (with extra complexity coming from replay detection etc.), or with clients doing PoW bidding (or "staking") as proposed in the recent call.

The above protocol sketch requires that the service only choose intropoints that have upgraded to fully support the protocol. This is risky, and still will require requires at least several months to deploy. Possibly much longer, if we are nervous about only a few IPs being available for use at a time by services under attack.

I think it is most important that we separate our protocols by how much network upgrade (and/or external infra) is required before they can be used.

To this end, here are some variants that we should keep in mind that require no network upgrades, or less extensive ones.

Service-as-validator (requires no IP upgrades):

  1. Descriptor lists a challenge input seed, updated every X minutes or every Y intros
  2. Client generates its own GUID challenge to combine with descriptor seed
  3. Client sends INTRO1 with PoW extension in encrypted extension section, in 253 bytes
  4. Service verifies client's GUID is unique since its last descriptor seed update
  5. Service itself verifies PoW (which is supposed to be fast)
  6. Service then builds rend if PoW passes (and drops otherwise)

Now, at 253 bytes, we lose most or all memory hardness guarantees of the PoW, but it can still be computationally expensive and possibly also GPU-resistant.

One minor variant that also requires no network upgrades is to use an external credential server that accepts a full memory-hard 11KB Itsuku PoW and gives out smaller privacy pass credentials that can be sent directly to service in the encrypted extension, which verifies the privacy pass credential. This also requires no network upgrades, but the external credential server must be built and deployed, and will be subject to DoS attack too.

If we expand scope slightly to allow intropoint upgrades that are minimal enough to backport to all releases, we can remove the "1 intro1 cell per circuit" limit at the intropoint if rate limits are requested by the service (this is roughly a 3 line diff). Then, schemes like the following become possible:

  1. Descriptor lists a challenge input seed, updated every X minutes or every Y intros
  2. Client generates its own GUID challenge to combine with descriptor seed
  3. Client sends INTRO1 with "multi-cell" extension in encrypted extension section
  4. Client sends Itsuku proof (<11KB, since we don't need that much) over subsequent cells
  5. Service combines these chained INTROs to reassemble PoW
  6. Service verifies client's GUID is unique since its last descriptor seed update
  7. Service itself verifies PoW (which is supposed to be fast)
  8. Service then builds rend if PoW passes (and drops otherwise)

Because the change to allow protocols like this is small, hopefully we can backport it and deploy it to the network much, much faster than a full release cycle. If we can't do that, we should rule out this class of protocol.

comment:4 Changed 7 months ago by mikeperry

TL;DR of above is that there are multiple classes of protocol changes, and I think we should decide our high-level preference for what order we want to attempt them.

Here's a strawman ordering of the combination of (degree of network upgrade required, external infra required) pairs:

  1. No network upgrade required; no external infra deployment (ie: client and service only)
  2. Backportable network upgrade required; no external infra deployment.
  3. No network upgrade required; external infra deployment is required.
  4. Backportable network upgrade required, external infra deployment is required.
  5. Non-backportable IP upgrade required; no external infra deployment.
  6. Non-backportable IP upgrade required; external infra deployment is required
  7. Non-backportable full network upgrade; no external infra deployment.
  8. Non-backportable full network upgrade; external infra deployment is required

Quick example of 1: Smuggle PoW inside a single INTRO1 cell
Quick example of 8: New wide-cell larger than 512 bytes to send a big anonymous credential provided by a captcha server.

Important questions to answer:

  • Does this ordering sound right in terms of what we should prefer/attempt to build first?
  • Does anyone want to propose a different ordering?
  • What sorts of changes are backportable for this problem?
Last edited 7 months ago by mikeperry (previous) (diff)

comment:5 Changed 7 months ago by arma

I think I arrive at the same conclusions as Mike, but for different reasons.

To me, the reason to focus on the "client to service" defense is that I believe whatever design we pick is going to need to have a "client to service" component, i.e. I don't think we can solve this problem solely with a "client to intro point" component. So if we need client-to-service, let's try to construct a system where that is sufficient.

I'm not much worried about time-to-upgrade. Let's think about the eventual result we want, and then get ourselves there. A few months for upgrades will pass before we know it, and if we have an important update that needs more relays to upgrade, we can ask them to do it.

Or, to rephrase: I currently think that client-to-service is the right area to focus on, because we're going to need it for the eventual solution, not because we have to constrain ourselves to solutions we can deploy this week. If we have a solution we like that involves parts of the network upgrading, let's be sure to keep it on the table so we don't accidentally rule out our best future just because it would be more logistics work to get there.

comment:6 Changed 7 months ago by gaba

Keywords: network-team-roadmap-2020Q2 added; network-team-roadmap-2020Q1 removed

comment:7 Changed 7 months ago by gaba

Points: 96

comment:9 Changed 7 months ago by asn

Description: modified (diff)
Summary: Design a PoW/credential scheme for HS DoS defenceDesign a PoW scheme for HS DoS defence
Note: See TracTickets for help on using tickets.