Opened 8 years ago

Closed 7 years ago

Last modified 6 years ago

#3460 closed task (fixed)

Replay-detection window for HS INTRODUCE2 cells causes HS reachability failures

Reported by: rransom Owned by: rransom
Priority: Medium Milestone: Tor: 0.2.3.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-hs
Cc: arma, atoruser Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Currently, hidden services only accept a v3 INTRODUCE2 cell from a client if the timestamp it contains is within 30 minutes of the service's current time, so that the service doesn't need to keep entries in its replay-detection cache for very long. We should expand that window.

But in order to figure out how large the window should be, we need some statistics for how many entries are stored in a popular hidden service's replay cache. We should also investigate the typical lifetimes of HS descriptors and service-side introduction-point circuits, to find out whether we can remove the timestamp check entirely.

Child Tickets

Change History (12)

comment:1 Changed 8 years ago by Sebastian

Milestone: Tor: 0.2.2.x-finalTor: 0.2.3.x-final

How would we do that? Seems like it would require a tor patch to gather this stuff?

comment:2 Changed 7 years ago by rransom

Milestone: Tor: 0.2.3.x-finalTor: 0.2.2.x-final
Summary: Expand replay-detection window for HS INTRODUCE2 cellsReplay-detection window for HS INTRODUCE2 cells causes HS reachability failures

My plan for how to fix this no longer involves expanding the replay-detection window, even on 0.2.2.x.

The Right Thing is to split our current 60-minute per-hidden-service (at least I hope it's per-HS) replay-detection cache (which handles both clients' DH public keys and the RSA-encrypted portions of INTRODUCE2 cells) into a per-HS DH public key replay cache that only holds entries for five minutes, purely as a performance improvement (so we continue to not launch multiple attempts to connect to a single rendezvous point), and a per-intro-point replay cache that holds the non-malleable part of the INTRODUCE2 message for the lifetime of the intro point, to provide security against replay attacks.

The easiest way to limit the size of the per-intro-point replay cache will be to limit the number of INTRODUCE2 cells sent to each intro point before it is replaced.

I'm setting this ticket back to 0.2.2.x, because the scary part of this change will be making intro points expire after a while, and we need to apply that to 0.2.2.x in order to fix the service-side part of #3825.

comment:3 Changed 7 years ago by rransom

Cc: atoruser added
Status: newneeds_review

See bug3460-v4 ( https://git.torproject.org/rransom/tor.git bug3460-v4 ) for a (not-yet-tested) fix. This is a service-side change, and will need to be merged to master for testing before we decide whether to merge it to maint-0.2.2 .

We should collect statistics on how often this branch expires intro points, and why they expired, but I haven't dug into the stats code yet.

I'm CCing ‘atoruser’ on this ticket because this branch will need testing on a hidden service, and it is a prerequisite for the service-side fix for #3825.

comment:4 Changed 7 years ago by nickm

Okay, I've got some obvious stuff to sort out in my head before I can review this.

Stupid questions: What if, after we replace an intro point, we accidentally pick the same intro point later on? What if we stop, then restart and pick the same intro point? Is it just service key rotation that keeps this safe? (And am I right in thinking that everybody uses the introduce format that include service keys?)

Also, it seems that this approach has a nasty possibility where I "just" make 16K bogus introduce attempts -- I don't need to even do a gx; I only need to do the public RSA -- and make you choose a different intro point. Probably I could keep doing this until you're using an intro point I like. Not a terribly cheap attack, but could be worth analyzing. Maybe the right answer is to change only the service key, but keep the same introduction points until you would otherwise rotate them?

Here's another dumb question: Why take this approach rather than, say, just incrementing the window from 30 minutes to 12 hours?

comment:5 in reply to:  4 Changed 7 years ago by rransom

Replying to nickm:

Okay, I've got some obvious stuff to sort out in my head before I can review this.

The ‘service key’ is the only long-term identifier of a hidden service. In the v2 hidden-service directory protocol, the service key is only used to sign HS descriptors.

The ‘introduction keys’ (one for each introduction point) are ephemeral keys. We generate a new introduction key when we launch the introduction-point circuit we will use it on; this causes #1307.

Stupid questions: What if, after we replace an intro point, we accidentally pick the same intro point later on? What if we stop, then restart and pick the same intro point?

If we generate the same introduction key twice, we're way more screwed than having insufficient replay detection for INTRODUCE cells. If we don't generate the same intro key twice, no one will be able to link the old one to the new one (except possibly by sending the same DH public key in an INTRODUCE cell to both), even if we pick the same relay.

(An introduction point is identified at the intro-point relay only by its introduction key, not its service key.)

Is it just service key rotation that keeps this safe?

Introduction key rotation keeps this safe.

(And am I right in thinking that everybody uses the introduce format that include service keys?)

We don't seem to have an INTRODUCE cell format that includes an identifier of the service key (unless an introduction point was created for use in the v0 HS directory protocol). In the current protocols, INTRODUCE cells only contain a digest of the introduction key, and it is sent as plaintext so that the intro-point relay can route the cell to the correct intro point.

Also, it seems that this approach has a nasty possibility where I "just" make 16K bogus introduce attempts -- I don't need to even do a g^x; I only need to do the public RSA -- and make you choose a different intro point. Probably I could keep doing this until you're using an intro point I like. Not a terribly cheap attack, but could be worth analyzing.

It sounds bad -- it could at least censor a service until its intro circs die of old age -- but flooding the intro-point relays with CREATE cells unrelated to the HS does that too, and is almost as easy.

Also, one thing I want to achieve (later, not in this branch) partly by expiring circuits is to spread the circuit-extension load of a popular HS over multiple intro points. This will involve creating many more intro points if a service appears to be popular. I'm hoping to be able to do this by building multiple new intro points when an intro point expires due to overuse (with the number of new intro points depending on how early the circuit expired). That and the fact that HSes never put multiple intro points on the same relay should make flooding with INTRODUCE2 cells not a useful attack.

Maybe the right answer is to change only the service key, but keep the same introduction points until you would otherwise rotate them?

You should mean ‘introduction key’ here, because we can't change the ‘service key’ for a hidden service.

I don't think we can reuse a service-side intro-point circuit for any other purpose (even as an intro-point circuit with a different introduction key) once we have sent the ESTABLISH_INTRODUCE cell.

Here's another dumb question: Why take this approach rather than, say, just incrementing the window from 30 minutes to 12 hours?

Merely increasing the replay-detection window increases RAM consumption on the HS. Increasing the window to 12 hours would mean that we keep some cells in the replay cache for 24 hours.

comment:6 Changed 7 years ago by nickm

Okay, I think I understand it now, and I'd like to merge it. A few final requests:

  • Could we rename one of the "accepted_intros" fields to something else, and document: that the per-service one works differently from the per-intropoint one; that one is for security and the other is to avoid redundant rendezvous; and that one lasts for the lifetime of the intro point while the other gets cleaned? It seems error-prone to have two maps with similar names and documentation but with crucially different semantics.
  • Can I have a patch to rend-spec.txt to document this new behavior?

comment:7 in reply to:  6 ; Changed 7 years ago by rransom

Milestone: Tor: 0.2.2.x-finalTor: 0.2.3.x-final

Replying to nickm:

Okay, I think I understand it now, and I'd like to merge it. A few final requests:

  • Could we rename one of the "accepted_intros" fields to something else, and document: that the per-service one works differently from the per-intropoint one; that one is for security and the other is to avoid redundant rendezvous; and that one lasts for the lifetime of the intro point while the other gets cleaned? It seems error-prone to have two maps with similar names and documentation but with crucially different semantics.

Done and pushed to my bug3460-v4 branch, along with documentation-comment fixes.

  • Can I have a patch to rend-spec.txt to document this new behavior?

Real Soon Now. rend-spec.txt doesn't say what the timestamp field in an INTRODUCE* cell is for, so this Tor change doesn't actually make anything in rend-spec.txt wrong or obsolete. I also see some other things that need to be fixed (the description of HidServAuth is wrong).

I'm changing the milestone to 0.2.3.x-final, because you're not going to merge this change to 0.2.2.x.

comment:8 Changed 7 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

Okay, I like this now. I've reviewed it a bunch, and if there are more bugs here, I'm not seeing 'em at the moment.

I got a bunch of merge conflicts in rendservice.c , mostly related to router->node shifts. Please review 628b735fe39e13cc37afb carefully.

comment:9 in reply to:  8 Changed 7 years ago by rransom

Replying to nickm:

Okay, I like this now. I've reviewed it a bunch, and if there are more bugs here, I'm not seeing 'em at the moment.

I got a bunch of merge conflicts in rendservice.c , mostly related to router->node shifts. Please review 628b735fe39e13cc37afb carefully.

Looks good! (Except for the part where I added “XXXX WTF?” in commit e46d56a9b4458370cb8de0c92e10688402749845 because the code problem there was an unrelated bug (now #4607).)

comment:10 in reply to:  7 Changed 7 years ago by rransom

Replying to rransom:

Replying to nickm:

  • Can I have a patch to rend-spec.txt to document this new behavior?

Real Soon Now. rend-spec.txt doesn't say what the timestamp field in an INTRODUCE* cell is for, so this Tor change doesn't actually make anything in rend-spec.txt wrong or obsolete. I also see some other things that need to be fixed (the description of HidServAuth is wrong).

Opened #4608 for ‘actually document our replay-prevention code’, and #4609 for ‘fix the description of HidServAuth’.

comment:11 Changed 6 years ago by nickm

Keywords: tor-hs added

comment:12 Changed 6 years ago by nickm

Component: Tor Hidden ServicesTor
Note: See TracTickets for help on using tickets.