Opened 6 weeks ago

Last modified 2 days ago

#32709 needs_revision enhancement

hsv3: Support onionbalance keys when handling INTRO2 cells

Reported by: asn Owned by: dgoulet
Priority: Medium Milestone: Tor: 0.4.4.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs scaling onionbalance tor-spec network-team-roadmap-2020Q1 044-must
Cc: s7r, gk Actual Points:
Parent ID: #26768 Points: 5
Reviewer: nickm Sponsor: Sponsor27-must


We have encountered a small issue with onionbalance viability for v3s.

While the descriptor cross-certificates are no longer an issue (#29583), there is an issue with the ntor handshake during the INTRODUCE1/INTRODUCE2 handshake between the client and service.

In particular, as specified in rend-spec-v3.txt [NTOR-WITH-EXTRA-DATA] the subcredential (which includes the onion address) is used as part of the ntor key material to generate end-to-end encryption keys and MAC keys so that the service can communicate with the client:

   To make an INTRODUCE1 cell, the client must know a public encryption
   key B for the hidden service on this introduction circuit. The client
   generates a single-use keypair:
             x,X = KEYGEN()
   and computes:
             intro_secret_hs_input = EXP(B,x) | AUTH_KEY | X | B | PROTOID
             info = m_hsexpand | subcredential
             hs_keys = KDF(intro_secret_hs_input | t_hsenc | info, S_KEY_LEN+MAC_LEN)
             ENC_KEY = hs_keys[0:S_KEY_LEN]
             MAC_KEY = hs_keys[S_KEY_LEN:S_KEY_LEN+MAC_KEY_LEN]

The issue here is that when the client prepares the INTRO1 cell, it uses the subcredential of the frontend OBv3 service, but the INTRO2 cell actually goes to a backend OBv3 instance which has a different subcredential. This causes the backend instance to not be able to verify the MAC of the cell, and generally finish the ntor handshake....

Child Tickets

Change History (17)

comment:1 Changed 6 weeks ago by dgoulet

Owner: set to asn
Reviewer: dgoulet
Status: newassigned

comment:2 Changed 6 weeks ago by asn

The suggested approach here is to introduce a torrc option for the OB instances HiddenServiceOnionbalanceFrontendConfigFile that points to a file which contains info about the OB frontend. In particular, it will contain (a number of) FrontendOnionAddress which is the onion address of the frontend, and can be used to decrypt and finish the ntor handshake with the client.

It's still not decided if we should allow the instances with that configured to also be accessed over their regular instance onion or not.

I will be working on this and hopefully have something before the end of week. Unittests might be hard to do.

comment:3 Changed 4 weeks ago by gaba

Keywords: network-team-roadmap-2020Q1 added; network-team-roadmap-september removed

comment:4 Changed 10 days ago by dgoulet

Owner: changed from asn to dgoulet
Status: assignedaccepted

Will take over this for asn. We just had a conf call about it.

comment:5 Changed 10 days ago by dgoulet

Status: acceptedneeds_review

Branch to be tested: ticket32709_043_01

I have done ZERO test on it.

comment:6 Changed 9 days ago by asn

Status: needs_reviewneeds_revision

Hey hey,

this looks really good. I actually tested this (and after fixing a few issues) I managed to connect to the backend service and exchange data over an onionbalance descriptor! That's a first!

I pointed a few issues on GH. The main ones is an off-topic assertion failure on the logging subsystem, and the more important one is something weird going on with the hs_cell_parse_introduce2 data structure and client_pk gets wiped when it shouldn't. I fixed both of those issues quickly for testing but we need to fix them well. Ah, also the time period iteration feature is missing, and I made a comment to that effect.

Other than that, it looks good, plus it needs changes file and man page entry. Also, a unittest for the config parsing wouldn't be bad if possible.

Thanks a lot!

Last edited 9 days ago by asn (previous) (diff)

comment:7 Changed 6 days ago by dgoulet

Keywords: 043-must added

comment:8 Changed 6 days ago by nickm

Hi! I'm going to take a look at the code in detail, but for now, some top-level things, some of which we already talked about on IRC:

  • It needs unit tests
  • CI needs to pass: it looks like practracker is telling us to break up config_service_v3() into manageable pieces
  • I think we need a corresponding patch to rend-spec-v3.txt
  • We should talk a bit about whether we want to accept both possible subcredentials or only one.

comment:9 Changed 5 days ago by nickm

Keywords: postfreeze-ok added

comment:10 Changed 5 days ago by dgoulet

Status: needs_revisionneeds_information

PR and branch are still the same but totally updated/changed after initial review from nickm/asn.

Currently with unit tests and practracker happy. Spec patch is being written by asn with a justification for the usage of both subcredentials.

I've notified asn about this so he can continue OBv3 testing with this branch.

comment:11 Changed 4 days ago by asn

Status: needs_informationneeds_revision

OK did a review on the branch and posted on the PR. Been testing it on OBv3 the past few hours and it works just fine!

As discussed with David, I also wrote a spec branch documenting and analyzing the behavior introduced by this code: Please let me know if something is missing and should be added.

comment:12 Changed 4 days ago by dgoulet

Reviewer: dgouletnickm
Status: needs_revisionneeds_review

comment:13 Changed 3 days ago by nickm

Okay, reading through it all again. I'll make comments in the PR, but here are some higher-level questions.

1) Does it really make sense to compute the subcredential for three time periods? It seems to me that we are never close to more than two periods. Maybe we should calculate the subcredentials for "now" and "the closest period other than now".

2) The performance here is going to be needlessly bad. Keep in mind that every time we call hs_ntor_client_get_introduce1_keys(), we're doing a curve25519 calculation... but the curve25519 calculation will be the same here every time! The only input that changes is the subcredential, which is an input only to the XOF() part of the process. This could be a followup branch, I guess, though.

3) We should think about timing side channels here. This could also be a followup, however.

4) Test coverage on these changes should really be higher.

comment:14 Changed 3 days ago by nickm

Status: needs_reviewneeds_revision

comment:15 Changed 3 days ago by nickm

We talked about this on IRC; here's the current plan as I understand it.

We're going to aim this ticket for 0.4.4, since doing it properly will require more time than we have for 0.4.3. We'll plan to open the merge window for 0.4.4 early, for this sponsor only.

Between today and next Wednsday or so, I'll do a branch based on this one that will:

a) turn subcredential into a proper type
b) implement a constant-time multi-subcredential backend solution for items 2 and 3 above, on top of this branch.
c) refactor config_service_v3() a little

After that, for minimal viable product, we need:

d) improved test coverage,
e) to stop recomputing the subcredentials for every request

Later on, we can make these improvements:

f) compute fewer subcredentials

comment:16 Changed 3 days ago by nickm

Keywords: 044-must added; 043-must postfreeze-ok removed
Milestone: Tor: 0.4.3.x-finalTor: 0.4.4.x-final
Points: 25

comment:17 Changed 2 days ago by nickm

I have done (a) and (b) above; I'll get to (c) before Wednesday.

I'm putting my work in progress at ticket32709_044_01 with PR at in case you want to take a look. It conflicts with the merged #32137, so CI isn't running on it for the moment.

Note: See TracTickets for help on using tickets.