Opened 8 months ago

Closed 8 months ago

Last modified 8 months ago

#33650 closed task (fixed)

Verify that intro2 cell extensions actually work

Reported by: arma Owned by: mikeperry
Priority: Medium Milestone:
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-dos tor-dos-2020 anonymous-credentials research
Cc: mikeperry, dgoulet, asn Actual Points: 2.5
Parent ID: #33712 Points:
Reviewer: Sponsor:


In the future we're going to want to transport end-to-end tokens, proofs of work, or other blobs from client to onion service. We should make sure that we can actually add these into the cells without anything going wrong, like surprising asserts or surprising length enforcement.

(Now is the time to notice if things will go wrong, so we can fix them and deploy that fix, and then people will have upgraded by the time we're ready to actually use them.)

So: let's make a branch that adds "hi mom" on the client side, and reads it out again on the service side.

In spelunking through the code and the spec, I found that modern intro2 cells have an "extensions" field inside their encrypted component, which seems perfectly suited for transporting arbitrary blobs from client to service. Here's how we set it currently on the client side:

  /* Set extension data. None are used. */
  ext = trn_cell_extension_new();
  trn_cell_extension_set_num(ext, 0);
  trn_cell_introduce_encrypted_set_extensions(enc_cell, ext);

So that 0 needs to become a 1, and then we need something new that makes and sets a new extension in ext (modeled maybe on build_establish_intro_dos_extension(), and something on the receiving end that invokes trn_cell_extension_parse() and reads it out to us.

I am optimistic, because this thing is encrypted, so the intro point in the middle can't be looking at it very carefully. But if we have bugs on the client side or the service side, or surprise length enforcement in the middle, now is a great time to notice and fix them.

Child Tickets

Change History (17)

comment:1 Changed 8 months ago by asn

Keywords: tor-dos tor-dos-2020 anonymous-credentials research added

comment:2 Changed 8 months ago by mikeperry

We should also do the "hi mom" read + write thing in the HS descriptor, so we can be sure we can signal any required additional intro extension fields.

comment:3 Changed 8 months ago by mikeperry

Owner: set to mikeperry
Status: newaccepted

comment:4 Changed 8 months ago by mikeperry

Oh, and we should test unencrypted extensions in the intro1, so we can know if the intropoint can easily locate and read them, if we want to do intro-side processing. And also test how ugly it gets to have more than one of these extensions in the same cell (like one encrypted, one unencrypted).

comment:5 Changed 8 months ago by mikeperry

Notes for the INTRO1->INTRO2 codepaths:

  • Intro1 is sent from hs_circ_send_introduce1(), and built in hs_cell_build_introduce1()
  • There are actually two extension areas: one is in the unencrypted section, set in hs_cell_build_introduce1(), and one is in the encrypted section, set in introduce1_set_encrypted()
  • The Intropoint processes the INTRO1 cell in hs_intro_received_introduce1()
  • v3 INTRO1 cells go into handle_introduce1() and the cell's unencrypted fields are parsed and validated
  • Requested rate limits are checked in hs_dos_can_send_intro2()
  • The cell is sent as an INTRO2 exactly as is towards the service
  • The service receives it in hs_service_receive_introduce2()
  • The service checks the IP in service_handle_introduce2(), and if valid, parses the cell in hs_circ_handle_introduce2() (via hs_cell_parse_introduce2() and ultimately the intro1 parsing functions)
  • If parsing was successful, a rend circuit is launched from hs_circ_handle_introduce2()

comment:6 Changed 8 months ago by mikeperry

Notes on v3 HSDesc codepaths:

  • We probably want to add our field in the superencrypted section. The spec says this may be extended.
  • v3 descriptors are built via hs_desc_encode_descriptor(), which calls desc_encode_v3() via a table lookup
  • The superencrypted stuff is done in encode_superencrypted_data(), which builds the fields in build_secret_data()
  • Decoding on the client side happens via hs_desc_decode_descriptor(), which calls down to desc_decode_encrypted_v3(), and then hs_desc_decode_superencrypted() and hs_desc_decode_encrypted()

comment:7 Changed 8 months ago by mikeperry

The above notes are just for v3. Do we care about v2? Probably not, right?

comment:8 in reply to:  7 Changed 8 months ago by arma

Replying to mikeperry:

The above notes are just for v3. Do we care about v2? Probably not, right?

I'm thinking that focusing on v3 is the right answer.

First because we have limited time and people to spend here, so let's be sure to do v3 right.

And second because dgoulet and asn found some v2 bugs while they were doing v3 refactoring, and they opted to leave them in place rather than potentially breaking v2 and not noticing. So that v2 code is already kind of decaying.

comment:9 Changed 8 months ago by arma

Another piece of this ticket: let's discover how much extra space we actually have, in the cells, for these blobs.

In particular, if we want to include a blob for the intro point and also a blob for the service, then these two will compete with each other for the limited space.

And that realization opens up two directions for future research:

  • Space-compact tokens. For example, are there proof-of-work schemes where the proof can be communicated in very few bytes, yet it is still efficient for the checker to check it? As one example, I can imagine a scheme where you send only (x, y) where x is a short seed and y is a number of bytes you generate from that seed such that, when you hash the y bytes, you get the property you want. Thus the checker can just expand the y bytes and check the property, but the prover has to find an x and y that achieves the property. I just made that scheme up, and I bet other people have done it way better (i.e. more compact and/or more efficient for the verifier).
  • "Wide" intro cells. In the general case, we'll want to be able to send more than 100-200 bytes of tokens in these cells. So we'll want some way to send big cells, or to send multiple cells that are conceptually tied together. See proposal 249 ("Allow CREATE cells with >505 bytes of handshake data") for ideas that can hopefully be reused here.

comment:10 in reply to:  9 Changed 8 months ago by asn

Replying to arma:

Another piece of this ticket: let's discover how much extra space we actually have, in the cells, for these blobs.

I looked into this.

tl;dr INTRO1 cells have about 188 bytes of free space...

For more info, please see section [SPACE_CONSTRAINTS].

WRT my methodology, I simply went to the place where we send INTRODUCE1 cells and added a printf that printed the payload_len and compared it with RELAY_PAYLOAD_SIZE, then I connected to a v3 service and read the logfile.

Last edited 8 months ago by asn (previous) (diff)

comment:11 Changed 8 months ago by asn

Parent ID: #33703

comment:12 Changed 8 months ago by asn

OK I finished an experiment on the descriptor "hi-mom" test. I went to the encrypted section (which is confusingly the inner layer of the descriptor which only authorized clients should have access to) and added the following line:

+  smartlist_add_asprintf(lines, "hi-mom lets-defend\n");

right below the "single-onion-service" line and above the intro points.

I then started a chutney network with that patch, and ensured that clients can connect to the HS (effectively ignoring the foreign line), and that the string is present in the client-received descriptor:

create2-formats 2
intro-auth-required ed25519
hi-mom lets-defend
introduction-point AgIUQ0NDQ0NDQ0NDQ0NDQ0NDQ0NDQ0MABgECAwQjKQ==
Last edited 8 months ago by asn (previous) (diff)

comment:13 Changed 8 months ago by mikeperry

I dug pretty deep into the INTRO1/INTRO2 free bytes test. Turns out there is a layer of padding that tries to keep the pre-mac intro cell at least HS_CELL_INTRODUCE1_MIN_SIZE (246), as well as some optional things that are not always present. This tends to add about 100 bytes of semi-unused slack space on top of the 188 bytes of free space that asn reported.

In the unit tests, I was able to get 287 additional bytes into an INTRO1, and have it parse correctly, before hitting any asserts or errors.

In the live network, I had to disable sending ipv6 addresses of rend points in order to get close to this. But the good news is that hs_get_extend_info_from_lspecs() on the service side currently ignores all link specs other than ipv4, legacy rsa, and ed25519 (which are all required). So if the DoS/PoW defense is enabled, we can safely disable sending IPv6 RPs and no one will notice.

With IPv6 link spec sending disabled, I was able to get 253 bytes in either the unencrypted or encrypted sections. If I added more fields of extensions, that byte count went down. For example, if I used 10 fields in an extension, I was only able to send 217 bytes (because of the nine additional type and length specifiers). If I added a single encrypted extension field, and a single unencrypted extension field, I was able to fit 124 bytes in each extension (for 248 bytes total).

So we have more room here than 188 bytes. If we trim more unused fields, we might be able to slim it down even more?

See for working code.

No intropoints were harmed in this test. Everything seemed to work fine.

Are we satisfied with this analysis? Can we close this ticket? Should I report to tor-dev@lists?

Last edited 8 months ago by mikeperry (previous) (diff)

comment:14 Changed 8 months ago by mikeperry

Status: acceptedneeds_review

Setting to "needs review" to signal that this analysis is complete, and that folks should take a look.

This code should not be merged!

comment:15 in reply to:  13 Changed 8 months ago by asn

Replying to mikeperry:

Are we satisfied with this analysis? Can we close this ticket? Should I report to tor-dev@lists?

Great analysis.

I'm not sure how viable it is to remove IPv6 link specifiers if we want to maintain IPv6-only support in the future, but it's good to know that things work end-to-end and that we can easily calculate how many bytes we have available (with and without ipv6).

Let's close this ticket and report to tor-dev indeed.

We also need to figure out next steps here.

comment:16 Changed 8 months ago by asn

Actual Points: 2.5
Resolution: fixed
Status: needs_reviewclosed

I'm gonna close this ticket and let's continue research and epxloration in #33712. If someone disagrees, please feel free to reopen it.

comment:17 Changed 8 months ago by asn

Parent ID: #33703#33712
Note: See TracTickets for help on using tickets.