Verify that intro2 cell extensions actually work

Trac:
Parent Ticket: #33712 (moved)

added actualpoints::2.5 anonymous-credentials component::core tor/tor owner::mikeperry parent::33712 priority::medium research resolution::fixed severity::normal status::closed tor-dos tor-dos-2020 type::task labels

Trac:
Keywords: N/A deleted, tor-dos tor-dos-2020 anonymous-credentials research added

We should also do the "hi mom" read + write thing in the HS descriptor, so we can be sure we can signal any required additional intro extension fields.

Trac:
Owner: N/A to mikeperry
Status: new to accepted

Oh, and we should test unencrypted extensions in the intro1, so we can know if the intropoint can easily locate and read them, if we want to do intro-side processing. And also test how ugly it gets to have more than one of these extensions in the same cell (like one encrypted, one unencrypted).

Notes for the INTRO1->INTRO2 codepaths:

Intro1 is sent from hs_circ_send_introduce1(), and built in hs_cell_build_introduce1()
There are actually two extension areas: one is in the unencrypted section, set in hs_cell_build_introduce1(), and one is in the encrypted section, set in introduce1_set_encrypted()
The Intropoint processes the INTRO1 cell in hs_intro_received_introduce1()
v3 INTRO1 cells go into handle_introduce1() and the cell's unencrypted fields are parsed and validated
Requested rate limits are checked in hs_dos_can_send_intro2()
The cell is sent as an INTRO2 exactly as is towards the service
The service receives it in hs_service_receive_introduce2()
The service checks the IP in service_handle_introduce2(), and if valid, parses the cell in hs_circ_handle_introduce2() (via hs_cell_parse_introduce2() and ultimately the intro1 parsing functions)
If parsing was successful, a rend circuit is launched from hs_circ_handle_introduce2()

Notes on v3 HSDesc codepaths:

We probably want to add our field in the superencrypted section. The spec says this may be extended.
v3 descriptors are built via hs_desc_encode_descriptor(), which calls desc_encode_v3() via a table lookup
The superencrypted stuff is done in encode_superencrypted_data(), which builds the fields in build_secret_data()
Decoding on the client side happens via hs_desc_decode_descriptor(), which calls down to desc_decode_encrypted_v3(), and then hs_desc_decode_superencrypted() and hs_desc_decode_encrypted()

The above notes are just for v3. Do we care about v2? Probably not, right?

Replying to mikeperry:

The above notes are just for v3. Do we care about v2? Probably not, right?

I'm thinking that focusing on v3 is the right answer.

First because we have limited time and people to spend here, so let's be sure to do v3 right.

And second because dgoulet and asn found some v2 bugs while they were doing v3 refactoring, and they opted to leave them in place rather than potentially breaking v2 and not noticing. So that v2 code is already kind of decaying.

Another piece of this ticket: let's discover how much extra space we actually have, in the cells, for these blobs.

In particular, if we want to include a blob for the intro point and also a blob for the service, then these two will compete with each other for the limited space.

And that realization opens up two directions for future research:

Space-compact tokens. For example, are there proof-of-work schemes where the proof can be communicated in very few bytes, yet it is still efficient for the checker to check it? As one example, I can imagine a scheme where you send only (x, y) where x is a short seed and y is a number of bytes you generate from that seed such that, when you hash the y bytes, you get the property you want. Thus the checker can just expand the y bytes and check the property, but the prover has to find an x and y that achieves the property. I just made that scheme up, and I bet other people have done it way better (i.e. more compact and/or more efficient for the verifier).
"Wide" intro cells. In the general case, we'll want to be able to send more than 100-200 bytes of tokens in these cells. So we'll want some way to send big cells, or to send multiple cells that are conceptually tied together. See proposal 249 ("Allow CREATE cells with >505 bytes of handshake data") for ideas that can hopefully be reused here.

Replying to arma:

Another piece of this ticket: let's discover how much extra space we actually have, in the cells, for these blobs.

I looked into this.

tl;dr INTRO1 cells have about 188 bytes of free space...

For more info, please see section [https://lists.torproject.org/pipermail/tor-dev/2020-March/014198.html [SPACE_CONSTRAINTS]].

WRT my methodology, I simply went to the place where we send INTRODUCE1 cells and added a printf that printed the payload_len and compared it with RELAY_PAYLOAD_SIZE, then I connected to a v3 service and read the logfile.

Trac:
Parent: N/A to #33703 (moved)

OK I finished an experiment on the descriptor "hi-mom" test. I went to the encrypted section (which is confusingly the inner layer of the descriptor which only authorized clients should have access to) and added the following line:

+  smartlist_add_asprintf(lines, "hi-mom lets-defend\n");

right below the "single-onion-service" line and above the intro points.

I then started a chutney network with that patch, and ensured that clients can connect to the HS (effectively ignoring the foreign line), and that the string is present in the client-received descriptor:

create2-formats 2
intro-auth-required ed25519
single-onion-service
hi-mom lets-defend
introduction-point AgIUQ0NDQ0NDQ0NDQ0NDQ0NDQ0NDQ0MABgECAwQjKQ==
onion-key ntor AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
...

I dug pretty deep into the INTRO1/INTRO2 free bytes test. Turns out there is a layer of padding that tries to keep the pre-mac intro cell at least HS_CELL_INTRODUCE1_MIN_SIZE (246), as well as some optional things that are not always present. This tends to add about 100 bytes of semi-unused slack space on top of the 188 bytes of free space that asn reported.

In the unit tests, I was able to get 287 additional bytes into an INTRO1, and have it parse correctly, before hitting any asserts or errors.

In the live network, I had to disable sending ipv6 addresses of rend points in order to get close to this. But the good news is that hs_get_extend_info_from_lspecs() on the service side currently ignores all link specs other than ipv4, legacy rsa, and ed25519 (which are all required). So if the DoS/PoW defense is enabled, we can safely disable sending IPv6 RPs and no one will notice.

With IPv6 link spec sending disabled, I was able to get 253 bytes in either the unencrypted or encrypted sections. If I added more fields of extensions, that byte count went down. For example, if I used 10 fields in an extension, I was only able to send 217 bytes (because of the nine additional type and length specifiers). If I added a single encrypted extension field, and a single unencrypted extension field, I was able to fit 124 bytes in each extension (for 248 bytes total).

So we have more room here than 188 bytes. If we trim more unused fields, we might be able to slim it down even more?

See https://github.com/mikeperry-tor/tor/commits/ticket33650-intro-smuggle-test for working code.

No intropoints were harmed in this test. Everything seemed to work fine.

Are we satisfied with this analysis? Can we close this ticket? Should I report to tor-dev@lists?

Setting to "needs review" to signal that this analysis is complete, and that folks should take a look.

This code should not be merged!

Trac:
Status: accepted to needs_review

Replying to mikeperry:

Are we satisfied with this analysis? Can we close this ticket? Should I report to tor-dev@lists?

Great analysis.

I'm not sure how viable it is to remove IPv6 link specifiers if we want to maintain IPv6-only support in the future, but it's good to know that things work end-to-end and that we can easily calculate how many bytes we have available (with and without ipv6).

Let's close this ticket and report to tor-dev indeed.

We also need to figure out next steps here.

I'm gonna close this ticket and let's continue research and epxloration in #33712 (moved). If someone disagrees, please feel free to reopen it.

Trac:
Actualpoints: N/A to 2.5
Status: needs_review to closed
Resolution: N/A to fixed

Trac:
Parent: #33703 (moved) to #33712 (moved)

closed

added 20h of time spent

mentioned in issue #33712 (moved)

moved to tpo/core/tor#33650 (closed)

Verify that intro2 cell extensions actually work

Child items ...

Activity