Opened 9 months ago

Last modified 3 months ago

#21693 assigned task

prop224: HS descriptors do wasteful double-base64 encoding

Reported by: asn Owned by: asn
Priority: Medium Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs, prop224, 031-stretch
Cc: Actual Points:
Parent ID: Points: 4
Reviewer: Sponsor: SponsorR-can

Description

In #21334 we implement the new prop224 HS descriptor format that does the double-layer encryption to implement the client auth functionality.

As part of that design, we first base64 the ciphertext of the inner layer of the descriptor. Then when we create the outer descriptor layer, we base64 the ciphertext of the middle layer which also includes the inner layer.

This results in a construction as follows:

middle_layer = base64(encrypt(client_auth_data + base64(encrypt(inner_layer))))
outer_layer = header + middle_layer.

Notice that in the above construction we actually base64 the inner layer twice which is wasteful. During design and development we glossed over this fact thinking that it's not that big of a waste, and since we are already padding the whole middle_layer to 10k bytes it's fine (a typical default size for middle_layer is about 4k bytes).

However, Nick brought this topic again during review and we decided to open a ticket to discuss this, since in theory we could define some sort of binary format for the middle layer and avoid the wasteful double base64.

The pros of this would be that we could fit more data in the middle layer. I estimated that we could fit an extra 1k bytes of data by addressing this. Based on some initial calculations this means that we could fit an extra intro point on the default 10k bytes descriptor, or maybe another block of 16 authed clients (if we are lucky since that's about 1.2k bytes).

The negative of this would be that we would have to go back into design stage to spec the binary format, and then we would need to write the code to implement that; whereas now we are using the same decoding function for both layers. That's basically a simple matter of programming and time and I'm definitely willing to do it if we decide it's the right thing to do.

We discussed this with David in IRC and decided that given the current state of development we should perhaps roll with the current design. That's because only a small number of hidden services would benefit from this change since default descriptors (of 3 intro points and no client auth) are just 4k bytes in size and they get padded to 10k bytes anyway. Only descriptors with many intro points and client auth data that are about 11k bytes would benefit from this change since they wouldn't need to get padded to 20k bytes, and they could actually fit in 10k.

Opening this ticket seems like a good idea since 0.3.1 is the time to do this change if we ever want to; so that we don't feel silly in the future.

Child Tickets

Change History (13)

comment:1 Changed 9 months ago by teor

I'm not sure I understand the issue here: what stops you replacing base64(encrypt(inner_layer)) with encrypt(inner_layer)?
Is there some requirement that it be plain-text?

comment:2 in reply to:  1 Changed 9 months ago by asn

Replying to teor:

I'm not sure I understand the issue here: what stops you replacing base64(encrypt(inner_layer)) with encrypt(inner_layer)?
Is there some requirement that it be plain-text?

Hmm, I think it's because we use the routerparse API (tokenize_string() etc.) to parse each layer, and the routerparse API mainly works on NUL-terminated strings.

Maybe we could play with the start and end arguments of tokenize_string() to tell it where it should stop parsing due to binary data. On this end, we could use the encrypted field of the descriptor as a separator between ASCII and binary data.

Of course, this way we lose the ASCIIness of the super-encrypted part of the descriptor; i.e. you can't nicely dump it on your terminal.

Last edited 9 months ago by asn (previous) (diff)

comment:3 Changed 9 months ago by nickm

Parent ID: #21334

comment:4 Changed 9 months ago by asn

A further point of complication here is that we apply NUL padding (up to nearest multiple of 10k bytes) on the superencrypted section to hide metadata about client auth details and intro points. So it's more like:

middle_layer = b64(encrypt(client_auth_data + b64(encrypt(inner_layer)) + nul_padding))
outer_layer = header + middle_layer.

So unfortunately it's not as simple as replacing b64(encrypt(inner_layer)) with encrypt(inner_layer) since then the binary ciphertext gets mangled with the NUL padding... :(

I guess this means we need some sort of frame on the binary data that specifies the length of encrypt(inner_layer), so that the decoding side can separate the ciphertext from the padding.

In my experience, these sort of frames need careful consideration due to all sorts of weird padding-oracle type of stuff (which probably dont apply in this scenario)... Will think some more, but this might be a reasonable topic for amsterdam as well...

Last edited 9 months ago by asn (previous) (diff)

comment:5 Changed 9 months ago by dgoulet

Keywords: tor-hs prop224tor-hs, prop224
Owner: set to asn
Status: newassigned
Summary: prop224 HS descriptors do wasteful double-base64 encodingprop224: HS descriptors do wasteful double-base64 encoding

comment:6 Changed 8 months ago by nickm

Something that would be almost as good: whenever we are going to encrypt, first compress, and then encrypt. That would mitigate most of the trouble.

comment:7 in reply to:  6 Changed 8 months ago by asn

Replying to nickm:

Something that would be almost as good: whenever we are going to encrypt, first compress, and then encrypt. That would mitigate most of the trouble.

Nice suggestion. Let's see how that could work out.

As discussed before here is how the descriptor layers work:

middle_layer = b64(encrypt(client_auth_data + b64(encrypt(inner_layer)) + nul_padding))
outer_layer = header + middle_layer

There are two encrypts here. We should probably not compress the plaintext of the middle layer, as that includes the NUL padding, which would basically get annihilated by the compression and lose all of its metadata hiding properties (we are using the NUL padding to hide the number of intro points and client auth details).

However, compressing the inner_layer before b64(encrypt(inner_layer)) could make sense. The inner layer is basically a small header with a bunch of intro points. It's usually about 3k bytes (for 3-4 intro points), and it compresses down to about 2k bytes using zlib (based on some testing I just did).

So by compressing the inner layer before encrypting we can save about 1k bytes (which is also about the cost of double-base64 encoding).

Do you think we should do this?
Is the zlib API a PITA to use, or do you think it's gonna be a straightforward patch?


That said, the above change will not change the final size of the HS descriptor because of the padding that gets applied on the layer above.

If we wanted to also reduce the size of the final descriptor we could relax the padding logic from padding to 10k bytes (total desc size: 14k bytes), to padding to 8k bytes or so (total desc size: 11.5k bytes). I think that might be OK to do, if we feel that having 14k bytes descriptors is a stupid thing.

comment:8 Changed 8 months ago by asn

WRT the zlib API, I looked at the test test_buffers_zlib_impl() and it does not seem super hard. It seems to require to require a buf_t, the use of write_to_buf_zlib(), and then extracting the compressed data from the buf_t into a uint8_t. Uncompressing seems even easier, just use tor_gzip_uncompress().

Maybe I should give this a try for the inner layer.

comment:9 Changed 8 months ago by nickm

It's even easier than that -- you can use tor_gzip_compress and tor_gzip_uncompress. No need to involve bufs.

And what about doing it like this --

  outer-layer = header + middle_layer
  middle-layer = b64(encrypt(compressed-part + nul-padding))
  compressed-part = gzip(client_auth_data + b64(encrypt(gzip(inner_layer))))

This keeps the length-hiding property of the nul padding, but allows more client_auth_data.

Thoughts?

comment:10 Changed 8 months ago by dgoulet

That scheme looks sane.

I'm curious on how much we'll win here to be honest. inner_layer is a part we can save a bit of space but it will be from ~3k to ~2k... (most common case). And then client_auth_data is pretty small (16 lines of small amount of text) so maybe 1k to 800B?

If HSDir were serving descriptor compressed that is gzip(outer-layer), we would go from 14k to ~11k for a client to download. (Same for uploading a desc for a service). So I'm guessing we can bring that 11k to maybe ~10k with compressing more inside the descriptor (inner-layer)?

The above implies two things. First, we need to change how HSDir are serving descriptors which is fine because we could just make that "if request comes in with URL.z, send it compressed". And we make client fetch it that way by default. Second, we need to implement the compression/encryption changes for the descriptor encoding/decoding in hs_descriptor.c which is a bit involving considering testing (maybe a day of work).

That being said and to be honest, I'm slightly unconvinced that all this is needed. Going between ~10k to ~14k for the common case, it is very little for something you "should" download every ~18h-24h. Nathan in AMS was telling me that the concern with mobile for instance is not much the bandwidth nowadays but rather the battery life which we are going to kill more with multiple gzip instead of download extra 3k *I think*. (speculation)

I'm missing some data here on how much going from ~3k (legacy code) to ~14k will affect people around the world... :S

comment:11 Changed 8 months ago by asn

Keywords: 031-stretch added

comment:12 Changed 7 months ago by asn

Milestone: Tor: 0.3.1.x-finalTor: 0.3.2.x-final

Deferring to 0.3.2 for now.

comment:13 Changed 3 months ago by dgoulet

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final
Note: See TracTickets for help on using tickets.