#22502 closed defect (fixed)

Investigate Zstandard decompression error

Reported by: ahf Owned by: nickm
Priority: Medium Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version: Tor: 0.3.1.2-alpha
Severity: Normal Keywords: review-group-18
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor: Sponsor4

Description

make test-network with current tor HEAD yields a number of warning related to Zstandard decompression, which indicates we have a bug somewhere. The relevant error snippet is:

Warning: Error while uncompresing data: bad input? Number: 2
Warning: Zstandard decompression didn't finish: Unknown frame descriptor. Number: 2

Both Nick and I are able to reproduce this issue on FC and Debian stretch with different versions of libzstd, which probably indicate that this issue is in our code.

Child Tickets

TicketStatusOwnerSummaryComponent
#22626closedahfMissing stream NULL check in tor_compress_implCore Tor/Tor
#22628closedahfZSTD_decompressStream doesn't ever say TOR_COMPRESS_BUFFER_FULLCore Tor/Tor
#22629closednickmtor_compress_impl() ignores trailing input garbage when decompressingCore Tor/Tor
#22669closednickmWhen sending cached_dir_t objects compressed with zlib, do not claim zstd compression.Core Tor/Tor

Change History (16)

comment:1 Changed 11 months ago by ahf

Owner: set to ahf
Status: newassigned

comment:2 Changed 11 months ago by teor

This also happens for me on macOS 10.12 (x86_64):

Tor 0.3.1.2-alpha-dev (git-5955b63a9a4182f8) running on Darwin with Libevent 2.1.8-stable, OpenSSL 1.0.2l, Zlib 1.2.11, Liblzma 5.2.3, and Libzstd 1.2.0.

Last edited 11 months ago by teor (previous) (diff)

comment:3 Changed 10 months ago by teor

This issue sometimes goes away when lzma is enabled, I can reliably reproduce some error using ./configure --enable-zstd --disable-lzma, and then I get a chutney failure with slightly different output:

$ make test-network
...
Warning: Unable to decompress HTTP body (server '127.0.0.1:5000'). Number: 15
Warning: Unable to decompress HTTP body (server '127.0.0.1:5001'). Number: 12
Warning: Unable to decompress HTTP body (server '127.0.0.1:5002'). Number: 14
Warning: Unable to decompress HTTP body (server '127.0.0.1:7000'). Number: 12
Warning: Unable to decompress HTTP body (server '127.0.0.1:7001'). Number: 14
Warning: Unable to decompress HTTP body (server '127.0.0.1:7002'). Number: 10
$ src/or/tor --version
Tor version 0.3.1.3-alpha-dev (git-80ad374b8457e4c9).

make test-network works with --disable-lzma --disable-zstd and --enable-lzma --disable-zstd.

Last edited 10 months ago by teor (previous) (diff)

comment:4 Changed 10 months ago by teor

Status: assignedneeds_revision

My branch bug22502 on https://github.com/teor2345 has fixes for this issue, and all child issues.

They don't work yet, but they are a start: they allow chutney to get at least some consensuses decompressed.

I'm going to leave any further fixes and the changes file to someone else.

$ ./configure --enable-zstd --disable-lzma && make clean test-network
...
Warning: Error while uncompresing data: bad input? Number: 21
...
Warning: Zstandard decompression didn't finish: Unknown frame descriptor. Number: 21
...

comment:5 Changed 10 months ago by nickm

Status: needs_revisionneeds_review

Putting this in needs_review to try to review and merge what there is, under the assumption it's an improvement and will help us isolate the remaining problems.

comment:6 Changed 10 months ago by nickm

Keywords: review-group-18 added

comment:7 Changed 10 months ago by nickm

Note to self; I can fix this as needed.

952c9073ad255442adcf4e85f7b2150c3bbe91de:

  • I wonder whether this is quite right; does it keep us from handling concatenated objects correctly? Let's add a unit test for that.

All other commits here look okay. I'm going to merge the branch to 0.3.1, add the unit test mentioned above, and then start trying to hunt down any remaining issues.

comment:8 Changed 10 months ago by nickm

Owner: changed from ahf to nickm
Status: needs_reviewassigned

comment:9 Changed 10 months ago by nickm

Okay. My branch for #22629 has an additional fix; at this point, I'm going to try to debug harder and see what's up.

comment:10 Changed 10 months ago by nickm

So, I see those warnings happening still, but they happen as part of this:

Jun 20 10:35:55.522 [info] HTTP body from server '127.0.0.1:7000' was labeled as Zstandard compressed, but it seems to be deflated.  Trying both.
Jun 20 10:35:55.522 [warn] Zstandard decompression didn't finish: Unknown frame descriptor.
Jun 20 10:35:55.522 [warn] Error while uncompresing data: bad input?
Jun 20 10:35:55.522 [info] handle_response_fetch_status_vote(): Got votes (body size 10366) from server 127.0.0.1:7000

To me that looks as if we tried with both zstd and deflate, and zstd failed, and deflate succeeded? I'll add more logs and try again.

comment:11 Changed 10 months ago by nickm

I can confirm that the object is a vote, and is arriving with zlib compression, but is labeled as having been compressed with zstd.

comment:12 Changed 10 months ago by nickm

The other bug here is that, although we are trying two methods, we warn when only one fails.

comment:13 Changed 10 months ago by nickm

My branch bug22502_redux_031 now tries to fix the root causes here. I'm going to clean up infelicities in the logging and warning code for handling decompression in a separate ticket (#22670)

comment:14 Changed 10 months ago by nickm

Status: assignedneeds_review

comment:15 Changed 10 months ago by ahf

Status: needs_reviewmerge_ready

LGTM. Let's get this in together with the changes for #22629.

comment:16 Changed 10 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

Merged bug22502_redux_031.

Note: See TracTickets for help on using tickets.