Opened 5 years ago

Closed 3 years ago

#16914 closed enhancement (wontfix)

improve consensus download decompress performance

Reported by: teor Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.6.10
Severity: Normal Keywords: tor-03-unspecified-201612
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Reported by starlight.2015q3@… to tor-dev@

"Currently tor_gzip_uncompress() starts with
an assumed 50% compression guess. Typical
consensus document compression is slightly
better than 65% and this assumption results
in at least one realloc() and sometimes
two realloc() calls during decompression.

Adjust the starting assumption to 75%
compression to eliminate the cost
of the realloc()."

Child Tickets

Attachments (1)

tor-0.2.6.10-gz4x_guess.patch (671 bytes) - added by teor 5 years ago.
Patch attachment from mailing list

Download all attachments as: .zip

Change History (13)

Changed 5 years ago by teor

Patch attachment from mailing list

comment:2 Changed 5 years ago by teor

This modifies the estimate for every gzip decompression performed by tor.
We could go two ways on this:

  • allow the caller to supply an estimate in a new tor_gzip_uncompress_estimate() (and leave existing callers as-is)
  • modify the estimate globally

I will now check the range of compression ratios on tor directory files.

I plan to use gzip -9 as a proxy for tor_gzip_compress. Note that the gzip command generates file headers, but tor generates HTTP/[Tor/]TCP/IP headers. The difference is likely to be negligible for large files.

And it's the largest files that matter for decompression speed and memory usage.

comment:3 Changed 5 years ago by teor

Status: newneeds_information

Using gzip -9 -v * 2>&1 | cut -f1 -d% on some recent tor directory documents:

cached-consensus-20150827-1400-UTC:	   66.0
cached-descriptors-20150827-1400-UTC:	   55.7
cached-microdesc-consensus-20150826-1400-UTC:	   56.3
cached-microdesc-consensus-20150827-2300-UTC:	   56.1
cached-microdescs-20150826-1100-UTC:	   51.9
cached-microdescs-20150827-2300-UTC:	   52.1

The microdescriptor consensus and descriptors are downloaded by most Tor clients (and, therefore, by most Tor instances). The descriptors are downloaded individually, so their ratios may be slightly lower. (The "full" consensus and descriptors aren't used by most clients, so they can be ignored for the purposes of this analysis.)

I suggest we increase the expected ratio to 75%. We currently double the size of the buffer when we need to reallocate anyway, so we are already using that much RAM for every decompression, we're just allocating 50%, then reallocating 75%. (Except if the individual microdescriptor ratios fall under 50%, then we're only using 50%.)

Alternately, we could increase the expected ratio to 70%, saving 1 reallocation and 5% RAM in typical cases. This would be a win for both performance and RAM usage.

Open questions:

  • What is the range of compression ratios on recent microdescriptor consensuses? Do they vary much?

comment:4 Changed 5 years ago by teor

Further notes from my tor-dev email:

The micro descriptor consensuses are 1.2 MBytes, and the combined size of all cached microdescriptors is 2.9MBytes.

The 5% difference between 70% and 75% is 61 KByte for the 1.2 MByte micro descriptor consensus, which is the largest individual file. This optimisation costs us nothing, except that if we cut it too close, and the compression ratio improves, we get a memory doubling and reallocation during decompression, which we’re trying to avoid. This would be most likely to happen in the full consensus decompression (currently 66% compression ratio on 1.4 MBytes), which is the one we care least about, because it's on the most powerful servers [citation needed].

Do we want to go with the 70% option to save both RAM and reallocation performance?
Or do we want to go with the 75% option to avoid reallocation, even if the ratio improves?

comment:5 Changed 5 years ago by someone_else

For the last couple of months, the compression ratio for the full consensus is 66-68%. The micro desc consensus should also be quite stable.

Last edited 5 years ago by someone_else (previous) (diff)

comment:6 Changed 5 years ago by nickm

Keywords: TorCoreTeam201509 Post027Freeze removed
Milestone: Tor: 0.2.8.x-final

Seems like a decent idea; let's consider it at the start of 0.2.8. There's no reason to cram it into 0.2.7 at this point.

(Also, do we have any evidence that the realloc calls here actually matter?)

comment:7 in reply to:  6 Changed 5 years ago by someone_else

Replying to nickm:

(Also, do we have any evidence that the realloc calls here actually matter?)

It won't make a performance difference. memcpy is something like 50x faster than zlib inflate on current Intel CPUs.

comment:8 in reply to:  6 Changed 5 years ago by teor

Replying to nickm:

(Also, do we have any evidence that the realloc calls here actually matter?)

Only when they're buggy, on old versions of Fedora.
https://trac.torproject.org/projects/tor/ticket/15901#comment:40

comment:9 Changed 5 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.???

Move a few tickets out of 0.2.8. I would take a good patch for most of these if somebody writes one. (If you do, please make the ticket needs_review and move it back into maint-0.2.8 milestone. :) )

comment:10 Changed 4 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:11 Changed 4 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:12 Changed 3 years ago by nickm

Resolution: wontfix
Severity: Normal
Status: needs_informationclosed

ok; closing as wontfix, since the current guesses seem to be working ok

Note: See TracTickets for help on using tickets.