Opened 21 months ago

Last modified 20 months ago

#24806 new defect

LTS branch leaks memory continuously under stress/attack, requires back-port of 0.3.2.8-rc fixes to remain viable

Reported by: starlight Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I am a fan of running LTS releases, however 0.2.9.14 leaks memory continuously on relays serving as HSDIR for an apparently popular onion service and while also under attack.

Daemon leaked 1GB in eight hours growing from 1.5GB VMRSS to 2.4GB VMRSS and was killed due to limited RAM.

To remain viable 0.2.9 branch requires backports of 0.3.2.8-rc memory fixes, otherwise must be deprecated.

Presently I am switching to 0.3.2.8-rc to see how the relay behaves the next time this happens. Lately high-volume onion services come its way shortly after the HSDIR flag is assigned.

see #24737 for some details

Child Tickets

TicketStatusOwnerSummaryComponent
#25141newenabling CellStatistics results in gigabytes of incremental memory consumptionCore Tor/Tor

Attachments (3)

tor-0.3.2.9-mtrace.patch (5.3 KB) - added by starlight 20 months ago.
mtrace patch
mtrace_1.out-clean.txt.xz (33.1 KB) - added by starlight 20 months ago.
full trace (sanitized)
mtrace_1.rpt-sym-clean.txt.gz (20.3 KB) - added by starlight 20 months ago.
lingering allocations (sanitized)

Download all attachments as: .zip

Change History (23)

comment:1 Changed 21 months ago by nickm

Do you have reason to think this is a memory leak, and not a bunch of allocations?

If it's the latter case, this could be a case where backporting #24666 specifically would help.

comment:2 Changed 21 months ago by nickm

(But if it's the former case, then #24666 won't help: that's not a memory leak.)

comment:3 Changed 21 months ago by starlight

The event had characteristics of a leak. I was watching on-and-off due to cycles of earlier memory exhaustion events followed by corrective tuning, along with current huge volumes of excess egress flow that matches the description of HSDIR query traffic. Tor daemon total memory gradually climbed for hours from 1.5GB. Saw it at 2.2GB; two hours later came back and observed the final footprint hit 2.4GB.

Egress traffic was continuously near saturation, limited by CPU rather than bandwidth and it's conceivable a pile-up of allocations consumed the memory. However CPU backed substantially off of 100% at times.

Earlier while working on attack mitigation, I observed in debug logging the attacker creates large numbers of circuits, destroys them all-at-once; repeats the cycle endlessly.

Hard to say for certain.

comment:4 Changed 21 months ago by starlight

I generally think nothing of it, but a possible unique element here is I monitor the daemon from another system using Vidalia. Could be that when Vidalia performs its periodic, inefficient view refresh and the daemon is under high load it leaks memory. I see 0.3.2.8-rc went from 800MB to 1.3GB today (queue-max 1GB). If the image continues to grow I'll look for bump-ups during refresh events. Determine if turning off Vidalia halts growth. Till recently would log up-times on the order of six-to-eight months, no problems at all.

comment:5 Changed 21 months ago by arma

To be clear, there's a good chance that this is not a memory leak.

Instead, it is more likely that it is memory fragmentation.

For context, when a program mallocs some memory, it gets a page of memory to itself, and further mallocs come out of that page, or maybe a new page, depending on complex algorithms inside the kernel. When the program frees chunks of memory, the kernel can't reclaim the page and give it to a different program until all of the chunks of memory on that page have been freed. So if you have an unfortunate pattern of allocations and frees, it can result in a larger and larger set of pages that are reserved for your program, even though you freed a bunch of the stuff that you allocated.

That pattern is why you see the Tor process growing over time and rarely shrinking.

The fix is generally hard. It involves identifying what patterns are causing inefficient usage of pages, and changing how Tor behaves to avoid those patterns.

comment:6 Changed 21 months ago by starlight

In general agree, though seeing the daemon click up 100MB every hour or two while under stress/attack causes me to keep an open mind regarding the cause.

After starting 0.3.2.8-rc the relay was still high rated and initially I selected the vanilla scheduler. Memory consumption climbed steadily until the consensus rating dropped, holding since at 1.5GB. Deactivated all local activities that could be an influence and memory increased overnight from 1.3GB to 1.5GB regardless. Since then decided to activate KIST scheduler after skimming the paper and a glance at the code.

Will be interesting to see how it looks the next time rating climbs and the HSDIR flag is assigned. Prepared a LSAN instrumented image to run if memory consumption takes another unbounded wander.

Have observed a couple of apparent circuit-extend attacks and it's performing better. Not seeing crypto thread-pair saturate; seeing no 'assign_to_cpuworker failed" errors and just one "computer too slow" per day with a much lower repeat count.

comment:7 Changed 21 months ago by starlight

migrating to SLAB memory allocation is worth considering

comment:8 Changed 21 months ago by dgoulet

Milestone: Tor: unspecified

Datapoint: I'm running a relay right now that keeps track of most the big memory chunk Tor uses including descriptors. I hope to learn something but so far I'm also on the side of memory frag :S.

comment:9 Changed 21 months ago by yawning

Could link against jemalloc or something and use it's introspection capabilities.

http://jemalloc.net/jemalloc.3.html

comment:10 Changed 21 months ago by starlight

quiet so far, have HSDIR again but no evidence of a popular onion service, rating backup up though not to previous max; no severe attacks

if this ticket has sent them off I could, on the next shutdown, gdb-attach the daemon with a breakpoint just before exit() and call malloc_stats(); will yield an indication on leak vs fragmentation; daemon was definitely exhibiting errant memory behavior when first started per comment:6 (holding steady at present); would take cores for further analysis

comment:11 Changed 21 months ago by teor

My relays are running 0.3.0 with a backport for #24666, they topped out at 7 GB before I started restricting incoming connections. So there may not be any memory leak here.

comment:12 Changed 20 months ago by starlight

Relay came under attack again and total memory utilization went from 1.6GB (drifted up from 1.5GB) to 2GB in matter of hours. This is 0.3.2.8-rc I'm talking about with MaxMemInQueues=1024MB.

Fiddling with it, suspended the daemon for a few minutes to see if that would shake off attack circuits and unfortunately this resulted in an immediate socket buffer memory consumption surge and kernel OOPS. So was unable to try a gdb-assisted shutdown and obtain pre-exit() statistics. The LSAN build has some problem causing it to trap so no progress on identifying the exact leak(s). I am now nearly convinced this a memory leak. Have iptables blocking direct connections from abusive clients, so it's clearly distributed circuit-extend attack. Two crypto threads were at 35% CPU each while the main event thread was pegged at 100%.

Also observed an attack on my exit in recent days and it survived, having 16GB of RAM--throwing hardware at it is one mitigation. Don't have time right now to deploy the replacement for the "temporary," ancient box swapped in due to a hardware failure last summer, so setting MaxMemInQueues=512MB and will try again with existing box. Will try to prepare a functional LSAN image.

comment:13 Changed 20 months ago by starlight

Finally identified the cause of excessive memory consumption: Running with CellStatistics enabled. Opened #25141.

Figuring it out, I ended up tracing malloc once a stable operating state is reached (#24914 patch applied) and netting-out frees such that only lingering allocations were visible. Tor's malloc/free behavior at steady-state is a terrible thing to behold.

Last edited 20 months ago by starlight (previous) (diff)

comment:14 Changed 20 months ago by arma

So, is this ticket now resolved, now that #25141 exists?

Can you publish the results of your "tracing malloc" work?

Thanks!

comment:15 in reply to:  14 Changed 20 months ago by starlight

Replying to arma:

So, is this ticket now resolved, now that #25141 exists?

I suppose, though when I opened it I was thinking about
the DESTROY cell fix in #24666. Perhaps that memory
burner is small beer in comparison to a persistent
struct-per-Circuit?

Can you publish the results of your "tracing malloc" work?

Sure. . .later this week.

Thanks!

My pleasure.

comment:16 Changed 20 months ago by nickm

CellStatistics is definitely a special-purpose option; anything that we do to make it use less memory in 0.3.4 or beyond via #25141 is probably not going to be backported.

The #24666 stuff, on the other hand, will get backported along with #24902. We're going to need another alpha to shake all the issues out of #24902, however.

Changed 20 months ago by starlight

Attachment: tor-0.3.2.9-mtrace.patch added

mtrace patch

Changed 20 months ago by starlight

Attachment: mtrace_1.out-clean.txt.xz added

full trace (sanitized)

Changed 20 months ago by starlight

lingering allocations (sanitized)

comment:17 in reply to:  14 Changed 20 months ago by starlight

Replying to arma:

Can you publish the results of your "tracing malloc" work?

I wrote a quick but reasonable patch that activates and deactivates the standard mtrace() facility in response to a pair of RT signals (40-on, 41-off) or control-channel commands. Then traced something like 120 seconds after the relay had warmed up. mtrace.pl provided alloc/free subtraction. Standard glibc hook functions are terribly inefficient and nearly kill it, but strong hints regarding the cause of the problem jumped off the page nonetheless and I set aside further refinement. Was also trying to coax LSAN into producing a pretty report, but LSAN is "too smart," and builds a graph of references by searching all allocations for pointers to other allocations--reliably produced zilch. No option to tune LSAN for stupidity and that exercise was a frustrating waste of time.

The mtrace() patch can be enhanced with fast hook callbacks that write binary structs including full call-arcs, and I was prepared to do that (notes in the patch). A small C/C++ utility to set-subtract frees from mallocs and pretty-print the results is required; should summarize by call-arc as Valgrind, ASAN, LSAN etc. do (anyone remember Purify?).

I'm willing to work this into a proper enhancement if of interest. My idea is that it can be an always-available on-demand memory trace one could activate and deactivate on a long-running instance, even one with months of uptime.

Am a bit burned-out on hacking Tor at the moment and would prefer to take a month or two to finish (more like two). If this is viewed as near-term desirable one of the core devs could certainly complete it and I won't feel upstaged (as long I get a little credit).

comment:18 Changed 20 months ago by dgoulet

Very useful stuff!!! Thanks for this.

Quick question, can you tell us which torrc options are turned on? Especially in terms of Statistics. Thanks!

comment:19 Changed 20 months ago by starlight

good to learn the trace is helpful

relevant configurations in effect during the trace run were

AvoidDiskWrites 1
RunAsDaemon 1
NumCPUs 1
DownloadExtraInfo 1
DisableAllSwap 1
MaxMemInQueues 512MB
CellStatistics 1
EntryStatistics 1
ConnDirectionStatistics 1
HiddenServiceStatistics 1
ExitPolicy reject *:*

comment:20 Changed 20 months ago by starlight

Seems to me ConnDirectionStatistics is a problem as well. Disable it on the relay yesterday.

Version 0, edited 20 months ago by starlight (next)
Note: See TracTickets for help on using tickets.