LTS branch leaks memory continuously under stress/attack, requires back-port of 0.3.2.8-rc fixes to remain viable

changed milestone to %Tor: unspecified

added component::core tor/tor milestone::Tor: unspecified priority::medium severity::normal status::new type::defect labels

Do you have reason to think this is a memory leak, and not a bunch of allocations?

If it's the latter case, this could be a case where backporting #24666 (moved) specifically would help.

(But if it's the former case, then #24666 (moved) won't help: that's not a memory leak.)

The event had characteristics of a leak. I was watching on-and-off due to cycles of earlier memory exhaustion events followed by corrective tuning, along with current huge volumes of excess egress flow that matches the description of HSDIR query traffic. Tor daemon total memory gradually climbed for hours from 1.5GB. Saw it at 2.2GB; two hours later came back and observed the final footprint hit 2.4GB.

Egress traffic was continuously near saturation, limited by CPU rather than bandwidth and it's conceivable a pile-up of allocations consumed the memory. However CPU backed substantially off of 100% at times.

Earlier while working on attack mitigation, I observed in debug logging the attacker creates large numbers of circuits, destroys them all-at-once; repeats the cycle endlessly.

Hard to say for certain.

I generally think nothing of it, but a possible unique element here is I monitor the daemon from another system using Vidalia. Could be that when Vidalia performs its periodic, inefficient view refresh and the daemon is under high load it leaks memory. I see 0.3.2.8-rc went from 800MB to 1.3GB today (queue-max 1GB). If the image continues to grow I'll look for bump-ups during refresh events. Determine if turning off Vidalia halts growth. Till recently would log up-times on the order of six-to-eight months, no problems at all.

To be clear, there's a good chance that this is not a memory leak.

Instead, it is more likely that it is memory fragmentation.

For context, when a program mallocs some memory, it gets a page of memory to itself, and further mallocs come out of that page, or maybe a new page, depending on complex algorithms inside the kernel. When the program frees chunks of memory, the kernel can't reclaim the page and give it to a different program until all of the chunks of memory on that page have been freed. So if you have an unfortunate pattern of allocations and frees, it can result in a larger and larger set of pages that are reserved for your program, even though you freed a bunch of the stuff that you allocated.

That pattern is why you see the Tor process growing over time and rarely shrinking.

The fix is generally hard. It involves identifying what patterns are causing inefficient usage of pages, and changing how Tor behaves to avoid those patterns.

In general agree, though seeing the daemon click up 100MB every hour or two while under stress/attack causes me to keep an open mind regarding the cause.

After starting 0.3.2.8-rc the relay was still high rated and initially I selected the vanilla scheduler. Memory consumption climbed steadily until the consensus rating dropped, holding since at 1.5GB. Deactivated all local activities that could be an influence and memory increased overnight from 1.3GB to 1.5GB regardless. Since then decided to activate KIST scheduler after skimming the paper and a glance at the code.

Will be interesting to see how it looks the next time rating climbs and the HSDIR flag is assigned. Prepared a LSAN instrumented image to run if memory consumption takes another unbounded wander.

Have observed a couple of apparent circuit-extend attacks and it's performing better. Not seeing crypto thread-pair saturate; seeing no 'assign_to_cpuworker failed" errors and just one "computer too slow" per day with a much lower repeat count.

migrating to SLAB memory allocation is worth considering

Datapoint: I'm running a relay right now that keeps track of most the big memory chunk Tor uses including descriptors. I hope to learn something but so far I'm also on the side of memory frag :S.

Trac:
Milestone: N/A to Tor: unspecified

Could link against jemalloc or something and use it's introspection capabilities.

http://jemalloc.net/jemalloc.3.html

quiet so far, have HSDIR again but no evidence of a popular onion service, rating backup up though not to previous max; no severe attacks

if this ticket has sent them off I could, on the next shutdown, gdb-attach the daemon with a breakpoint just before exit() and call malloc_stats(); will yield an indication on leak vs fragmentation; daemon was definitely exhibiting errant memory behavior when first started per comment:6 (holding steady at present); would take cores for further analysis

My relays are running 0.3.0 with a backport for #24666 (moved), they topped out at 7 GB before I started restricting incoming connections. So there may not be any memory leak here.

Relay came under attack again and total memory utilization went from 1.6GB (drifted up from 1.5GB) to 2GB in matter of hours. This is 0.3.2.8-rc I'm talking about with MaxMemInQueues=1024MB.

Fiddling with it, suspended the daemon for a few minutes to see if that would shake off attack circuits and unfortunately this resulted in an immediate socket buffer memory consumption surge and kernel OOPS. So was unable to try a gdb-assisted shutdown and obtain pre-exit() statistics. The LSAN build has some problem causing it to trap so no progress on identifying the exact leak(s). I am now nearly convinced this a memory leak. Have iptables blocking direct connections from abusive clients, so it's clearly distributed circuit-extend attack. Two crypto threads were at 35% CPU each while the main event thread was pegged at 100%.

Also observed an attack on my exit in recent days and it survived, having 16GB of RAM--throwing hardware at it is one mitigation. Don't have time right now to deploy the replacement for the "temporary," ancient box swapped in due to a hardware failure last summer, so setting MaxMemInQueues=512MB and will try again with existing box. Will try to prepare a functional LSAN image.

Finally identified the cause of excessive memory consumption: Running with CellStatistics enabled. Opened #25141 (moved).

Figuring it out, I ended up tracing malloc once a stable operating state is reached (#24914 (moved) patch applied) and netting-out frees such that only lingering allocations were visible. Tor's malloc/free behavior at steady-state is a terrible thing to behold.

So, is this ticket now resolved, now that #25141 (moved) exists?

Can you publish the results of your "tracing malloc" work?

Thanks!

Replying to arma:

So, is this ticket now resolved, now that #25141 (moved) exists?

I suppose, though when I opened it I was thinking about the DESTROY cell fix in #24666 (moved). Perhaps that memory burner is small beer in comparison to a persistent struct-per-Circuit?

Can you publish the results of your "tracing malloc" work?

Sure. . .later this week.

Thanks!

My pleasure.

CellStatistics is definitely a special-purpose option; anything that we do to make it use less memory in 0.3.4 or beyond via #25141 (moved) is probably not going to be backported.

The #24666 (moved) stuff, on the other hand, will get backported along with #24902 (moved). We're going to need another alpha to shake all the issues out of #24902 (moved), however.

Trac:
tor-0.3.2.9-mtrace.patch

mtrace patch

Trac:
mtrace_1.out-clean.txt.xz

full trace (sanitized)

Trac:
mtrace_1.rpt-sym-clean.txt.gz

lingering allocations (sanitized)

Replying to arma:

Can you publish the results of your "tracing malloc" work?

I wrote a quick but reasonable patch that activates and deactivates the standard mtrace() facility in response to a pair of RT signals (40-on, 41-off) or control-channel commands. Then traced something like 120 seconds after the relay had warmed up. mtrace.pl provided alloc/free subtraction. Standard glibc hook functions are terribly inefficient and nearly kill it, but strong hints regarding the cause of the problem jumped off the page nonetheless and I set aside further refinement. Was also trying to coax LSAN into producing a pretty report, but LSAN is "too smart," and builds a graph of references by searching all allocations for pointers to other allocations--reliably produced zilch. No option to tune LSAN for stupidity and that exercise was a frustrating waste of time.

The mtrace() patch can be enhanced with fast hook callbacks that write binary structs including full call-arcs, and I was prepared to do that (notes in the patch). A small C/C++ utility to set-subtract frees from mallocs and pretty-print the results is required; should summarize by call-arc as Valgrind, ASAN, LSAN etc. do (anyone remember Purify?).

I'm willing to work this into a proper enhancement if of interest. My idea is that it can be an always-available on-demand memory trace one could activate and deactivate on a long-running instance, even one with months of uptime.

Am a bit burned-out on hacking Tor at the moment and would prefer to take a month or two to finish (more like two). If this is viewed as near-term desirable one of the core devs could certainly complete it and I won't feel upstaged (as long I get a little credit).

Very useful stuff!!! Thanks for this.

Quick question, can you tell us which torrc options are turned on? Especially in terms of Statistics. Thanks!

good to learn the trace is helpful

relevant configurations in effect during the trace run were

AvoidDiskWrites 1
RunAsDaemon 1
NumCPUs 1
DownloadExtraInfo 1
DisableAllSwap 1
MaxMemInQueues 512MB
CellStatistics 1
EntryStatistics 1
ConnDirectionStatistics 1
HiddenServiceStatistics 1
ExitPolicy reject *:*

Seems to me ConnDirectionStatistics is a problem as well. Disabled it on the relay yesterday.

mentioned in issue #25141 (moved)

moved to tpo/core/tor#24806 (closed)

LTS branch leaks memory continuously under stress/attack, requires back-port of 0.3.2.8-rc fixes to remain viable

Child items 0

Activity