due to recent DOS attacks much incorrect advice has been tossed around on tor-relays regarding the application of MaxMemInQueues
many seem to believe that MaxMemInQueues should be set to 75-80% of available memory but this is painfully (in the sense of OOM crashes) incorrect
proper advice is to set MaxMemInQueues to 45% of physical memory available for the instance, assuming DisableAllSwap=1 is also in effect; 40% is a safer, more conservative value
one of my relays configured with MaxMemInQueues=1024MB recently emitted
We're low on memory. Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.)Removed 1063029792 bytes by killing 1 circuits; 21806 circuits remain alive. Also killed 0 non-linked directory connections.
after which the tor daemon was observed to consume precisely 2GB per /proc//status:VmRSS
the aforementioned incorrect advice was followed in #22255 (moved) and the operator continues to experience OOM failures
another mitigation is to establish conservative linux memory management with the sysctl settings
vm.overcommit_memory = 2
vm.overcommit_ratio = X
where X is set such that /proc/memifo:CommitLimit is approximately 80% of physical memory (90% if 16GB or more is present)
The settings will prevent sparse-memory applications from running (e.g. ASAN instrumented code), but is appropriate for dedicated tor relays systems. Effectively disables OOM killer and should result in graceful memory exhaustion behavior, though I have not investigated tor daemon response in the face of malloc() fails returning null pointers.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
I'm not sure what you want us to do in response to this ticket.
If you can write up a short wiki page with some advice, we could point to it rather than trying to guess the right setting.
due to recent DOS attacks much incorrect advice has been tossed around on tor-relays regarding the application of MaxMemInQueues
many seem to believe that MaxMemInQueues should be set to 75-80% of available memory but this is painfully (in the sense of OOM crashes) incorrect
proper advice is to set MaxMemInQueues to 45% of physical memory available for the instance, assuming DisableAllSwap=1 is also in effect; 40% is a safer, more conservative value
I don't think percentages are helpful - I think creating a table with free RAM to MaxMemInQueues values would be more helpful. (See below.)
one of my relays configured with MaxMemInQueues=1024MB recently emitted
{{{
We're low on memory. Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.)
Removed 1063029792 bytes by killing 1 circuits; 21806 circuits remain alive. Also killed 0 non-linked directory connections.
}}}
after which the tor daemon was observed to consume precisely 2GB per /proc//status:VmRSS
To be more precise: MaxMemInQueues doesn't track destroy queues, nor does it track various other Tor data structures,
So you have to set it at a level that allows space for a few hundred megabytes of Tor data, and then some destroy queues.
At 1024 MB per instance, this means 512 MB or less.
But with 10 GB per instance, it really is ok to allow 5-7 GB in queues.
(I have a relay that allows the default 8 GB in queues, and it's fine.)
the aforementioned incorrect advice was followed in #22255 (moved) and the operator continues to experience OOM failures
Are you the operator?
Have they tried 0.3.2.8-rc and reopened another ticket?
…
The settings will prevent sparse-memory applications from running (e.g. ASAN instrumented code), but is appropriate for dedicated tor relays systems. Effectively disables OOM killer and should result in graceful memory exhaustion behavior, though I have not investigated tor daemon response in the face of malloc() fails returning null pointers.
The tor daemon will assert and exit if malloc returns NULL.
I'm not sure what you want us to do in response to this ticket.
If you can write up a short wiki page with some advice, we could point to it rather than trying to guess the right setting.
I suggest adding some verbiage to the Tor Manual where most people would look first when adjusting MaxMemInQueues.
I don't think percentages are helpful - I think creating a table with free RAM to MaxMemInQueues values would be more helpful. (See below.)
. . .
To be more precise: MaxMemInQueues doesn't track destroy queues, nor does it track various other Tor data structures,
So you have to set it at a level that allows space for a few hundred megabytes of Tor data, and then some destroy queues.
At 1024 MB per instance, this means 512 MB or less.
But with 10 GB per instance, it really is ok to allow 5-7 GB in queues.
(I have a relay that allows the default 8 GB in queues, and it's fine.)
My observation is that when MaxMemInQueues triggers a circuit kill, the daemon will have consumed in physical memory approximately twice the setting value. Of course YMMV on the precise amount, but this observational rule-of-thumb is far away from the suggestion that 120-130% of MaxMemInQueues will be used.
the aforementioned incorrect advice was followed in #22255 (moved) and the operator continues to experience OOM failures
Are you the operator?
Have they tried 0.3.2.8-rc and reopened another ticket?
Not the operator on that ticket. It came up in a search and seems to me his MaxMemInQueues is too high relative to RAM.
The tor daemon will assert and exit if malloc returns NULL.
Ah, well then vm.overcommit_memory=2 will cause the daemon to die sooner rather than later instead of a more graceful response such as killing one circuit. Still better then allowing Linux OOM handler choose a victim to kill.
Alternately, my advice for hardy souls willing to expend such effort:
leave the default vm.overcommit_memory=0 in effect
write a script to set /proc//task//oom_adj to -17 for every process in the system
have a script set oom_adj=0 for a process you would rather have die than the tor daemon
3b) if one sets -17 for every process, then Linux will suspend the memory requester until some becomes available; this could result in a hung system, a crashed system, or it could result in a semi-graceful recovery in the case where socket buffer memory is freed as queues drain
Additionally one should set vm.min_free_kbytes=131072 or even =262144. By default Linux sets this value so low that a sudden surge in arriving network traffic will use up all free memory so fast OOM killer and dirty-cache writes can't keep pace and the system will OOPs (hard crash).
the observation that tor worst-case memory consumption is 2x MaxMemInQueues does not include kernel socket-buffer memory consumption; socket memory can be substantial and must be allowed for, which is where I came up with setting MaxMemInQueues to 40% of free memory available for a given instance; note KIST lowers egress socket buffering, but leave ingress socket memory utilization to the kernel and to the behavior of remote peers
inspiration for this ticket was the OOM kill of a daemon configured MaxMemInQueues=2G running on 4G machine, and subsequently the event mentioned in "Description" -- both apparently were "sniper attacks"
If someone writes a few sentences, we can add them to the MaxMemInQueues man page entry.
We might want to have a different recommendation for older and newer Tor versions, as some of the bugs you mention were fixed in 0.3.2.8-rc.
The detailed steps you wrote for Linux would be more appropriate in doc/TUNING, or in a wiki page entry.
Trac: Summary: oft given MaxMemInQueues advice is wrong to Recommend a MaxMemInQueues value in the Tor man page Points: N/Ato 0.5 Milestone: Tor: unspecified to Tor: 0.3.2.x-final
I'd still like to see someone repeat this analysis with 0.3.2.8-rc, and post the results to #24737 (moved).
It's going to be hard for us to close that ticket without any idea of the effect of our changes.
I'm not willing to run a newer version till one is declared LTS, but can say that even when my relay is not under attack memory consumption goes to 1.5G with the 1G max queue setting. Seems to me the 2x max queues memory consumption is a function of the overheads associated with tor daemon queues and related processing, including malloc slack space.
Saying 2x is a useful guide, but I think we can do better. Because I see very different behaviour on systems with a lot more RAM.
This is how the overheads work on my 0.3.0 relay with 8 GB per tor instance, and a high MaxMemInQueues:
512 MB per instance with no circuits
256 - 512 MB extra per instance with relay circuits
256 - 512 MB extra per instance with exit streams
The RAM usage will occasionally spike to a few gigabytes, but I've never seen it all used.
So I think we should document the following RAM usage and MaxMemInQueues settings:
Relays: minimum 768 MB, set MaxMemInQueues to (RAM per instance - 512 MB)*N
Exits: minimum 1GB, set MaxMemInQueues to (RAM per instance - 768 MB)*N
For all versions without the destroy cell patch (0.3.2.7-rc and all current versions as of 1 January 2018), N should be 0.5 or lower. It's reasonable to expect destroy cell queues and other objects to take up approximately the same amount of RAM as the queues.
For all versions with the destroy cell patch (0.3.2.8-rc and all versions released after 1 January 2018), N should be 0.75 or lower. It's reasonable to expect destroy cell queues and other objects to take up a third of the queue RAM.
Now we just have to turn this into a man page patch and wiki entry.
Anyone running a busy relay on an older/slower system and with MaxMemInQueues=1024MB can check /proc//status to see how much memory is consumed. Be sure DisableAllSwap=1 is set and the queue limit is not higher since the point is to observe actual memory consumed relative to a limit likely to be approached under normal operation.
Another idea is to add an option to the daemon to cause queue memory preallocation. This would be a nice hardening feature as it will reduce malloc() calls issued under stress, and of course would allow more accurate estimates of worst-case memory consumption. If OOM strikes with preallocated queues that would indicate memory leakage.
I'd still like to see someone repeat this analysis with 0.3.2.8-rc, and post the results to #24737 (moved).
It's going to be hard for us to close that ticket without any idea of the effect of our changes.
I'm not willing to run a newer version till one is declared LTS, but can say that even when my relay is not under attack memory consumption goes to 1.5G with the 1G max queue setting. Seems to me the 2x max queues memory consumption is a function of the overheads associated with tor daemon queues and related processing, including malloc slack space.
Saying 2x is a useful guide, but I think we can do better. Because I see very different behaviour on systems with a lot more RAM.
This is how the overheads work on my 0.3.0 relay with 8 GB per tor instance, and a high MaxMemInQueues:
512 MB per instance with no circuits
256 - 512 MB extra per instance with relay circuits
256 - 512 MB extra per instance with exit streams
The RAM usage will occasionally spike to a few gigabytes, but I've never seen it all used.
So I think we should document the following RAM usage and MaxMemInQueues settings:
Relays: minimum 768 MB, set MaxMemInQueues to (RAM per instance - 512 MB)*N
Exits: minimum 1GB, set MaxMemInQueues to (RAM per instance - 768 MB)*N
For all versions without the destroy cell patch (0.3.2.7-rc and all current versions as of 1 January 2018), N should be 0.5 or lower. It's reasonable to expect destroy cell queues and other objects to take up approximately the same amount of RAM as the queues.
For all versions with the destroy cell patch (0.3.2.8-rc and all versions released after 1 January 2018), N should be 0.75 or lower. It's reasonable to expect destroy cell queues and other objects to take up a third of the queue RAM.
Now we just have to turn this into a man page patch and wiki entry.
Here's some advice I've just given some relay operators:
If you have 4 Tor Exits, a 1 Gbps connection, and this much RAM, use this setting:
8 GB RAM -> MaxMemInQueues 256 MB
16 GB RAM -> MaxMemInQueues 1 GB
32 GB RAM -> MaxMemInQueues 2 GB
I think this is the right level of detail for a man page.
We could probably afford 3 GB with 32 GB of RAM, but there are other issues:
do we really benefit from buffering more than a minute of traffic?
how much extra CPU load do we get if we set MaxMemInQueues too high?
how low does MaxMemInQueues need to be to resist a sniper attack?
I also opened #24782 (moved) so we change the default in Tor itself.
Here's a MaxMemInQueues setting that's easier to understand:
Set MaxMemInQueues to half your available RAM per tor instance.
(It doesn't track all of Tor's memory usage.)
If your machine has one relay, if you have this much RAM, try this setting:
4 GB -> MaxMemInQueues 512 MB
8 GB -> MaxMemInQueues 2 GB
16 GB -> MaxMemInQueues 4 GB
32 GB -> MaxMemInQueues 8 GB
(If you have more than one relay on the machine, divide MaxMemInQueues by the
number of relays. If you still have RAM issues, take down one relay.)
Here's a list of other options relay operators can use for load tuning, probably appropriate for a wiki page:
I think suggestions in comment 6 and 7 are a bit conservative (but ok) and still like my approx 40% of available memory per instance. So on my 4G machine thinking ~1G for the kernel, I set MaxMemInQueues=1024MB for one relay instance and have some room for a some other daemons. With this setting tor daemon 0.2.9.14 goes to 1.5GB under heavy load (old slow CPU and medium-fast FiOS connection, YMMV) and when hit with a known sniper attack it went to 2GB and survived with Tor's OOM logic killing a 1GB circuit (event log entries above). Leaves quite a bit of space for socket buffer memory and about 500-700MB of other daemons. Note I prefer DisableAllSwap=1 and recommend it strongly, so all Tor daemon memory will fall in the Unevictable/Mlocked accounting and cannot be paged to disk (a detrimental behavior no doubt).
Put another way, MMIQ=1G -> daemon 2GB (80%) plus socket-buffer-delta guess 500MB or 2.5GB total budget (100%) for the instance.
I see kernel SLAB around 900MB (buffer frees tuned lazily with ~7000 active TCP connections at the time of observation, peak around ~9000 connections).
On a 4G machine running just Tor and nothing else, I'd take 40% of 3G and get MMIQ=1228MB.
Don't forget sysctl.conf
vm.min_free_kbytes = 262144
which causes linux to attempt to keep 1/4 GB of memory free. Linux will take aggressive action to page-out idle memory and free cached files when this threshold is hit--it's not an absolute impediment to allocations. The idea is a huge sudden burst of network traffic will rapidly chew up free memory for socket buffers, and if /proc/meminfo:MemFree hits zero and the kernel needs to allocate memory while servicing a network interrupt, the systems will OOPS/crash. So one wants linux to maintain a nice cushion against hard memory exhaustion. /proc/meminfo:Cached not-dirty memory the easiest target for obtaining true free memory, but Cached pages cannot be converted to MemFree during interrupt service--takes some time, i.e. a few hundred microseconds to a couple of milliseconds depending on how busy the scheduler is.
On an 8GB machine I'd still take 1G for the kernel and then 40% of 7G for MaxMemInQueues=2800MB. Two daemons MMIQ=1400MB. On big memory systems (16GB and up) I don't bother setting MMIQ higher than 4096MB or 4G for an instance.
Memory leaks in tor are more severe than reported at the top of this ticket.
My relay became a HSDIR earlier today while also undergoing attack and the Tor daemon leaked memory all the way from 1.5GB total memory utilization to 2.4GB utilization and was killed.
0.2.9.14 is dead (so much for LTS) and I am forced to upgrade to 0.3.2.8-rc
Memory leaks in tor are more severe than reported at the top of this ticket.
My relay became a HSDIR earlier today while also undergoing attack and the Tor daemon leaked memory all the way from 1.5GB total memory utilization to 2.4GB utilization and was killed.
0.2.9.14 is dead (so much for LTS) and I am forced to upgrade to 0.3.2.8-rc
Please open a different ticket with this information, or we will lose track of it.
Observed similar values in recent months running 0.3.3, including the final days
of last winter's overload attacks.
In light of the observations and the numerous improvements in memory OOM accounting,
reporting and mitigation, plus the new circuit queued-cell maximum logic, it appears
safe to recommend MaxMemInQueues values incorporating reasonable premiums that allow
for usual OS-process overheads. Perhaps physical memory of 120% or 130% MaxMemInQueues
per dameon instance? If Shadow-environment tests for simulating attacks exist it would
be worth running them against 0.3.4 before arriving at final recommendations.
KIST scheduler is effective at minimizing data queued on egress socket buffers, but ingress socket memory is determined by the TCP/IP stack and remote peer behavior. Perhaps then 150% of MaxMemInQueues provides a better margin? A Shadow test simulating all-out botnet attack scenarios would help greatly determining extreme worst-case memory consumption.
where is the absolute maximum memory allocated to socket
buffers in 4096 byte pages. Checking a couple of systems the
default values vary form 2/3rds of a 3G virtual machine to 20% of
an 8GB physical machine. Not to be confused with tcp_rmem
and tcp_wmem per-socket tuning parameters.
Correct advice advice for preventing OOM daemon crashes in
worst-case scenarios should probably be something like:
find out what tcp_mem is and subtract that from physical
memory to arrive at memory available for the daemon; subtract an
additional 384-512MB for the kernel. Tune tcp_mem if you don't
like the defaults.
The remaining memory is allocated to one or more tor daemons
where each daemon is allocated 130% of MaxMemInQueues.
The above can be turned into a table indicating MaxMemInQueues
values for different typical distros easily enough, though
hopefully most operators are able to divide a number by 1.3