Set a lower default MaxMemInQueues value

changed milestone to %Tor: 0.3.3.x-final

added 033-included-20180320 033-must 033-triage-20180320 component::core tor/tor milestone::Tor: 0.3.3.x-final owner::ahf points::0.5 priority::medium resolution::fixed reviewer::dgoulet security severity::normal status::closed tor-dos tor-relay type::defect labels

This was split off #24737 (moved).

Trac:
Status: new to assigned
Owner: N/A to ahf

FWIW, I would expect that the kistlite bug #24671 (moved) fixed in 0.3.2.8-rc might have made Tor use way too much kernel ram; we can take this change, but we should keep monitoring Tor's memory usage to see whether our estimates are right. (Also, over time, we should make MaxMemInQueues cover more and more of the things that Tor allocates for. But that doesn't affect this ticket.)

I'm still seeing some of my Guards use 5-7 GB even with the destroy cell fix and MaxMemInQueues 2 GB. They have 11000 - 160000 connections open each. (This is process RAM, and they don't use KISTLite.) So I think this supports decreasing the default for systems with a lot of RAM.

We could also explore the possibility for that value to be a moving target at runtime. It is a bit more dicy and complicated but because Tor at startup looks at the "Total memory" instead of the "Available memory" to estimate that value, things can go badly quickly if 4/16 GB of RAM are available which will make Tor use 12GB as a limit... and even with a fairly good amount of swap, this is likely to be killed by the OOM of the OS at some point.

On the flip side, a fast relay stuck with an estimation of 1GB or 2GB of RAM that Tor can use at startup won't be "fast" for much long before the OOM kicks in and start killing old circuits. It is difficult to tell what a normal fast relay will endure in terms of RAM for Tor overtime but so far of what I can tell with my relays, between 1 and 2 GB is usually what I see (in non-DoS condition and non-Exit).

I do believe right now that the network is still fairly usable because we have big Guards able to use 5, 10, 12GB of RAM right now... Unclear to me if firing up the OOM more frequently would improve the situation but we should be very careful at not making every relays using a "too low amount of ram" :S.

Replying to dgoulet:

We could also explore the possibility for that value to be a moving target at runtime. It is a bit more dicy and complicated but because Tor at startup looks at the "Total memory" instead of the "Available memory" to estimate that value, things can go badly quickly if 4/16 GB of RAM are available which will make Tor use 12GB as a limit... and even with a fairly good amount of swap, this is likely to be killed by the OOM of the OS at some point.

On the flip side, a fast relay stuck with an estimation of 1GB or 2GB of RAM that Tor can use at startup won't be "fast" for much long before the OOM kicks in and start killing old circuits.

This is not what I have observed. I have some fast Guards. Under normal load they don't ever use much more than 1 - 2 GB total RAM.

It is difficult to tell what a normal fast relay will endure in terms of RAM for Tor overtime but so far of what I can tell with my relays, between 1 and 2 GB is usually what I see (in non-DoS condition and non-Exit).

I usually see 1-2 GB for non-exits, and closer to 2 GB for exits.

I do believe right now that the network is still fairly usable because we have big Guards able to use 5, 10, 12GB of RAM right now... Unclear to me if firing up the OOM more frequently would improve the situation but we should be very careful at not making every relays using a "too low amount of ram" :S.

If the fastest relay can do 1 Gbps, then that's 125 MB per second. 12 GB of RAM is 100 seconds of traffic. Is it really useful to buffer 100 seconds of traffic? (Or, under the current load, tens of thousands of useless circuits?)

So I'm not sure if using more RAM for queues actually helps. In my experience, it just increases the number of active connections and CPU usage. I don't know how to measure if this benefits or hurts clients. (I guess I could tweak my guard and test running a client through it?)

Here's what happened when I followed my own advice in this thread: https://lists.torproject.org/pipermail/tor-relays/2018-January/014021.html

I have a few big guards that are very close to a lot of the new clients. They were using 150% CPU, 4-8 GB RAM, and 15000 connections each. But they were not actually carrying much useful traffic.

I tried reducing MaxMemInQueues to 2 GB and 1 GB, and they started using 3-7 GB RAM. This is on 0.3.0 with the destroy cell fix. (But on my slower Guards and my Exit, MaxMemInQueues worked really well, reducing the RAM usage to 0.5 - 1.5 GB, without reducing the consensus weight.)

I tried reducing the number of file descriptors, that reduced the CPU to around 110%, because the new connections were closed earlier. It pushed a lot of the sockets into the kernel TIME_WAIT state, about 10,000 on top of the regular 10,000. (Maybe these new Tor clients didn't do exponential backoff?)

I tried DisableOOSCheck 0, and it didn't seem to make much difference to RAM or CPU, but it made a small difference to sockets (and it makes sure that I don't lose important sockets, like new control port sockets, so I left it on).

I already set RelayBandwidthRate, but now I also set MaxAdvertisedBandwidth to about half the RelayBandwidthRate. Hopefully this will make the clients go elsewhere. But this isn't really a solution for the network.

So I'm out of options to try and regulate traffic on these guards. And I need to have them working in about a week or so, because I need to run safe stats collections on them.

I think my only remaining option is to drop connections when the number of connections per IP goes above some limit. From the tor-relays posts, it seems like up to 10 connections per IP is normal, but these clients will make hundreds of connections at once. I think I should DROP rather than RST, because that forces the client to timeout, rather than immediately making another connection.

Replying to teor:

I do believe right now that the network is still fairly usable because we have big Guards able to use 5, 10, 12GB of RAM right now... Unclear to me if firing up the OOM more frequently would improve the situation but we should be very careful at not making every relays using a "too low amount of ram" :S.

If the fastest relay can do 1 Gbps, then that's 125 MB per second. 12 GB of RAM is 100 seconds of traffic. Is it really useful to buffer 100 seconds of traffic? (Or, under the current load, tens of thousands of useless circuits?)

No, but you don't have a choice because you can't drop cells except if things are going catastrophically wrong and you're willing to tear down the circuit.

http://yuba.stanford.edu/~nickm/papers/sigcomm2004.pdf

However as the paper says:

It is a little difficult to persuade the operator of a functioning, profitable network to take the risk and remove 99% of their buffers. But that has to be the next step, and we see the results presented in this paper as a first step towards persuading an operator to try it.

(The CoDel work also supports their hypothesis.)

Replying to teor:

Replying to dgoulet:

We could also explore the possibility for that value to be a moving target at runtime. It is a bit more dicy and complicated but because Tor at startup looks at the "Total memory" instead of the "Available memory" to estimate that value, things can go badly quickly if 4/16 GB of RAM are available which will make Tor use 12GB as a limit... and even with a fairly good amount of swap, this is likely to be killed by the OOM of the OS at some point.

On the flip side, a fast relay stuck with an estimation of 1GB or 2GB of RAM that Tor can use at startup won't be "fast" for much long before the OOM kicks in and start killing old circuits.

This is not what I have observed. I have some fast Guards. Under normal load they don't ever use much more than 1 - 2 GB total RAM.

Oh that was in the context of the ongoing "DDoS" on the network. I also usually never go above 1.2GB for a ~12MB/s relay but right now I'm at ~3GB so an estimation at 1GB of RAM would just decrease my relay capabilities.

If the fastest relay can do 1 Gbps, then that's 125 MB per second. 12 GB of RAM is 100 seconds of traffic. Is it really useful to buffer 100 seconds of traffic? (Or, under the current load, tens of thousands of useless circuits?)

So I'm not sure if using more RAM for queues actually helps. In my experience, it just increases the number of active connections and CPU usage. I don't know how to measure if this benefits or hurts clients. (I guess I could tweak my guard and test running a client through it?)

I think this could come down to lots of traffic being queued because the next hops are overloaded so if you relay is very big, Tor is happy to keep it while waiting to relay the cells to the much slower next hop. However, I'm seriously uncertain about this and if it is even really what is happening... Need more investigation on my part.

[snip]

Yeah the rest of your response is good knowledge and I'm honestly also uncertain of what to do for now either.

Ok, so do we want to hold off on this change? Or do we want to try 0.4*Total RAM?

Trac:
Status: assigned to needs_information

Consolidate our DoS keywords to only use "tor-dos"

Trac:
Keywords: tor-ddos deleted, tor-dos added

Replying to teor:

Ok, so do we want to hold off on this change? Or do we want to try 0.4*Total RAM?

I'm up for it. Tor badly estimate the limit for the OOM :S...

Big relays probably have huge amount of RAM (total) like 32 or 64GB so this would mean a limit of 12GB or 25GB with 0.4 which possibly is way over estimated anyway.

For a 2GB relay, it would be 800MB. I think 1GB or even 1.2GB would be better but it is a good start.

We could also do that computation than look at the "Available memory" and if it is lower than our estimation, we adjust without going below the minimum. But that can back fire if the machine uses lots of memory and a minute later frees half of it, tor wouldn't adapt unless we make that a moving target.

Trac:
Milestone: Tor: 0.3.2.x-final to Tor: 0.3.3.x-final
Keywords: N/A deleted, 033-must added

Trac:
Keywords: N/A deleted, security added

Marking all tickets reached by current round of 033 triage.

Trac:
Keywords: N/A deleted, 033-triage-20180320 added

Mark 033-must tickets as triaged-in for 0.3.3

Trac:
Keywords: N/A deleted, 033-included-20180320 added

We should probably come up with a decision here on what we'd like to do for us to get this into 033 (if we think it's still important).

I think David's comment in comment 11 looks reasonable, but reaching some kind of consensus would be good.

Do we have a handle on what most of the use is, when the memory use gets huge?

I ask because if it's queued cells on circuits, the proposed work on #25226 (moved) should help us to kill problematic circuits before they trigger the OOM.

And we might want to expand that technique to tackle other categories of things, where if we can identify a "woah that is using way more than it should" situation we can take care of it before the OOM killer has to do it.

Patch in https://github.com/torproject/tor/pull/37

Trac:
Status: needs_information to needs_review

Trac:
Reviewer: N/A to dgoulet

Set a lower default MaxMemInQueues value

Child items 0

Activity