Opened 4 months ago

Closed 8 days ago

Last modified 7 days ago

#24782 closed defect (fixed)

Set a lower default MaxMemInQueues value

Reported by: teor Owned by: ahf
Priority: Medium Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-relay, tor-dos, 033-must, security, 033-triage-20180320, 033-included-20180320
Cc: Actual Points:
Parent ID: Points: 0.5
Reviewer: dgoulet Sponsor:

Description

The default MaxMemInQueues value of 0.75*RAM assumes:

  • there is one tor instance per machine, and
  • MaxMemInQueues covers most queue memory.

If we instead assumed:

  • two tor instances, and
  • MaxMemInQueues covers half of queue memorym

this would make Tor more resilient to attacks.

To do this, we should set MaxMemInQueues to 0.2*RAM, at least if we have a lot of RAM (like 4-8GB or more).

Child Tickets

Change History (28)

comment:1 Changed 4 months ago by teor

This was split off #24737.

comment:2 Changed 4 months ago by ahf

Owner: set to ahf
Status: newassigned

comment:3 Changed 4 months ago by nickm

FWIW, I would expect that the kistlite bug #24671 fixed in 0.3.2.8-rc might have made Tor use way too much kernel ram; we can take this change, but we should keep monitoring Tor's memory usage to see whether our estimates are right. (Also, over time, we should make MaxMemInQueues cover more and more of the things that Tor allocates for. But that doesn't affect this ticket.)

comment:4 Changed 4 months ago by teor

I'm still seeing some of my Guards use 5-7 GB even with the destroy cell fix and MaxMemInQueues 2 GB. They have 11000 - 160000 connections open each. (This is process RAM, and they don't use KISTLite.) So I think this supports decreasing the default for systems with a lot of RAM.

comment:5 Changed 3 months ago by dgoulet

We could also explore the possibility for that value to be a moving target at runtime. It is a bit more dicy and complicated but because Tor at startup looks at the "Total memory" instead of the "Available memory" to estimate that value, things can go badly quickly if 4/16 GB of RAM are available which will make Tor use 12GB as a limit... and even with a fairly good amount of swap, this is likely to be killed by the OOM of the OS at some point.

On the flip side, a fast relay stuck with an estimation of 1GB or 2GB of RAM that Tor can use at startup won't be "fast" for much long before the OOM kicks in and start killing old circuits. It is difficult to tell what a normal fast relay will endure in terms of RAM for Tor overtime but so far of what I can tell with my relays, between 1 and 2 GB is usually what I see (in non-DoS condition and non-Exit).

I do believe right now that the network is still fairly usable because we have big Guards able to use 5, 10, 12GB of RAM right now... Unclear to me if firing up the OOM more frequently would improve the situation but we should be very careful at not making every relays using a "too low amount of ram" :S.

comment:6 in reply to:  5 ; Changed 3 months ago by teor

Replying to dgoulet:

We could also explore the possibility for that value to be a moving target at runtime. It is a bit more dicy and complicated but because Tor at startup looks at the "Total memory" instead of the "Available memory" to estimate that value, things can go badly quickly if 4/16 GB of RAM are available which will make Tor use 12GB as a limit... and even with a fairly good amount of swap, this is likely to be killed by the OOM of the OS at some point.

On the flip side, a fast relay stuck with an estimation of 1GB or 2GB of RAM that Tor can use at startup won't be "fast" for much long before the OOM kicks in and start killing old circuits.

This is not what I have observed. I have some fast Guards. Under normal load they don't ever use much more than 1 - 2 GB total RAM.

It is difficult to tell what a normal fast relay will endure in terms of RAM for Tor overtime but so far of what I can tell with my relays, between 1 and 2 GB is usually what I see (in non-DoS condition and non-Exit).

I usually see 1-2 GB for non-exits, and closer to 2 GB for exits.

I do believe right now that the network is still fairly usable because we have big Guards able to use 5, 10, 12GB of RAM right now... Unclear to me if firing up the OOM more frequently would improve the situation but we should be very careful at not making every relays using a "too low amount of ram" :S.

If the fastest relay can do 1 Gbps, then that's 125 MB per second. 12 GB of RAM is 100 seconds of traffic. Is it really useful to buffer 100 seconds of traffic? (Or, under the current load, tens of thousands of useless circuits?)

So I'm not sure if using more RAM for queues actually helps. In my experience, it just increases the number of active connections and CPU usage. I don't know how to measure if this benefits or hurts clients. (I guess I could tweak my guard and test running a client through it?)

Here's what happened when I followed my own advice in this thread:
https://lists.torproject.org/pipermail/tor-relays/2018-January/014021.html

I have a few big guards that are very close to a lot of the new clients. They were using 150% CPU, 4-8 GB RAM, and 15000 connections each. But they were not actually carrying much useful traffic.

I tried reducing MaxMemInQueues to 2 GB and 1 GB, and they started using 3-7 GB RAM. This is on 0.3.0 with the destroy cell fix. (But on my slower Guards and my Exit, MaxMemInQueues worked really well, reducing the RAM usage to 0.5 - 1.5 GB, without reducing the consensus weight.)

I tried reducing the number of file descriptors, that reduced the CPU to around 110%, because the new connections were closed earlier. It pushed a lot of the sockets into the kernel TIME_WAIT state, about 10,000 on top of the regular 10,000. (Maybe these new Tor clients didn't do exponential backoff?)

I tried DisableOOSCheck 0, and it didn't seem to make much difference to RAM or CPU, but it made a small difference to sockets (and it makes sure that I don't lose important sockets, like new control port sockets, so I left it on).

I already set RelayBandwidthRate, but now I also set MaxAdvertisedBandwidth to about half the RelayBandwidthRate. Hopefully this will make the clients go elsewhere. But this isn't really a solution for the network.

So I'm out of options to try and regulate traffic on these guards. And I need to have them working in about a week or so, because I need to run safe stats collections on them.

I think my only remaining option is to drop connections when the number of connections per IP goes above some limit. From the tor-relays posts, it seems like up to 10 connections per IP is normal, but these clients will make hundreds of connections at once. I think I should DROP rather than RST, because that forces the client to timeout, rather than immediately making another connection.

comment:7 in reply to:  6 Changed 3 months ago by yawning

Replying to teor:

I do believe right now that the network is still fairly usable because we have big Guards able to use 5, 10, 12GB of RAM right now... Unclear to me if firing up the OOM more frequently would improve the situation but we should be very careful at not making every relays using a "too low amount of ram" :S.

If the fastest relay can do 1 Gbps, then that's 125 MB per second. 12 GB of RAM is 100 seconds of traffic. Is it really useful to buffer 100 seconds of traffic? (Or, under the current load, tens of thousands of useless circuits?)

No, but you don't have a choice because you can't drop cells except if things are going catastrophically wrong and you're willing to tear down the circuit.

http://yuba.stanford.edu/~nickm/papers/sigcomm2004.pdf

However as the paper says:

It is a little difficult to persuade the operator of a
functioning, profitable network to take the risk and remove
99% of their buffers. But that has to be the next step, and
we see the results presented in this paper as a first step
towards persuading an operator to try it.

(The CoDel work also supports their hypothesis.)

comment:8 in reply to:  6 Changed 3 months ago by dgoulet

Replying to teor:

Replying to dgoulet:

We could also explore the possibility for that value to be a moving target at runtime. It is a bit more dicy and complicated but because Tor at startup looks at the "Total memory" instead of the "Available memory" to estimate that value, things can go badly quickly if 4/16 GB of RAM are available which will make Tor use 12GB as a limit... and even with a fairly good amount of swap, this is likely to be killed by the OOM of the OS at some point.

On the flip side, a fast relay stuck with an estimation of 1GB or 2GB of RAM that Tor can use at startup won't be "fast" for much long before the OOM kicks in and start killing old circuits.

This is not what I have observed. I have some fast Guards. Under normal load they don't ever use much more than 1 - 2 GB total RAM.

Oh that was in the context of the ongoing "DDoS" on the network. I also usually never go above 1.2GB for a ~12MB/s relay but right now I'm at ~3GB so an estimation at 1GB of RAM would just decrease my relay capabilities.

If the fastest relay can do 1 Gbps, then that's 125 MB per second. 12 GB of RAM is 100 seconds of traffic. Is it really useful to buffer 100 seconds of traffic? (Or, under the current load, tens of thousands of useless circuits?)

So I'm not sure if using more RAM for queues actually helps. In my experience, it just increases the number of active connections and CPU usage. I don't know how to measure if this benefits or hurts clients. (I guess I could tweak my guard and test running a client through it?)

I think this could come down to lots of traffic being queued because the next hops are overloaded so if you relay is very big, Tor is happy to keep it while waiting to relay the cells to the much slower next hop. However, I'm seriously uncertain about this and if it is even really what is happening... Need more investigation on my part.

[snip]

Yeah the rest of your response is good knowledge and I'm honestly also uncertain of what to do for now either.

comment:9 Changed 3 months ago by teor

Status: assignedneeds_information

Ok, so do we want to hold off on this change?
Or do we want to try 0.4*Total RAM?

comment:10 Changed 8 weeks ago by dgoulet

Keywords: tor-dos added; tor-ddos removed

Consolidate our DoS keywords to only use "tor-dos"

comment:11 in reply to:  9 Changed 8 weeks ago by dgoulet

Keywords: 033-must added
Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

Replying to teor:

Ok, so do we want to hold off on this change?
Or do we want to try 0.4*Total RAM?

I'm up for it. Tor badly estimate the limit for the OOM :S...

Big relays probably have huge amount of RAM (total) like 32 or 64GB so this would mean a limit of 12GB or 25GB with 0.4 which possibly is way over estimated anyway.

For a 2GB relay, it would be 800MB. I think 1GB or even 1.2GB would be better but it is a good start.

We could also do that computation than look at the "Available memory" and if it is lower than our estimation, we adjust without going below the minimum. But that can back fire if the machine uses lots of memory and a minute later frees half of it, tor wouldn't adapt unless we make that a moving target.

comment:12 Changed 4 weeks ago by nickm

Keywords: security added

comment:13 Changed 4 weeks ago by nickm

Keywords: 033-triage-20180320 added

Marking all tickets reached by current round of 033 triage.

comment:14 Changed 4 weeks ago by nickm

Keywords: 033-included-20180320 added

Mark 033-must tickets as triaged-in for 0.3.3

comment:15 Changed 3 weeks ago by ahf

We should probably come up with a decision here on what we'd like to do for us to get this into 033 (if we think it's still important).

I think David's comment in comment 11 looks reasonable, but reaching some kind of consensus would be good.

comment:16 Changed 3 weeks ago by arma

Do we have a handle on what most of the use is, when the memory use gets huge?

I ask because if it's queued cells on circuits, the proposed work on #25226 should help us to kill problematic circuits before they trigger the OOM.

And we might want to expand that technique to tackle other categories of things, where if we can identify a "woah that is using way more than it should" situation we can take care of it before the OOM killer has to do it.

comment:17 Changed 2 weeks ago by ahf

Status: needs_informationneeds_review

comment:18 Changed 2 weeks ago by dgoulet

Reviewer: dgoulet

comment:19 Changed 2 weeks ago by dgoulet

Status: needs_reviewneeds_revision

lgtm; except for one comment in the code. Putting it back in needs_revision but after that, it should go in merge_ready.

comment:20 Changed 2 weeks ago by ahf

Status: needs_revisionneeds_review

Updated comment.

comment:21 Changed 2 weeks ago by dgoulet

Status: needs_reviewmerge_ready

Thanks ahf. Love it. Sorry about the nitpick!

lgtm;

comment:22 Changed 11 days ago by nickm

One more comment tweak needed -- I'll do it postmerge.

comment:23 Changed 11 days ago by nickm

Status: merge_readyneeds_revision

Hang on -- this branch seems to be based on master.

Please do the remaining comment tweak, and base it on maint-0.3.3?

comment:24 Changed 8 days ago by dgoulet

Status: needs_revisionmerge_ready

See branch: ticket24782_033_01.

I cherry picked the commits from ahf's branch and addressed nickm's review in fixup 418d1ac115babbe6.

comment:25 Changed 8 days ago by nickm

Resolution: fixed
Status: merge_readyclosed

Thanks; merging!

comment:26 Changed 8 days ago by nickm

Note that this patch caused 32-bit builds to break. I've tried to fix this with 4aaa4215e7e11f318c5a50124e29dc0b50ce21e1.

We should have travis check 32-bit builds somehow if we can.

comment:27 Changed 8 days ago by nickm

Fixed another compile-time warning in the tests, this time with 46795a7be63b9a1b90a59fcf9efda4f4f1eacc37

comment:28 in reply to:  26 Changed 7 days ago by teor

Replying to nickm:

Note that this patch caused 32-bit builds to break. I've tried to fix this with 4aaa4215e7e11f318c5a50124e29dc0b50ce21e1.

We should have travis check 32-bit builds somehow if we can.

Travis doesn't have 32 bit machines yet:
https://github.com/travis-ci/travis-ci/issues/986

We could do a 32-bit build and test on 64-bit macOS, but we don't use macOS on Travis because it's slow.
(Homebrew also doesn't support 32-bit libraries on macOS, so we'd have to install them through MacPorts, which is slow.)

I'm not sure if there is a similar option for Linux.

Note: See TracTickets for help on using tickets.