Opened 5 years ago

Closed 5 years ago

Last modified 3 years ago

#10169 closed defect (fixed)

Extend OOM handler to cover channels/connection buffers

Reported by: nickm Owned by:
Priority: High Milestone: Tor: 0.2.5.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay, oom, 024-backport, 2016-bug-retrospective
Cc: robgjansen, Flo, nickm, rpw Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Right now, the OOM handler looks at circuits but not streams. We should fix that. Any ideas for a good algorithm?

Child Tickets

Attachments (2)

ram.time.png (7.2 KB) - added by robgjansen 5 years ago.
The oom defense was not triggered.
all.ram.time.png (14.9 KB) - added by robgjansen 5 years ago.
OOM defense works well in controlled test network.

Download all attachments as: .zip

Change History (28)

comment:1 Changed 5 years ago by robgjansen

This is for the upload to server variant of the sniper attack. For each exit stream at the exit relay, track a LastReadTimestamp - the last time that stream was read from by the destination. Then, while the OOM killer is checking all circuits for the one with the oldest cell, have it also consider the exit streams' LastReadTimestamps and kill the oldest circuit/stream accordingly.

What breaks here?

comment:2 Changed 5 years ago by nickm

Keywords: 024-backport added

comment:3 Changed 5 years ago by nickm

"How long has the oldest cell on the circuit been there" (what we did for #9093) is not quite the same thing as "when did this stream last successfully write" (what I think you're proposing above. The latter check is trivially defeated by a slow but steady drain, right?

comment:4 Changed 5 years ago by robgjansen

Yeah, you're right, I wasn't very clear. I was trying to suggest something similar to a timestamp per cell like we have with circuit queues. How about we keep a timestamp queue on the streams. We append a timestamp for every N bytes written from the circuit to the edge buffer, and then pop the head of the timestamp queue after we flush N bytes. Will this allow us to track how long every N byte chunk has been waiting in the buffer?

comment:5 Changed 5 years ago by nickm

Hm. I bet we could do something clever with the buf_chunk_t structure.

comment:6 Changed 5 years ago by nickm

Branch "bug10169_023" is a very hurried implementation of an approximation of this.

What it is missing is:

  • it needs to aggressively free the bytes in the buffers rather than just marking the connections, to free up RAM faster.
  • it needs to count the bytes in the buffers as it's removing them, to know when it's freed 10% of the total allocations here.
  • probably something else; I'm in a hurry

comment:7 Changed 5 years ago by nickm

Status: newneeds_review

I've resolved the items I know about above, and added a #9686 fix. Please review this branch? (bug10169_023)

Despite its name, I would be happy to let it percolate in 0.2.5 for a while before merging it into 0.2.4 or 0.2.3.

comment:8 Changed 5 years ago by nickm

Re-pinging folks: I really do need a review for this one. Anybody?

comment:9 Changed 5 years ago by nickm

I've forward-ported to bug10169_024 and bug10169_025; I'm writing unit tests for the latter since the unit testing framework in 0.2.5 is what I need here. I'm doing a branch that combines unit tests and fixes as bug10169_025_tmp; I'm going to split them up into separate commits and cherry-pick the bugfixes to bug10169_023 while cherry-picking the tests in bug10169_025.

comment:10 Changed 5 years ago by sysrqb

Reviewed bug10169_024.

total_bytes_allocated_in_chunks still has a DOCDOC in buffers.c

I wonder if END_CIRC_REASON_RESOURCELIMIT can be used to perform a modified sniper-based oracle attack if there's another way to fill the remaining 10% when the 90% memory usage is reached.

It might be useful to the relay operator if Tor says how many circuits remained alive after circuits_handle_oom runs, e.g. modify the log notice at the end of that function.

  int n_circuits_alive = smartlist_len(circlist) - n_circuits_killed;
  log_notice(LD_GENERAL, "Removed "U64_FORMAT" bytes by killing %d "
                         "circuits, %d circuits remain alive.",
             U64_PRINTF_ARG(mem_recovered), n_circuits_killed,
             n_circuits_alive);

Maybe the comments for circuit_get_streams_max_data_age and marked_circuit_streams_free_bytes should notate that they are helpers for circuit_max_queued_data_age and marked_circuit_free_stream_bytes, respectively?

Is it possible that when the stream buffers are "aggressively freed" using chunk_free_unchecked() they may not actually be freed, but prepended to a freelist and, as a result, less memory is actually freed in circuits_handle_oom() than is expected?

comment:11 Changed 5 years ago by nickm

I've added some fixes to bug10169_023, bug10169_024, and bug10169_025_v2. (Note the new branch!)

Bug10169_025_v2 now has some tests for oom and other issues. I still need to address sysrqb's comments above.

comment:12 in reply to:  10 Changed 5 years ago by nickm

Replying to sysrqb:

Reviewed bug10169_024.

total_bytes_allocated_in_chunks still has a DOCDOC in buffers.c

Will fix later in 0.2.5.

I wonder if END_CIRC_REASON_RESOURCELIMIT can be used to perform a modified sniper-based oracle attack if there's another way to fill the remaining 10% when the 90% memory usage is reached.

The idea being, cause a node to go nearly OOM, and then see which streams (as a client!) got END_STREAM_REASON_RESOURCELIMIT, so you know that you're nearly at the OOM point, and then somehow make the node consume another .1 * MaxMem ?

If that's what you meant, it would work, but I think it only means that MaxMem needs to be set conservatively, and we need to be on the lookout for other ways to pump up a node's memory consumption. Even if we didn't send END_STREAM_REASON_RESOURCELIMIT, an attacker could still snipe a node if they know a way to make it run out of memory without its buffers and cell queues exceeding .9*MaxMem.

It might be useful to the relay operator if Tor says how many circuits remained alive after circuits_handle_oom runs, e.g. modify the log notice at the end of that function.

  int n_circuits_alive = smartlist_len(circlist) - n_circuits_killed;
  log_notice(LD_GENERAL, "Removed "U64_FORMAT" bytes by killing %d "
                         "circuits, %d circuits remain alive.",
             U64_PRINTF_ARG(mem_recovered), n_circuits_killed,
             n_circuits_alive);

Done in 79c234e0e3fa22d76029bd3b5e2c52072709cff3

Maybe the comments for circuit_get_streams_max_data_age and marked_circuit_streams_free_bytes should notate that they are helpers for circuit_max_queued_data_age and marked_circuit_free_stream_bytes, respectively?

Is it possible that when the stream buffers are "aggressively freed" using chunk_free_unchecked() they may not actually be freed, but prepended to a freelist and, as a result, less memory is actually freed in circuits_handle_oom() than is expected?

I think I fixed that in fd28754dd3dce0e00304825d531348414c0a354b

The branches to review are still bug10169_023 (for the main patch), bug10169_024 (for the merge), and bug10169_025_v2 (for the merge and the tests)

comment:13 Changed 5 years ago by andrea

Code review for NickM's bug10169_023 branch:

https://gitweb.torproject.org/nickm/tor.git/shortlog/refs/heads/bug10169_023
91ec6f7269bd7a5b73629f38e9779e84a0fb84f2..79c234e0e3fa22d76029bd3b5e2c52072709cff3

91ec6f7269bd7a5b73629f38e9779e84a0fb84f2:

  • I don't like the name buf_get_oldest_chunk_timestamp() for a function that returns an age rather than a timestamp.
  • Can anything horrible happen with all this if the clock gets reset?
    • Perhaps it would be wise to use clock_gettime(CLOCK_MONOTONIC, ...) where available if we aren't doing so already.
  • The adjustment in chunk_grow() is wrong but you fixed it in 79515917449c7e0d92f16db0d1e5af4a0370bbab

eabcab2b7caab75ba8607c7dac225e4533998e80:

  • This looks okay to me.

a406f6d0f05f812ced4bfc048f30bc8692be7e93:

  • This looks okay to me.

03ff21b018c8d0b005d2a60c3ba2bf08d6cb00bb:

  • This looks okay to me.

e572ec856dff263d7f43ec6d42fe3fc3b7557f73:

  • This looks okay to me.

647248729fa65f0e51d062e2af8f4e8b38592bf5:

  • This looks okay to me.

79515917449c7e0d92f16db0d1e5af4a0370bbab:

  • This looks okay to me.

fd28754dd3dce0e00304825d531348414c0a354b:

  • Yeah, this might be a good move :)

79c234e0e3fa22d76029bd3b5e2c52072709cff3:

  • This looks okay to me.

Code review for NickM's bug10169_024 branch:

https://gitweb.torproject.org/nickm/tor.git/shortlog/refs/heads/bug10169_024

  • All of bug10169_023 plus merges:
    • 5c45a333c3cdfc4c7a817425a1c3ae88085c389b
    • 05d8111eedee9e11e4bb1c42e93ae2fc168d52ec

5c45a333c3cdfc4c7a817425a1c3ae88085c389b:

  • This looks okay to me.

05d8111eedee9e11e4bb1c42e93ae2fc168d52ec:

  • This looks okay to me.

Code review for bug10169_025 branch:

https://gitweb.torproject.org/nickm/tor.git/shortlog/refs/heads/bug10169_025_v2

  • All of bug10169_024 plus 87fb1e324c1b3214765c46bec3d9ec6adc3fa83d..c8d41da52d6ae1edd1c4999e328b1e7eadc0ab5b

87fb1e324c1b3214765c46bec3d9ec6adc3fa83d:

  • This looks okay to me.

eb6f433bdbd6cf44e1f35272ed04a6c0e14f3c2d:

  • This looks okay to me.

f425cf833852c4e1b4660cbba6190d04b070f6b6:

  • This looks okay to me.

d379fc6e0ffce916753e5ef1ac0783703d150fa5:

  • This looks okay to me.

52d222aafbc21d674624fdd4c8fc834a40af69c7:

  • This looks okay to me.

9a07ec751ff73062b958a8fc9f8437bed72e144c:

  • This looks okay to me.

48877e24a880be41e4dad50f5ebb6e0671e9f92f:

  • This looks okay to me.

c8d41da52d6ae1edd1c4999e328b1e7eadc0ab5b:

  • This looks okay to me.

comment:14 in reply to:  13 Changed 5 years ago by nickm

Replying to andrea:

Code review for NickM's bug10169_023 branch:

https://gitweb.torproject.org/nickm/tor.git/shortlog/refs/heads/bug10169_023
91ec6f7269bd7a5b73629f38e9779e84a0fb84f2..79c234e0e3fa22d76029bd3b5e2c52072709cff3

91ec6f7269bd7a5b73629f38e9779e84a0fb84f2:

  • I don't like the name buf_get_oldest_chunk_timestamp() for a function that returns an age rather than a timestamp.

Okay, let's change that in the 0.2.5 version.

  • Can anything horrible happen with all this if the clock gets reset?

We could kill the wrong circuits if the clock goes backwards and then doesn't catch up with itself before we hit an OOM.

  • Perhaps it would be wise to use clock_gettime(CLOCK_MONOTONIC, ...) where available if we aren't doing so already.

Can that be an 0.2.5 only thing? Doing a portable monotonic timer is a bit tricksy. On Linux, you want clock_gettime(CLOCK_MONOTONIC_COARSE). On OSX, you want mach_absolute_time(). On other Unix, you want clock_gettime(CLOCK_MONOTONIC) if possible. On Windows, there's a complicated mishmash of things using QueryPerformanceCounter(), GetTickCount64(), and GetTickCount(). As a fallback, you can use gettimeofday() and check the result to make sure it doesn't go backwards.

I guess we could do just the fallback check-and-latch in 0.2.3/0.2.4, and aim for the more complex ones in 0.2.5 or later?

comment:15 Changed 5 years ago by nickm

I've added 833d027778ba97020fb5ded1d94e4b21fbcab766 to bug10169_023 ; if you like it, I'll merge it forwards. It implements a trivial latch to make sure that our time can't go backwards there.

comment:16 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.4.x-final

Andrea said this looked good to merge into 0.2.5. I've updated the branches "bug10169_024" and "bug10169_025_v2", and merged the latter.

We should consider this for backport to 0.2.4.

comment:17 in reply to:  16 Changed 5 years ago by robgjansen

Replying to nickm:

Andrea said this looked good to merge into 0.2.5. I've updated the branches "bug10169_024" and "bug10169_025_v2", and merged the latter.

FYI, I have not forgotten about testing this defense. I've been trying to test bug10169_025_v2 in Shadow for the last week. I've been running into bugs that appeared after updating the version of Tor I'm using in Shadow. Stay tuned.

Changed 5 years ago by robgjansen

Attachment: ram.time.png added

The oom defense was not triggered.

comment:18 Changed 5 years ago by robgjansen

After merge hell, I finally got this working. I set MaxMemInQueues to 50MB, which automatically gets pushed up to the minimum of 256MB. I've attached a graph showing that circuits_handle_oom() did not appear to be triggered.

https://trac.torproject.org/projects/tor/attachment/ticket/10169/ram.time.png

Is there some way for me to tell what the victim thought its memory was over time? Or should I just add a log message for that?

Last edited 5 years ago by robgjansen (previous) (diff)

comment:19 Changed 5 years ago by nickm

I'd suggest adding a log message. Also, what are you testing that gets you "merge hell"? I'd suggest that you just test master, or 0.2.5.3-alpha.

comment:20 in reply to:  19 Changed 5 years ago by robgjansen

Replying to nickm:

I'd suggest adding a log message.

OK.

Also, what are you testing that gets you "merge hell"? I'd suggest that you just test master, or 0.2.5.3-alpha.

This was mostly due to the fact that I implemented some necessary client pieces for the attack on 0.2.3.25, and made the mistake of optimistically assuming everything would work OOTB instead of starting my testing on a minimal network. So yes, its my own fault.

comment:21 Changed 5 years ago by nickm

ok. Please try against 0.2.5.3-alpha in that case if at all possible?

Changed 5 years ago by robgjansen

Attachment: all.ram.time.png added

OOM defense works well in controlled test network.

comment:22 Changed 5 years ago by robgjansen

TLDR, the defense seems to be working correctly.

I tried this out on my small 10 node test network in Shadow, where all relays has ample 10 MiB/s connections. I merged both my sniper attack code and nickm's bug10169_025_v2 with tor-0.2.5.2-alpha. Then I tested the sniper attack using 1 team of 10 circuits (1 client instance to use a ping circuit to measure rtt, 1 client instance to launch 9 sniper circuits). I tested the attack without nickm's defense, and with nickm's defense using MaxMemInQueues 50 MB (which automatically gets adjusted up to 256MB). Then I ran a second test with 2 teams of 10 circuits.

The results are in the attack graph. Both the graph and the log file indicates that the sniper's circuits were successfully killed after memory exceeded the 256MB limit.

I'm not exactly sure why the defense was not being triggered before, but looking back at my config I may have been using MaxMemInQueues of 500 MB (which would have been to large to trigger OOM killer).

comment:23 Changed 5 years ago by nickm

Recommendation: unsure about 0.2.4.22 backport due to complexity.

comment:24 Changed 5 years ago by arma

I think given the complexity, and given that some fast relays are running 0.2.5, it is not critical to get this fix to the remaining 0.2.4 relays.

comment:25 Changed 5 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final
Resolution: fixed
Status: needs_reviewclosed

Okay, no backport to 0.2.4 for these, for stability reasons.

comment:26 Changed 3 years ago by nickm

Keywords: 2016-bug-retrospective added

Mark more tickets for bug retrospective based on hand-review of changelogs from 0.2.5 onwards.

Note: See TracTickets for help on using tickets.