Opened 8 years ago

Closed 7 years ago

Last modified 7 years ago

#5263 closed defect (fixed)

Busy/infinite Libevent loops

Reported by: robgjansen Owned by: nickm
Priority: High Milestone: Tor: 0.2.3.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay
Cc: nickm, arma, robgjansen Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

There is a bug causing busy loops in Libevent and infinite loops in the Shadow simulator. A connection that is marked for close, wants to flush, is held open to flush, but is rate limited (the token bucket is empty) triggers the bug.

A patch to the latest alpha (tor-0.2.3.12-alpha) is attached, and details are below.

This currently happens on read and write callbacks when the active socket is marked for close. In this case, Tor doesn't actually try to complete the read or write (it returns from those methods when marked), but instead tries to clear the connection with conn_close_if_marked(). Tor will not close a marked connection that contains data: it must be flushed first. The bug occurs when this flush operation on the marked connection can not occur because the connection is rate-limited (its write token bucket is empty).

The fix is to detect when rate limiting is preventing a marked connection from properly flushing. In this case, it should be flagged as read/write_blocked_on_bandwidth and the read/write events de-registered from Libevent. When the token bucket gets refilled, it will check the associated read/write_blocked_on_bandwidth flag, and add the read/write event back to Libevent, which will cause it to fire. This time, it will be properly flushed and closed.

The reason that both read and write events are both de-registered when the marked connection can not flush is because both result in the same behavior. Both read/write events on marked connections will never again do any actual reads/writes, and are only useful to trigger the flush and close the connection. By setting the associated read/write_blocked_on_bandwidth flag, we ensure that the event will get added back to Libevent, properly flushed, and closed.

Why is this important? Every Shadow event occurs at a discrete time instant. If Tor does not properly deregister Libevent events that fire but result in Tor essentially doing nothing, Libevent will repeatedly fire the event. In Shadow this means infinite loop, outside of Shadow this means wasted CPU cycles.

Child Tickets

Attachments (2)

0001-Fix-busy-Libevent-loops-infinite-loops-in-Shadow.patch (3.3 KB) - added by robgjansen 8 years ago.
config_vanilla.tar.xz (37.9 KB) - added by robgjansen 7 years ago.
'small' Shadow config bundle (uses about 10 GiB of RAM)

Download all attachments as: .zip

Change History (35)

comment:1 Changed 8 years ago by nickm

Component: - Select a componentTor Relay
Milestone: Tor: 0.2.2.x-final

comment:2 Changed 8 years ago by nickm

Priority: normalmajor

comment:3 Changed 8 years ago by nickm

Status: newneeds_review

This is much simpler than the last version of this patch; thanks!

The reason that both read and write events are both de-registered when the marked connection can not flush is because both result in the same behavior. Both read/write events on marked connections will never again do any actual reads/writes, and are only useful to trigger the flush and close the connection.

I think I agree with this patch except for this question. Is it really the case that we reach this point with the connection still reading? Keep in mind that this case can only happen if the connection is marked and is busy flushing; a connection in that condition *shouldn't* be reading. Did you run into some that were? If I missed something, though: sz==0 in that case doesn't mean that read is blocked on bandwidth; no read was attempted in the code above, and the read bucket wasn't examined.

Can you tell me more about the read case here?

comment:4 in reply to:  3 ; Changed 8 years ago by robgjansen

Cc: jansen@… added

The reason that both read and write events are both de-registered when the marked connection can not flush is because both result in the same behavior. Both read/write events on marked connections will never again do any actual reads/writes, and are only useful to trigger the flush and close the connection.

I think I agree with this patch except for this question. Is it really the case that we reach this point with the connection still reading? Keep in mind that this case can only happen if the connection is marked and is busy flushing; a connection in that condition *shouldn't* be reading. Did you run into some that were? If I missed something, though: sz==0 in that case doesn't mean that read is blocked on bandwidth; no read was attempted in the code above, and the read bucket wasn't examined.

Can you tell me more about the read case here?

I did run into a case where a connection was reading.

You are technically correct about sz==0 not implying read is blocked on bandwidth. The reason I added that was to catch reading connections that are flushing but not registered as writing, because we need a way for the event to later get activated, flushed, and removed. Would it be better to remove the read event and set write_blocked_on_bandwidth to get ensure removal later?

comment:5 in reply to:  4 Changed 8 years ago by nickm

Replying to robgjansen:

The reason that both read and write events are both de-registered when the marked connection can not flush is because both result in the same behavior. Both read/write events on marked connections will never again do any actual reads/writes, and are only useful to trigger the flush and close the connection.

I think I agree with this patch except for this question. Is it really the case that we reach this point with the connection still reading? Keep in mind that this case can only happen if the connection is marked and is busy flushing; a connection in that condition *shouldn't* be reading. Did you run into some that were? If I missed something, though: sz==0 in that case doesn't mean that read is blocked on bandwidth; no read was attempted in the code above, and the read bucket wasn't examined.

Can you tell me more about the read case here?

I did run into a case where a connection was reading.

You are technically correct about sz==0 not implying read is blocked on bandwidth. The reason I added that was to catch reading connections that are flushing but not registered as writing, because we need a way for the event to later get activated, flushed, and removed. Would it be better to remove the read event and set write_blocked_on_bandwidth to get ensure removal later?

If the connection is marked, it really shouldn't be reading at all. I'd be curious what kind of connections these are, and what made them read. Disabling 'read' if it's on is okay, I think. And if we tried to write but we can't because of bandwidth, then write_blocked_on_bandwidth_ does sound appropriate.

comment:6 Changed 8 years ago by nickm

To be concrete, can you try out my branch "bug5263" based on 0.2.2 with your patch? It adds a log message if the thing that I think can't happen tries to happen anyway.

comment:7 Changed 8 years ago by nickm

Milestone: Tor: 0.2.2.x-finalTor: 0.2.3.x-final

comment:8 Changed 8 years ago by nickm

Added a few more fixups and a changes file. Rob, did you ever have a change to test the version from 3 months ago?

comment:9 Changed 8 years ago by nickm

err, that should be "3 weeks ago". Can't read today.

comment:10 in reply to:  9 ; Changed 8 years ago by robgjansen

Replying to nickm:

err, that should be "3 weeks ago". Can't read today.

I have not changed the patch or anything in the bug5263 branch. And sorry for the delay, but I'll try to run the bug5263 branch in Shadow by the end of the week.

comment:11 Changed 8 years ago by robgjansen

Cc: nickm arma robgjansen added; nickm@… arma@… jansen@… removed

comment:12 in reply to:  10 Changed 8 years ago by robgjansen

Replying to robgjansen:

Replying to nickm:

err, that should be "3 weeks ago". Can't read today.

I have not changed the patch or anything in the bug5263 branch. And sorry for the delay, but I'll try to run the bug5263 branch in Shadow by the end of the week.

So, I've been running this with your branch. I've found another problems. In the middle of connection.c, around line 2844, the following statement causes problems:

else if (conn->state == OR_CONN_STATE_TLS_SERVER_RENEGOTIATING) {
  return connection_handle_read(conn);
}

The above basically causes a read call on a write callback. When the above call to connection_handle_read returns 0, the write callback event returns without deregistering itself. Infinite loop in Shadow.

Something like the following should fix it:

else if (conn->state == OR_CONN_STATE_TLS_SERVER_RENEGOTIATING) {
  int read_result = connection_handle_read(conn);
  if(read_result <= 0) {
    /* we are supposed to be writing, but we swapped over to reading
      and that didn't work either. so stop writing. */
    if (connection_is_writing(conn)) {
      conn->write_blocked_on_bw = 1;
      connection_stop_writing(conn);
    }
    return read_result;
  }
}

I think there may be more issues too. Any reason we didn't use 0.2.3.12-alpha (instead of 0.2.2) to isolate my original bug??

comment:13 Changed 8 years ago by nickm

The branch is based on 0.2.2 because that's where I thought we were originally focusing it; It should merge cleanly into most any 0.2.3.x tag though. master would be particularly good.

Have you test thated fix you posted above, and did it work?

Did the warning from the branch show up during your testing?

comment:14 Changed 8 years ago by nickm

Oh, also, you say:

I think there may be more issues too.

In your opinion, is it still worthwhile to merge this into 0.2.3.x, or do you think we should trace down these other issues first?

comment:15 in reply to:  13 Changed 8 years ago by robgjansen

Replying to nickm:

Have you test thated fix you posted above, and did it work?

It got past the first place where it was looping, but seemed to get stuck at another place a little later in the experiment.

Did the warning from the branch show up during your testing?

I have not seen the warning yet, mostly because I think it hasn't had a chance to go down that code path yet (see above).

comment:16 in reply to:  14 Changed 8 years ago by robgjansen

Replying to nickm:

Oh, also, you say:

I think there may be more issues too.

In your opinion, is it still worthwhile to merge this into 0.2.3.x, or do you think we should trace down these other issues first?

Its still worthwhile. I've been running many experiments on 0.2.3.13-alpha with my original patch without any of these other issues showing up. My plan is to run more experiments after merging into master as you suggested.

comment:17 Changed 8 years ago by nickm

wrt the OR_CONN_STATE_TLS_SERVER_RENEGOTIATING issue -- I think the problem might be here, around line 2818 (in master connection.c) :

      case TOR_TLS_WANTWRITE:
        connection_start_writing(conn);
        return 0;
      case TOR_TLS_WANTREAD: /* we're already reading */
      case TOR_TLS_DONE: /* no data read, so nothing to process */
        result = 0;
        break; /* so we call bucket_decrement below */

The "wantread" case isn't correct if connection_handle_read is called from connection_handle_write_impl. I think instead it should say something like

      case TOR_TLS_WANTREAD:
          if (connection_is_writing(conn))
             stop writing.
          if (! connection_is_reading(conn))
              start reading
      ...

or something of that kind. (Very sleepy now; I may be doing this poorly; please read the above critically.)

comment:18 Changed 8 years ago by nickm

(BTW, is this something I can reproduce locally? I've set up a shadow instance at home to the extent of being able to run the scallion.xml test, but I have no idea what I would do to see whether I'm hitting infinite loops, whether a patch fixes this, etc.)

comment:19 Changed 8 years ago by nickm

Hm. What I said above about making the change in connection_read_to_buf: I think that's wrong; we don't want to stop writing every time we get a wantread: we only want to stop writing in the case you mentioned, where we try to read because connection_handle_read() failed, and we get a "wantread". But we don't want to do that in just any case where read_result <= 0 : only in the case where it actually said wantread. Also, we need some way to make sure that once the operation eventually succeeds, we enable writing again, unless there already is such a thing.

comment:20 in reply to:  18 Changed 8 years ago by robgjansen

Replying to nickm:

(BTW, is this something I can reproduce locally? I've set up a shadow instance at home to the extent of being able to run the scallion.xml test, but I have no idea what I would do to see whether I'm hitting infinite loops, whether a patch fixes this, etc.)

Yes. It takes about 10 seconds to hit the infinite loop in master without my patch, so you won't have to worry about having enough memory. I'll help you get Shadow setup over email.

comment:21 in reply to:  19 ; Changed 8 years ago by robgjansen

Replying to nickm:

Hm. What I said above about making the change in connection_read_to_buf: I think that's wrong; we don't want to stop writing every time we get a wantread: we only want to stop writing in the case you mentioned, where we try to read because connection_handle_read() failed, and we get a "wantread". But we don't want to do that in just any case where read_result <= 0 : only in the case where it actually said wantread. Also, we need some way to make sure that once the operation eventually succeeds, we enable writing again, unless there already is such a thing.

Maybe some type of 'event source type' code or something would be helpful here.

I merged master into a local copy of your bug5263 branch. The OR_CONN_STATE_TLS_SERVER_RENEGOTIATING and other issues no longer appear. I'm waiting for the bug log message. I recall the bug appearing after a few hours of an experiment.

comment:22 Changed 8 years ago by robgjansen

I just realized that to notice the infinite loop bug if you're running in Shadow, you'll have to comment/remove lines 45 and 46 in scallion/contrib/patch.sh before building and installing scallion.

Changed 7 years ago by robgjansen

Attachment: config_vanilla.tar.xz added

'small' Shadow config bundle (uses about 10 GiB of RAM)

comment:23 in reply to:  21 Changed 7 years ago by robgjansen

Replying to robgjansen:

I merged master into a local copy of your bug5263 branch. The OR_CONN_STATE_TLS_SERVER_RENEGOTIATING and other issues no longer appear. I'm waiting for the bug log message. I recall the bug appearing after a few hours of an experiment.

Using Shadow and Scallion master as of yesterday, Tor master merged into your bug5263 branch, and this config bundle, the new BUG message prints once:

3:34:38:184038 [thread-0] 0:50:23:979790999 [scallion-warning] 
 [nonexit25-195.1.0.0] [intercept_logv] [tor-warn] BUG: conn_close_if_marked() 
 Marked connection (fd 1000151, type OR) is still reading; that shouldn't happen.

comment:24 Changed 7 years ago by nickm

I'm guessing that the very very big fd is a Shadow behavior, and not a bug. I wonder how this is possible. I think that right now I should add a little more information to that debugging statement, merge it as-is, and see if the other errors come back, and fix them as they do. No sense in not merging the fixes we have simply because we think we will need more fixes as well.

comment:25 Changed 7 years ago by nickm

Owner: set to nickm
Status: needs_reviewaccepted

I have rebased, squashed, and added a little more information to the BUG log. The branch is now "bug5263_023". I am merging it into master.

Putting this back into state "accepted" as we look for more issues.

comment:26 Changed 7 years ago by nickm

(I bet that we will find that the OR state from the connections that give these warnings is some kind of handshaking state.)

comment:27 in reply to:  24 Changed 7 years ago by robgjansen

Replying to nickm:

I'm guessing that the very very big fd is a Shadow behavior, and not a bug. I wonder how this is possible.

Shadow gives out descriptors starting at 1000000 so it can distinguish system calls on virtual sockets (which it intercepts and manages) from those on file descriptors (which it lets the underlying OS handle).

comment:28 Changed 7 years ago by nickm

Does shadow still have these issues and need to patch tor because of them, or is this bug cleared up now?

comment:29 Changed 7 years ago by nickm

Status: acceptedneeds_information

comment:30 in reply to:  28 Changed 7 years ago by robgjansen

Replying to nickm:

Does shadow still have these issues and need to patch tor because of them, or is this bug cleared up now?

Multiple experiments indicate it seems to be cleared up.

comment:31 Changed 7 years ago by nickm

Resolution: fixed
Status: needs_informationclosed

Great! I'll close this then. Please reopen if we turn out to be wrong; please open a new ticket linking to this one if you discover any new similar issues.

Thanks again!

comment:32 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:33 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.