Opened 4 weeks ago

Closed 3 weeks ago

#23676 closed defect (fixed)

kist on 0.3.2.1-alpha-dev beats its head against a wall trying to flush a conn that's closed

Reported by: arma Owned by:
Priority: High Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: regression, cpu, tor-sched, 0.3.2.2-alpha-must
Cc: pastly Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

moria1 running "0.3.2.1-alpha-dev (git-a1f31c0671c82f95)" last wrote to its info-level log this morning at 05:15:

Sep 27 05:15:09.482 [info] connection_close_immediate(): fd 334, type OR, state open, 146672 bytes on outbuf.

That was 9 hours ago.

The process is still alive, and it's using 100% cpu.

I did a gdb attach, and it looks like it is doing...stuff. Here are three snapshots:

#0  channel_flush_some_cells (chan=0x7f2e04c84c80, num_cells=1)
    at src/or/channel.c:2355
#1  0x00007f2df60bd915 in kist_scheduler_run () at src/or/scheduler_kist.c:589
#2  0x00007f2df60bb7e2 in scheduler_evt_callback (fd=<value optimized out>,·
    events=<value optimized out>, arg=<value optimized out>)
    at src/or/scheduler.c:215
#3  0x00007f2df6201040 in event_process_active_single_queue ()
#4  0x00007f2df6201610 in event_process_active ()
#5  0x00007f2df6201d74 in event_base_loop ()
#6  0x00007f2df60530dd in run_main_loop_once () at src/or/main.c:2631
#7  run_main_loop_until_done () at src/or/main.c:2685
#8  do_main_loop () at src/or/main.c:2598
#9  0x00007f2df605421d in tor_main (argc=<value optimized out>,·
    argv=<value optimized out>) at src/or/main.c:3780
#10 0x00007f2df604fd99 in main (argc=<value optimized out>,·
    argv=<value optimized out>) at src/or/tor_main.c:34
#0  channel_more_to_flush (chan=0x7f2e04c84c80) at src/or/channel.c:2609
#1  0x00007f2df60bd868 in kist_scheduler_run () at src/or/scheduler_kist.c:600
#2  0x00007f2df60bb7e2 in scheduler_evt_callback (fd=<value optimized out>,·
    events=<value optimized out>, arg=<value optimized out>)
    at src/or/scheduler.c:215
#3  0x00007f2df6201040 in event_process_active_single_queue ()
#4  0x00007f2df6201610 in event_process_active ()
#5  0x00007f2df6201d74 in event_base_loop ()
#6  0x00007f2df60530dd in run_main_loop_once () at src/or/main.c:2631
#7  run_main_loop_until_done () at src/or/main.c:2685
#8  do_main_loop () at src/or/main.c:2598
#9  0x00007f2df605421d in tor_main (argc=<value optimized out>,·
    argv=<value optimized out>) at src/or/main.c:3780
#10 0x00007f2df604fd99 in main (argc=<value optimized out>,·
    argv=<value optimized out>) at src/or/tor_main.c:34
#0  0x00007f2df60bd42e in socket_table_s_HT_FIND_P_ (
    chan=<value optimized out>, table=0x7f2df64f0ee0)
    at src/or/scheduler_kist.c:50
#1  socket_table_s_HT_FIND (chan=<value optimized out>, table=0x7f2df64f0ee0)
    at src/or/scheduler_kist.c:51
#2  socket_table_search (chan=<value optimized out>, table=0x7f2df64f0ee0)
    at src/or/scheduler_kist.c:168
#3  0x00007f2df60bd4a9 in socket_can_write (chan=<value optimized out>,
    table=0x7f2df64f0ee0) at src/or/scheduler_kist.c:372
#4  0x00007f2df60bd968 in kist_scheduler_run () at src/or/scheduler_kist.c:630
#5  0x00007f2df60bb7e2 in scheduler_evt_callback (fd=<value optimized out>,
    events=<value optimized out>, arg=<value optimized out>)
    at src/or/scheduler.c:215
#6  0x00007f2df6201040 in event_process_active_single_queue ()
#7  0x00007f2df6201610 in event_process_active ()
#8  0x00007f2df6201d74 in event_base_loop ()
#9  0x00007f2df60530dd in run_main_loop_once () at src/or/main.c:2631
#10 run_main_loop_until_done () at src/or/main.c:2685
#11 do_main_loop () at src/or/main.c:2598
#12 0x00007f2df605421d in tor_main (argc=<value optimized out>,
    argv=<value optimized out>) at src/or/main.c:3780
#13 0x00007f2df604fd99 in main (argc=<value optimized out>,
    argv=<value optimized out>) at src/or/tor_main.c:34

David suggested that I try strace -f -p PID -o /tmp/strace.log, and I ran it for 30 seconds or so, and it produced this file:

13376 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <unfinished ...>
13375 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <unfinished ...>
13374 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <unfinished ...>
13373 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <unfinished ...>
13372 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <unfinished ...>
13371 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <unfinished ...>
13370 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <unfinished ...>
13369 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <unfinished ...>
13368 futex(0x7f2dff2de65c, FUTEX_WAIT_PRIVATE, 755787, NULL <detached ...>

Child Tickets

Change History (25)

comment:1 Changed 4 weeks ago by arma

(gdb) break scheduler_kist.c:589
Breakpoint 1 at 0x7f2df60bd908: file src/or/scheduler_kist.c, line 589.
(gdb) continue
Continuing.

Breakpoint 1, kist_scheduler_run () at src/or/scheduler_kist.c:589
589           flush_result = (int)channel_flush_some_cells(chan, 1); // 1 for num cells
(gdb) next
591           if (flush_result > 0) {
(gdb) print flush_result
$1 = 0

comment:2 Changed 4 weeks ago by arma

Summary: relay running 0.3.2.1-alpha-dev wedged: kist bug?kist on 0.3.2.1-alpha-dev beats its head against a wall trying to flush a conn that's closed

comment:3 Changed 4 weeks ago by arma

(gdb) break channel_flush_some_cells
Breakpoint 1 at 0x7f2df60cbd60: file src/or/channel.c, line 2347.
(gdb) continue
Continuing.

Breakpoint 1, channel_flush_some_cells (chan=0x7f2e04c84c80, num_cells=1)
    at src/or/channel.c:2347
2347    {
(gdb) print chan->state
$1 = CHANNEL_STATE_CLOSING

comment:4 Changed 4 weeks ago by nickm

Cc: pastly dgoulet added
Keywords: regression cpu added
Priority: MediumHigh

comment:5 Changed 4 weeks ago by dgoulet

Cc: dgoulet removed
Keywords: tor-sched added

comment:6 Changed 4 weeks ago by pastly

Status: newneeds_review

bug23676_032_01 on https://github.com/pastly/public-tor.git

Needs code review and for a couple people to try it (arma and me, probably).

comment:7 Changed 4 weeks ago by pastly

If you're picky like arma, and thus want the new function in one commit and the bugfix in another, you will enjoy bug23676_032_02.

comment:8 Changed 4 weeks ago by pastly

I hit the new log line once already. With an open channel. Interesting.

I wonder if the bad bug only appears when the channel is closing, it's the only channel pending, and it has cells to send. That could explain why we never had a relay lock up before.

comment:9 Changed 4 weeks ago by pastly

I hit the log line ~450 times over night, always with an open channel. We probably want more information about why we aren't flushing.

comment:10 Changed 4 weeks ago by pastly

I fear my discussion with dgoulet today on IRC may have fallen into the trap of assuming things about where cells are and how many of them are in various places. I think we really need that patch that logs more info about cell counts in values queues/bufs when we hit this case.

That said, I believe the branch provided above (bug23676_032_02 on ​https://github.com/pastly/public-tor.git) is a fix for the problem.

There's concern that channel_more_to_flush is checking the wrong queue. David is working on that as I understand it. It might deserve its own ticket. This function has been doing this for a long time, so a change should be tested on a relay or two.

I also understand David to be working on changing some logic in channel_flush_some_cells, basic on logic in our IRC discussion (that I'd like to see verified, see first paragraph).

comment:11 Changed 4 weeks ago by nickm

Keywords: 0.3.2.2-alpha-must added

comment:12 Changed 4 weeks ago by dgoulet

I've been debugging this one for a while and I think we have more problems then just this. However, I can explain how we get in this situation where we have a channel scheduled with *no* cells to flush.

The offending party is the circuitmux_append_destroy_cell() which basically creates one DESTROY cell, push it on the cmux queue and then flushes it with channel_flush_from_first_active_circuit(). That function does MANY thins and among those, it writes the cell directly to the outbuf if the outgoing_queue of the channel is empty (which turns out to be ALWAYS empty...).

So then scheduler_channel_wants_writes() is called by the libevent write of the connection connection_handle_write_impl() which schedule the empty channel for a write to kernel but that has already been done by that function...

comment:13 in reply to:  12 ; Changed 4 weeks ago by arma

Replying to dgoulet:

The offending party is the circuitmux_append_destroy_cell() which basically creates one DESTROY cell, push it on the cmux queue and then flushes it with channel_flush_from_first_active_circuit(). That function does MANY thins and among those, it writes the cell directly to the outbuf if the outgoing_queue of the channel is empty (which turns out to be ALWAYS empty...).

Does that mean the destroy cell is going out before the other cells, with the new kist scheduler?

comment:14 in reply to:  13 Changed 4 weeks ago by dgoulet

Replying to arma:

Replying to dgoulet:

The offending party is the circuitmux_append_destroy_cell() which basically creates one DESTROY cell, push it on the cmux queue and then flushes it with channel_flush_from_first_active_circuit(). That function does MANY thins and among those, it writes the cell directly to the outbuf if the outgoing_queue of the channel is empty (which turns out to be ALWAYS empty...).

Does that mean the destroy cell is going out before the other cells, with the new kist scheduler?

I don't think so because at that point, the circuit cell queue is empty. The destroy cell is only flushed if the outbuf of the connection is empty.

comment:15 in reply to:  7 Changed 4 weeks ago by arma

Replying to pastly:

If you're picky like arma, and thus want the new function in one commit and the bugfix in another, you will enjoy bug23676_032_02.

Ok, back we all go to remedial pointer school. What's wrong with this function:

+static void
+add_chan_to_readd_list(smartlist_t *to_readd, channel_t *chan)
+{
+    if (!to_readd) {
+      to_readd = smartlist_new();
+    }
+    smartlist_add(to_readd, chan);
+}

Hint: what happens when the to_readd argument is NULL? We make a smartlist_new, and assign it to this local variable, and then add something to it, and then... we return?

comment:16 Changed 4 weeks ago by arma

diff --git a/src/or/scheduler_kist.c b/src/or/scheduler_kist.c
index 7b53f08..7f5a720 100644
--- a/src/or/scheduler_kist.c
+++ b/src/or/scheduler_kist.c
@@ -417,12 +417,12 @@ update_socket_written(socket_table_t *table, channel_t *c
han, size_t bytes)
  * have to, and we now do this in more than once place in kist_scheduler_run.
  */
 static void
-add_chan_to_readd_list(smartlist_t *to_readd, channel_t *chan)
+add_chan_to_readd_list(smartlist_t **to_readd, channel_t *chan)
 {
-    if (!to_readd) {
-      to_readd = smartlist_new();
+    if (!*to_readd) {
+      *to_readd = smartlist_new();
     }
-    smartlist_add(to_readd, chan);
+    smartlist_add(*to_readd, chan);
 }
 
 /*
@@ -611,7 +611,7 @@ kist_scheduler_run(void)
                  "likely the issue) and stop scheduling it this round.",
                  channel_state_to_string(chan->state));
         chan->scheduler_state = SCHED_CHAN_WAITING_TO_WRITE;
-        add_chan_to_readd_list(to_readd, chan);
+        add_chan_to_readd_list(&to_readd, chan);
         continue;
       }
     }
@@ -660,7 +660,7 @@ kist_scheduler_run(void)
        * in the next scheduling round.
        */
       chan->scheduler_state = SCHED_CHAN_WAITING_TO_WRITE;
-      add_chan_to_readd_list(to_readd, chan);
+      add_chan_to_readd_list(&to_readd, chan);
       log_debug(LD_SCHED, "chan=%" PRIu64 " now waiting_to_write",
                 chan->global_identifier);
     } else {

is now the patch I am running on moria1. :)

I encourage somebody -- maybe pastly? -- to !fixup that into pastly's ce363803 commit, so we can erase the history of this buglet.

David, I wonder if this fix helps in your crazy debugging stuff too?

comment:17 in reply to:  16 Changed 4 weeks ago by pastly

I encourage somebody -- maybe pastly? -- to !fixup that into pastly's ce363803 commit, so we can erase the history of this buglet.

Done.

Running on a relay.

comment:18 Changed 3 weeks ago by dgoulet

See branch: bug23676_032_03.

Because we move the channel back to waiting_for_cells, we don't need to re-add it to the pending channels so I removed the offending commit that introduce add_chan_to_readd_list().

You'll also notice in the commit message and code that I explicitly state that there is probably a deeper problem with this flush_result = 0 and I want us to be aware and have a trace on why we did this fix.

comment:19 Changed 3 weeks ago by nickm

Looks plausible. Did the debug message come up while you were testing this, so you know it's working?

comment:20 in reply to:  19 ; Changed 3 weeks ago by dgoulet

Replying to nickm:

Looks plausible. Did the debug message come up while you were testing this, so you know it's working?

It does come up. It was info level before and I downgraded it to debug because it does happen regularly. I've opened many tickets that could cause this but for the next alpha, this is the best we got for now (#23709, #23710, #23711, #23712).

Would be nice if pastly ACK it before any merge.

Last edited 3 weeks ago by dgoulet (previous) (diff)

comment:21 in reply to:  20 Changed 3 weeks ago by pastly

Replying to dgoulet:

Would be nice if pastly ACK it before any merge.

Consider it ACKed.

Looked at it. It compiles. It passes all tests. Yes there's more issues (that have probably existed for a long time in some cases) but this is a good bandaid for this alpha.

comment:22 Changed 3 weeks ago by dgoulet

Status: needs_reviewmerge_ready

comment:23 Changed 3 weeks ago by isis

I also reviewed dgoulet's bug23676_032_03, and I agree that it should be merged for the alpha so that we can further diagnose the other possible issues/causes.

comment:24 Changed 3 weeks ago by nickm

merged to master! Is this ready to close?

comment:25 in reply to:  24 Changed 3 weeks ago by nickm

Resolution: fixed
Status: merge_readyclosed

Replying to nickm:

merged to master! Is this ready to close?

dgoulet says yes

Note: See TracTickets for help on using tickets.