Opened 18 months ago

Closed 18 months ago

Last modified 18 months ago

#24502 closed defect (fixed)

scheduler_release_channel: Non-fatal assertion

Reported by: toralf Owned by: dgoulet
Priority: Low Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor Version: Tor: 0.3.2.6-alpha
Severity: Normal Keywords: tor-sched, regression
Cc: Actual Points:
Parent ID: Points:
Reviewer: pastly Sponsor:

Description

Got this at a stable hardened Gentoo Linux exit :

Dec 02 18:20:17.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)

... <repeating>

Dec 03 01:03:59.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
Dec 03 01:04:10.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
Dec 03 01:07:47.000 [warn] tor_bug_occurred_(): Bug: src/or/scheduler.c:631: scheduler_release_channel: Non-fatal assertion !(smartlist_pos(channels_pending, chan) == -1) failed. (on Tor 0.3.2.6-alpha 87012d076ef58bb9)

Child Tickets

Change History (14)

comment:1 Changed 18 months ago by dgoulet

Status: newneeds_information

Oh, you should have much more log information after that warning. tor_bug_occurred_() prints much more usually of the current scheduler context?

comment:2 Changed 18 months ago by dgoulet

Keywords: tor-sched added
Milestone: Tor: 0.3.2.x-final

Oh also, you think you can find in your notice logs the Scheduler type. You are looking for something like (could be something else than "KIST"):

[notice] Scheduler type KIST has been enabled.

comment:3 Changed 18 months ago by dgoulet

Priority: MediumHigh

Bumping to High because this could mean we have a channel in a wrong scheduler state and thus might have been handled improperly through out its lifetime.

However, the scheduler will recover here, this is just a sanity checks during a free channel.

comment:4 Changed 18 months ago by toralf

Indeed, here's the missing info :

/tmp/warn.log-Dec 03 01:19:02.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
/tmp/warn.log-Dec 03 01:20:50.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
/tmp/warn.log-Dec 03 01:21:51.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
mr-fox ~ # grep -A 100 sched /tmp/warn*
/tmp/warn.log:Dec 03 01:07:47.000 [warn] tor_bug_occurred_(): Bug: src/or/scheduler.c:631: scheduler_release_channel: Non-fatal assertion !(smartlist_pos(channels_pending, chan) == -1) failed. (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log:Dec 03 01:07:47.000 [warn] Bug: Non-fatal assertion !(smartlist_pos(channels_pending, chan) == -1) failed in scheduler_release_channel at src/or/scheduler.c:631. Stack trace: (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(log_backtrace+0x53) [0x55ad3ea64e73] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(tor_bug_occurred_+0xc5) [0x55ad3ea81b75] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log:Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(scheduler_release_channel+0x19a) [0x55ad3e99312a] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(+0xc7408) [0x55ad3e99f408] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(connection_or_about_to_close+0x4d) [0x55ad3e9f43dd] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(+0x4ff8c) [0x55ad3e927f8c] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(+0x50788) [0x55ad3e928788] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/lib64/libevent-2.1.so.6(+0x2373a) [0x7ff15d09073a] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/lib64/libevent-2.1.so.6(event_base_loop+0x57f) [0x7ff15d09168f] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(do_main_loop+0x24d) [0x55ad3e929f3d] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(tor_main+0x1c2d) [0x55ad3e92d9bd] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(main+0x28) [0x55ad3e9252b8] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /lib64/libc.so.6(__libc_start_main+0xfd) [0x7ff15bc055ad] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Bug:     /usr/bin/tor(_start+0x2a) [0x55ad3e92530a] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log:Dec 03 01:07:47.000 [warn] scheduler_bug_occurred(): Bug: Channel 431733 in state channel error and scheduler state 3. Num cells on cmux: 0. Connection outbuf len: 0. Num pending channels: 127. Channel in pending list: no. (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
/tmp/warn.log-Dec 03 01:07:47.000 [warn] Scheduler asked to release channel 431733 but it wasn't in channels_pending
/tmp/warn.log-Dec 03 01:10:05.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
/tmp/warn.log-Dec 03 01:11:45.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
/tmp/warn.log-Dec 03 01:11:47.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
/tmp/warn.log-Dec 03 01:13:36.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
/tmp/warn.log-Dec 03 01:16:41.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
/tmp/warn.log-Dec 03 01:18:19.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)
/tmp/warn.log-Dec 03 01:19:02.000 [warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)

and FWIW I set the loglevel from warn to notice.

comment:5 Changed 18 months ago by dgoulet

Keywords: regression added
Priority: HighLow
Status: needs_informationneeds_review

Thanks toralf!

Bottom line here is that a release on a channel has been done by connection_or_about_to_close() meaning the channel's connection was closed in error state. This seems to be coming from the libevent main loop, most likely the write_event of the connection.

I think what happened is that the channel was in pending state in the scheduler, it got into the first condition if (socket_can_write(&socket_table, chan)) { but then because the channel wasn't opened, we moved to the next channel.

This had two consequences. First, the channel scheduler state is kept at PENDING but it is not put back in the channel pending list. Second, because it is still in that state, scheduler_release_channel() triggered that bug but at least clean it up properly. A PENDING state for a channel implies that it *has* to be in the pending list.

The KIST scheduler has 4 conditions and each of them should end with setting the channel scheduler state to reflect what has been done with it.

The condition I outlined above is really the only place I can find where we fail to update the state properly. It has been introduced with dcabf801e52 (0.3.2.4-alpha). Fortunately, this is harmless but annoying.

Branch: bug24502_032_01

comment:6 Changed 18 months ago by dgoulet

Owner: set to dgoulet
Status: needs_reviewaccepted

comment:7 Changed 18 months ago by dgoulet

Reviewer: pastly
Status: acceptedneeds_review

comment:8 Changed 18 months ago by toralf

Another example from today:

Dec 04 21:03:05.000 [warn] tor_bug_occurred_(): Bug: src/or/scheduler.c:631: scheduler_release_channel: Non-fatal assertion !(smartlist_pos(channels_pending, chan) =
= -1) failed. (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug: Non-fatal assertion !(smartlist_pos(channels_pending, chan) == -1) failed in scheduler_release_channel at src/or/scheduler.c:631. Sta
ck trace: (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(log_backtrace+0x53) [0x55ad3ea64e73] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(tor_bug_occurred_+0xc5) [0x55ad3ea81b75] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(scheduler_release_channel+0x19a) [0x55ad3e99312a] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(+0xc7408) [0x55ad3e99f408] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(connection_or_about_to_close+0x4d) [0x55ad3e9f43dd] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(+0x4ff8c) [0x55ad3e927f8c] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(+0x50788) [0x55ad3e928788] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/lib64/libevent-2.1.so.6(+0x2373a) [0x7ff15d09073a] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/lib64/libevent-2.1.so.6(event_base_loop+0x57f) [0x7ff15d09168f] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(do_main_loop+0x24d) [0x55ad3e929f3d] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(tor_main+0x1c2d) [0x55ad3e92d9bd] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(main+0x28) [0x55ad3e9252b8] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /lib64/libc.so.6(__libc_start_main+0xfd) [0x7ff15bc055ad] (on Tor 0.3.2.6-al
pha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Bug:     /usr/bin/tor(_start+0x2a) [0x55ad3e92530a] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] scheduler_bug_occurred(): Bug: Channel 1204645 in state channel error and scheduler state 3. Num cells on cmux: 3. Connection outbuf len: 0. Num pending channels: 98. Channel in pending list: no. (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Dec 04 21:03:05.000 [warn] Scheduler asked to release channel 1204645 but it wasn't in channels_pending

comment:9 Changed 18 months ago by pastly

Status: needs_reviewmerge_ready

dgoulet's bug24502_032_01 LGTM

comment:10 Changed 18 months ago by nickm

Status: merge_readyneeds_revision

LGTM too but maybe we should have a test here?

comment:11 in reply to:  10 Changed 18 months ago by dgoulet

Replying to nickm:

LGTM too but maybe we should have a test here?

Test added in 1a55a5ff06f9022a

comment:12 Changed 18 months ago by dgoulet

Status: needs_revisionmerge_ready

comment:13 Changed 18 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

Thank you!

Merged to master.

comment:14 Changed 18 months ago by nickm

whoops -- also merged to 0.3.2

Note: See TracTickets for help on using tickets.