#25316 closed defect (duplicate)

Assertion failure in scale_active_circuits

Reported by: Logforme Owned by:
Priority: High Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version: Tor: 0.3.2.9
Severity: Major Keywords: crash, 033-must, 033-triage-20180320, 033-included-20180320
Cc: Actual Points:
Parent ID: Points: 0.5
Reviewer: Sponsor:

Description

Relay 855BC2DABE24C861CD887DB9B2E950424B49FC34 crashed with the following log entries:

Feb 20 14:33:40.000 [err] tor_assertion_failed_(): Bug: ../src/or/circuitmux_ewma.c:711: scale_active_circuits: Assertion e->last_adjusted_tick == pol->active_circuit_pqueue_last_recalibrated failed; aborting. (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug: Assertion e->last_adjusted_tick == pol->active_circuit_pqueue_last_recalibrated failed in scale_active_circuits at ../src/or/circuitmux_ewma.c:711. Stack trace: (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(log_backtrace+0x42) [0x7f4b7ab65fb2] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(tor_assertion_failed_+0x9f) [0x7f4b7ab80b3f] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(+0xe74b2) [0x7f4b7aac74b2] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(circuitmux_notify_xmit_cells+0xdb) [0x7f4b7aac5a7b] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(channel_flush_from_first_active_circuit+0x269) [0x7f4b7aa5b7d9] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(channel_flush_some_cells+0xe4) [0x7f4b7aaaf8d4] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(+0xbd369) [0x7f4b7aa9d369] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(+0xbb2a2) [0x7f4b7aa9b2a2] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x414) [0x7f4b7a0ee254] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(do_main_loop+0x255) [0x7f4b7aa35d75] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(tor_main+0x1e75) [0x7f4b7aa397d5] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(main+0x19) [0x7f4b7aa31539] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7f4b79103ead] (on Tor 0.3.2.9 )
Feb 20 14:33:40.000 [err] Bug:     /usr/bin/tor(+0x51589) [0x7f4b7aa31589] (on Tor 0.3.2.9 )

Relay runs on Debian wheezy. uname -a:
Linux 3.2.0-5-amd64 #1 SMP Debian 3.2.96-3 x86_64 GNU/Linux

Uptime before the crash was 23 days. Memory and CPU usage was normal:

Feb 20 12:55:12.000 [notice] Heartbeat: Tor's uptime is 23 days 17:59 hours, with 30352 circuits open. I've sent 20574.36 GB and received 20125.28 GB.

Child Tickets

Change History (10)

comment:1 Changed 22 months ago by gk

Component: - Select a componentCore Tor/Tor

comment:2 Changed 22 months ago by teor

Keywords: crash must-033 added
Milestone: Tor: 0.3.3.x-final
Points: 0.5
Priority: MediumHigh
Severity: NormalMajor
Version: Tor: 0.3.2.9

comment:3 Changed 22 months ago by dgoulet

Fun one...

I can only see this assert() being triggered if we added twice the same ewma circ policy to the active circuit queue. So in the loop, the first pass would change the last_adjusted_tick and the second pass, the assert would be triggered. The only way to do that would be to have called notify_circ_active() on an already active circuit for which I can't find a way how that would be possible....

We can't have an entry last_adjusted_tick in the policy active circuit pqueue that wasn't synchronized with the active_circuit_pqueue_last_recalibrated value. This is done in add_cell_ewma() which adds the ewma circuit data to the active list and just before does:

last_adjusted_tick = active_circuit_pqueue_last_recalibrated

We can't end up in scale_active_circuits() for a circuit that is *not* in the active queue because of the assert at the end of ewma_notify_xmit_cells() which makes sure that we've xmit a cell on a active circuit of the policy:

  tmp = pop_first_cell_ewma(pol);
  tor_assert(tmp == cell_ewma);
  add_cell_ewma(pol, cell_ewma);

Confusing.....

comment:4 Changed 22 months ago by nickm

Keywords: 033-must added; must-033 removed

s/must-033/033-must/

comment:5 Changed 22 months ago by nickm

David says:

I can only see this assert() being triggered if we added twice the same ewma circ policy to the active circuit queue.

That reminds me of #24700 , which is fixed in 0.3.3.2-alpha, but not in 0.3.2.9. Could this be another case of that bug?

comment:6 in reply to:  5 Changed 22 months ago by dgoulet

Replying to nickm:

That reminds me of #24700 , which is fixed in 0.3.3.2-alpha, but not in 0.3.2.9. Could this be another case of that bug?

Yes, I also had that in mind when I saw this ticket and it seems very plausible! #24700 will get backported fortunately.

comment:7 Changed 21 months ago by nickm

Status: newneeds_information

Okay. I'm going to put this in needs_info for now.

Everyone: please let us know if you see this error in any 0.3.2.x version that is at least 0.3.2.10,
or any version that is at least 0.3.3.2-alpha!

comment:8 Changed 21 months ago by nickm

Keywords: 033-triage-20180320 added

Marking all tickets reached by current round of 033 triage.

comment:9 Changed 21 months ago by nickm

Keywords: 033-included-20180320 added

Mark 033-must tickets as triaged-in for 0.3.3

comment:10 Changed 21 months ago by nickm

Resolution: duplicate
Status: needs_informationclosed

Closing as probable duplicate of #24700. Please reopen if this occurs in any 0.3.2.x version that is at least 0.3.2.10, or any version that is at least 0.3.3.2-alpha!

Note: See TracTickets for help on using tickets.