Opened 3 weeks ago

Last modified 3 weeks ago

#32564 new defect

Assertion pol->magic failed

Reported by: Logforme Owned by:
Priority: High Milestone: Tor: 0.4.3.x-final
Component: Core Tor/Tor Version: Tor: 0.4.1.6
Severity: Normal Keywords: assert crash backport
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Guard relay 855BC2DABE24C861CD887DB9B2E950424B49FC34 crashed with the following log:

Nov 21 00:18:01.000 [err] tor_assertion_failed_(): Bug: ../src/core/or/circuitmux_ewma.c:165: TO_EWMA_POL_CIRC_DATA: Assertion pol->magic == 0x761e7747U failed; aborting. (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug: Assertion pol->magic == 0x761e7747U failed in TO_EWMA_POL_CIRC_DATA at ../src/core/or/circuitmux_ewma.c:165: . Stack trace: (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(log_backtrace_impl+0x47) [0x55d5e9f968e7] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(tor_assertion_failed_+0x147) [0x55d5e9f919c7] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(+0x8fe84) [0x55d5e9e14e84] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(+0xb839f) [0x55d5e9e3d39f] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(circuit_receive_relay_cell+0x29a) [0x55d5e9e419fa] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(command_process_cell+0x2fc) [0x55d5e9e23a1c] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(channel_tls_handle_cell+0x333) [0x55d5e9e030d3] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(+0xa773f) [0x55d5e9e2c73f] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(connection_handle_read+0x990) [0x55d5e9df0500] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(+0x707ee) [0x55d5e9df57ee] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x6a0) [0x7fb82bb5f5a0] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(do_main_loop+0x105) [0x55d5e9df6b25] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(tor_run_main+0x1225) [0x55d5e9de4545] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(tor_main+0x3a) [0x55d5e9de193a] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(main+0x19) [0x55d5e9de14b9] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7fb82a3b32e1] (on Tor 0.4.1.6 )
Nov 21 00:18:01.000 [err] Bug:     /usr/bin/tor(_start+0x2a) [0x55d5e9de150a] (on Tor 0.4.1.6 )

After the relay automatically restarted the log had the following warning:

Nov 21 00:18:07.000 [warn] Incorrect ed25519 signature(s)

Possibly the same as #16423 according to Nick.

Relay is one of two relays running on a Debian box. Memory and CPU usage are normal.

Odd things about the relay:

  • 1. About a week ago I had log entries (rotated unfortunately) to the effect of not being able to apply consensus diffs (wrong hash) and eventually "no longer serving directory info to clients"
  • 2. Lately the two relays has seen an upswing in traffic. Sometimes their combined BW hits my ISPs ceiling.
  • 3. Over time the memory usages of the relays grows. Initially they are around 700MB. Once they use most of the RAM (4GB) I reboot the machine. When the assert happened the RAM usage was nowhere near that.
  • 4. I run my home brewed monitoring software that uses "SETEVENT BW" and calls "GETINFO orconn-status ns/id/<fingerprint> status/fresh-relay-descs" every 10 minutes.

Child Tickets

Change History (2)

comment:1 Changed 3 weeks ago by dgoulet

So the only path into the EWMA subsystem I can find from circuit_receive_relay_cell() is through this path:

  • circuit_receive_relay_cell()
    • append_cell_to_circuit_queue()
      • update_circuit_on_cmux_()
        • circuitmux_set_num_cells()
          • circuitmux_make_circuit_inactive() _OR_
          • circuitmux_make_circuit_active()

Both either call notify_circ_inactive() or notify_circ_active() which EWMA subsystem sets those function pointers.

Which means that we have in the chanid_circid_muxinfo_map map an entry that has either things that were freed() or never initialized.

comment:2 Changed 3 weeks ago by nickm

Keywords: crash backport added
Milestone: Tor: 0.4.3.x-final
Priority: MediumHigh
Note: See TracTickets for help on using tickets.