Opened 11 months ago

Last modified 3 months ago

#25733 merge_ready defect

Bug: Assertion bin_counts > 0 failed in circuit_build_times_get_xm at ../src/or/circuitstats.c:772.

Reported by: cstest Owned by: mikeperry
Priority: Medium Milestone: Tor: 0.2.9.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: crash, 029-backport, 031-unreached-backport, 032-unreached-backport
Cc: Actual Points:
Parent ID: Points:
Reviewer: nickm Sponsor:

Description (last modified by catalyst)

Server running couple of hundred HS domains. Tor crashed. One spammer is trying simple DDoS.

Apr 06 20:36:28.000 [notice] Received reload signal (hup). Reloading config and resetting internal state.
Apr 06 20:36:28.000 [notice] Read configuration file "-------------------".
Apr 06 20:36:28.000 [notice] Tor 0.3.2.10 (git-0edaa32732ec8930) opening log file.
Apr 06 20:36:42.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
Apr 06 20:36:42.000 [warn] Error launching circuit to node $F0F5074A6DADD3DC22E1FAA18FD6D89C-------- at --------- for service ---------------.
Apr 06 20:36:56.000 [warn] Your Guard remedy ($3FEBFB6A491D30CACC2C2995EDB41717A6F94E95) is failing a very large amount of circuits. Most likely this means the Tor network is overloaded, but it could$
Apr 06 20:36:59.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
Apr 06 20:37:26.000 [warn] Your Guard AlienZone ($80392DC1522E647C56457CEBA58DD84CC56AEC44) is failing a very large amount of circuits. Most likely this means the Tor network is overloaded, but it co$
Apr 06 20:37:29.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
Apr 06 20:37:29.000 [warn] Error launching circuit to node $9AEF164F5BE5618509C9E60----------- at ---------------- for service ---------------------------.
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 996 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 997 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 998 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
........
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] tor_assertion_failed_(): Bug: ../src/or/circuitstats.c:772: circuit_build_times_get_xm: Assertion bin_counts > 0 failed; aborting. (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug: Assertion bin_counts > 0 failed in circuit_build_times_get_xm at ../src/or/circuitstats.c:772. Stack trace: (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(log_backtrace+0x43) [0x55ff6c113313] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(tor_assertion_failed_+0x8d) [0x55ff6c12e54d] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(circuit_build_times_set_timeout+0x83e) [0x55ff6c0760be] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(circuit_expire_building+0x1012) [0x55ff6c07a5b2] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(+0x52bd7) [0x55ff6bfdfbd7] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6(+0x1f6aa) [0x7faafc9836aa] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6(event_base_loop+0x5a7) [0x7faafc984227] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(do_main_loop+0x28d) [0x55ff6bfe085d] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(tor_main+0xe1d) [0x55ff6bfe367d] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(main+0x19) [0x55ff6bfdc0a9] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7faafb18b1c1] (on Tor 0.3.2.10 )
Apr 06 20:37:49.000 [err] Bug:     /usr/sbin/tor(_start+0x2a) [0x55ff6bfdc0fa] (on Tor 0.3.2.10 )

Child Tickets

Change History (17)

comment:1 Changed 10 months ago by catalyst

Description: modified (diff)

Fix formatting.

comment:2 Changed 10 months ago by catalyst

Keywords: 034-proposed added
Milestone: Tor: unspecified
Summary: Bug: Could not determine largest build time (0). Xm is 3925ms and we've abandoned 999 out of 1000 circuits.Bug: Assertion bin_counts > 0 failed in circuit_build_times_get_xm at ../src/or/circuitstats.c:772.

Changed summary to reflect the assertion failure.

comment:3 Changed 10 months ago by mikeperry

Quick thought: This crash might be avoided by the fix for #23100. If a client is learning its circuit build timeout from only a few non-hidden service circuits, and the rest are hidden service circuits that are being ignored by the 0.3.2 CBT, then it is possible that it picked a timeout that is too low for the hidden service circuits to ever complete, and thus all are timing out without being counted properly. I could see this happening under DDoS. #23100 changes the CBT code to count HS circs in build times.

If this is a critical problem, it is worth trying the latest Tor 0.3.3.4-alpha or above, which has this fix. 0.3.3 should be released as either rc or stable next week, so it should be in good shape for production. If you try this, please report if it continues to happen or not.

Regardless, we should try to fix the underlying assert, of course. I suppose we could make it a tor_fragile_assert or similar for now, also.

comment:4 Changed 10 months ago by nickm

Keywords: crash added

Since this is a crash, it can automatically get into a release milestone if we think it's warranted.

Mike -- is it safe just to turn this into a nonfatal assertion for now in 0.3.2? And should we look at backporting #23100?

comment:5 Changed 10 months ago by cstest

So far crash never repeated. Still using the same Tor version.

comment:6 Changed 10 months ago by mikeperry

Status: newneeds_review

Ok here is a branch to avoid the assert: https://oniongit.eu/mikeperry/tor/commits/bug25733_032

If we like that, we should probably backport as far back as we go (what is that, 0.2.9?). Since this maybe could happen in some other rare case, we want both this and #23100 going forward.

#23100 might be nice to backport too, so that this timeout miscalculation doesn't happen quite so easily for onion service clients. It was tested as a package deal with #23114, though..

comment:7 Changed 10 months ago by nickm

Keywords: 029-backport 031-backport 032-backport added; 034-proposed removed
Milestone: Tor: unspecifiedTor: 0.3.3.x-final

If you think we should backport this to 0.2.9, could you branch based on maint-0.2.9?

comment:8 Changed 10 months ago by nickm

Reviewer: nickm

(Other than that, this patch looks okay to me. Only one request -- could you edit the log message so that it's easier for operators to tell what it means? The one that's there right now will only confuse people.)

comment:9 Changed 10 months ago by nickm

Status: needs_reviewneeds_revision

comment:10 Changed 10 months ago by dgoulet

Owner: set to mikeperry
Status: needs_revisionassigned

comment:11 Changed 10 months ago by dgoulet

Status: assignedneeds_revision

comment:12 Changed 10 months ago by mikeperry

Status: needs_revisionmerge_ready

https://oniongit.eu/mikeperry/tor/commits/bug25733_029 Rebased on 029, with fixed log line. Also in my torproject remote.

comment:13 Changed 10 months ago by nickm

Milestone: Tor: 0.3.3.x-finalTor: 0.3.2.x-final

Merged to 0.3.3 and forward; marking for backport.

comment:14 Changed 8 months ago by teor

Keywords: 031-unreached-backport added; 031-backport removed

0.3.1 is end of life, there are no more backports.
Tagging with 031-unreached-backport instead.

comment:15 Changed 3 months ago by teor

Keywords: 032-unreached-backport added; 032-backport removed

0.3.2 is end of life, so 032-backport is now 032-unreached-backport.

comment:16 Changed 3 months ago by teor

Version: Tor: 0.3.2.10

comment:17 Changed 3 months ago by teor

Milestone: Tor: 0.3.2.x-finalTor: 0.2.9.x-final

These tickets can't be backported to 0.3.2, because it is end of life.
But they can still be backported to 0.2.9.

Note: See TracTickets for help on using tickets.