segfault while choosing nodes
On tor26 on r10939 within seconds of starting Tor segfaults.
Core was generated by `/usr/sbin/tor'. Program terminated with signal 11, Segmentation fault.
(gdb) bt #0 0xb7d992de in mallopt () from /lib/tls/i686/cmov/libc.so.6 #1 0xb7d98b1d in mallopt () from /lib/tls/i686/cmov/libc.so.6 #2 (closed) 0xb7d9950a in mallopt () from /lib/tls/i686/cmov/libc.so.6 #3 (closed) 0xb7d98135 in realloc () from /lib/tls/i686/cmov/libc.so.6 #4 (closed) 0x080c9fa8 in _tor_realloc (ptr=0x11, size=17) at util.c:149 #5 (closed) 0x080d129e in smartlist_add (sl=0xb7e57f5c, element=0x15d2a318) at container.c:92 #6 (closed) 0x08054501 in compute_preferred_testing_list (answer=0x9319d58 "") at circuitbuild.c:1533 #7 (closed) 0x08054684 in choose_good_middle_server (purpose=17 '\021', state=0x8844378, head=0x8ee2b08, cur_len=0) at circuitbuild.c:1582 #8 (closed) 0x08054ab4 in onion_extend_cpath (circ=0x87688e0) at circuitbuild.c:1693 #9 (closed) 0x08051358 in onion_populate_cpath (circ=0x87688e0) at circuitbuild.c:269 #10 (closed) 0x0805144c in circuit_establish_circuit (purpose=17 '\021', onehop_tunnel=17, exit=0x11, need_uptime=17, need_capacity=17, internal=17) at circuitbuild.c:315 #11 (closed) 0x0805ca5e in circuit_launch_by_router (purpose=17 '\021', onehop_tunnel=17, exit=0x11, need_uptime=17, need_capacity=17, internal=17) at circuituse.c:809 #12 (closed) 0x080ae71b in consider_testing_reachability (test_or=1, test_dir=1) at router.c:602 #13 (closed) 0x08085b99 in connection_dir_client_reached_eof (conn=0x91f41e8) at directory.c:1306 #14 (closed) 0x080865cf in connection_dir_reached_eof (conn=0x91f41e8) at directory.c:1476 #15 (closed) 0x0806cd1c in connection_handle_read (conn=0x91f41e8) at connection.c:1817 #16 (closed) 0x0809a779 in conn_read_callback (fd=1009, event=2, _conn=0x91f41e8) at main.c:498 #17 (closed) 0xb7fa0c79 in event_base_priority_init () from /usr/lib/libevent-1.1a.so.1 #18 (closed) 0xb7fa0f65 in event_base_loop () from /usr/lib/libevent-1.1a.so.1 #19 (closed) 0xb7fa0dcb in event_loop () from /usr/lib/libevent-1.1a.so.1 ...
In this particular instance it had said 'smartlist_choose_by_bandwidth(): Bug: Round-off error in computing bandwidth had an effect on which router we chose.' (see bug #470 (moved)) first, but it does not do this all the time.
After committing r1940 (the found=0 compile time warning), I got two different backtraces:
(gdb) bt #0 0xb7d372de in mallopt () from /lib/tls/i686/cmov/libc.so.6 #1 0xb7d36b1d in mallopt () from /lib/tls/i686/cmov/libc.so.6 #2 (closed) 0xb7d3750a in mallopt () from /lib/tls/i686/cmov/libc.so.6 #3 (closed) 0xb7d36135 in realloc () from /lib/tls/i686/cmov/libc.so.6 #4 (closed) 0x080c9fb8 in _tor_realloc (ptr=0xb7df69e8, size=3084872168) at util.c:149 #5 (closed) 0x080d12ae in smartlist_add (sl=0xb7df5f5c, element=0x2209b030) at container.c:92 #6 (closed) 0x080b2ac0 in router_add_running_routers_to_smartlist (sl=0x875caa0, allow_invalid=0, need_uptime=1, need_capacity=1, need_guard=0) at routerlist.c:1103 #7 (closed) 0x080b3232 in router_choose_random_node (preferred=0x0, excluded=0x0, excludedsmartlist=0x82cebe0, need_uptime=1, need_capacity=1, need_guard=0, allow_invalid=0, strict=0, weight_for_exit=0) at routerlist.c:1412 #8 (closed) 0x0805488b in choose_good_entry_server (purpose=5 '\005', state=0x87effa8) at circuitbuild.c:1642 #9 (closed) 0x080549e5 in onion_extend_cpath (circ=0x85c2288) at circuitbuild.c:1689 #10 (closed) 0x08051358 in onion_populate_cpath (circ=0x85c2288) at circuitbuild.c:269 #11 (closed) 0x0805144c in circuit_establish_circuit (purpose=5 '\005', onehop_tunnel=-1210095128, exit=0xb7df69e8, need_uptime=-1210095128, need_capacity=-1210095128, internal=-1210095128) at circuitbuild.c:315 #12 (closed) 0x0805ca5e in circuit_launch_by_router (purpose=5 '\005', onehop_tunnel=-1210095128, exit=0xb7df69e8, need_uptime=-1210095128, need_capacity=-1210095128, internal=-1210095128) at circuituse.c:809 #13 (closed) 0x0805bebe in circuit_predict_and_launch_new () at circuituse.c:408 #14 (closed) 0x0809ba11 in run_scheduled_events (now=1185498960) at main.c:1041 #15 (closed) 0x0809bf03 in second_elapsed_callback (fd=-1, event=1, args=0x0) at main.c:1178 #16 (closed) 0xb7f3ec79 in event_base_priority_init () from /usr/lib/libevent-1.1a.so.1 #17 (closed) 0xb7f3ef65 in event_base_loop () from /usr/lib/libevent-1.1a.so.1 #18 (closed) 0xb7f3edcb in event_loop () from /usr/lib/libevent-1.1a.so.1 #19 (closed) 0x0809c453 in do_main_loop () at main.c:1379 #20 (closed) 0x0809d57d in tor_main (argc=-1210095128, argv=0xb7df69e8) at main.c:2622 #21 (closed) 0x080c90fb in main (argc=-1210095128, argv=0xb7df69e8) at tor_main.c:28
and one with a broken stack frame:
(gdb) bt
#0 0xb7cc42de in mallopt () from /lib/tls/i686/cmov/libc.so.6
#1 0xb7cc3b1d in mallopt () from /lib/tls/i686/cmov/libc.so.6
#2 (closed) 0xb7cc2e83 in malloc () from /lib/tls/i686/cmov/libc.so.6
#3 (closed) 0xb7dc648f in default_malloc_ex () from /usr/lib/i686/cmov/libcrypto.so.0.9.7
#4 (closed) 0x000002c0 in ?? ()
#5 (closed) 0x00000002 in ?? ()
#6 (closed) 0xb7e93d24 in JCR_LIST () from /usr/lib/i686/cmov/libcrypto.so.0.9.7
#7 (closed) 0xb7dc60c3 in CRYPTO_malloc () from /usr/lib/i686/cmov/libcrypto.so.0.9.7
#8 (closed) 0x000002c0 in ?? ()
[...]
#22 (closed) 0x00000049 in ?? ()
#23 (closed) 0xbfdf5fe8 in ?? ()
#24 (closed) 0xb7cc2e83 in malloc () from /lib/tls/i686/cmov/libc.so.6
Previous frame inner to this frame (corrupt stack?)
[Automatically added by flyspray2trac: Operating System: All]