Opened 12 years ago

Last modified 7 years ago

#411 closed defect (Fixed)

Tor Dies, SVN 9918 (relay.c:1562, main.c:1271, tor_main.c:22)

Reported by: xiando Owned by: nickm
Priority: High Milestone:
Component: Core Tor/Tor Version: 0.2.0.0-alpha-dev
Severity: Keywords:
Cc: xiando Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

VER: Checked out revision 9918.

LOG:

Apr 01 06:23:12.256 [err] Bug: relay.c:1562: next_circ_on_conn_p: Assertion conn == orcirc->p_conn failed; aborting.

# gdb /usr/bin/tor /var/lib/tor/core.7405
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1".

Core was generated by `/usr/bin/tor -f /etc/tor/torrc --pidfile /var/run/tor/tor.pid --log notice file'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/tls/libpthread.so.0...done.
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /usr/lib/libevent-1.3b.so.1...Reading symbols from /usr/lib/debug/usr/lib/libevent-1.3b.so.1.0.3.debug...done.
done.
Loaded symbols for /usr/lib/libevent-1.3b.so.1
Reading symbols from /lib/libssl.so.4...done.
Loaded symbols for /lib/libssl.so.4
Reading symbols from /lib/libcrypto.so.4...done.
Loaded symbols for /lib/libcrypto.so.4
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/lib/libgssapi_krb5.so.2...done.
Loaded symbols for /usr/lib/libgssapi_krb5.so.2
Reading symbols from /usr/lib/libkrb5.so.3...done.
Loaded symbols for /usr/lib/libkrb5.so.3
Reading symbols from /lib/libcom_err.so.2...done.
Loaded symbols for /lib/libcom_err.so.2
Reading symbols from /usr/lib/libk5crypto.so.3...done.
Loaded symbols for /usr/lib/libk5crypto.so.3
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
#0 0x0060e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) bt
#0 0x0060e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x0064e7a5 in raise () from /lib/tls/libc.so.6
#2 0x00650209 in abort () from /lib/tls/libc.so.6
#3 0x08087870 in make_circuit_inactive_on_conn (circ=0x8b6c940, conn=0x921d1b8) at relay.c:1562
#4 0x08054a23 in circuit_set_circid_orconn_helper (circ=0x8b6c940, id=0, conn=0x0, old_id=57696, old_conn=0x921d1b8, active=1)

at circuitlist.c:107

#5 0x08054c8b in circuit_set_p_circid_orconn (circ=0x8b6c940, id=0, conn=0x0) at circuitlist.c:152
#6 0x08059e6f in command_process_cell (cell=0xbffa09f0, conn=0x921d1b8) at or.h:1555
#7 0x0806cb76 in connection_or_process_inbuf (conn=0x921d1b8) at connection_or.c:780
#8 0x080638ad in connection_process_inbuf (conn=Variable "conn" is not available.
) at or.h:962
#9 0x08065d70 in connection_handle_read (conn=0x921d1b8) at connection.c:1514
#10 0x08083038 in conn_read_callback (fd=30, event=2, _conn=0x921d1b8) at main.c:427
#11 0x0098162d in event_base_loop (base=0x86affd8, flags=Variable "flags" is not available.
) at event.c:315
#12 0x009816e0 in event_loop (flags=0) at event.c:366
#13 0x00981704 in event_dispatch () at event.c:329
#14 0x08084e27 in tor_main (argc=15, argv=0xbffa1174) at main.c:1271
#15 0x080a3c23 in main (argc=15, argv=0xbffa1174) at tor_main.c:22
(gdb)

[Automatically added by flyspray2trac: Operating System: Other Linux]

Child Tickets

Change History (12)

comment:1 Changed 12 years ago by arma

I just got an assert that might be related or might not be. Running svn 9917,
on my Tor client on my laptop.

Apr 03 04:53:00.419 [info] append_cell_to_circuit_queue(): Made a circuit active
.
Apr 03 04:53:00.419 [info] exit circ (length 3): $9B7E25A1B2E52723077AC4DCC57DA8
A6655346AB(open) $C6E52E4FBB825CA161A98682255CDE3411A98124(open) $75075ECBDAD2AE
FD14ED77AF3186CB194036FDA9(open)
Apr 03 04:53:00.419 [info] connection_ap_handshake_send_begin(): Address/port se
nt, ap socket 23, n_circ_id 42307
Apr 03 04:53:00.419 [info] circuit_predict_and_launch_new(): Have 0 clean circs
(0 internal), need another exit circ.
Apr 03 04:53:00.421 [info] choose_good_exit_server_general(): Found 329 servers
that might support 0/0 pending connections.
Apr 03 04:53:00.422 [info] choose_good_exit_server_general(): Chose exit server
'torxmission'
Apr 03 04:53:00.423 [info] append_cell_to_circuit_queue(): Made a circuit active
.
Apr 03 04:53:00.423 [info] append_cell_to_circuit_queue(): Primed a buffer.
Apr 03 04:53:00.423 [info] connection_or_flush_from_first_active_circuit(): Made

a circuit inactive.

Apr 03 04:53:00.423 [info] circuit_send_next_onion_skin(): First hop: finished s
ending CREATE_FAST cell to 'anonserver'
Apr 03 04:53:00.423 [err] Bug: circuitlist.c:172: circuit_set_n_circid_orconn: A
ssertion bool_eq(active, circ->next_active_on_n_conn) failed; aborting.

comment:2 Changed 12 years ago by arma

My backtrace was:

#0 0x401b183b in raise () from /lib/tls/libc.so.6
#1 0x401b2fa2 in abort () from /lib/tls/libc.so.6
#2 0x080577a3 in circuit_set_n_circid_orconn (circ=0x84281b8, id=0, conn=0x0)

at circuitlist.c:172

#3 0x080581ad in circuit_free (circ=0x84281b8) at circuitlist.c:412
#4 0x08057c2e in circuit_close_all_marked () at circuitlist.c:272
#5 0x080990af in run_scheduled_events (now=1175590380) at main.c:956
#6 0x080993cb in second_elapsed_callback (fd=-1, event=1, args=0x0)

at main.c:1068

#7 0x40053c79 in event_base_priority_init () from /usr/lib/libevent-1.1a.so.1
#8 0x40053f65 in event_base_loop () from /usr/lib/libevent-1.1a.so.1
#9 0x40053dcb in event_loop () from /usr/lib/libevent-1.1a.so.1
#10 0x40053cb0 in event_dispatch () from /usr/lib/libevent-1.1a.so.1
#11 0x08099780 in do_main_loop () at main.c:1271
#12 0x0809aa04 in tor_main (argc=5, argv=0xbffff964) at main.c:2497
#13 0x080c5dba in main (argc=5, argv=0xbffff964) at tor_main.c:22

(gdb) up
#3 0x080581ad in circuit_free (circ=0x84281b8) at circuitlist.c:412
412 circuit_set_n_circid_orconn(circ, 0, NULL);
(gdb) print *circ
$1 = {magic = 892424771, n_conn_cells = {head = 0x0, tail = 0x0, n = 0},

n_conn = 0x0, n_conn_id_digest = "\233~%¡²å'#\azÄÜÅ}¨¦eSF«", n_circ_id = 0,
n_port = 9980, n_addr = 1123968570, streams_blocked_on_n_conn = 0,
streams_blocked_on_p_conn = 0, package_window = 1000, deliver_window = 1000,
onionskin = 0x0, timestamp_created = 1175590356,
timestamp_dirty = 1175589165, state = 3 '\003', purpose = 5 '\005',
marked_for_close = 554, marked_for_close_file = 0x80ded43 "circuituse.c",
next_active_on_n_conn = 0x830d030, prev_active_on_n_conn = 0x830d030,
next = 0x82214a0}

comment:3 Changed 12 years ago by nickm

I think the patches I just pushed as 9928 and 9929 should solve this.

comment:4 Changed 12 years ago by arma

Running r9933, I just triggered the
tor_assert(bool_eq(active, circ->next_active_on_n_conn))
one, after running my exit node on my laptop for about a half hour.

comment:5 Changed 12 years ago by xiando

Crashes!!!

Checked out revision 9933.

(gdb) bt
#0 0x0060e7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x0064e7a5 in raise () from /lib/tls/libc.so.6
#2 0x00650209 in abort () from /lib/tls/libc.so.6
#3 0x08054e8d in circuit_set_n_circid_orconn (circ=0x9cf22a0, id=0, conn=0x0) at circuitlist.c:186
#4 0x08055475 in circuit_free (circ=0x9cf22a0) at circuitlist.c:426
#5 0x080557f4 in circuit_close_all_marked () at circuitlist.c:282
#6 0x080838fd in second_elapsed_callback (fd=-1, event=1, args=0x0) at main.c:956
#7 0x0098162d in event_base_loop (base=0x9a57fd8, flags=Variable "flags" is not available.
) at event.c:315
#8 0x009816e0 in event_loop (flags=0) at event.c:366
#9 0x00981704 in event_dispatch () at event.c:329
#10 0x08084f9b in tor_main (argc=15, argv=0xbfea0fd4) at main.c:1271
#11 0x080a3d97 in main (argc=15, argv=0xbfea0fd4) at tor_main.c:22
(gdb) up
#1 0x0064e7a5 in raise () from /lib/tls/libc.so.6
(gdb)

comment:6 Changed 12 years ago by nickm

Possibly fixed in 9934.

comment:7 Changed 12 years ago by nickm

Possibly not. I keep getting the variant called from circuit_free(). I've added some log messages to my
sandbox to try to hunt it down.

comment:8 Changed 12 years ago by nickm

#0 0x0000002a95e43545 in raise () from /lib/libc.so.6
#1 0x0000002a95e44cce in abort () from /lib/libc.so.6
#2 0x000000000040ec8d in circuit_set_p_circid_orconn (circ=0x1108fa0, id=0,

conn=0x0) at circuitlist.c:177

#3 0x000000000040fe32 in circuit_unlink_all_from_or_conn (conn=0xe61200,

reason=8) at circuitlist.c:696

#4 0x000000000041ce34 in connection_about_to_close_connection (conn=0xe61200)

at or.h:968

#5 0x000000000043f205 in connection_unlink (conn=0xe61200, remove=1)

at main.c:229

#6 0x0000000000440215 in conn_close_if_marked (i=23076) at main.c:557
#7 0x000000000043fc47 in close_closeable_connections () at main.c:408
#8 0x000000000043fdaa in conn_read_callback (fd=23076, event=23076, _conn=0x6)

at main.c:443

#9 0x0000002a9599b82d in event_base_priority_init ()

from /usr/lib/libevent-1.1a.so.1

#10 0x0000002a9599ba72 in event_base_loop () from /usr/lib/libevent-1.1a.so.1
#11 0x0000002a9599b8e5 in event_loop () from /usr/lib/libevent-1.1a.so.1
#12 0x0000002a9599b84b in event_dispatch () from /usr/lib/libevent-1.1a.so.1
#13 0x00000000004413ba in do_main_loop () at main.c:1271
#14 0x0000000000442056 in tor_main (argc=23076, argv=0x5a24) at main.c:2497

(failure is on _entry_ to circuit_set_p_circid_orconn)

comment:9 Changed 12 years ago by nickm

#2 0x000000000040ec8d in circuit_set_p_circid_orconn (circ=0xd93ba0, id=0,

conn=0x0) at circuitlist.c:177

#3 0x0000000000414bcf in command_process_destroy_cell (cell=0x7fbfffeed0,

conn=0x1b5c3d0) at or.h:1561

#4 0x000000000041429d in command_process_cell (cell=0x7fbfffeed0, conn=0x73cb)

at command.c:146

#5 0x0000000000428b1b in connection_or_process_cells_from_inbuf (

conn=0x1b5c3d0) at connection_or.c:780

#6 0x000000000042796f in connection_or_process_inbuf (conn=0x73cb)

at connection_or.c:229

#7 0x00000000004207d1 in connection_process_inbuf (conn=0x73cb,

package_partial=29643) at or.h:968

#8 0x000000000041f020 in connection_handle_read (conn=0x1b5c3d0)

at connection.c:1514

#9 0x000000000043fcb1 in conn_read_callback (fd=29643, event=29643, _conn=0x6)

at main.c:427

#10 0x0000002a9599b82d in event_base_priority_init ()

from /usr/lib/libevent-1.1a.so.1

#11 0x0000002a9599ba72 in event_base_loop () from /usr/lib/libevent-1.1a.so.1
#12 0x0000002a9599b8e5 in event_loop () from /usr/lib/libevent-1.1a.so.1
#13 0x0000002a9599b84b in event_dispatch () from /usr/lib/libevent-1.1a.so.1

(again, failure is on entry.)

comment:10 Changed 12 years ago by nickm

Okay; r9936 seems much happier on peacetime. Unless somebody can make this happen again, I'm going
to close this.

comment:11 Changed 12 years ago by nickm

flyspray2trac: bug closed.
Seems fixed in r9936.

comment:12 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.