Opened 11 years ago

Last modified 7 years ago

#632 closed defect (Fixed)

Tor v0.2.1.0-alpha-dev (r14101): eventdns(?): Assertion conn->read_event failed

Reported by: Safari Owned by:
Priority: High Milestone: 0.2.0.22-rc
Component: Core Tor/Tor Version: 0.1.2.19
Severity: Keywords:
Cc: Safari, nickm Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

With high concurrency, tor eventdns barfs up.
Trying to resolve 100 IP addresses with concurrency of 100+100 (for PTR->A and A->PTR) with command:
random-ip 100 64|DNSCACHEIP=127.0.0.69 dnsfilter -c 100 | DNSCACHEIP=127.0.0.69 dnsfilter -c 100 -p
Takes less than a minute till abort().

2008-03-18 19:14:58.980894879 [debug] connection_remove(): removing socket -1 (type Socks), n_conns now 897
2008-03-18 19:14:59.301429897 [debug] conn_write_callback(): socket 24 wants to write.
2008-03-18 19:14:59.301432330 [debug] flush_chunk_tls(): flushed 3901 bytes, 12483 ready to flush, 12483 remain.
2008-03-18 19:14:59.301433263 [debug] flush_chunk_tls(): flushed 4057 bytes, 8426 ready to flush, 8426 remain.
2008-03-18 19:14:59.301434092 [debug] flush_chunk_tls(): flushed 4057 bytes, 4369 ready to flush, 4369 remain.
2008-03-18 19:14:59.301434943 [debug] flush_chunk_tls(): flushed 4057 bytes, 312 ready to flush, 312 remain.
2008-03-18 19:14:59.301435738 [debug] flush_chunk_tls(): flushed 312 bytes, 0 ready to flush, 0 remain.
2008-03-18 19:14:59.301455409 [debug] connection_handle_write(): After TLS write of 16384: 0 read, 16722 written
2008-03-18 19:14:59.301456530 [err] Bug: main.c:300: connection_start_reading: Assertion conn->read_event failed; aborting.
2008-03-18 19:14:59.301457413 main.c:300 connection_start_reading: Assertion conn->read_event failed; aborting.
2008-03-18 19:15:00.340741083 [notice] Tor v0.2.1.0-alpha-dev (r14101). This is experimental software. Do not rely on it for strong anonymity. (Running on Linux x86_64)

I have a big log file, but this should be easily reproducable.

Other notes about eventdns: (maybe this should go in a different bugreport)

  • O(N) lookup for transactions IDs (this may or may not show up in the profiles, but is nevertheless not a very clever tactic)
  • using malloc+memset instead of calloc
  • ad-hoc handling of lists (e.g., would look more readable if you made static inline function list_add instead of doing ns->next = server_head->next; ns->prev = server_head; server_head->next = ns; etc.)
  • useless casts: resolv = (u8 *) malloc((size_t)st.st_size + 1);
  • why is it doing CLEAR so often before free?

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (9)

comment:1 Changed 11 years ago by nickm

eventdns bugs should get reported on libevent's bugtracker at sourceforge.net; we're hoping to stop duplicating
it some time in the next release or two and just require a recent libevent version.

Could you post a backtrace from the assert?

comment:2 Changed 11 years ago by Safari

Well I don't know is this eventdns bug?

No backtraces were logged.

comment:3 Changed 11 years ago by nickm

The "other notes about eventdns" are eventdns bugs. :)

To get a backtrace, you'll need to run Tor under gdb, or unlimit your coredumpsize and get Tor to give you a core dump,
then pass the core to gdb.

(I haven't been able to reproduce the original bug yet; the next step will be installing djbdns to reproduce your
command line exactly. :/)

comment:4 Changed 11 years ago by nickm

(Installed djbdns, but haven't managed to reproduce the bug yet. The dnsfilter in djbdns 1.05 claims not to
have a -p option. What dnsfilter are you using?)

comment:5 Changed 11 years ago by Safari

Oh I thought tor supports printing out backtrace with some magic switch (some apps can do it, without gdb or core dumping) :)

It should not matter what client you use for querying, but you need a patch for dnsfilter to get -p option:
http://safari.iki.fi/patches/djbdns/djbdns-1.05-dnsfilter-features.diff

comment:6 Changed 11 years ago by nickm

Ah, there we go!

#3 0x080a2707 in connection_start_reading (conn=0x9c219b0) at main.c:300
#4 0x080ac8a9 in set_streams_blocked_on_circ (circ=<value optimized out>,

orconn=<value optimized out>, block=0) at relay.c:1822

#5 0x080add78 in connection_or_flush_from_first_active_circuit (

conn=0x9abd330, max=1) at relay.c:1875

#6 0x08078f61 in connection_or_flushed_some (conn=0x9abd330)

at connection_or.c:294

#7 0x0806ad04 in connection_flushed_some (conn=0x9abd330) at connection.c:2713
#8 0x0807048c in connection_handle_write (conn=0x9abd330, force=0)

at connection.c:2247

#9 0x080a5061 in conn_write_callback (fd=13, events=4, _conn=0x9abd330)

at main.c:489

#10 0x001141d5 in event_base_loop () from /usr/lib/libevent-1.1a.so.1
#11 0x001143f9 in event_loop () from /usr/lib/libevent-1.1a.so.1
#12 0x080a4db9 in do_main_loop () at main.c:1446
#13 0x080a4f6b in tor_main (argc=5, argv=0xbf9e5784) at main.c:1986
#14 0x080d9762 in main (argc=Cannot access memory at address 0x1fde
) at tor_main.c:29

comment:7 Changed 11 years ago by nickm

okay. I think I have this fixed. Try r14110 or later?

comment:8 Changed 11 years ago by nickm

flyspray2trac: bug closed.
Can no longer reproduce, marking as closed. Please tag for reopening if it shows up again.

comment:9 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.