Opened 14 years ago

Last modified 7 years ago

#221 closed defect (Fixed)

connection_start_writing: Assertion conn->write_event failed

Reported by: weasel Owned by:
Priority: Low Milestone:
Component: Core Tor/Tor Version: 0.1.1.10-alpha
Severity: Keywords:
Cc: weasel Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I've been running tor nodes in a dedicated test network.

This Tor node was started, ran for 30 to 90 seconds and was
then sent a TERM signal, either via the controller interface
or directly as a Unix signal.

Dec 19 09:43:43.120 [err] main.c:347: connection_start_writing: Assertion conn->write_event failed; aborting.

bt:
#0 0x556f583b in raise () from /lib/tls/libc.so.6
#1 0x556f6fa2 in abort () from /lib/tls/libc.so.6
#2 0x0808b62d in connection_start_writing (conn=0xd7d81d8) at main.c:347
#3 0x08069190 in connection_write_to_buf (string=0xffff9420 "POST ", len=5, conn=0xd7d81d8) at connection.c:1557
#4 0x0807ccfd in directory_send_command (conn=0xd7d81d8, platform=0xc31d7d0 "Tor 0.1.1.10-alpha-cvs on Linux x86_64", purpose=8, resource=0x0, payload=0xd7dac40 "",
+payload_len=288) at directory.c:587
#5 0x0807c1e0 in directory_initiate_command (address=0xc317f58 "127.0.0.1", addr=2130706433, dir_port=19031, platform=0xc31d7d0 "Tor 0.1.1.10-alpha-cvs on Linux x86_64",
+digest=0x80fcfe4 "Cª!àñÓátUèã½&®¦hÈõ)\023", purpose=8 '\b', private_connection=1,

resource=0x0, payload=0xd7dac40 "", payload_len=288) at directory.c:465

#6 0x0807b9cb in directory_initiate_command_routerstatus (status=0x80fcfcc, purpose=8 '\b', private_connection=1, resource=0x0, payload=0xd7dac40 "", payload_len=288) at
+directory.c:293
#7 0x0807b599 in directory_post_to_dirservers (purpose=8 '\b', payload=0xd7dac40 "", payload_len=288) at directory.c:139
#8 0x08098e4b in upload_service_descriptor (service=0x80fd140, version=0) at rendservice.c:914
#9 0x08099380 in rend_consider_services_upload (now=1134981823) at rendservice.c:1064
#10 0x0808c8cd in run_scheduled_events (now=1134981823) at main.c:876
#11 0x0808cb3f in second_elapsed_callback (fd=-1, event=1, args=0x0) at main.c:957
#12 0x556c7c79 in event_base_priority_init () from /usr/lib/libevent-1.1a.so.1
#13 0x556c7f65 in event_base_loop () from /usr/lib/libevent-1.1a.so.1
#14 0x556c7dcb in event_loop () from /usr/lib/libevent-1.1a.so.1
#15 0x556c7cb0 in event_dispatch () from /usr/lib/libevent-1.1a.so.1
#16 0x0808cf64 in do_main_loop () at main.c:1114
#17 0x0808df2d in tor_main (argc=3, argv=0xffffba44) at main.c:2077
#18 0x080abf7e in main (argc=3, argv=0xffffba44) at tor_main.c:22

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (6)

comment:1 Changed 14 years ago by weasel

Notice level log:
Dec 19 09:43:18.644 [notice] Tor 0.1.1.10-alpha-cvs opening log file.
Dec 19 09:43:19.141 [notice] I learned some more directory information, but not enough to build a circuit.
Dec 19 09:43:19.141 [notice] update_router_descriptor_downloads(): Launching request for all routers
Dec 19 09:43:20.273 [warn] connection_add(): Failing because we have 991 connections already. Please raise your ulimit -n.
Dec 19 09:43:20.273 [warn] connection_add(): Failing because we have 991 connections already. Please raise your ulimit -n.
[about 150 more of those lines]
Dec 19 09:43:20.334 [warn] connection_add(): Failing because we have 991 connections already. Please raise your ulimit -n.
Dec 19 09:43:20.346 [warn] connection_add(): Failing because we have 991 connections already. Please raise your ulimit -n.
Dec 19 09:43:22.064 [notice] I learned some more directory information, but not enough to build a circuit.
Dec 19 09:43:32.368 [notice] We now have enough directory information to build circuits.
Dec 19 09:43:35.677 [notice] Tor has successfully opened a circuit. Looks like it's working.
Dec 19 09:43:36.759 [warn] connection_add(): Failing because we have 991 connections already. Please raise your ulimit -n.
Dec 19 09:43:36.760 [warn] connection_add(): Failing because we have 991 connections already. Please raise your ulimit -n.
[about 1000 more of them]
Dec 19 09:43:42.759 [warn] connection_add(): Failing because we have 991 connections already. Please raise your ulimit -n.
Dec 19 09:43:43.120 [warn] connection_add(): Failing because we have 991 connections already. Please raise your ulimit -n.
Dec 19 09:43:43.120 [err] main.c:347: connection_start_writing: Assertion conn->write_event failed; aborting.

comment:2 Changed 14 years ago by weasel

I think it's possible that there way well over 1000
server descriptors in the directory, most of the them
of servers that no longer existed.

Is Tor smart enough to try only 'Running' nodes?

comment:3 Changed 14 years ago by weasel

For my own reference, here's a list of Tor C cores with this error:

./tor-create-and-go-away-43/core.26094.connection_start_writing
./tor-create-and-go-away-44/core.7909.connection_start_writing
./tor-create-and-go-away-45/core.13068.connection_start_writing
./tor-create-and-go-away-45/core.26659.connection_start_writing
./tor-create-and-go-away-45/core.5620.connection_start_writing
./tor-create-and-go-away-46/core.2606.connection_start_writing

comment:4 Changed 14 years ago by weasel

At one time Roger requested *conn:
(gdb)
#2 0x0808b62d in connection_start_writing (conn=0xd7d81d8) at main.c:347
347 tor_assert(conn->write_event);
(gdb) p *conn
$1 = {magic = 2084319310, type = 9 '\t', state = 2 '\002', purpose = 8 '\b',

wants_to_read = 0, wants_to_write = 0, hold_open_until_flushed = 0,
has_sent_end = 0, control_events_are_extended = 0, is_obsolete = 0,
s = 1002, poll_index = -1, read_event = 0x0, write_event = 0x0,
inbuf = 0xd7d82a8, inbuf_reached_eof = 0, timestamp_lastread = 1134981823,
outbuf = 0xd7d8190, outbuf_flushlen = 0, timestamp_lastwritten = 1134981823,
timestamp_created = 1134981823, timestamp_lastempty = 0, addr = 2130706433,
port = 19031, marked_for_close = 0, marked_for_close_file = 0x0,
address = 0xd7d81b0 "127.0.0.1", identity_pkey = 0x0,
identity_digest = "Cª!àñÓátUèã½&®¦hÈõ)\023", nickname = 0x0,
chosen_exit_name = 0x0, tls = 0x0, bandwidth = 0, receiver_bucket = 0,
circ_id_type = CIRC_ID_TYPE_LOWER, n_circuits = 0, next_with_same_id = 0x0,
next_circ_id = 28895, stream_id = 0, next_stream = 0x0, cpath_layer = 0x0,
package_window = 0, deliver_window = 0, requested_resource = 0x0,
socks_request = 0x0, global_identifier = 2263, event_mask = 0,
incoming_cmd_len = 0, incoming_cmd_cur_len = 0, incoming_cmd = 0x0,
on_circuit = 0x0, rend_query = '\0' <repeats 16 times>,
incoming_cmd_type = 0}

comment:5 Changed 14 years ago by nickm

flyspray2trac: bug closed.
Fixed in CVS. The problem was that we weren't checking the return value of connection_add in directory_initiate_command().

comment:6 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.