Opened 10 years ago

Last modified 7 years ago

#1073 closed defect (Fixed)

tor crashed, core dumped

Reported by: aaronsw Owned by:
Priority: Low Milestone:
Component: Core Tor/Tor Version: 0.2.1.19
Severity: Keywords:
Cc: aaronsw, Sebastian, arma, nickm, karsten Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I was running tor when it crashed abruptly. Unfortunately the binary was stripped so this backtrace is rather uninformative. I've built a new binary with symbols in it and will try to catch it next time.

Core was generated by `/usr/sbin/tor'.
Program terminated with signal 11, Segmentation fault.
[New process 8221]
#0 0x0000000000474bea in ?? ()
(gdb) bt
#0 0x0000000000474bea in ?? ()
#1 0x0000000000419338 in ?? ()
#2 0x000000000042fb10 in ?? ()
#3 0x000000000040ed12 in ?? ()
#4 0x000000000046e4c9 in ?? ()
#5 0x000000000046effb in ?? ()
#6 0x000000000041a376 in ?? ()
#7 0x0000000000434a1d in ?? ()
#8 0x0000000000435e7c in ?? ()
#9 0x000000000042bcc8 in ?? ()
#10 0x00000000004620be in ?? ()
#11 0x00007f293171f67d in event_base_loop () from /usr/lib/libevent-1.3e.so.1
#12 0x0000000000461cc6 in ?? ()
#13 0x0000000000461f15 in ?? ()
#14 0x00007f29309cd466 in libc_start_main () from /lib/libc.so.6
#15 0x0000000000407419 in ?? ()
#16 0x00007fff39d685c8 in ?? ()
#17 0x000000000000001c in ?? ()
#18 0x0000000000000001 in ?? ()
#19 0x00007fff39d68a50 in ?? ()
#20 0x0000000000000000 in ?? ()
(gdb)

[Automatically added by flyspray2trac: Operating System: Other Linux]

Child Tickets

Change History (16)

comment:1 Changed 10 years ago by Sebastian

hey, if you're able to reproduce, please also tell what OS you're running
and which package you installed/how you built Tor. That might be helpful

Thanks
Sebastian

comment:2 Changed 10 years ago by aaronsw

It happens repeatably. Ubuntu Intrepid with the latest noreply packages.

comment:3 Changed 10 years ago by arma

Install the tor-dbg package from noreply. It's got your symbols. Then gdb on the
Tor you have and the core you have ought to work.

comment:4 Changed 10 years ago by aaronsw

Nice!

Program terminated with signal 11, Segmentation fault.
[New process 8221]
#0 0x0000000000474bea in rend_client_send_introduction (introcirc=0x24da3c0,

rendcirc=0x2c992e0) at rendclient.c:110

110 log_warn(LD_BUG, "Internal error: could not find intro key; we "
(gdb) bt
#0 0x0000000000474bea in rend_client_send_introduction (introcirc=0x24da3c0,

rendcirc=0x2c992e0) at rendclient.c:110

#1 0x0000000000419338 in connection_ap_handshake_attach_circuit (

conn=0x2c04bd0) at circuituse.c:1521

#2 0x000000000042fb10 in connection_ap_attach_pending ()

at connection_edge.c:497

#3 0x000000000040ed12 in circuit_send_next_onion_skin (circ=0x24da3c0)

at circuitbuild.c:657

#4 0x000000000046e4c9 in connection_edge_process_relay_cell (

cell=0x7fff39d67ee0, circ=0x24da3c0, conn=0x0, layer_hint=0x2690190)
at relay.c:1121

#5 0x000000000046effb in circuit_receive_relay_cell (cell=0x7fff39d67ee0,

circ=0x24da3c0, cell_direction=CELL_DIRECTION_IN) at relay.c:179

#6 0x000000000041a376 in command_process_cell (cell=0x7fff39d67ee0,

conn=0x2aaac90) at command.c:414

#7 0x0000000000434a1d in connection_or_process_cells_from_inbuf (

conn=0x2aaac90) at connection_or.c:1250

#8 0x0000000000435e7c in connection_or_process_inbuf (conn=0x4)

at connection_or.c:265

#9 0x000000000042bcc8 in connection_handle_read (conn=0x2aaac90)

at connection.c:2014

#10 0x00000000004620be in conn_read_callback (fd=<value optimized out>,

event=4096, _conn=0x4e2250) at main.c:456

---Type <return> to continue, or q <return> to quit---
#11 0x00007f293171f67d in event_base_loop () from /usr/lib/libevent-1.3e.so.1
#12 0x0000000000461cc6 in do_main_loop () at main.c:1435
#13 0x0000000000461f15 in tor_main (argc=1, argv=<value optimized out>)

at main.c:2061

#14 0x00007f29309cd466 in libc_start_main () from /lib/libc.so.6
#15 0x0000000000407419 in _start ()
(gdb)

comment:5 Changed 10 years ago by arma

Great. So to make sure, this is the 0.2.1.19 deb?

Anything good in the logs? E.g. some warning messages or asserts?

comment:6 Changed 10 years ago by aaronsw

Yep, 0.2.19 deb. Nothing in the logs -- ended with just the normal messages like:

[notice] Closing stream for '[scrubbed].onion': hidden service is unavailable (try again later).
[notice] Tried for 120 seconds to get a connection to [scrubbed]:80. Giving up. (waiting for rendezvous desc)

comment:7 Changed 10 years ago by arma

Ok. The first part of the fix is clear. We're calling
rend_cache_lookup_entry(introcirc->rend_data->onion_address, 0, &entry)
and it's failing, so it's not writing anything into entry. Then we ask
for entry->parsed and we seg fault. That part is easy to fix.

The harder part is that both of those clauses appear to be for writing
out warn details for a situation that "shouldn't" happen. Karsten added
those logs, so I'm going to wait for him to get back from his weekend
and tell us what's going on.

(It looks like we made an intro circuit as a client, yet when it came
time to use it, the descriptor we have for the hidden service doesn't
list that intro point. That doesn't sound like it should be marked as
a bug -- it can probably legitimately happen if we update our descriptor
after launching the intro circuit. Perhaps the right behavior is to hunt
for and kill in-progress intro circuits whenever we get an updated
descriptor that no longer lists the intro point we're intending to use?)

comment:8 Changed 10 years ago by karsten

I fixed the easy part in my public branch fix-1073. Is that what you had in
mind, Roger?

As for hunting for and killing in-progress intro circuits whenever we get
an update descriptor, that is a somewhat invasive change that might
introduce new bugs. Also, should we start new introduction circuits when we
realize that we have to kill one or two introduction circuits?

Actually, I'm thinking that this bug is a very rare case. For now, it's
probably fine to just abort the connection attempt and let the user retry.
After all, the introduction points must have changed anyway, so that we
want to fall back to the logic of trying the other introduction points and/
or downloading a new rendezvous descriptor.

What we probably should do at some point is enumerate all client-side
states in the process of accessing a hidden service and make sure that all
cases are handled correctly in the code. That might include finding a
better fix for this bug, too. But this analysis should probably focus on
0.2.2.x and not touch the 0.2.1.x code anymore.

Does that sound sane?

comment:9 Changed 10 years ago by Sebastian

If it is a rare case, I wonder why it would be reliably reproducible for Aaron.
I've tried to reproduce it, but haven't been able to do so so far... hrm.

comment:10 Changed 10 years ago by aaronsw

It's possible it was crashing for other reasons the other times; I wasn't looking at backtraces. FWIW, it hasn't crashed since this.

comment:11 Changed 10 years ago by aaronsw

FWIW, it just happened again:

Aug 30 07:32:55.885 [notice] Closing stream for '[scrubbed].onion': hidden service is unavailable (try again later).
Aug 30 07:32:55.885 [notice] Closing stream for '[scrubbed].onion': hidden service is unavailable (try again later).
/etc/init.d/tor: line 112: 15849 Segmentation fault (core dumped)

[...]

Program terminated with signal 11, Segmentation fault.
[New process 8221]
#0 0x0000000000474bea in rend_client_send_introduction (introcirc=0x24da3c0,

rendcirc=0x2c992e0) at rendclient.c:110

110 log_warn(LD_BUG, "Internal error: could not find intro key; we "
(gdb) bt
#0 0x0000000000474bea in rend_client_send_introduction (introcirc=0x24da3c0,

rendcirc=0x2c992e0) at rendclient.c:110

#1 0x0000000000419338 in connection_ap_handshake_attach_circuit (

conn=0x2c04bd0) at circuituse.c:1521

#2 0x000000000042fb10 in connection_ap_attach_pending ()

at connection_edge.c:497

#3 0x000000000040ed12 in circuit_send_next_onion_skin (circ=0x24da3c0)

at circuitbuild.c:657

#4 0x000000000046e4c9 in connection_edge_process_relay_cell (

cell=0x7fff39d67ee0, circ=0x24da3c0, conn=0x0, layer_hint=0x2690190)
at relay.c:1121

#5 0x000000000046effb in circuit_receive_relay_cell (cell=0x7fff39d67ee0,

circ=0x24da3c0, cell_direction=CELL_DIRECTION_IN) at relay.c:179

#6 0x000000000041a376 in command_process_cell (cell=0x7fff39d67ee0,

conn=0x2aaac90) at command.c:414

#7 0x0000000000434a1d in connection_or_process_cells_from_inbuf (

conn=0x2aaac90) at connection_or.c:1250

#8 0x0000000000435e7c in connection_or_process_inbuf (conn=0x4)

at connection_or.c:265

#9 0x000000000042bcc8 in connection_handle_read (conn=0x2aaac90)

at connection.c:2014

#10 0x00000000004620be in conn_read_callback (fd=<value optimized out>,

event=4096, _conn=0x4e2250) at main.c:456

---Type <return> to continue, or q <return> to quit---
#11 0x00007f293171f67d in event_base_loop () from /usr/lib/libevent-1.3e.so.1
#12 0x0000000000461cc6 in do_main_loop () at main.c:1435
#13 0x0000000000461f15 in tor_main (argc=1, argv=<value optimized out>)

at main.c:2061

#14 0x00007f29309cd466 in libc_start_main () from /lib/libc.so.6
#15 0x0000000000407419 in _start ()

comment:12 Changed 10 years ago by arma

Hi Aaron,

Are you just patiently waiting for us to fix it at this point? :) Is the crash
continuing to happen?

I've put the fix into the maint-0.2.1 branch, so it will go out with 0.2.1.20. We're
planning to continue delaying that release ("no urgent bugfixes yet"), but if the bug
is actually impacting you, we can move it up.

Also, we preemptively fixed the bug a different way in 0.2.2.1-alpha, so that snapshot
should work better too.

comment:13 Changed 10 years ago by aaronsw

I'm just running tor from a while loop now so that it restarts when the bug gets triggered. It seems to be working OK, although there is some downtime as it restarts and I end up getting paged constantly.

comment:14 Changed 10 years ago by arma

Ok. Tor 0.2.1.20 is out, which fixes this bug. I'm going to close this. Let us know
if it resumes being a problem. Thanks!

comment:15 Changed 10 years ago by arma

flyspray2trac: bug closed.

comment:16 Changed 7 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.