Opened 12 years ago

Last modified 7 years ago

#614 closed defect (Fixed)

Tor crash on 0.2.20-rc

Reported by: Fredzupy Owned by: nickm
Priority: Very High Milestone: 0.2.0.22-rc
Component: Core Tor/Tor Version: 0.2.0.20-rc
Severity: Keywords:
Cc: Fredzupy, nickm, arma Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Here is the logs :

Feb 25 21:28:23.074 [warn] TLS error while renegotiating handshake with [scrubbed]: sslv3 alert handshake failure (in SSL routines:SSL3_READ_BYTES)
Feb 26 05:40:15.124 [warn] TLS error while renegotiating handshake with [scrubbed]: sslv3 alert handshake failure (in SSL routines:SSL3_READ_BYTES)
Feb 26 07:51:54.217 [warn] No ciphers on session
Feb 26 07:51:54.221 [err] Bug: connection.c:1580: connection_buckets_decrement: Assertion num_written < INT_MAX failed; aborting.

Sorry, I don't have a core file.

I've relaunch the same version to see if it happen again.

[Automatically added by flyspray2trac: Operating System: Other Linux]

Child Tickets

Change History (14)

comment:1 Changed 12 years ago by nickm

The SSL warnings are unrelated to the crash bug.

It looks like connection_buckets_decrement() is getting called with a num_written value that is greater than INT_MAX.
This shouldn't be happening unless Tor just managed to write or read over 2GB of data on a single connection in
a single call. More likely, there's an underflow bug someplace.

comment:2 Changed 12 years ago by nickm

BTW, if you get a core the next time this happens, the really useful things to find out would be a) a stack trace,
and b) the value of *conn in either connection_handle_write() or connection_handle_read(). (This function is called
from one of those two functions.)

comment:3 Changed 12 years ago by nickm

I've changed this from an assert to a "Bug" warning. It's still a bug, but it's not worth shutting down servers
for. With any luck, we'll turn it up soon.

comment:4 Changed 11 years ago by Fredzupy

It happen again.

Mar 04 03:58:18.077 [err] connection_buckets_decrement(): Bug: Value out of range. num_read=261,
num_written=4294966313, connection type=OR, state=waiting for renegotiation (TLS)

comment:5 Changed 11 years ago by nickm

Interesting! That definitely looks like an integer underflow in num_written. The fact that this is a TLS connection
means that the value is coming out of tor_tls_get_n_raw_bytes().

What version of openssl are you using? Am I right in guessing that you're on a 32-bit platform?

comment:6 Changed 11 years ago by Fredzupy

openssl 0.9.8g
Right, 32-bit plateform.

comment:7 Changed 11 years ago by shamrock

I am seeing a similar error message in the log:

Mar 10 06:00:23.333 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Mar 10 06:00:23.595 [warn] TLS error while renegotiating handshake with [scrubbed]: sslv3 alert handshake failure (in SSL routines:SSL3_READ_BYTES)
Mar 10 06:00:25.781 [err] connection_buckets_decrement(): Bug: Value out of range. num_read=85, num_written=18446744073709550549, connection type=OR, state=waiting for renegotiation (TLS)
Mar 10 06:01:23.038 [notice] Self-testing indicates your DirPort is reachable from the outside. Excellent.
Mar 10 06:01:43.843 [warn] TLS error while renegotiating handshake with [scrubbed]: sslv3 alert handshake failure (in SSL routines:SSL3_READ_BYTES)

This happens only once when Tor first starts up and never subsequent to it.

Tor v0.2.0.21-rc (r13812) on 64-bit Debian etch. Standard Debian package.

Linux 2.6.22-4-amd64

comment:8 Changed 11 years ago by arma

Nick: I wonder if openssl, somewhere deep inside, resets the value of
BIO_number_written(SSL) when we think about renegotiating?

comment:9 Changed 11 years ago by nickm

I do too; I've been looking at the code to try to find anything like this, but to little avail.

comment:10 Changed 11 years ago by nickm

So, the only things that change bio->num_write are actually writing data onto the BIO (with BIO_write or BIO_puts),
or changing the BIO's method (with BIO_set). And the SSL code doesn't call either of those after starting...

Oh! But there's also ssl_init_wbio_buffer(), which apparently replaces a previous ssl->wbio with a new buffered
BIO. Big fun.

comment:11 Changed 11 years ago by nickm

Okay, I've checked in a possible fix as r13975. The comment there may be interesting:

/* We want the number of bytes actually for real written. Unfortunately,

  • sometimes OpenSSL replaces the wbio on tls->ssl with a buffering bio,
  • which makes the answer turn out wrong. Let's cope with that. Note
  • that this approach will fail if we ever replace tls->ssl's BIOs with
  • buffering bios for reasons of our own. As an alternative, we could
  • save the original BIO for tls->ssl in the tor_tls_t structure, but
  • that would be tempting fate. */

wbio = SSL_get_wbio(tls->ssl);
if (wbio->method == BIO_f_buffer() && (tmpbio = BIO_next(wbio)) != NULL)

wbio = tmpbio;

w = BIO_number_written(wbio);

I'm going to try trunk out on peacetime for a while. Assuming it doesn't fall over, I'll backport the fix
to 0.2.0.x.

comment:12 Changed 11 years ago by nickm

That was a fun one. The fix was backported as r13982. With any luck, it should work now. Marking this bug as
fixed.

comment:13 Changed 11 years ago by nickm

flyspray2trac: bug closed.

comment:14 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.