Opened 5 years ago

Closed 5 years ago

Last modified 4 years ago

#13471 closed defect (fixed)

router daemon crashes with openssl built no_ssl3

Reported by: starlight Owned by:
Priority: Immediate Milestone: Tor: 0.2.6.x-final
Component: Core Tor/Tor Version: Tor: 0.2.4.24
Severity: Keywords: tor-relay, ssl3, poodle, 025-backport, 024-backport, 023-backport, 2016-bug-retrospective
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Access violation when 'tor' is run with
openssl-1.0.1j built with no_ssl3 option and

openssl s_client -ssl3 -connect x.x.x.x:x

is run against the OR port

Child Tickets

Attachments (5)

tor_nossl3_segv (1.1 KB) - added by starlight 5 years ago.
stack trace
do_Configure (197 bytes) - added by starlight 5 years ago.
openssl configuration command
openssl-1.0.1j-lto_as.patch (1.1 KB) - added by starlight 5 years ago.
openssl patch, apply after running do_Configure
tor-0.2.4.24-as.do_configure (169 bytes) - added by starlight 5 years ago.
configuration command for tor
tor-0.2.4.24-as.patch (2.7 KB) - added by starlight 5 years ago.
patch for tor, apply after running configure

Download all attachments as: .zip

Change History (28)

Changed 5 years ago by starlight

Attachment: tor_nossl3_segv added

stack trace

Changed 5 years ago by starlight

Attachment: do_Configure added

openssl configuration command

comment:1 Changed 5 years ago by nickm

Keywords: tor-relay ssl3 poodle 025-backport 024-backport 023-backport added
Milestone: Tor: 0.2.6.x-final
Priority: normalmajor

comment:2 Changed 5 years ago by nickm

So if the version is right, and the stack trace can be trusted, that's happening in tor_tls_free's call to SSL_set_tlsext_host_name(tls->ssl, NULL);

And if the stack trace can be trusted, the problem is happening in " SSL_ctrl openssl-1.0.1j-as/ssl_lib.c:1106". In a stock openssl-1.0.1j, that line is

  return(s->method->ssl_ctrl(s,cmd,larg,parg));

So, the possibilities seem to be that s->method is junk, that s->method->ssl_ctrl is junk, or that something about my analysis above is wrong.

Could you try taking a debugger to look at the contents of tls->ssl and tls->ssl->method in tor_tls_free() when this violation happens?

comment:3 Changed 5 years ago by starlight

I believe the stack trace is good. Have three
examples that agree, this one with -O2 and
two with -O1 -fsanitize-address.

The macro used in the call is

tls1.h:304

#define SSL_set_tlsext_host_name(s,name) \
SSL_ctrl(s,SSL_CTRL_SET_TLSEXT_HOSTNAME,TLSEXT_NAMETYPE_host_name,(char *)name)

where SSL_CTRL_SET_TLSEXT_HOSTNAME is 55 per ssl/ssl.h:1583
and the chosen switch/case of default: matches the
indicated code.

I'm not setup to debug the tor daemon and it would
take a day. Sorry don't have time right now, but
this should be trivial to reproduce in an existing
debug environment.

comment:4 Changed 5 years ago by nickm

ok, anybody else able to try to debug this?

comment:5 Changed 5 years ago by dgoulet

I tried with the provided configure options and couldn't reproduce it (on tor master branch).

@starlight: can you provide the commit id or git tag of tor you used or package version ?

comment:6 Changed 5 years ago by nickm

You might need to hand-revert ab4b29625db720817f9af502199ebf1ee3ac5af7 to reproduce this on any of the git branches. (That's the one that disables SSLv3 use in Tor.)

Last edited 5 years ago by nickm (previous) (diff)

comment:7 Changed 5 years ago by nickm

Oops, no, not that one! The one you might want to hand-revert is af73d3e4d83ba7f404068008ad617e02b8a0a77b.

comment:8 in reply to:  5 Changed 5 years ago by yawning

Replying to dgoulet:

I tried with the provided configure options and couldn't reproduce it (on tor master branch).

@starlight: can you provide the commit id or git tag of tor you used or package version ?

Likewise, can't reproduce on master, or with an older copy of master that I'm fairly sure predates nickm's change. However I can reproduce it on 0.2.4.24, so here's the stacktrace.

Program received signal SIGSEGV, Segmentation fault.
0x00005555556cfbcd in SSL_ctrl (s=0x55555777b6c0, cmd=55, larg=0, parg=0x0)
    at ssl_lib.c:1106
1106			return(s->method->ssl_ctrl(s,cmd,larg,parg));
(gdb) bt
#0  0x00005555556cfbcd in SSL_ctrl (s=0x55555777b6c0, cmd=55, larg=0, parg=0x0)
    at ssl_lib.c:1106
#1  0x00005555556ba291 in tor_tls_free (tls=0x55555777b660)
    at src/common/tortls.c:2029
#2  0x000055555563f701 in connection_free_ (conn=conn@entry=0x55555777b080)
    at src/or/connection.c:512
#3  0x00005555556400d3 in connection_free (conn=conn@entry=0x55555777b080)
    at src/or/connection.c:631
#4  0x0000555555595555 in connection_unlink (conn=conn@entry=0x55555777b080)
    at src/or/main.c:414
#5  0x00005555555961b4 in conn_close_if_marked (i=<optimized out>)
    at src/or/main.c:930
#6  0x000055555559622a in close_closeable_connections () at src/or/main.c:699
#7  0x0000555555596636 in conn_read_callback (fd=<optimized out>, 
    event=<optimized out>, _conn=0x555555b3c980) at src/or/main.c:734
#8  0x00007ffff7689104 in event_base_loop () from /usr/lib/libevent-2.0.so.5
#9  0x0000555555597ae7 in do_main_loop () at src/or/main.c:1987
#10 0x00005555555988a4 in tor_main (argc=3, argv=0x7fffffffe4e8)
    at src/or/main.c:2703
#11 0x0000555555593249 in main (argc=<optimized out>, argv=<optimized out>)
    at src/or/tor_main.c:30
(gdb) p s
$1 = (SSL *) 0x55555777b6c0
(gdb) p s->method
$2 = (const SSL_METHOD *) 0x0

Edit: Derp, my old copy is linked against my system openssl.

Last edited 5 years ago by yawning (previous) (diff)

comment:9 Changed 5 years ago by starlight

Used the official release TAR tor-0.2.4.24.tar.gz
SHA1SUM b9f33aed3a2a4a6afa952b6595f5c9a611ffb2bd.

No idea what the GIT tag is, but if it's unclear
I suggest starting with the TAR and working back
from there.

I'm uploading the .patch files used to tweak
'openssl' and 'tor' for running with Address
Sanitizer in production as is done here.
However it looks to me like a straightforward
NULL pointer dereference and all ASAN is doing
is providing a decent stack-trace (with the
assistance of asan_symbolize.py). Using GCC 4.9.1
which so far has not given me any trouble.

comment:10 Changed 5 years ago by nickm

Priority: majorblocker

Changed 5 years ago by starlight

Attachment: openssl-1.0.1j-lto_as.patch added

openssl patch, apply after running do_Configure

Changed 5 years ago by starlight

configuration command for tor

Changed 5 years ago by starlight

Attachment: tor-0.2.4.24-as.patch added

patch for tor, apply after running configure

comment:11 Changed 5 years ago by yawning

Applying af73d3e4d83ba7f404068008ad617e02b8a0a77b (See: #13426) fixes the issue, so it's probably the path of least resistance since the branch is good to go from 0.2.3.x up.

comment:12 Changed 5 years ago by starlight

I'll try the patch and report back.

comment:13 Changed 5 years ago by starlight

Can't find the patch. Direct gitweb link would be much appreciated.

comment:15 Changed 5 years ago by starlight

The patch worked.

some points of interest
==================

Actual patch is

https://gitweb.torproject.org/tor.git/commitdiff_plain/c1c83eb376a7c89fadb01d1c7082d4aa4125333d?hp=0eec8e2aa51f779e458fb3831b0ed8ae1db896ec

and it it applies with some "fuzz" warnings so
perhaps should be adapted for the 0.2.4.24 branch

openssl s_client -ssl3 -connect x.x.x.x:x  # CentOS openssl-0.9.8e-27.el5_10.4

returns

CONNECTED(00000003)
11175:error:14094410:SSL routines:SSL3_READ_BYTES:sslv3 alert handshake failure:s3_pkt.c:1092:SSL alert number 40
11175:error:1409E0E5:SSL routines:SSL3_WRITE_BYTES:ssl handshake failure:s3_pkt.c:536:

but

openssl s_client -ssl2 -connect x.x.x.x:x

returns

CONNECTED(00000003)
write:errno=104

comment:16 Changed 5 years ago by nickm

It appears that there is an associated openssl ticket:

http://marc.info/?l=openssl-dev&m=141357408522028&w=2

I think on the Tor side, the right fix is as follows.

  1. Add a new changes entry to indicate that our fix for #13426 also fixes #13471
  2. Put out a new 0.2.4 and 0.2.5-rc release.
  3. Hope that openssl does something sensible too

I've done item 1 above as c1dd598df8972a6b7b8bdeff6a1549a7c171d7dc.

comment:17 Changed 5 years ago by starlight

As a point of interest, this bug handily facilitated
the completion of a goal I've had to configure
useable core-dumps on the stripped-down production
relay system.

The key bit was an undocumented ASAN environment option:

unmap_shadow_on_exit=1

per

https://code.google.com/p/address-sanitizer/issues/detail?id=345

so running with

ASAN_OPTIONS="disable_core=0:unmap_shadow_on_exit=1:abort_on_error=1"

I was finally able to obtain a good core file
of this particular event. Then one brings
the chroot_tor jail over to the dev system
and can use gdb to examine the file. E.G.

$ gdb /ww/chroot_tor/usr/local/bin/tor-sanitize-0.2.4.24
Reading symbols from /ww/chroot_tor/usr/local/bin/tor-sanitize-0.2.4.24...done.
(gdb) set sysroot /ww/chroot_tor
(gdb) add-auto-load-safe-path /ww/chroot_tor/lib64/libthread_db-1.0.so
(gdb) core-file core.8353
warning: core file may not match specified executable file.
[New LWP 8353]
[New LWP 8356]
[New LWP 8355]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/ww/chroot_tor/lib64/libthread_db.so.1".
Core was generated by `/usr/local/bin/tor -f /home/tor/torrc --quiet'.
Program terminated with signal SIGABRT, Aborted.
#0  0x? in raise () from /ww/chroot_tor/lib64/libc.so.6
#1  0x? in abort () from /ww/chroot_tor/lib64/libc.so.6
#2  0x? in __sanitizer::Abort () at ...
#3  0x? in __asan::AsanDie () at ...
#4  0x? in __sanitizer::Die () at ...
#5  0x? in __asan::ScopedInErrorReport::~ScopedInErrorReport (this=<optimized out>, __in_chrg=<optimized out>) at ...
#6  0x? in __asan::ReportSIGSEGV (pc=, sp=, bp=, addr=) ...
#7  0x? in __asan::ASAN_OnSIGSEGV (siginfo=<optimized out>, context=0x?) at ...
#8  <signal handler called>

#9  0x? in SSL_ctrl (s=0x?, cmd=55, larg=0, parg=0x0) at ssl_lib.c:1106
#10 0x? in tor_tls_free (tls=0x?) at src/common/tortls.c:2029
#11 0x? in connection_free_ (conn=conn@entry=0x?) at src/or/connection.c:512
#12 0x? in connection_free (conn=conn@entry=0x?) at src/or/connection.c:631
#13 0px in connection_unlink (conn=conn@entry=0x?) at src/or/main.c:414
#14 0x? in conn_close_if_marked (i=<optimized out>) at src/or/main.c:930
#15 close_closeable_connections () at src/or/main.c:699
#16 0x? in conn_read_callback (fd=<optimized out>, event=<optimized out>, _conn=0x?) at src/or/main.c:734
#17 0x? in event_process_active_single_queue (activeq=0x?, base=0x?) at event.c:1350
#18 event_process_active (base=<optimized out>) at event.c:1420
#19 event_base_loop (base=0x?, flags=flags@entry=0) at event.c:1621
#20 0x? in do_main_loop () at src/or/main.c:1987
#21 0x? in tor_main (argc=4, argv=0x?) at src/or/main.c:2703
#22 0x? in main (argc=<optimized out>, argv=<optimized out>) at src/or/tor_main.c:30

(gdb) directory /w/gpl/openssl-1.0.1j-as-no_ssl3/ssl/
Source directories searched: /w/gpl/openssl-1.0.1j-as-no_ssl3/ssl:$cdir:$cwd
(gdb) up 9
#9  0x? in SSL_ctrl (s=0x?, cmd=55, larg=0, parg=0x0) at ssl_lib.c:1106
1106                    return(s->method->ssl_ctrl(s,cmd,larg,parg));
(gdb) print s
$3 = (SSL *) 0x?
(gdb) print s->method
$4 = (const struct ssl_method_st *) 0x0

Had to add libthread_db-1.0.so to the copy of the jail for 'gdb'.

Of course the point of all this is to obtain core files
that can be used for postmortem analysis a one-off failures
that may be impossible to reproduce.

comment:18 Changed 5 years ago by nickm

Interesting stuff!

Ideally, I'd like the sanatizer stuff that gets included when you build tor with --enable-expensive-hardening to include "unmap_shadow_on_exit=1" if that will help us get good debuggability. Can we turn it on by default?

comment:19 Changed 5 years ago by starlight

I'll have to take a look at --enable-expensive-hardening.

I see in

gcc-4.9.1/libsanitizer/include/sanitizer/asan_interface.h

  // This function may be optionally provided by user and should return
  // a string containing ASan runtime options. See asan_flags.h for details.
  const char* __asan_default_options();

which seems like it will do the job. Be aware that
many references are made in web-postings regarding
ASAN to usage similar to

#if defined(__has_feature)
#if __has_feature(address_sanitizer)
  __sanitizer_sandbox_on_notify(NULL);
#endif
#endif

which I call to allow ASAN to work with
/chroot_tor/proc unmounted after
startup, but the conditional compile is
specific to CLANG and does not work for
GCC. You should create your own -DASANFLAG
conditional compilation flag. I only just
figured this out and the above code was
not working at all though I was
laboring under the misconception
that it was.

Also be sure to either direct stdout/stderr
to a file (as in my patch above) or configure
the "log_path" ASAN option or ASAN will not
work. I favor standard I/O since sometimes
glibc will write a message that may be
of value.

comment:20 Changed 5 years ago by starlight

Note all of the options should be used

ASAN_OPTIONS="disable_core=0:unmap_shadow_on_exit=1:abort_on_error=1"

Apparently abort_on_error=1 is necessary for
SEGV traps to produce a core per the Google
code issue linked above.

comment:21 Changed 5 years ago by yawning

Resolution: fixed
Status: newclosed

New release tarballs with the workaround for this have been uploaded (0.2.4.25, 0.2.5.9-rc), mailing list posting and website changes will happen shortly.

The ASAN improvements are now at #13499, please let me know if I missed anything important, or add a comment there, thanks.

comment:22 Changed 4 years ago by nickm

Keywords: 2016-bug-retrospective added

Marking for bug retrospective based on Priority.

comment:23 Changed 4 years ago by nickm

Mark more tickets for bug retrospective based on hand-review of changelogs from 0.2.5 onwards.

Note: See TracTickets for help on using tickets.