Opened 3 years ago

Closed 3 years ago

#14901 closed defect (fixed)

Segfault when calling SETCONF

Reported by: atagar Owned by:
Priority: High Milestone: Tor: 0.2.6.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: regression
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Awwww, ran into one last weekend too. *sob*

For tor...

% git checkout 5644d92
% git clean -fdx
% make dist-clean; ./autogen.sh && ./configure && make
% mkdir /tmp/tor_test
[made a torrc...]

% cat /tmp/tor_test/torrc 
DataDirectory /tmp/tor_test
ControlPort 1111

% tor -f /tmp/tor_test/torrc

And then...

atagar@odin:~/Desktop/stem$ telnet localhost 1111
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
AUTHENTICATE 
250 OK
GETINFO version
250-version=0.2.6.2-alpha-dev (git-5644d92dd7081e4a)
250 OK
SETCONF ORPort=9090
Connection closed by foreign host.

The tor instance fails with...

Feb 14 12:05:03.000 [notice] Now checking whether ORPort 174.21.175.181:9090 is reachable... (this may take up to 20 minutes -- look for log messages indicating success)

============================================================ T= 1423944303
Tor 0.2.6.2-alpha-dev (git-5644d92dd7081e4a) died: Caught signal 11
tor(+0x12234e)[0xb772c34e]
/lib/i386-linux-gnu/libpthread.so.0(__pthread_mutex_lock+0x17)[0xb7332cb7]
/lib/i386-linux-gnu/libpthread.so.0(__pthread_mutex_lock+0x17)[0xb7332cb7]
tor(tor_mutex_acquire+0x2c)[0xb774634c]
tor(threadpool_queue_update+0x55)[0xb77452a5]
tor(cpuworkers_rotate_keyinfo+0x4f)[0xb76f69af]
tor(set_options+0xf98)[0xb76ce5b8]
tor(options_trial_assign+0xd7)[0xb76d0567]
tor(+0xe4b81)[0xb76eeb81]
tor(connection_control_process_inbuf+0x6e4)[0xb76f2bf4]
tor(+0xca744)[0xb76d4744]
tor(connection_handle_read+0x7c7)[0xb76dadb7]
tor(+0x28d41)[0xb7632d41]
/usr/lib/libevent-2.0.so.5(event_base_loop+0x209)[0xb7554ce9]
tor(do_main_loop+0x1bb)[0xb763372b]
tor(tor_main+0x1f6d)[0xb76370dd]
tor(main+0x33)[0xb762fa03]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb718f4d3]
tor(+0x25a4d)[0xb762fa4d]
Aborted

Child Tickets

Change History (8)

comment:1 Changed 3 years ago by atagar

Mr. Sebastian requested a valgrind dump - hopefully I'm doing this right...

atagar@odin:~/Desktop/tor/tor$ valgrind --leak-check=yes tor -f /tmp/tor_test/torrc 
==9554== Memcheck, a memory error detector
==9554== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==9554== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==9554== Command: tor -f /tmp/tor_test/torrc
==9554== 
Feb 14 12:18:56.472 [notice] Tor v0.2.6.2-alpha-dev (git-5644d92dd7081e4a) running on Linux with Libevent 2.0.16-stable, OpenSSL 1.0.1 and Zlib 1.2.3.4.
Feb 14 12:18:56.552 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Feb 14 12:18:56.553 [notice] This version is not a stable Tor release. Expect more bugs than usual.
Feb 14 12:18:56.572 [notice] Read configuration file "/tmp/tor_test/torrc".
Feb 14 12:18:56.907 [warn] ControlPort is open, but no authentication method has been configured.  This means that any program on your computer can reconfigure your Tor.  That's bad!  You should upgrade your Tor controller as soon as possible.
Feb 14 12:18:57.166 [notice] Opening Socks listener on 127.0.0.1:9050
Feb 14 12:18:57.189 [notice] Opening Control listener on 127.0.0.1:1111
Feb 14 12:18:59.000 [notice] Bootstrapped 0%: Starting
Feb 14 12:19:09.000 [notice] We now have enough directory information to build circuits.
Feb 14 12:19:09.000 [notice] Bootstrapped 80%: Connecting to the Tor network
Feb 14 12:19:12.000 [notice] Bootstrapped 85%: Finishing handshake with first hop
==9554== Conditional jump or move depends on uninitialised value(s)
==9554==    at 0x4A11624: ASN1_STRING_set (in /lib/i386-linux-gnu/libcrypto.so.1.0.0)
==9554==    by 0x49FB092: ASN1_mbstring_ncopy (in /lib/i386-linux-gnu/libcrypto.so.1.0.0)
==9554==    by 0x49FB31A: ASN1_mbstring_copy (in /lib/i386-linux-gnu/libcrypto.so.1.0.0)
==9554==    by 0x49FC3E8: ASN1_STRING_to_UTF8 (in /lib/i386-linux-gnu/libcrypto.so.1.0.0)
==9554==    by 0x49FE09B: ??? (in /lib/i386-linux-gnu/libcrypto.so.1.0.0)
==9554==    by 0x49FE74F: ??? (in /lib/i386-linux-gnu/libcrypto.so.1.0.0)
==9554==    by 0x4A061A7: ASN1_item_ex_d2i (in /lib/i386-linux-gnu/libcrypto.so.1.0.0)
==9554==    by 0x4A06F2E: ??? (in /lib/i386-linux-gnu/libcrypto.so.1.0.0)
==9554== 
Feb 14 12:19:14.000 [notice] Bootstrapped 90%: Establishing a Tor circuit
==9554== Conditional jump or move depends on uninitialised value(s)
==9554==    at 0x4851DD8: inflateReset2 (in /lib/i386-linux-gnu/libz.so.1.2.3.4)
==9554==    by 0x4851EC7: inflateInit2_ (in /lib/i386-linux-gnu/libz.so.1.2.3.4)
==9554==    by 0x24DC2B: tor_gzip_uncompress (torgzip.c:309)
==9554==    by 0x1FE0EA: connection_dir_client_reached_eof (directory.c:1735)
==9554==    by 0x1FEB28: connection_dir_reached_eof (directory.c:2200)
==9554==    by 0x1D8FB1: connection_handle_read (connection.c:4695)
==9554==    by 0x130D40: conn_read_callback (main.c:777)
==9554==    by 0x4897CE8: event_base_loop (in /usr/lib/libevent-2.0.so.5.1.4)
==9554==    by 0x13172A: do_main_loop (main.c:2100)
==9554==    by 0x1350DC: tor_main (main.c:3077)
==9554==    by 0x12DA02: main (tor_main.c:30)
==9554== 
Feb 14 12:19:19.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Feb 14 12:19:19.000 [notice] Bootstrapped 100%: Done
Feb 14 12:19:30.000 [notice] New control connection opened from 127.0.0.1.
Feb 14 12:19:39.000 [notice] Your ContactInfo config option is not set. Please consider setting it, so we can contact you if your server is misconfigured or something else goes wrong.
Feb 14 12:19:39.000 [notice] Based on detected system memory, MaxMemInQueues is set to 2048 MB. You can override this by setting MaxMemInQueues by hand.
Feb 14 12:19:39.000 [warn] ControlPort is open, but no authentication method has been configured.  This means that any program on your computer can reconfigure your Tor.  That's bad!  You should upgrade your Tor controller as soon as possible.
Feb 14 12:19:39.000 [warn] Tor is running as an exit relay with the default exit policy. If you did not want this behavior, please set the ExitRelay option to 0. If you do want to run an exit Relay, please set the ExitRelay option to 1 to disable this warning, and for forward compatibility.
Feb 14 12:19:39.000 [warn] In a future version of Tor, ExitRelay 0 may become the default when no ExitPolicy is given.
Feb 14 12:19:39.000 [notice] Opening OR listener on 0.0.0.0:9090
Feb 14 12:19:41.000 [notice] Your Tor server's identity key fingerprint is 'Unnamed 4A6FB0F6E9655B1368B11C7B83F89F5587051A64'
Feb 14 12:19:41.000 [notice] Now checking whether ORPort 174.21.175.181:9090 is reachable... (this may take up to 20 minutes -- look for log messages indicating success)
==9554== Invalid read of size 4
==9554==    at 0x4ADDCB7: pthread_mutex_lock (pthread_mutex_lock.c:50)
==9554==    by 0x24434B: tor_mutex_acquire (compat_pthreads.c:125)
==9554==    by 0x2432A4: threadpool_queue_update (workqueue.c:329)
==9554==    by 0x1F49AE: cpuworkers_rotate_keyinfo (cpuworker.c:180)
==9554==    by 0x1CC5B7: set_options (config.c:1728)
==9554==    by 0x1CE566: options_trial_assign (config.c:2063)
==9554==    by 0x1ECB80: control_setconf_helper (control.c:739)
==9554==    by 0x1F0BF3: connection_control_process_inbuf (control.c:786)
==9554==    by 0x1D2743: connection_process_inbuf (connection.c:4583)
==9554==    by 0x1D8DB6: connection_handle_read (connection.c:3339)
==9554==    by 0x130D40: conn_read_callback (main.c:777)
==9554==    by 0x4897CE8: event_base_loop (in /usr/lib/libevent-2.0.so.5.1.4)
==9554==  Address 0x5c is not stack'd, malloc'd or (recently) free'd
==9554== 

============================================================ T= 1423945181
Tor 0.2.6.2-alpha-dev (git-5644d92dd7081e4a) died: Caught signal 11
tor(+0x12234e)[0x22a34e]
/lib/i386-linux-gnu/libpthread.so.0(__pthread_mutex_lock+0x17)[0x4addcb7]
/lib/i386-linux-gnu/libpthread.so.0(__pthread_mutex_lock+0x17)[0x4addcb7]
tor(tor_mutex_acquire+0x2c)[0x24434c]
tor(threadpool_queue_update+0x55)[0x2432a5]
tor(cpuworkers_rotate_keyinfo+0x4f)[0x1f49af]
tor(set_options+0xf98)[0x1cc5b8]
tor(options_trial_assign+0xd7)[0x1ce567]
tor(+0xe4b81)[0x1ecb81]
tor(connection_control_process_inbuf+0x6e4)[0x1f0bf4]
tor(+0xca744)[0x1d2744]
tor(connection_handle_read+0x7c7)[0x1d8db7]
tor(+0x28d41)[0x130d41]
/usr/lib/libevent-2.0.so.5(event_base_loop+0x209)[0x4897ce9]
tor(do_main_loop+0x1bb)[0x13172b]
tor(tor_main+0x1f6d)[0x1350dd]
tor(main+0x33)[0x12da03]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x4b124d3]
tor(+0x25a4d)[0x12da4d]
==9554== 
==9554== HEAP SUMMARY:
==9554==     in use at exit: 5,667,729 bytes in 95,749 blocks
==9554==   total heap usage: 192,329 allocs, 96,580 frees, 29,171,834 bytes allocated
==9554== 
==9554== LEAK SUMMARY:
==9554==    definitely lost: 0 bytes in 0 blocks
==9554==    indirectly lost: 0 bytes in 0 blocks
==9554==      possibly lost: 0 bytes in 0 blocks
==9554==    still reachable: 5,667,729 bytes in 95,749 blocks
==9554==         suppressed: 0 bytes in 0 blocks
==9554== Reachable blocks (those to which a pointer was found) are not shown.
==9554== To see them, rerun with: --leak-check=full --show-reachable=yes
==9554== 
==9554== For counts of detected and suppressed errors, rerun with: -v
==9554== Use --track-origins=yes to see where uninitialised values come from
==9554== ERROR SUMMARY: 13 errors from 3 contexts (suppressed: 0 from 0)
Aborted

comment:2 Changed 3 years ago by yawning

This is a easy one. There is one, and only one place where we call cpuworker.c:cpu_init(), which is in main.c:do_main_loop():

  if (server_mode(get_options())) {
    /* launch cpuworkers. Need to do this *after* we've read the onion key. */
    cpu_init();
  }

Your reproduction examples is starting tor as a client, so the initialization is never done.

comment:3 Changed 3 years ago by Sebastian

Yup, in the past cpu_init didn't do very much so this wasn't a problem, but now we have to think a second about whether we should just call it always, no matter if we're a relay or not, if maybe we should just call it frequently as it looks mostly idempotent with the possible exception of crypto_seed_weak_rng which I haven't looked at, or if we need to do a bigger refactoring. My intuition would be to just call this on startup independent of whether we're a relay or not.

comment:4 Changed 3 years ago by yawning

I'd be ok with just calling it regardless on startup.

The cpu_init() routine should also probably do more error checking, and kill the tor process if anything fails, because it's a sign that things are Really Broken (I was wrong, we don't always terminate after leaking condition variables), but that might be better suited to another bug.

comment:5 Changed 3 years ago by Sebastian

Keywords: regression added

ok, this isn't as easy as I thought. We will need some refactorings.

comment:6 Changed 3 years ago by Sebastian

Status: newneeds_review

branch bug14901 in my repo. I ended up going for a simpler approach, because initializing the cpuworker system for a client is a bit silly.

comment:7 Changed 3 years ago by nickm

Looks solid; merged. Thanks for the fix!

comment:8 Changed 3 years ago by nickm

Milestone: Tor: 0.2.6.x-final
Resolution: fixed
Status: needs_reviewclosed
Note: See TracTickets for help on using tickets.