Opened 5 weeks ago

Closed 5 weeks ago

#22368 closed defect (fixed)

double-free of MyFamily lines

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version: Tor: 0.3.1.1-alpha
Severity: Normal Keywords: regression memory-safety tor-relay
Cc: DeS Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Run a relay under valgrind with "myfamily moria1", and then ctrl-C it once it bootstraps. Upon exit, you'll get:

==17604== Invalid free() / delete / delete[] / realloc()
==17604==    at 0x4C29E90: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==17604==    by 0x277E75: config_free_lines (confline.c:323)
==17604==    by 0x1F56F2: or_options_free (config.c:898)
==17604==    by 0x1F6583: config_free_all (config.c:907)
==17604==    by 0x157CCC: tor_free_all (main.c:3238)
==17604==    by 0x157DB0: tor_cleanup (main.c:3310)
==17604==    by 0x2614E5: hibernate_begin (hibernate.c:818)
==17604==    by 0x1584E9: process_signal (main.c:2686)
==17604==    by 0x1584E9: signal_callback (main.c:2663)
==17604==    by 0x5361A14: event_base_loop (in /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5.1.9)
==17604==    by 0x156E23: run_main_loop_once (main.c:2594)
==17604==    by 0x156E23: run_main_loop_until_done (main.c:2648)
==17604==    by 0x156E23: do_main_loop (main.c:2561)
==17604==    by 0x15A664: tor_main (main.c:3745)
==17604==    by 0x152628: main (tor_main.c:34)
==17604==  Address 0x668f9a0 is 0 bytes inside an unallocated block of size 16 in arena "client"

User DeS originally found this bug on #22255, with this stack trace:

==33656== Invalid free() / delete / delete[] / realloc()
==33656==    at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==33656==    by 0x1A4378: routerinfo_free (routerlist.c:3172)
==33656==    by 0x199BF6: router_rebuild_descriptor (router.c:2449)
==33656==    by 0x199CD2: router_get_my_routerinfo (router.c:2013)
==33656==    by 0x1D183E: channel_tls_process_netinfo_cell (channeltls.c:1679)
==33656==    by 0x1D183E: channel_tls_handle_cell (channeltls.c:1133)
==33656==    by 0x2137A0: connection_or_process_cells_from_inbuf (connection_or.c:2085)
==33656==    by 0x20ABE4: connection_handle_read_impl (connection.c:3451)
==33656==    by 0x153CB0: conn_read_callback (main.c:736)
==33656==    by 0x5363F23: event_base_loop (in /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5.1.9)
==33656==    by 0x154DDC: run_main_loop_once (main.c:2594)
==33656==    by 0x154DDC: run_main_loop_until_done (main.c:2648)
==33656==    by 0x154DDC: do_main_loop (main.c:2561)
==33656==    by 0x158594: tor_main (main.c:3745)
==33656==    by 0x1507C8: main (tor_main.c:34)
==33656==  Address 0x6453720 is 0 bytes inside a block of size 42 free'd
==33656==    at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==33656==    by 0x1995BC: router_build_fresh_descriptor (router.c:2327)
==33656==    by 0x199BE2: router_rebuild_descriptor (router.c:2445)
==33656==    by 0x199CD2: router_get_my_routerinfo (router.c:2013)
==33656==    by 0x1D183E: channel_tls_process_netinfo_cell (channeltls.c:1679)
==33656==    by 0x1D183E: channel_tls_handle_cell (channeltls.c:1133)
==33656==    by 0x2137A0: connection_or_process_cells_from_inbuf (connection_or.c:2085)
==33656==    by 0x20ABE4: connection_handle_read_impl (connection.c:3451)
==33656==    by 0x153CB0: conn_read_callback (main.c:736)
==33656==    by 0x5363F23: event_base_loop (in /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5.1.9)
==33656==    by 0x154DDC: run_main_loop_once (main.c:2594)
==33656==    by 0x154DDC: run_main_loop_until_done (main.c:2648)
==33656==    by 0x154DDC: do_main_loop (main.c:2561)
==33656==    by 0x158594: tor_main (main.c:3745)
==33656==    by 0x1507C8: main (tor_main.c:34)
==33656== 
==33656== Invalid free() / delete / delete[] / realloc()
==33656==    at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==33656==    by 0x1995BC: router_build_fresh_descriptor (router.c:2327)
==33656==    by 0x199BE2: router_rebuild_descriptor (router.c:2445)
==33656==    by 0x199CD2: router_get_my_routerinfo (router.c:2013)
==33656==    by 0x19A358: router_my_exit_policy_is_reject_star (router.c:1963)
==33656==    by 0x247025: dns_resolve_impl.constprop.9 (dns.c:720)
==33656==    by 0x249A68: dns_resolve (dns.c:614)
==33656==    by 0x2101BA: connection_exit_begin_conn (connection_edge.c:3292)
==33656==    by 0x17B4A0: connection_edge_process_relay_cell (relay.c:1648)
==33656==    by 0x17CCD8: circuit_receive_relay_cell (relay.c:328)
==33656==    by 0x1EF725: command_process_relay_cell (command.c:542)
==33656==    by 0x1EF725: command_process_cell (command.c:196)
==33656==    by 0x1D19A2: channel_tls_handle_cell (channeltls.c:1152)
==33656==  Address 0x6452e10 is 80 bytes inside a block of size 128 alloc'd
==33656==    at 0x4C2CE8E: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==33656==    by 0x5858E68: CRYPTO_realloc (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==33656==    by 0x58DF3B9: sk_dup (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==33656==    by 0x55D900D: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==33656==    by 0x55D13B3: SSL_set_cipher_list (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==33656==    by 0x29407E: tor_tls_session_secret_cb (tortls.c:1599)
==33656==    by 0x55AD7D5: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==33656==    by 0x55B1DAC: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==33656==    by 0x55BF863: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==33656==    by 0x2973A2: tor_tls_handshake (tortls.c:1901)
==33656==    by 0x216D7F: connection_tls_continue_handshake (connection_or.c:1420)
==33656==    by 0x217137: connection_tls_start_handshake (connection_or.c:1372)
==33656== 
==33656== Invalid read of size 8
==33656==    at 0x58E41E1: EVP_MD_CTX_cleanup (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==33656==    by 0x58E463D: EVP_MD_CTX_destroy (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)
==33656==    by 0x55BA0D0: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==33656==    by 0x55B789B: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==33656==    by 0x55D44DA: SSL_free (in /lib/x86_64-linux-gnu/libssl.so.1.0.0)
==33656==    by 0x295BD5: tor_tls_free (tortls.c:1794)
==33656==    by 0x204EA7: connection_free_ (connection.c:572)
==33656==    by 0x1536BD: conn_close_if_marked (main.c:908)
==33656==    by 0x1536BD: close_closeable_connections (main.c:700)
==33656==    by 0x153FE0: run_scheduled_events (main.c:1474)
==33656==    by 0x153FE0: second_elapsed_callback (main.c:2175)
==33656==    by 0x5363F23: event_base_loop (in /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5.1.9)
==33656==    by 0x154DDC: run_main_loop_once (main.c:2594)
==33656==    by 0x154DDC: run_main_loop_until_done (main.c:2648)
==33656==    by 0x154DDC: do_main_loop (main.c:2561)
==33656==    by 0x158594: tor_main (main.c:3745)
==33656==  Address 0x699aeaf2 is not stack'd, malloc'd or (recently) free'd

Once we've resolved this ticket, we should take a closer look at that last "Invalid read of size 8" stanza, and open a new ticket for it if needed.

Child Tickets

Change History (10)

comment:1 Changed 5 weeks ago by nickm

  • Keywords regression memory-safety tor-relay added

comment:2 Changed 5 weeks ago by arma

  • Keywords regression memory-safety tor-relay removed

Here's the patch I'm testing:

diff --git a/src/or/router.c b/src/or/router.c
index 642f415..f2741b7 100644
--- a/src/or/router.c
+++ b/src/or/router.c
@@ -2289,7 +2289,7 @@ router_build_fresh_descriptor(routerinfo_t **r, extrainfo_
        char *name = family->value;
        const node_t *member;
        if (!strcasecmp(name, options->Nickname))
-         goto skip; /* Don't list ourself, that's redundant */
+         continue; /* Don't list ourself, that's redundant */
        else
          member = node_get_by_nickname(name, 1);
        if (!member) {
@@ -2308,7 +2308,7 @@ router_build_fresh_descriptor(routerinfo_t **r, extrainfo_
            smartlist_add_strdup(warned_nonexistent_family, name);
          }
          if (is_legal) {
-           smartlist_add(ri->declared_family, name);
+           smartlist_add_strdup(ri->declared_family, name);
            name = NULL;
          }
        } else if (router_digest_is_me(member->identity)) {
@@ -2323,8 +2323,6 @@ router_build_fresh_descriptor(routerinfo_t **r, extrainfo_
          if (smartlist_contains_string(warned_nonexistent_family, name))
            smartlist_string_remove(warned_nonexistent_family, name);
        }
-    skip:
-       tor_free(name);
     }
 
     /* remove duplicates from the list */

comment:3 Changed 5 weeks ago by arma

  • Keywords regression memory-safety tor-relay added

(trac fail)

comment:4 Changed 5 weeks ago by arma

See #4998 for where the bug crept in.

I think that means bugfix on 0.3.1.1-alpha. Does that mean it's still a regression? What does regression mean anyway? :)

comment:5 Changed 5 weeks ago by arma

  • Status changed from new to needs_review

My bug22368 branch fixes it.

I reproduced the bug on my freeBogatov relay, then applied this branch and I couldn't reproduce it anymore.

comment:6 in reply to: ↑ description Changed 5 weeks ago by arma

Replying to arma:

==33656== Invalid read of size 8
==33656== at 0x58E41E1: EVP_MD_CTX_cleanup (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0)

[...]

Once we've resolved this ticket, we should take a closer look at that last "Invalid read of size 8" stanza, and open a new ticket for it if needed.

I think this weird stack trace is a side effect of the bug. I got another trace, when I was triggering the bug on my relay, that looked like this:

==35332== Invalid read of size 1
==35332==    at 0x198624: is_legal_nickname_or_hexdigest (router.c:3419)
==35332==    by 0x19CAFA: router_build_fresh_descriptor (router.c:2296)
==35332==    by 0x19CD47: router_rebuild_descriptor (router.c:2445)
==35332==    by 0x19CF1E: router_get_my_routerinfo (router.c:2013)
==35332==    by 0x19DEF4: check_descriptor_ipaddress_changed (router.c:2589)
==35332==    by 0x1528E1: check_descriptor_callback (main.c:1878)
==35332==    by 0x171BFF: periodic_event_dispatch (periodic.c:52)
==35332==    by 0x2EA12F: event_process_active_single_queue (in /home/tord/git/src/or/tor)
==35332==    by 0x2EA6FF: event_process_active (in /home/tord/git/src/or/tor)
==35332==    by 0x2EAE63: event_base_loop (in /home/tord/git/src/or/tor)
==35332==    by 0x154C81: do_main_loop (main.c:2594)
==35332==    by 0x155DA4: tor_main (main.c:3745)
==35332==  Address 0x72fd640 is 0 bytes inside a block of size 8 free'd
==35332==    at 0x4A06430: free (vg_replace_malloc.c:446)
==35332==    by 0x5594E1C: CRYPTO_free (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55D17C1: bn_expand2 (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55CE702: BN_div (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55D42C6: BN_nnmod (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55D728E: BN_mod_inverse (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55DE366: BN_MONT_CTX_set (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55DE5AF: BN_MONT_CTX_set_locked (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55EF6FA: ??? (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55F1879: int_rsa_verify (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55F1C38: RSA_verify (in /usr/lib64/libcrypto.so.1.0.1e)
==35332==    by 0x55F62DB: ??? (in /usr/lib64/libcrypto.so.1.0.1e)

So I think "once you start freeing memory and then using it again, just about anything could happen" is the right explanation for that other stack trace.

comment:7 follow-up: Changed 5 weeks ago by arma

Speaking of just about anything, it is distantly possible that relays who hit this bug will print little pieces of arbitrary memory, if they are valid nicknames or hexes, into the Family line of their descriptor. Good times.

comment:8 in reply to: ↑ 7 Changed 5 weeks ago by teor

Replying to arma:

Speaking of just about anything, it is distantly possible that relays who hit this bug will print little pieces of arbitrary memory, if they are valid nicknames or hexes, into the Family line of their descriptor. Good times.

Since the smallest valid nickname is 1 character, it discloses 1 byte with probability 62/256, 2 bytes with probability (62/256)2, ...

Unless it needs to be a valid continuation of a nickname list?

Last edited 5 weeks ago by teor (previous) (diff)

comment:9 Changed 5 weeks ago by arma

Here are two fun log lines from the relay when it triggered:

May 24 20:52:30.264 [warn] There is a router named """" in my declared family, but that isn't a legal nickname. Skipping it.
May 24 20:52:30.462 [warn] There is a router named ""\340\254U\013"" in my declared family, but that isn't a legal nickname. Skipping it.

comment:10 Changed 5 weeks ago by nickm

  • Resolution set to fixed
  • Status changed from needs_review to closed

This looks like a good and likely fix. Merged.

Note: See TracTickets for help on using tickets.