pt_prepare_proxy_list_for_config_read: Assertion unconfigured_proxies_n == 0 failed

changed milestone to %Tor: 0.2.3.x-final

added component::core tor/tor milestone::Tor: 0.2.3.x-final priority::very high resolution::fixed status::closed tor-bridge type::defect labels

Trac:
Summary: pt_prepare_proxy_list_for_config_read: Assertion +unconfigured_proxies_n == 0 failed to pt_prepare_proxy_list_for_config_read: Assertion unconfigured_proxies_n == 0 failed

chiiph said that step 3) click on settings in vidalia, then click ok. is equivalent to SAVECONF.

pt_prepare_proxy_list_for_config_read() is called every time we read the config (in bootstrap and in SIGHUP). I should see what's the connection between SAVECONF and config reading (that is options_act()). I can't reproduce this by simply SIGHUPing tor.

#0  0x00007f447415b475 in *__GI_raise (sig=<optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007f447415e6f0 in *__GI_abort () at abort.c:92
#2  0x0000000000417fcf in pt_prepare_proxy_list_for_config_read ()
    at transports.c:1221
#3  0x000000000047869a in options_act (old_options=0x23ffcd0) at config.c:1418
#4  0x0000000000479c00 in set_options (new_val=<optimized out>, 
    msg=<optimized out>) at config.c:743
#5  0x000000000047add4 in options_trial_assign (list=0x20b39d0, 
    use_defaults=0, clear_first=1, msg=0x7ffff13300e0) at config.c:2672
#6  0x0000000000497842 in control_setconf_helper (conn=<optimized out>, 
    len=<optimized out>, body=0x1c <Address 0x1c out of bounds>, 
    use_defaults=<optimized out>) at control.c:742
#7  0x0000000000498255 in handle_control_resetconf (body=<optimized out>, 
    len=<optimized out>, conn=<optimized out>) at control.c:789
#8  connection_control_process_inbuf (conn=0x1e59180) at control.c:3200
#9  0x0000000000482b49 in connection_handle_read_impl (conn=<optimized out>)
    at connection.c:2657
#10 connection_handle_read (conn=0x1e59180) at connection.c:2697
#11 0x000000000040bfd6 in conn_read_callback (fd=<optimized out>, event=8843, 
    _conn=0x6) at main.c:702
#12 0x00007f4474edc8fd in event_base_loop ()
   from /tmp/tor-browser_en-US/Lib/libevent-2.0.so.5
#13 0x000000000040a061 in do_main_loop () at main.c:1924
#14 0x000000000040a39d in tor_main (argc=<optimized out>, argv=0x7ffff1330558)
    at main.c:2619
#15 0x00007f4474147ead in __libc_start_main (main=<optimized out>, 
    argc=<optimized out>, ubp_av=<optimized out>, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffff1330548)
    at libc-start.c:228
#16 0x0000000000408889 in _start ()

(gdb) up
#1  0x00007f447415e6f0 in *__GI_abort () at abort.c:92
92      abort.c: No such file or directory.
        in abort.c
(gdb) up
#2  0x0000000000417fcf in pt_prepare_proxy_list_for_config_read ()
    at transports.c:1221
1221    transports.c: No such file or directory.
        in transports.c
(gdb) print managed_proxy_list
$1 = (smartlist_t *) 0x195e380
(gdb) print *managed_proxy_list
$2 = {list = 0x195eb60, num_used = 1, capacity = 16}
(gdb) print unconfigured_proxies_n
$4 = 1
(gdb) print *(managed_proxy_t *)*managed_proxy_list->list
$6 = {conf_state = PT_PROTO_COMPLETED, argv = 0x195ea50, conf_protocol = 1, 
  is_server = 0, process_handle = 0x1e57fc0, pid = 0, marked_for_removal = 1, 
  got_hup = 1, transports_to_launch = 0x195eb00, transports = 0x1e628c0}
(gdb) print *(*(managed_proxy_t *)*managed_proxy_list->list)->transports_to_launch
$8 = {list = 0x195eed0, num_used = 0, capacity = 16}

When I break on the function in gdb, it triggers a few times, and each time unconfigured_proxies_n is 0. And after those few times, everything goes smoothly. No assert.

I wonder if there's some race going on.

Progress!

I don't need TBB. I can run my Tor with an open control port, and then connect a Vidalia to it and it dies. Consistently. Now I can run my own Tor.

But if I break on pt_prepare_proxy_list_for_config_read(), it won't die. Fun.

(gdb) print (char )(*(managed_proxy_t *)*managed_proxy_list->list)->transports->list $4 = 0x234c010 "obfs2"

Vidalia is doing two setconfs in quick succession.

The first one causes pt_kickstart_proxy() to find that mp->got_hup and mp->marked_for_removal are true, so it increments unconfigured_proxies_n to 1. The second one finds unconfigured_proxies_n to still be 1, yet the mp in the list has conf_state PT_PROTO_COMPLETED so it doesn't call managed_proxy_destroy() on it, and then triggers the assert since it finished going through the list yet unconfigured_proxies_n is still 1.

Ok, I don't even need vidalia to trigger the assert.

Run your Tor, connect to the control port, authenticate, then paste "setconf socksport=9050" twice in quick succession.

Or just hup it twice in quick succession.

Trac:
Description: Tor dies with

Feb 11 08:34:46.668 [Error] pt_prepare_proxy_list_for_config_read(): Bug: transports.c:1221: pt_prepare_proxy_list_for_config_read: Assertion unconfigured_proxies_n == 0 failed; aborting.

Steps to reproduce:

grab your favorite obfsproxy tbb from the download section on https://www.torproject.org/projects/obfsproxy
unpack it and start it
click on settings in vidalia, then click ok.

to

Tor dies with

Feb 11 08:34:46.668 [Error] pt_prepare_proxy_list_for_config_read(): Bug: transports.c:1221: pt_prepare_proxy_list_for_config_read: Assertion unconfigured_proxies_n == 0 failed; aborting.

Steps to reproduce:

configure your tor to use a managed ClientTransportPlugin, e.g.

ClientTransportPlugin obfs2 exec /path/to/obfsproxy --managed

hup your Tor twice in quick succession

OK, as you have noticed, the SIGHUP code of src/or/transports.c is not elegant. If you can find a better logic for it, I'd be very glad. The current logic is the best I could think of when I was writing it.

This bug has to do with the way we count unconfigured proxies between config reads.

In the first SIGHUP, the managed proxy is already launched and configured. During the SIGHUP, we re-parse the config, find the proxy line and give it to pt_kickstart_proxy(). In pt_kickstart_proxy()`, we go into the

} else { /* known proxy. add its transport to its transport list */

branch, since we've already parsed this managed proxy when we launched tor. The proxy was there from before the SIGHUP, so it's already marked with got_hup and marked_for_removal. So we go into the next two if blocks and end up incrementing unconfigured_proxies_n. We need to do that, so that in the next run_scheduled_events() tick, we execute pt_configure_remaining_proxies() and examine if the proxy needs to be restarted or not. In this case, the proxy wouldn't need to be restarted, proxy_needs_restart() would have returned False, and unconfigured_proxies_n would have been reduced back to 0. Meanwhile,, the proxy has been in the PT_PROTO_COMPLETED state, since it was spawned and configured back when we launched tor.

So in this bug, we never do another tick of run_scheduled_events(), and instead we do a second SIGHUP. This means that we re-read the config, and call pt_prepare_proxy_list_for_config_read(). In this code, unconfigured_proxies_n == 1 from the previous SIGHUP, and since the proxy is in PT_PROTO_COMPLETED we don't decrement unconfigured_proxies_n. Instead, we reach the bottom of the function and hit the assert with 1 != 0.

Notice how unconfigured_proxies_n is only actually used in pt_proxies_configuration_pending(), to see whether we should configure managed proxies in run_scheduled_events(). It's also used in some logs, and to assert that the code does what I expect it to do.

My original stupid fix idea, was to remove unconfigured_proxies_n--; from pt_prepare_proxy_list_for_config_read() and simply set unconfigured_proxies_n to 0 in the end of the function, since we are planning on re-parsing the config file anyway. That might have worked.

Maybe a cleaner solution, would be to remove the unconfigured_proxies_n logic completely, and instead use a boolean, named need_to_configure_pt or something, for pt_proxies_configuration_pending()`. This might be cleaner, and still not too radical.

This one could use some help from other people too. As it is, connecting Vidalia to a Tor that uses pluggable transports will kill the Tor.

Trac:
Priority: major to critical

My public branch "bug5084" works for armadev. It should get squashed and merged.

Trac:
Status: new to needs_review

Merging; thanks for the help here, everybody!

Trac:
Status: needs_review to closed
Resolution: N/A to fixed

Trac:
Keywords: N/A deleted, tor-bridge added

Trac:
Component: Tor Bridge to Tor

closed

moved to tpo/core/tor#5084 (closed)

pt_prepare_proxy_list_for_config_read: Assertion unconfigured_proxies_n == 0 failed

Child items ...

Activity