chiiph said that step 3) click on settings in vidalia, then click ok. is equivalent to SAVECONF.
pt_prepare_proxy_list_for_config_read() is called every time we read the config (in bootstrap and in SIGHUP). I should see what's the connection between SAVECONF and config reading (that is options_act()). I can't reproduce this by simply SIGHUPing tor.
#0 0x00007f447415b475 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64#1 0x00007f447415e6f0 in *__GI_abort () at abort.c:92#2 0x0000000000417fcf in pt_prepare_proxy_list_for_config_read () at transports.c:1221#3 0x000000000047869a in options_act (old_options=0x23ffcd0) at config.c:1418#4 0x0000000000479c00 in set_options (new_val=<optimized out>, msg=<optimized out>) at config.c:743#5 0x000000000047add4 in options_trial_assign (list=0x20b39d0, use_defaults=0, clear_first=1, msg=0x7ffff13300e0) at config.c:2672#6 0x0000000000497842 in control_setconf_helper (conn=<optimized out>, len=<optimized out>, body=0x1c <Address 0x1c out of bounds>, use_defaults=<optimized out>) at control.c:742#7 0x0000000000498255 in handle_control_resetconf (body=<optimized out>, len=<optimized out>, conn=<optimized out>) at control.c:789#8 connection_control_process_inbuf (conn=0x1e59180) at control.c:3200#9 0x0000000000482b49 in connection_handle_read_impl (conn=<optimized out>) at connection.c:2657#10 connection_handle_read (conn=0x1e59180) at connection.c:2697#11 0x000000000040bfd6 in conn_read_callback (fd=<optimized out>, event=8843, _conn=0x6) at main.c:702#12 0x00007f4474edc8fd in event_base_loop () from /tmp/tor-browser_en-US/Lib/libevent-2.0.so.5#13 0x000000000040a061 in do_main_loop () at main.c:1924#14 0x000000000040a39d in tor_main (argc=<optimized out>, argv=0x7ffff1330558) at main.c:2619#15 0x00007f4474147ead in __libc_start_main (main=<optimized out>, argc=<optimized out>, ubp_av=<optimized out>, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffff1330548) at libc-start.c:228#16 0x0000000000408889 in _start ()
(gdb) up#1 0x00007f447415e6f0 in *__GI_abort () at abort.c:9292 abort.c: No such file or directory. in abort.c(gdb) up#2 0x0000000000417fcf in pt_prepare_proxy_list_for_config_read () at transports.c:12211221 transports.c: No such file or directory. in transports.c(gdb) print managed_proxy_list$1 = (smartlist_t *) 0x195e380(gdb) print *managed_proxy_list$2 = {list = 0x195eb60, num_used = 1, capacity = 16}(gdb) print unconfigured_proxies_n$4 = 1(gdb) print *(managed_proxy_t *)*managed_proxy_list->list$6 = {conf_state = PT_PROTO_COMPLETED, argv = 0x195ea50, conf_protocol = 1, is_server = 0, process_handle = 0x1e57fc0, pid = 0, marked_for_removal = 1, got_hup = 1, transports_to_launch = 0x195eb00, transports = 0x1e628c0}(gdb) print *(*(managed_proxy_t *)*managed_proxy_list->list)->transports_to_launch$8 = {list = 0x195eed0, num_used = 0, capacity = 16}
When I break on the function in gdb, it triggers a few times, and each time unconfigured_proxies_n is 0. And after those few times, everything goes smoothly. No assert.
Vidalia is doing two setconfs in quick succession.
The first one causes pt_kickstart_proxy() to find that mp->got_hup and mp->marked_for_removal are true, so it increments unconfigured_proxies_n to 1. The second one finds unconfigured_proxies_n to still be 1, yet the mp in the list has conf_state PT_PROTO_COMPLETED so it doesn't call managed_proxy_destroy() on it, and then triggers the assert since it finished going through the list yet unconfigured_proxies_n is still 1.
OK, as you have noticed, the SIGHUP code of src/or/transports.c is not elegant. If you can find a better logic for it, I'd be very glad. The current logic is the best I could think of when I was writing it.
This bug has to do with the way we count unconfigured proxies between config reads.
In the first SIGHUP, the managed proxy is already launched and configured. During the SIGHUP, we re-parse the config, find the proxy line and give it to pt_kickstart_proxy(). In pt_kickstart_proxy()`, we go into the
} else { /* known proxy. add its transport to its transport list */
branch, since we've already parsed this managed proxy when we launched tor. The proxy was there from before the SIGHUP, so it's already marked with got_hup and marked_for_removal. So we go into the next two if blocks and end up incrementing unconfigured_proxies_n. We need to do that, so that in the next run_scheduled_events() tick, we execute pt_configure_remaining_proxies() and examine if the proxy needs to be restarted or not. In this case, the proxy wouldn't need to be restarted, proxy_needs_restart() would have returned False, and unconfigured_proxies_n would have been reduced back to 0. Meanwhile,, the proxy has been in the PT_PROTO_COMPLETED state, since it was spawned and configured back when we launched tor.
So in this bug, we never do another tick of run_scheduled_events(), and instead we do a second SIGHUP. This means that we re-read the config, and call pt_prepare_proxy_list_for_config_read(). In this code, unconfigured_proxies_n == 1 from the previous SIGHUP, and since the proxy is in PT_PROTO_COMPLETED we don't decrement unconfigured_proxies_n. Instead, we reach the bottom of the function and hit the assert with 1 != 0.
Notice how unconfigured_proxies_n is only actually used in pt_proxies_configuration_pending(), to see whether we should configure managed proxies in run_scheduled_events(). It's also used in some logs, and to assert that the code does what I expect it to do.
My original stupid fix idea, was to remove unconfigured_proxies_n--; from pt_prepare_proxy_list_for_config_read() and simply set unconfigured_proxies_n to 0 in the end of the function, since we are planning on re-parsing the config file anyway. That might have worked.
Maybe a cleaner solution, would be to remove the unconfigured_proxies_n logic completely, and instead use a boolean, named need_to_configure_pt or something, for pt_proxies_configuration_pending()`. This might be cleaner, and still not too radical.