Opened 2 years ago

Closed 2 years ago

Last modified 13 months ago

#15245 closed defect (fixed)

SETCONF Log results in segfault when not running a relay

Reported by: anonym Owned by:
Priority: High Milestone: Tor: 0.2.6.x-final
Component: Core Tor/Tor Version: Tor: 0.2.6.4-rc
Severity: Keywords: tor-client, regression, 2016-bug-retrospective
Cc: anonym@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

To be precise, we get a segfault when SETCONF:ing Log so its value changes in any way (but only when no running as a relay). Here's a backtrace:

#0  crash_handler (sig=11, si=0xbfffeb4c, ctx_=0xbfffebcc)
    at ../src/common/backtrace.c:121
#1  <signal handler called>
#2  0xb7bb9e8d in pthread_mutex_lock ()
   from /lib/i386-linux-gnu/libpthread.so.0
#3  0xb7cadd76 in pthread_mutex_lock () from /lib/i386-linux-gnu/libc.so.6
#4  0x8014323c in tor_mutex_acquire (m=m`entry=0x50)
    at ../src/common/compat_pthreads.c:125
#5  0x80142175 in threadpool_queue_update (pool=0x0, 
    dup_fn=dup_fn`entry=0x800f1e50 <worker_state_new>, 
    fn=fn`entry=0x800f1de0 <update_state_threadfn>, 
    free_fn=free_fn`entry=0x800f1d80 <worker_state_free>, arg=arg`entry=0x0)
    at ../src/common/workqueue.c:329
#6  0x800f247f in cpuworkers_rotate_keyinfo () at ../src/or/cpuworker.c:181
#7  0x800c9431 in options_act (old_options=0x801ebf48)
    at ../src/or/config.c:1731
#8  set_options (new_val=new_val`entry=0x80a52ee8, msg=msg`entry=0xbffff174)
    at ../src/or/config.c:653
#9  0x800cb4e7 in options_trial_assign (list=0x80a503c0, 
    use_defaults=use_defaults`entry=0, clear_first=clear_first`entry=1, 
    msg=msg`entry=0xbffff174) at ../src/or/config.c:2066
#10 0x800e98f8 in control_setconf_helper (conn=conn`entry=0x80a51978, 
    len=len`entry=5, body=<optimized out>, body`entry=0x80a51ae0 "Log\r\n", 
    use_defaults=0) at ../src/or/control.c:739
#11 0x800ee215 in handle_control_resetconf (body=<optimized out>, 
    len=<optimized out>, conn=<optimized out>) at ../src/or/control.c:786
#12 connection_control_process_inbuf (conn=conn`entry=0x80a51978)
    at ../src/or/control.c:3444
#13 0x800cfb0c in connection_process_inbuf (conn=conn`entry=0x80a51978, 
    package_partial=package_partial`entry=1) at ../src/or/connection.c:4584
#14 0x800d5cfb in connection_handle_read_impl (conn=0x80a51978)
    at ../src/or/connection.c:3340
#15 connection_handle_read (conn=conn`entry=0x80a51978)
    at ../src/or/connection.c:3381
#16 0x8002d5e1 in conn_read_callback (fd=21, event=2, _conn=0x80a51978)
    at ../src/or/main.c:777
#17 0xb7f4f522 in event_base_loop ()
   from /usr/lib/i386-linux-gnu/libevent-2.0.so.5
#18 0x8002dfbf in do_main_loop () at ../src/or/main.c:2104
#19 0x80031955 in tor_main (argc=argc`entry=3, argv=argv`entry=0xbffff724)
    at ../src/or/main.c:3078
#20 0x8002a1e3 in main (argc=3, argv=0xbffff724) at ../src/or/tor_main.c:30

Note that in #5, pool is NULL, and then we have tor_mutex_acquire(&pool->lock); where &pool->lock becomes the pointer 0x50, that we later use as an argument in pthread_mutex_lock() which of course leads to this segfault. That pool is NULL is because when we call cpuworkers_rotate_keyinfo() earlier in the stack, and the global variable threadpool is NULL (from initialization) because cpu_init() was never called. I set a breakpoint to verify, but from reading the code around all calls of cpu_init() it's clear it will only be called when the Tor client is acting as a relay since it's always wrapped inside a if (server_mode(...)).

So, in options_act(), we'll end up call cpuworkers_rotate_keyinfo() because transition_affects_workers is set, which is done like this:

  const int transition_affects_workers =
    old_options && options_transition_affects_workers(old_options, options);

Changing the Log option is something that probably rightfully will cause options_transition_affects_workers to return true, resulting in this variable being set, and hence doing all sorts of cpuworker-related stuff even though they aren't used since we're not running a relay. At least in the context where we end up calling cpuworkers_rotate_keyinfo(), this seems wrong, but if it's also wrong in the other places when code is run because transition_affects_workers, then it feels like we should instead do:

  const int transition_affects_workers =
    old_options && server_mode(options) &&
    options_transition_affects_workers(old_options, options);

Or something similar. My point is that we need to be running a relay for the concept of cpuworkers (and options transitions affecting them) to be relevant. Otherwise a server_mode() check is needed somewhere around that cpuworkers_rotate_keyinfo() call, and possibly at other places.

Child Tickets

Change History (6)

comment:1 Changed 2 years ago by nickm

  • Keywords tor-client regression added
  • Milestone set to Tor: 0.2.6.x-final
  • Priority changed from normal to major

comment:2 Changed 2 years ago by nickm

  • Status changed from new to needs_review

I think bug15245_026 in my public repository should work okay here. It fixes the bug for me, and is relatively minimal.

comment:3 Changed 2 years ago by dgoulet

Neither the commit message nor change file explain why threadpool could be NULL in there.

Why are we getting into "updating the cpuworkers' task" without the subsystem being initialized? Should we try to avoid that instead like putting cpuworkers_rotate_keyinfo in the server_mode() if block before?

comment:4 Changed 2 years ago by dgoulet

Ack!

comment:5 Changed 2 years ago by nickm

  • Resolution set to fixed
  • Status changed from needs_review to closed

Merged to 0.2.6 and forward!

comment:6 Changed 13 months ago by nickm

  • Keywords 2016-bug-retrospective added

Mark more tickets for bug retrospective based on hand-review of changelogs from 0.2.5 onwards.

Note: See TracTickets for help on using tickets.