Opened 4 years ago

Closed 4 years ago

#16695 closed defect (implemented)

Decouple generating controller events from sending them to controllers

Reported by: nickm Owned by:
Priority: Medium Milestone: Tor: 0.2.7.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: SponsorS TorCoreTeam201508 blob
Cc: Actual Points:
Parent ID: #16764 Points:
Reviewer: Sponsor:

Description

In my analysis of Tor's callgraph , I found that a lot of our callgraph's complexity comes from the fact that all the code that generates a controller event can try to call into the network stack. And since controller events can come from nearly anywhere in the code, that's quite a problem for modularity.

So, let's try to make the "blob" smaller by decoupling the logic here.

Child Tickets

Change History (10)

comment:1 Changed 4 years ago by nickm

Status: newneeds_review

Please review my branch decouple_controller_events.

comment:2 Changed 4 years ago by nickm

(I've confirmed that this change reduces the size of the "blob" by 125 functions.)

comment:3 Changed 4 years ago by nickm

Parent ID: #16764

comment:4 Changed 4 years ago by teor

Status: needs_reviewneeds_revision

Multithreading

If this code could be accessed by multiple threads simultaneously, two threads could both proceed past if (block_event_queue), then both increment using ++block_event_queue.

static int block_event_queue = 0;
if (block_event_queue) {
  tor_free(msg);
  return;
}

/* No queueing an event while queueing an event */
++block_event_queue;

I don't think that can happen if these statements are swapped around like this, but this code does allow for two simultaneous events to mutually block each other (and therefore both would be skipped).

static int block_event_queue = -1;
/* No queueing an event while queueing an event */
++block_event_queue;

if (block_event_queue) {
  tor_free(msg);
  goto done;
}

I think we need atomics or locks to write code that avoids both these issues.

Similarly, what happens if we are in the middle of queueing an event, then get a call to flush events?
I can imagine that, depending on the exact order of connection_write_to_buf in queued_events_flush_all, and smartlist_add in queue_control_event_string, we might miss some events.
Also, does SMARTLIST_FOREACH_* cope with its list size changing (from another thread) while iterating/deleting?

Error Handling

If struct event_base *b = tor_libevent_get_base(); returns NULL in queue_control_event_string, we will continue to add events to the queue, and never clear them.
Should we clear the queue if we can't register a callback?
Do we need to check the return values of tor_event_new and event_active?
(That said, if Libevent is failing, we probably have bigger issues.)

Can we keep tor_assert(event >= EVENT_MIN_ && event <= EVENT_MAX_); in send_control_event_string?

Typos

queue_control_event_string:
"attempt to avoid queueing something we shouldn't have tot queue"

Edited: deleted a few words in a last-minute edit, consistent formatting

Last edited 4 years ago by teor (previous) (diff)

comment:5 Changed 4 years ago by nickm

Re multithreading:

I'm pretty sure this stuff isn't reachable from other threads, but it would be nice to clean it up anyway in case I'm wrong.

Re error handling:

It can't hurt to assert that the return value is non-null, though we could never reach that point in the Tor code if the base were absent.

We should check the return value of tor_event_new, but event_active() shouldn't be able to fail.

I like the assertion idea too.

comment:6 Changed 4 years ago by nickm

Just added a bunch more stuff to this branch. Please review? :)

comment:7 Changed 4 years ago by nickm

Status: needs_revisionneeds_review

comment:8 Changed 4 years ago by yawning

Do we support tor_threadlocaL_set(foo, NULL)? Pthreads does not have a way to disambiguate no key being set vs NULL being set. MSDN says you must check GetLastError if TlsGetValue returns NULL.

"Don't do that, because it's silly" is a valid answer here, but this should be documented, and may be assertion worthy.

comment:9 Changed 4 years ago by isis

Not sure if this is correct in queue_control_event_string():

  int activate_event = 1;
  if (! flush_queued_event_pending && in_main_thread()) { 
    activate_event = 1;
    flush_queued_event_pending = 1;
  }

Since activate_event will always be true. Is this what we want? (I'm actually not entirely sure if you meant to always schedule the event because we just got it, but it seems maybe wrong since this also would ignore the in_main_thread() check.)

Also, full disclosure: I skipped over reviewing the windows parts of these commits.

The rest looks okay to me.

comment:10 Changed 4 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

Okay, I think I've taken care of these. Squashing and merging. Thanks!

Note: See TracTickets for help on using tickets.