assign_onionskin_to_cpuworker is too expensive
The function "assign_onionskin_to_cpuworker()" is 20% of CPU usage on Moritz's profile. If that's so, the likeliest culprit is one of the functions getting inlined there: probably cull_wedged_cpuworkers, which does a linear walk over all the connections.
(I'm guessing that connection_get_by_type_state, which also does a linear walk over all the connections, doesn't show up in the profile because, on a busy server, the average cpuworker is always busy, so assign_onionskin_to_cpuworker mostly gets called from cpuworker.c to assign an onionskin to a cpuworker that just became idle.)
The easiest fix for this would be to only call cull_wedged_cpuworkers every N seconds, or every N invocations of assign_onionskin_to_cpuworker(). This is easy enough that I'm marking this one for 0.2.2
If connection_get_by_type_state shows up in profiles, we can look into another data structure on connections.