Opened 4 months ago

Last modified 6 weeks ago

#29494 needs_revision enhancement

Optimize interaction between circuitmux and circuitpadding

Reported by: mikeperry Owned by: mikeperry
Priority: High Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: wtf-pad, network-team-roadmap-2019-Q1Q2 042-proposed
Cc: mikeperry Actual Points:
Parent ID: Points: 5
Reviewer: Sponsor: Sponsor2

Description

In #29204, we realized that we needed to inspect the circuit queue to prevent the circuit padding system from queuing too much padding. For that ticket, we're going to do the simple fix -- we don't schedule padding if the circuit's cell queue already has a window full of cells on it. In that case, we just lie to circpad and tell the machine we sent padding even though we did not.

However, we should determine if we want to have a more complicated interaction, such as delaying our padding until the queue drains. We should also examine creating alternate machine events for when cells are dequeued from circuitmux vs when they are sent locally. This has pluses and minuses for machine interaction.

Child Tickets

Change History (9)

comment:1 Changed 4 months ago by mikeperry

Ok, so for simplicity, lying to circpad seems like the right move. If the queue is full, its going to be wrong anyway..

However, the second piece of this ticket is about when to tell circpad that a padding cell or normal cell was sent. Right now, we tell circpad that a padding cell was sent as soon as its callback to send it fires (via circpad_cell_event_padding_sent() from circpad_send_padding_cell_for_callback()). With respect to the wire, this is incorrect because the cell may still wait around in a circuitmux queue for a long time before being sent. This delay may cause the padding system to decide to send another padding packet before the one it just sent even hits the wire.

Similarly, we call circpad_cell_event_nonpadding_sent() from circpad_deliver_sent_relay_cell_events() (via relay_send_command_from_edge_()) and from circpad_deliver_unrecognized_cell_events() (via circuit_receive_relay_cell()). These calls are *also* before the cell is added to the circuitmux queue. This means that circuitpadding may schedule padding timers for additional padding that fire before the actual data packet makes it through the queue.

In both cases, this will lead to more padding added to the circuit than we wanted to. Worse, it means more padding added to *already busy, saturated, and/or stalled circuits*, which is...not a good look.

I think the right solution is to make these circpad_cell_event_*_sent() callbacks from channel_flush_from_first_active_circuit(), possibly right after the cell is popped from the queue. This will be about as accurate as we can get, because right after this it goes into the outbuf and should get sent by KIST within ~10ms, give or take TCP congestion.

The difficulty with this is that circpad needs to know if this cell is padding or not. Since the cell has already been encrypted, we can't just inspect the relay command field anymore. We need to add an additional bitfield to packed_cell_t. 1 bit should suffice, though. All we need to know is is_padding or not.. But we need to set that bitfield in the relay_send_command_from_edge_() call, where the relay cell padding command is set. Otherwise the bitfield flag should remain 0.

comment:2 Changed 4 months ago by mikeperry

Draft branch: https://github.com/mikeperry-tor/tor/commits/bug29494. Tests fail due to event changes. I want to mull this over a little more, but figured I'd post it for reference.

comment:3 Changed 4 months ago by mikeperry

Thoughts since Friday:

  1. Because we decrement tokens early but don't check for bins empty until the padding event is sent, we can have a situation where another padding or non-padding cell causes us to decrement tokens on the same bin before a check, eventually leading to a negative token count in that bin. Right now this will trigger a backtrace/fragile assert.
  2. A further optimization is not to send padding if there is at least one cell in the cell queue, because the idea of padding is to append cells to a burst, not place them in the middle of one that is sitting in the queue. However, we'll have to figure out how to update token counts properly in this case, as well. And then there is the question as to if it does make sense to allow a string of padding cells to get queued up. It actually might. But then how much? Perhaps this should be a third ticket.

comment:4 Changed 4 months ago by mikeperry

Priority: MediumHigh

comment:5 Changed 4 months ago by dgoulet

Ok two comments on the commit. No show stopper, overall the approach makes sense. I have no idea what the overhead here can be at each cell so if you have a clue, maybe noting it here or adding a comment so we have an idea how much work that is adding at each cell.

comment:6 Changed 4 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

comment:7 Changed 2 months ago by nickm

Cc: mikeperry added
Milestone: Tor: 0.4.1.x-final
Status: assignedneeds_revision

Tentatively calling this 0.4.1, needs_revision? Mike, please adjust if it isn't.

comment:8 Changed 2 months ago by mikeperry

Parent ID: #28634

This is not strictly needed for #28634 or 041, but it will change all of our padding vs non-padding relative timing delays, which may affect classifier accuracy. For apples-to-apples comparisons for future attacks+defenses wrt #28634. we should get this in at the same time as it lands.

So yes, I agree. 041.

comment:9 Changed 6 weeks ago by mikeperry

Keywords: 042-proposed added
Milestone: Tor: 0.4.1.x-finalTor: unspecified
Parent ID: #28634

The machines in #28634 have been updated so that they are not sensitive to timing on the relay side -- they just need to send padding cells back-to-back, which will happen just fine with the existing callback placement. This can switch to 0.4.2 now.

Additionally, I want to expose the different optimization strategy options to the padding machines, as certain types of machines would actually be able to send much less padding if they were given options like "only send padding when the circuitmux queue is empty" (since many machines will really only care about making bursts longer and inserting any padding before that will just add latency and waste bandwidth). Other machines may want different strategies with similar things from my earlier comments on this ticket.

So let's keep mulling this over for a bit more and see what researchers need.

Note: See TracTickets for help on using tickets.