In #29204 (moved), we realized that we needed to inspect the circuit queue to prevent the circuit padding system from queuing too much padding. For that ticket, we're going to do the simple fix -- we don't schedule padding if the circuit's cell queue already has a window full of cells on it. In that case, we just lie to circpad and tell the machine we sent padding even though we did not.
However, we should determine if we want to have a more complicated interaction, such as delaying our padding until the queue drains. We should also examine creating alternate machine events for when cells are dequeued from circuitmux vs when they are sent locally. This has pluses and minuses for machine interaction.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Ok, so for simplicity, lying to circpad seems like the right move. If the queue is full, its going to be wrong anyway..
However, the second piece of this ticket is about when to tell circpad that a padding cell or normal cell was sent. Right now, we tell circpad that a padding cell was sent as soon as its callback to send it fires (via circpad_cell_event_padding_sent() from circpad_send_padding_cell_for_callback()). With respect to the wire, this is incorrect because the cell may still wait around in a circuitmux queue for a long time before being sent. This delay may cause the padding system to decide to send another padding packet before the one it just sent even hits the wire.
Similarly, we call circpad_cell_event_nonpadding_sent() from circpad_deliver_sent_relay_cell_events() (via relay_send_command_from_edge_()) and from circpad_deliver_unrecognized_cell_events() (via circuit_receive_relay_cell()). These calls are also before the cell is added to the circuitmux queue. This means that circuitpadding may schedule padding timers for additional padding that fire before the actual data packet makes it through the queue.
In both cases, this will lead to more padding added to the circuit than we wanted to. Worse, it means more padding added to already busy, saturated, and/or stalled circuits, which is...not a good look.
I think the right solution is to make these circpad_cell_event_*_sent() callbacks from channel_flush_from_first_active_circuit(), possibly right after the cell is popped from the queue. This will be about as accurate as we can get, because right after this it goes into the outbuf and should get sent by KIST within ~10ms, give or take TCP congestion.
The difficulty with this is that circpad needs to know if this cell is padding or not. Since the cell has already been encrypted, we can't just inspect the relay command field anymore. We need to add an additional bitfield to packed_cell_t. 1 bit should suffice, though. All we need to know is is_padding or not.. But we need to set that bitfield in the relay_send_command_from_edge_() call, where the relay cell padding command is set. Otherwise the bitfield flag should remain 0.
Because we decrement tokens early but don't check for bins empty until the padding event is sent, we can have a situation where another padding or non-padding cell causes us to decrement tokens on the same bin before a check, eventually leading to a negative token count in that bin. Right now this will trigger a backtrace/fragile assert.
A further optimization is not to send padding if there is at least one cell in the cell queue, because the idea of padding is to append cells to a burst, not place them in the middle of one that is sitting in the queue. However, we'll have to figure out how to update token counts properly in this case, as well. And then there is the question as to if it does make sense to allow a string of padding cells to get queued up. It actually might. But then how much? Perhaps this should be a third ticket.
Ok two comments on the commit. No show stopper, overall the approach makes sense. I have no idea what the overhead here can be at each cell so if you have a clue, maybe noting it here or adding a comment so we have an idea how much work that is adding at each cell.
This is not strictly needed for #28634 (moved) or 041, but it will change all of our padding vs non-padding relative timing delays, which may affect classifier accuracy. For apples-to-apples comparisons for future attacks+defenses wrt #28634 (moved). we should get this in at the same time as it lands.
The machines in #28634 (moved) have been updated so that they are not sensitive to timing on the relay side -- they just need to send padding cells back-to-back, which will happen just fine with the existing callback placement. This can switch to 0.4.2 now.
Additionally, I want to expose the different optimization strategy options to the padding machines, as certain types of machines would actually be able to send much less padding if they were given options like "only send padding when the circuitmux queue is empty" (since many machines will really only care about making bursts longer and inserting any padding before that will just add latency and waste bandwidth). Other machines may want different strategies with similar things from my earlier comments on this ticket.
So let's keep mulling this over for a bit more and see what researchers need.