Opened 3 weeks ago

Last modified 34 minutes ago

#23712 needs_review defect

sched: DESTROY cell on a circuit bypasses the scheduler

Reported by: dgoulet Owned by: dgoulet
Priority: Medium Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-sched, review-group-24
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


If you look at circuitmux_append_destroy_cell(), it is the one appending a DESTROY cell to the cmux queue and then calls channel_flush_from_first_active_circuit() if no writes are pending that is if the outbuf is empty (also looks at the out queue but that is always empty #23709).

In the case the flush is triggered, the cell is immediately put in the outbuf and written to kernel by libevent which completely bypasses the scheduler. Maybe it is what we want that is go as fast as we can in destroying a circuit? Don't know but it has this effect on the scheduler where the channel is scheduled with a "wants_to_write" event from the connection subsystem and ultimately the channel gets scheduled with nothing in the queue because it is already on the outbuf. For KIST, this is not ideal because KIST should control the flow of data to the kernel.

It seems there are two places we queue cells into a cmux queue: circuitmux_append_destroy_cell() and append_cell_to_circuit_queue(). The latter triggers a "has waiting cells" for the scheduler which is what we want but the former just bypasses it.

I think it should simply trigger that notify to the scheduler instead of flushing it by itself.

Child Tickets

Change History (5)

comment:1 Changed 3 weeks ago by dgoulet

Status: newneeds_review

See branch: bug23712_032_01.

Ok this might look like a very simple fix but it is really not.

For KIST, this is what we _need_ to do else this process of appending a DESTROY cell bypasses the scheduler and can trigger KIST to try to flush an empty channel which could be one of the source of #23676. The following lists the possible channel state it can be when the destroy cell is appended and the scheduler notified:

  1. if in WAITING_FOR_CELLS, all good, we will get scheduled
  2. KIST can't let unscheduled channel in WAITING_TO_WRITE so we can't be calling "waiting for cells" on a "waiting to write" channel.
  3. KIST doesn't put channel in IDLE state
  4. PENDING state, can't escape the scheduler, like (ii).

For the Vanilla kernel:

  1. if in WAITING_FOR_CELLS, all good, we will get scheduled
  2. if in WAITING_TO_WRITE, it means we have data in the queue but can't put them in the outbuf because it is full. The flushed_some() function will call the "wants to write" event when the outbuf is flushed to the network and thus the DESTROY cell will ultimately get drained.
  3. PENDING state, can't escape the scheduler.
  4. if in the IDLE state, it means we have no more cell in the queue and the outbuf is full, the channel is put in WAITING_TO_WRITE state so (ii) applies to this case that is the "wants to write" event will trigger a flush of the destroy cell at some point when the outbuf is drained.

comment:2 Changed 3 weeks ago by dgoulet

One quick note here. When a DESTROY cell is sent, it is from circuit_about_to_free() which means that the "circuit_t" will get freed momentarily. However, we put a cell on the circuit queue for which we should be updating our cmux cell count (circuitmux_set_num_cells()) but that function takes a circuit object that we can't use... The circ ID is available but we can't do much with it with the current interface.

That is "OK" because for the scheduler to know if more needs to be flushed, it uses circuitmux_num_cells() which counts the cell + destroy cell.

comment:3 Changed 13 days ago by nickm

Owner: set to dgoulet
Status: needs_reviewassigned

setting owner

comment:4 Changed 13 days ago by nickm

Status: assignedneeds_review

comment:5 Changed 34 minutes ago by nickm

Keywords: review-group-24 added

review-group-24 is now open.

Note: See TracTickets for help on using tickets.