Opened 22 months ago

Last modified 5 weeks ago

#23711 assigned defect

sched: KIST writes to kernel and get a "wants to write" notification right after

Reported by: dgoulet Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-sched, 034-triage-20180328, 034-removed-20180328
Cc: Actual Points:
Parent ID: #23993 Points:
Reviewer: Sponsor:

Description

KIST scheduler does call a write to kernel contrary to the vanilla scheduler. This is done through channel_write_to_kernel() which calls connection_handle_write().

That last function will ultimately call connection_or_flushed_some() which triggers a scheduler_channel_wants_writes() because of this condition:

  datalen = connection_get_outbuf_len(TO_CONN(conn));
  if (datalen < OR_CONN_LOWWATER) {
    scheduler_channel_wants_writes(TLS_CHAN_TO_BASE(conn->chan));

That is OK if datalen > 0 but useless if datalen == 0. For KIST, it makes the channel go back in pending state and scheduled because it wants to write. But then if the outbuf or the cmux queue is empty, we end up scheduling a channel that actually does NOT need to write at all.

Could be the fix here is probably simple as:

  if (datalen > 0 && datalen < OR_CONN_LOWWATER) {

I suspect with KIST, the datalen will always be 0 because KIST in theory controls exactly what goes in the outbuf and what can be written to the kernel so when it triggers a connection write(), the entire outbuf should be drained (in theory). So the effect of this is that every write to the kernel from KIST triggers a useless "wants to write" event rescheduling the channel. Note that this only happens if the channel is in SCHED_CHAN_WAITING_TO_WRITE state.

Child Tickets

Change History (13)

comment:1 Changed 22 months ago by dgoulet

After some analysis, the Vanilla scheduler needs this call for the case where the connection outbuf was full and the circuit queue still has cells in it.

This way, once the outbuf is flushed onto the network, if we ever are under the low watermark (16k), the channel is rescheduled so to try to flush the cells into the outbuf which has room at that point.

This is a big house of cards because the channel *has* to be in SCHED_CHAN_WAITING_TO_WRITE if we ever want to drain the circuit queue because the scheduler_channel_has_waiting_cells() will never do anything unless the channel is in state SCHED_CHAN_WAITING_FOR_CELLS.

For KIST, this "wants to write" is useless outside of the scheduler because KIST actually schedules things and always put back a channel in state "waiting to write" in the channels pending list to be re-assess at the next tick and writes to kernel what it just flushed.

comment:2 Changed 22 months ago by dgoulet

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

This won't really get fixed until #23744 so postponing.

For now, I think it is fine, KIST recovers properly from that.

comment:3 Changed 21 months ago by dgoulet

Sponsor: SponsorV

Tagging all scheduler (and KIST) related tickets for SponsorV

comment:4 Changed 21 months ago by dgoulet

Owner: set to dgoulet
Status: newaccepted

Taking the scheduler related tickets.

comment:5 Changed 21 months ago by dgoulet

Parent ID: 23993

comment:6 Changed 21 months ago by dgoulet

Parent ID: 23993#23993

comment:7 Changed 18 months ago by dgoulet

Milestone: Tor: 0.3.3.x-finalTor: 0.3.4.x-final

Move 033 ticket I own to 034

comment:8 Changed 16 months ago by nickm

Keywords: 034-triage-20180328 added

comment:9 Changed 16 months ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:10 Changed 16 months ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: unspecified

These tickets, tagged with 034-removed-*, are no longer in-scope for 0.3.4. We can reconsider any of them, if time permits.

comment:11 Changed 7 weeks ago by gaba

Removing sponsor V as we do not have more time to include this tickets in the sponsor.

comment:12 Changed 7 weeks ago by gaba

Sponsor: SponsorV

Removing sponsor from tickets that we do not have time to fit in the remain of this sponsorship.

comment:13 Changed 5 weeks ago by gaba

Owner: dgoulet deleted
Status: acceptedassigned

dgoulet will assign himself to the ones he is working on right now.

Note: See TracTickets for help on using tickets.