Opened 6 years ago

Closed 4 years ago

#9262 closed enhancement (implemented)

Refactor cell scheduling to consider all connections at once

Reported by: nickm Owned by: andrea
Priority: High Milestone: Tor: 0.2.6.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay circuitmux performance
Cc: robgjansen, nikita, bryan.ford@…, yawning Actual Points:
Parent ID: #12541 Points:
Reviewer: Sponsor:

Description (last modified by robgjansen)

Right now, our cell scheduling works by ensuring that every connection on which *any* circuit wants to write has at least one cell buffered for writing, and by putting more cells into the connection's write buffer every time a write makes it get low. The circuitmux logic is only used to choose which circuit on a given connection should have permission to write.

But this approach does a bad job in that it treats all connections equally: instead, it should look at all connections _which would like to write now_, and decide among them.

Reported by Rob Jansen.

Andrea is working on this one. As I understand it, the approach is to decouple the "Put a cell on this connection's output buffer" logic from the two things that trigger it: draining the output buffer a little in response to a libevent event, and making a circuit active on a connection which had no active circuits or buffered cells before. Instead, the main event loop should update a list of connections which would like to have cells put on them, and periodically (every X usec, or every time through the libevent loop, or something like that) queue cells onto the selected connections.

The details above will be a little tricksy. Andrea, could you explain a little more about your approach here? Alternatively I think I can dig up the conversation we had about this stuff, if you're okay with my posting it.

Child Tickets

Change History (19)

comment:1 Changed 6 years ago by nickm

Status: newassigned

comment:2 Changed 6 years ago by robgjansen

I'm happy to see this is being worked on! There's probably a simulation step hiding somewhere in here that we should consider when the time is right.

We already have a rough research prototype which with we've been experimenting. Let me know if that might be useful to you.

comment:3 Changed 6 years ago by robgjansen

Description: modified (diff)

comment:4 Changed 6 years ago by nickm

Status: assignedneeds_review

The first part of this is in andrea's repository, in branch cmux_refactor_pt_1

comment:5 Changed 6 years ago by nickm

I hacked around to make it compile and work under chutney, and to actually make cells get queued from the scheduler in my branch "cmux_refactor_pt_1_hacks". Some of the commits in that branch are good; some are necessary, and 0b3822dbe6367eb5eb is a total kludge.

Next, I need to figure out how to have some channels linger in the channels_pending state without having the scheduler busy-loop, and I need to get these hacks I did better reviewed and tested, and I need to stress-test the code even harder.

comment:6 Changed 5 years ago by andrea

Now see my cmux_refactor_squashed branch for this completed, mostly unit-tested and rebased against master.

comment:7 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.6.x-final

comment:8 Changed 5 years ago by robgjansen

Is there any documentation (other than the code) about the approach taken here?

comment:9 Changed 5 years ago by nikita

Cc: nikita added

comment:10 Changed 5 years ago by ford

Cc: bryan.ford@… added

comment:11 Changed 5 years ago by robgjansen

Parent ID: #12541

comment:12 Changed 5 years ago by arma

I rebased this onto 0.2.5.6-alpha, and pushed the result to my cmux-0256 branch.

I recommend doing a ./configure --enable-gcc-warnings and you'll find a bunch of complaints.

comment:13 Changed 5 years ago by robgjansen

First simulation results were just posted in #12889. Please take a look and discuss there.

comment:14 Changed 5 years ago by nickm

The new version of this is cmux_refactor_for_026_squashed in athena's public repository.

comment:15 Changed 5 years ago by yawning

Cc: yawning added

comment:16 Changed 5 years ago by robgjansen

Is cmux_refactor_for_026_squashed just a different merge than Roger's? Or did some important functionality also change?

comment:17 Changed 5 years ago by andrea

Gcc warnings fixed in cmux_refactor_for_026_squashed_2

comment:18 Changed 4 years ago by nickm

On #12889, Rob got good results with:

SchedulerLowWaterMark 100MB
SchedulerHighWaterMark 101MB
SchedulerMaxFlushCells 1000

on the branch cmux_refactor_configurable_threshold. So those are probably the values we should take?

Last edited 4 years ago by nickm (previous) (diff)

comment:19 Changed 4 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

Merged! Oh, finally, finally it is merged. Thanks to Yawning for helping me review, Rob for testing, and above all thanks to Andrea for doing the huge revision.

I did a couple of fixups on top of this, mainly affecting the configuration option thing. See 0e0dc7d787ef867a24b2e630bf1db72227b24fef and b1e1b439b892624b7dbe2f631a9ddb41b16ecdd6.

I had a sudden question about OOM issues; I'll add a new ticket for that.

Note: See TracTickets for help on using tickets.