Opened 7 months ago

Last modified 3 days ago

#29698 needs_revision defect

Edge case that causes improper circuit prioritization for one scheduling run

Reported by: pastly Owned by: dgoulet
Priority: Medium Milestone: Tor: 0.4.2.x-final
Component: Core Tor/Tor Version: Tor: 0.3.2.1-alpha
Severity: Normal Keywords: tor-sched, tor-cmux, kist, 041-deferred-20190530, 042-should, BugSmashFund
Cc: robgjansen Actual Points: 0.2
Parent ID: Points: 0.2
Reviewer: nickm Sponsor:

Description

The Problem, Shortly

A circuit that goes from very busy for a long time, to 100% idle for a long time, and then needs to send traffic again will be incorrectly deprioritized the first time it gets scheduled.

The Problem, Illustrated

Consider a circuit that is very very busy for a significant length of time (minutes). There's constant traffic flowing in one (or both, but let's just say one) direction on this circuit, leading to it earning for itself a high cell_count EWMA value (thus a low priority for scheduling). Assume it is the only circuit on its channel.

Now assume it suddenly stops sending traffic but stays open. It stays this way for significant length of time (many 10s of seconds) such that its cell_count EWMA value should be essentially zero, but it hasn't actually been updated yet since this value isn't updated until a cell has been transmitted (see circuitmux_notify_xmit_cells).

At this point in time the relay is still servicing some number of low-traffic circuits on other channels. Maybe it has always been handling these circuits. Whatever. It doesn't matter. What matters is at this point in time there's lots of low-traffic circuits needing scheduling. Because they are low-traffic, these circuits have cell_count EWMA values that are relatively low (thus a high priority for scheduling).

Now what happens when that original high-traffic circuit stops being totally idle? What happens when it wants to send another 1000, 100, or even just 1 cell?

It gets put into KIST channels_pending smart list like any other circuit. In fact there are a bunch of low-bandwidth circuits in there with it. Observe what happens when KIST starts scheduling its collection of pending channels:

KIST loops over and over until its list of pending channels is empty. Each time it gets the channel with the current best-priority circuit, schedules one cell, updates the appropriate cell_count, and puts the channel back in the pending list if necessary.

All those low-traffic circuits will be serviced first because they have low cell_count values (high priority) as compared to the outdated cell_count value for the original high-traffic circuit.

When the circuit finally gets to send its first cell after its long period of inactivity, its cell_count EWMA value is corrected to be near zero. That's fine. But it should have been updated before scheduling decisions were made so that it was the first one to be scheduled.

A solution

Add a touch function in the circuitmux channel interface that tells the circuitmux and whatever its policy is to update its circuit priorities if desired.

Before entering the main scheduling loop, call this touch function on all the pending channels. In the case of the EWMA policy, the touch function would ultimately drill down to something like

ewma_touch(circuitmux_policy_data_t *pol_data)
{
  ewma_policy_data_t *pol = NULL;
  unsigned int tick;
  double fractional_tick;
  tor_assert(pol_data);
  pol = TO_EWMA_POL_DATA(pol_data);

  /* Rescale the EWMAs if needed */
  tick = cell_ewma_get_current_tick_and_fraction(&fractional_tick);

  if (tick != pol->active_circuit_pqueue_last_recalibrated) {
    scale_active_circuits(pol, tick);
  }
}

(Which you might observe is essentially the first part of ewma_notify_xmit_cells(...)).

Child Tickets

Attachments (1)

EWMA vs time with and without touch.png (144.6 KB) - added by pastly 7 months ago.

Download all attachments as: .zip

Change History (11)

comment:1 Changed 7 months ago by pastly

Currently KIST will always handle every pending channel. It will always send at least one cell, thereby updating the EWMA values. Technically a bug, but not super bad IMHO.

But if KIST were to ever stop handling every pending channel -- for example if a global write limit were added on the maximum number of cells it is allowed to send over all channels --, then channels could "get forgotten about."

I'm attaching two graphs in a single image that show a single channel's best EWMA value over time. In both graphs, the middle 120 seconds takes place when a global write limit and extra load are applied to the relay.

The left graph is without the touch function. The channel we are looking at gets "forgotten" during the middle 120 seconds. No cells are sent on it during this time, and its EWMA value never updates. It's stuck. Once the load and write limit are removed, the channel gets to send and its EWMA value is updated correctly.

The right graph adds the touch function. The channel gets its EWMA value updated periodically like all other channels, and it does get its fair chance to send data. The channel is not "forgotten."

Changed 7 months ago by pastly

comment:2 Changed 7 months ago by dgoulet

Keywords: kist added
Milestone: Tor: 0.3.5.x-finalTor: 0.4.1.x-final

Fortunately for us, the fix seems simple enough. Moving it to 041 since we can address this without massive engineering.

comment:3 Changed 5 months ago by nickm

Keywords: 041-deferred-20190530 added

Marking these tickets as deferred from 041.

comment:4 Changed 5 months ago by nickm

Milestone: Tor: 0.4.1.x-finalTor: 0.4.2.x-final

comment:5 Changed 6 weeks ago by nickm

Keywords: 042-should added

comment:6 Changed 3 weeks ago by ahf

Owner: set to dgoulet
Status: newassigned

Distributing 0.4.2 tickets between network team members.

comment:7 Changed 10 days ago by dgoulet

Actual Points: 0.2
Keywords: BugSmashFund added
Points: 0.2
Reviewer: nickm
Status: assignedneeds_review

PR: https://github.com/torproject/tor/pull/1402
Branch: ticket29698_042_01

comment:8 Changed 10 days ago by nickm

Status: needs_reviewneeds_revision

Looks reasonable; I've left some comments on the patch.

One more thing I'd request: Could we please have a regression test on this? That is, a test that fails without this patch, and passes with it, to demonstrate that we really fixed the problem.

Also, is this a backport candidate?

Thanks!

comment:9 in reply to:  8 Changed 10 days ago by dgoulet

Replying to nickm:

Looks reasonable; I've left some comments on the patch.

One more thing I'd request: Could we please have a regression test on this? That is, a test that fails without this patch, and passes with it, to demonstrate that we really fixed the problem.

I've spent a bit of time trying to come up with a way to test this. And I sorta fail in a reasonable time frame. It would hitting so many code path in the cmux/ewma/channel that it would require quite a bit of work to set that test setup.

We have virtually _no_ EWMA or cmux unit tests so everything needs to be done in mocking and what not to be able to do something proper. Also, the cmux and EWMA code is opaque, even to the tests. Thus more work there to expose what we need to test.

All in all, I can do something there but there is a potential for a much larger patch to support everything needed for testing. And this will turn the ticket into a much larger Point size.

Still OK with this?

The other approach I guess would be to allocate myself some cycles and add considerable amount of testing over the entire cmux/EWMA code. I would expect that to be around 2 or 3 days of work for something proper which would include this touch() interface.

Also, is this a backport candidate?

Nope. It is not critical enough to justify a backport imo.

comment:10 Changed 3 days ago by dgoulet

Discussion on IRC with nickm is that I'll spend some cycle (couple days max) to add test coverage to the cmux subsystem so we can at least have a baseline to work with for future changes such as this one.

If it turns out to be a bit to aggressive code wise, we'll then aim for 043 including this patch.

Note: See TracTickets for help on using tickets.