Opened 8 years ago

Closed 6 months ago

#2536 closed enhancement (wontfix)

Disable outgoing token bucket and reduce token bucket refill interval

Reported by: karsten Owned by: nickm
Priority: Medium Milestone: Tor: 0.3.4.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: performance, tor-relay, review-group-34
Cc: tschorsch@…, bjscheue@…, arthuredelstein@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Florian Tschorsch and Björn Scheuermann have written a patch to a) disable the outgoing token bucket to avoid the "double door effect" and b) reduce the token bucket refill interval from 1 second to, say, 100 milliseconds or less.

Sebastian and I discussed the patch with Florian and Björn. See the latest version of the patch in branch tokenbucket in my public repository.

Child Tickets

Attachments (2)

0001-Add-Token-Bucket-proposal-idea.patch (8.3 KB) - added by karsten 8 years ago.
[PATCH] Add Token Bucket proposal idea.
xxx-creditbucket.txt (9.0 KB) - added by Flo 7 years ago.
Credit Bucket Proposal

Download all attachments as: .zip

Change History (39)

comment:1 Changed 8 years ago by nickm

Milestone: Tor: 0.2.3.x-final
Status: newneeds_review

Marking as needs_review, for 0.2.3.x.

Removing the outgoing token bucket entirely is not okay with me, for the reasons discussed on the or-dev thread, so I'm going to treat this as preliminary.

Is it possible to separate the more frequent refill interval part of this from the double-door part? They seem like they should be logically separable.

comment:2 in reply to:  1 Changed 8 years ago by Flo

Cc: tschorsch@… added

Removing the outgoing token bucket entirely is not okay with me, for the reasons discussed on the or-dev thread, so I'm going to treat this as preliminary.

In order to meet Nick's comments, here a round-up of a modification (http://archives.seul.org/or/dev/Feb-2011/msg00038.html) of our original proposal (http://archives.seul.org/or/dev/Jan-2011/msg00004.html) that we posted some time ago on or-dev:

We propose to continue using a token bucket on the incoming side to limit the incoming rate anyway. It will also be easy to monitor (not limit!) the outgoing rate. We might make use of these two mechanisms to proceed as follows: Every time traffic is generated in our relay we remove a corresponding amount of tokens from the bucket on the *incoming* side. Thereby slowing down the rate at which data flows into the router.

It is easy to see that it will limit the outgoing traffic to the configured limit -- while at the same time it does not need any rate limiting mechanism on the outgoing side (it therefore avoids any double-door effects). It is also reasonably simple and will thus (hopefully) not cause undesired side effects.

If you agree with this modification, I would love to implement the changes.

 

Is it possible to separate the more frequent refill interval part of this from the double-door part? They seem like they should be logically separable.

Both enhancements are sperated in our implementation already. They can be configured by two new parameters: DisableOutgoingTokenBucket and TokenBucketRefillInterval. Nevertheless we recommend to use both, since using smaller refill intervals only will amplify the double-door effects.

 

comment:3 in reply to:  1 Changed 8 years ago by Flo

Just a small update concerning our patch:
We ran two identical configured relays (0.2.3.0-alpha-dev, non-exit, 500kbps) with and without our patch enabled for a time period of one week.
The standard Tor relay queued 16% of all relayed cells for one second or more.
In contrast, with our patch and a configured refill period of 10ms, only 2% of all relayed cells get queued for one second or more.
Thus the results reveal the great impact of our patch.

Therefore we should push forward the discussion/ reviewing of our proposed modifications.

comment:4 Changed 8 years ago by Sebastian

So, did you write the patch that can guarantee that a configured maximum amount of traffic is never exceeded? That sounds like the next thing to review here going forward, because we cannot use any code that doesn't have that guarantee.

comment:5 Changed 8 years ago by nickm

I am skeptical about the "decrement the in-bucket when traffic is 'generated internally' " approach. It seems that it would require pervasive changes throughout the codebase. I bet that the "adaptively decrease the read-rate until the write-limit is (almost) never reached)" approach would yield significantly more reliable and maintainable code.

As for separating stuff: I meant separate patches, so that the features can be merged and reviewed independently. If the short-refill-interval approach yields significant improvements, it shouldn't have to wait for the double-gating fixes to get revised. I just talked to Karsten and Sebastian, and they said they'd like to try to split up the patch series into independent patches. That'd be great!

comment:6 in reply to:  5 Changed 8 years ago by karsten

Replying to nickm:

As for separating stuff: I meant separate patches, so that the features can be merged and reviewed independently. If the short-refill-interval approach yields significant improvements, it shouldn't have to wait for the double-gating fixes to get revised. I just talked to Karsten and Sebastian, and they said they'd like to try to split up the patch series into independent patches. That'd be great!

Yes, Sebastian and I can help. Where do we find the latest branch?

comment:7 in reply to:  5 ; Changed 8 years ago by Flo

Replying to nickm:

I am skeptical about the "decrement the in-bucket when traffic is 'generated internally' " approach. It seems that it would require pervasive changes throughout the codebase. I bet that the "adaptively decrease the read-rate until the write-limit is (almost) never reached)" approach would yield significantly more reliable and maintainable code.

Is it so? With my current knowledge of the code, I would suspect that there are only two additional places to implement the mentioned modifications of our patch: (1) In connection_buckets_decrement() when counting bytes of directory traffic. There we need to subtract the num_written bytes off the readbucket. Btw: Apart from that a directory mirror answers requests only when capacity is available (see usage of global_write_bucket_low). (2) I am not exactly sure where it happens (probably in connection_edge_package_raw_inbuf), but we need to consider data that is less then the cells' payload. This difference needs to be subtracted off the readbucket too.

Please tell me If I missed something here. Nevertheless this approach maintains the bandwidth limits strictly. In contrast the adaptive approach seems to be more inaccurate.

As for separating stuff: I meant separate patches, so that the features can be merged and reviewed independently. If the short-refill-interval approach yields significant improvements, it shouldn't have to wait for the double-gating fixes to get revised. I just talked to Karsten and Sebastian, and they said they'd like to try to split up the patch series into independent patches. That'd be great!

We have evidence that decreasing the refill-intervals only amplifies the double-door effects and therefore queues cells even longer. On the other side it makes the traffic less bursty and smoothens it. Btw: This will probably be also the fact with buffer-events enabled, where shorter intervals are used only. Measurements that confirm both are in progress.

Therefore, in order to gain maximum improvements, we advise to enable both features. Though, technically it can be reviewed independently.

Replying to karsten:

Yes, Sebastian and I can help. Where do we find the latest branch?

Once we have agreed on a modification of the original proposal, I will finish and upload it to my public git-repository. At the moment there is the original patch available only.

Thanks to all of you for your help.

comment:8 in reply to:  7 ; Changed 8 years ago by karsten

Replying to Flo:

Replying to nickm:

I am skeptical about the "decrement the in-bucket when traffic is 'generated internally' " approach. It seems that it would require pervasive changes throughout the codebase. I bet that the "adaptively decrease the read-rate until the write-limit is (almost) never reached)" approach would yield significantly more reliable and maintainable code.

Is it so? With my current knowledge of the code, I would suspect [...] I am not exactly sure [...]

Too many uncertainties. :) I think Nick has a point here. We should come up with a patch that we're 100% sure respects the configured bandwidth limit, even if it doesn't give us 100% of the performance improvement that we hope to achieve. We can still apply another patch that does what you have in mind, once that is tested more and there are fewer uncertainties.

As for separating stuff: I meant separate patches, so that the features can be merged and reviewed independently. [...]

Therefore, in order to gain maximum improvements, we advise to enable both features. Though, technically it can be reviewed independently.

This is only about reviewing and merging.

Replying to karsten:

Yes, Sebastian and I can help. Where do we find the latest branch?

Once we have agreed on a modification of the original proposal, I will finish and upload it to my public git-repository. At the moment there is the original patch available only.

Great. Where would we find the original patch again? :) We can split the branch tomorrow and you can apply your changes to the separate branches.

Thanks to all of you for your help.

Thanks for being persistent in trying to get this patch into Tor. :)

comment:9 in reply to:  8 ; Changed 8 years ago by Flo

Replying to karsten:

Too many uncertainties.

Well actually I am certain ;-)

Great. Where would we find the original patch again? :) We can split the branch tomorrow and you can apply your changes to the separate branches.

Okay cool. Here is the git read-only url: git://github.com/tschorsch/tor.git

comment:10 in reply to:  9 Changed 8 years ago by karsten

Owner: set to karsten
Status: needs_reviewassigned

Replying to Flo:

Okay cool. Here is the git read-only url: git://github.com/tschorsch/tor.git

I squashed, splitted, and rebased your branch to have 1 commit per change (plus 1 commit for a more general change). The result is in branch tokenbucket in my public repository. Note that I messed up with the author information, but I assumed you don't care about the Git meta data as long as the ChangeLog says that you are the author(s). Or maybe Sebastian can fix that if you care.

If this branch looks okay to you, please apply any further patches on top of it, and we'll merge them into the existing commits before merging everything into master.

I also removed your proposal, because proposals are now in a separate repository. I'm attaching a patch to this ticket for adding your proposal there.

Changed 8 years ago by karsten

[PATCH] Add Token Bucket proposal idea.

comment:11 in reply to:  description Changed 7 years ago by Flo

We can absolutely understand your concerns and Nick always had a point there. We are just excited about the performance improvements obtained with our initial proposal, so that we tried to find a way to bring it on. Nevertheless you are right that it might be a good first step to implement a modification as you suggest, even though we expect that it will not overcome all the performance issue under all circumstances then.

In order to meet your comments, we drafted a different, alternative design that strictly maintains configured bandwidth limits.

Let us start from our original patch (cf. tokenbucket proposal). Thus, we have a regular token bucket on the reading side with a certain rate and a certain burst size. Let x denote the current amount of tokens in the bucket. On the outgoing side we need something appropriate that monitors and constrains the outgoing rate, but at the same time avoids holding back cells (cf. double door effects) whenever possible.
Here we propose something that adopts the role of a token bucket, but realizes this functionality in a slightly different way. We call it a "credit bucket". Like a token bucket, the credit bucket also has a current fill level, denoted by y. However, the credit bucket is refilled in a different way.

To understand how it works, let us look at the possible operations:

  • As said, x is the fill level of a regular token bucket on the incoming side and thus gets incremented periodically according to the configured rate. No changes here.
  • If x<=0, we are obviously not allowed to read.
  • If x>0, we are allowed to read up to x bytes of incoming data. If k bytes are read (k<=x), then we update x and y as follows:

x = x - k (1)
y = y + k (2)

(1) is the standard token bucket operation on the incoming side. Whenever data is admitted in, though, an additional operation is performed: (2) allocates the same number of bytes on the outgoing side, which will later on allow the same number of bytes to leave the onion router without any delays.

  • If y + x > -M, we are allowed to write up to y + x + M bytes on the outgoing side, where M is a positive constant. M specifies a burst size for the outgoing side. M should be higher than the number of tokens that get refilled during a refill interval, we would suggest to have M in the order of a few seconds "worth" of data. Now if k bytes are written on the outgoing side, we proceed as follows:

1) If k <= y then y = y - k

In this case we use "saved" credits, previously allocated on the incoming side when incoming data has been processed.

2) If k > y then y = 0 and x = x - (k-y)

We generated additional traffic in the onion router, so that more data is to be sent than has been read (the credit is not sufficient). We therefore "steal" tokens from the token buffer on the incoming side to compensate for the additionally generated data. This will result in correspondingly less data being read on the incoming side subsequently. As a result of such an operation, the token bucket fill level x on the incoming side may become negative (but it can never fall below -M).

  • If y + x <= -M then outgoing data will be held back. This may lead to double-door effects, but only in extreme cases where the outgoing traffic largely exceeds the incoming traffic, so that the outgoing bursts size M is exceeded.

Aside from short-term bursts of configurable size (as with every token bucket), this procedure guarantees that the configured rate may never be exceeded (on the application layer, that is; as with the current implementation, an attacker may easily cause the onion router to arbitrarily exceed the limits on the lower layers). Over time, we never send more data than the configured rate: every sent byte needs a corresponding token on the incoming side; this token must either have been consumed by an incoming byte before (it then became a "credit"), or it is "stolen" from the incoming bucket to compensate for data generated within the onion router.

This modification can be implemented with moderate effort and requires changes only at the points where currently the token bucket operations are performed.

We feel that this is not the be-all and end-all solution, because it again introduces a feedback loop between the incoming and the outgoing side. We therefore still hope that we will be able to come to a both simpler and more effective design in the future. However, we believe that what we proposed above is a good compromise between avoiding double-door effects to the furthest possible extent, strictly enforcing an application-layer data rate, and keeping the extent of changes to the code small.

Comments are, as always, highly welcome.

comment:12 Changed 7 years ago by Bjoern

Cc: bjscheue@… added

comment:13 Changed 7 years ago by Sebastian

This new scheme sounds like a good idea to make progress. What happens to the current notion of a bandwidthburst? That is probably what M wants to be based on somehow, yes?

comment:14 in reply to:  13 Changed 7 years ago by Bjoern

Replying to Sebastian:

What happens to the current notion of a bandwidthburst? That is probably what M wants to be based on somehow, yes?

We need to distinguish two burst sizes, in the incoming and the outgoing direction. The incoming burst size is, just like before, given by the size of the token bucket on the incoming side; let me denote this size by T here. For the outgoing side, we have, analogously, a size S for the credit bucket. Then, there is a hard limit of M+S+T on the outgoing burst size: in the extreme case, the incoming bucket is full (x=T) and the credit bucket on the outgoing side is also full (y=S), so we may send out M+S+T bytes immediately. This may happen if (a) there has been lots of incoming traffic which "vanished" inside the router before (so that the credit bucket became full) and (b) there is currently no incoming traffic (so that the incoming bucket ran full, too), and (c) lots of outgoing traffic is spontaneously generated within the router.

To reliably avoid double-dooring, S and M should both be at least a few times larger than T. To be on the absolutely safe side, we could either have separate configuration parameters for M, S, and T, or we could have one maximum burst size parameter b and would internally use a fixed ratio between the sizes, e.g., T = 0.1*b, S=0.45*b, M = 0.45*b.

For good performance, it will of course be essential to configure a sufficiently large burst size. This should be communicated in the documentation.

comment:15 Changed 7 years ago by nickm

This sounds promising! I think an approach like this could totally work. Let's get it written up in a proposal form, and get some code written. Since I have been so remiss in reviewing this promptly, I'll take on writing the first draft of the proposal unless somebody else wants to do it first. I think I could get it done early next week. (If somebody else wants to take a shot at it instead then I would be grateful, but at this point I really ought to do something to make up for my tardiness.)

I wonder if we can improve this algorithm in cases where we know that a given read will generate no bytes written (as is the case for when we read directory information), or when we find out early that we're going to have an imbalance (as when we pack or unpack a partial cell).

Another wrinkle that we need to address is that not all connections are rate-limited: by default, for example, the read and write limits do not apply to connections from localhost. I think the answer there is to make sure that even if a connection isn't counted towards the read or write bucket, its reads have the proper effect on the write bucket and its writes the appropriate effect on the read bucket.

We should also keep an eye out for ways to instrument the logic here so that we can find out how much this is helping us.

Finally, because of our transition to libevent during 2.3.x, we need to find a way to implement this both using bufferevent rate limiting and using Tor's rate limiting code. Big fun there.

Karsten, did your branch manage to separate the finer-grained-ticks part from the rest of the code? If so, we should really just open a new ticket for that: it ought to be mergeable independently.

comment:16 in reply to:  15 Changed 7 years ago by karsten

Replying to nickm:

Karsten, did your branch manage to separate the finer-grained-ticks part from the rest of the code? If so, we should really just open a new ticket for that: it ought to be mergeable independently.

To be honest, I can't say. I think the branch needs more work before merging parts of it. Florian might know better though.

comment:17 in reply to:  15 ; Changed 7 years ago by Flo

Thanks for your replies.

Replying to nickm:

This sounds promising! I think an approach like this could totally work. Let's get it written up in a proposal form, and get some code written. Since I have been so remiss in reviewing this promptly, I'll take on writing the first draft of the proposal unless somebody else wants to do it first. I think I could get it done early next week. (If somebody else wants to take a shot at it instead then I would be grateful, but at this point I really ought to do something to make up for my tardiness.)

If you haven't already started writing the proposal, I can come up with a draft. Just to make sure I got you right: You would like to split up both ideas -- (1) new approach on token bucket refilling and (2) reduction of refill intervals -- so that we also have two separate proposal documents in the end. Ist that right?


Karsten, did your branch manage to separate the finer-grained-ticks part from the rest of the code? If so, we should really just open a new ticket for that: it ought to be mergeable independently.

One of my students already started implementing our modified idea. At the moment we are in the state of debuging and testing it. As soon as this will be finished, I am going to merge it into Karsten's current branch. Hopefully this can be done next week.

Concerning the other comments and notes (algorithm improvements, unlimited connections, libevent 2.3.x), we are going to discuss them and try to find a solution on that.

comment:18 in reply to:  17 Changed 7 years ago by nickm

Replying to Flo:

Thanks for your replies.

Replying to nickm:

This sounds promising! I think an approach like this could totally work. Let's get it written up in a proposal form, and get some code written. Since I have been so remiss in reviewing this promptly, I'll take on writing the first draft of the proposal unless somebody else wants to do it first. I think I could get it done early next week. (If somebody else wants to take a shot at it instead then I would be grateful, but at this point I really ought to do something to make up for my tardiness.)

If you haven't already started writing the proposal, I can come up with a draft.

That would be great.

Just to make sure I got you right: You would like to split up both ideas -- (1) new approach on token bucket refilling and (2) reduction of refill intervals -- so that we also have two separate proposal documents in the end. Ist that right?

Right. Also, the token-bucket proposal needs to be really specific: ideally, described in spec-levels of detail, and to cover all of the weird corner cases described above.

Thanks!

Changed 7 years ago by Flo

Attachment: xxx-creditbucket.txt added

Credit Bucket Proposal

comment:19 Changed 7 years ago by Flo

Hi, as promised I just added the first draft of our credit bucket proposal. Please feel free to revise the document attached.

Soon I will also publish the corresponding code in my github repository as further basis of discussion.

Thanks in advance.

comment:20 Changed 7 years ago by nickm

Hi! Just added it as proposal 182 and sent it to tor-dev.

comment:21 in reply to:  20 Changed 7 years ago by Flo

Replying to nickm:

Hi! Just added it as proposal 182 and sent it to tor-dev.

Thanks for that.

Just a quick note: I created a new ticket (#3630) that deals with the reduced refill intervals only. By doing so I hope to meet your suggestion of separating both topics/ enhancements.

comment:22 Changed 7 years ago by nickm

Great; thank you!

comment:23 Changed 7 years ago by nickm

Keywords: performance added
Milestone: Tor: 0.2.3.x-finalTor: 0.2.4.x-final

Ugh; it looks like this won't be merged by the end-of-november big-feature deadline for 0.2.3. Marking it 0.2.4 instead, though we should actually consider it in context with other pending performance ideas as well.

comment:24 Changed 7 years ago by karsten

Owner: changed from karsten to nickm

Re-assigning this ticket to Nick. There's not really something I could do here.

comment:25 Changed 7 years ago by arma

How does this ticket relate to #4712? I think the refill-interval issue has its own tickets now, and this one might be redundant with #4712?

comment:26 Changed 6 years ago by nickm

Keywords: tor-relay added

comment:27 Changed 6 years ago by nickm

Component: Tor RelayTor

comment:28 Changed 6 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final

comment:29 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.???

comment:30 Changed 23 months ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:31 Changed 22 months ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:32 Changed 16 months ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:33 Changed 16 months ago by nickm

Severity: Normal
Status: assignedneeds_review

comment:34 Changed 10 months ago by arthuredelstein

Cc: arthuredelstein@… added

comment:35 Changed 7 months ago by teor

Milestone: Tor: unspecifiedTor: 0.3.4.x-final

Move all needs_review tickets without a release to 0.3.4

comment:36 Changed 7 months ago by nickm

Keywords: review-group-34 added

comment:37 Changed 6 months ago by dgoulet

Resolution: wontfix
Status: needs_reviewclosed

8 years old proposal and doesn't apply to current tor design right now. If you disagree, open new ticket and explain why.

Note: See TracTickets for help on using tickets.