Opened 7 years ago

Last modified 15 months ago

#4712 new task

Review and update any existing patches for proposal 182 ("credit buckets")

Reported by: karsten Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-relay network performance credit-bucket
Cc: tschorsch@… Actual Points:
Parent ID: Points: 5
Reviewer: Sponsor:

Description

We should review any existing patches for proposal 182 and update them for merging them into master. This is part of #4682.

Child Tickets

Change History (20)

comment:1 Changed 7 years ago by Sebastian

Milestone: Tor: unspecified

Wait, shouldn't the "evalue the design" step come before "evaluate the patches" step?

comment:2 Changed 7 years ago by Flo

Cc: tschorsch@… added

You can find our implementation of proposal 182 in my public github repository (git://github.com/tschorsch/tor.git), branch "creditbucket". It is based on Tor's master branch (commit d688a40a0e7c1e8417ecdc463821e50cd1762715). 

comment:3 Changed 6 years ago by arma

I opened #5336 to ask Kevin and Rob to do simulations on this.

In the meantime, here are some early notes on fixes the patch will need:

-  if (connection_counts_as_relayed_traffic(conn, now) &&
+  if (get_options()->CreditBucket) {
+   /*write_limit = y + x + M */
+   global_bucket = global_read_bucket + global_write_bucket
+       + (int)get_options()->OutgoingBandwidthBurst;
+   if (global_bucket < 0)
+     global_bucket = 0;
+  } else if (connection_counts_as_relayed_traffic(conn, now) &&
  • looks like it's doing the credit bucket thing even for conns that do not count as relayed traffic (i.e. that fall under BandwidthRate but not RelayBandwidthRate). If a Tor client (or relay also used as a client) turns on CreditBucket, it will start counting its own traffic and directory fetches in some parts of the credit buckets but not others.
     global_relayed_read_bucket -= (int)num_read;
     global_relayed_write_bucket -= (int)num_written;
+    
+    global_read_bucket -= (int)num_read;
+    if (get_options()->CreditBucket) {
  • similarly, here it ignores the changes to global_relayed_foo_bucket.
  • I want to check to make sure that write_bucket is still going negative and discouraging us from answering e.g. directory queries. This topic is messier since if most relays start declining to answer directory fetches, we need to change the algorithm for where clients should go (thus piling on even more load to the remaining ones, possibly causing a chain of problems).
  • "OutgoingBandwidthBurst 10 MB" is a huge default. It's probably better set to [Relay]BandwidthBurst.
  • That said, I wonder if there are situations where the write bucket gets dragged down and it basically lives right around "0-OutgoingBandwidthBurst", thus putting the relay right back into the situation that this patch is trying to solve. We might instrument the patch to warn if the write bucket spends too many seconds too close to that value.

comment:4 in reply to:  3 Changed 6 years ago by Flo

Replying to arma:

[... ...]

  • looks like it's doing the credit bucket thing even for conns that do not count as relayed traffic (i.e. that fall under BandwidthRate but not RelayBandwidthRate). If a Tor client (or relay also used as a client) turns on CreditBucket, it will start counting its own traffic and directory fetches in some parts of the credit buckets but not others.

[... ...]

  • similarly, here it ignores the changes to global_relayed_foo_bucket.

Maybe I didn’t get it right, but as far as I understand the difference between the global_relay_foo_bucket and global_foo_bucket is that from the first one the respective amount of relay-data is deducted only and from the second all sorts of data (including directory, client *and* relay traffic) is deducted. This results in an hierarchical structure that follows the function to limit the relayed data only and/or limit all data processed by the node. By saying that, I think your point (counting client traffic) already exists in Tor and is not something that comes with this patch. Meaning if I currently use my relay as client too, the amount of data I produce as client will be deducted from the global_foo_bucket.

  • "OutgoingBandwidthBurst 10 MB" is a huge default. It's probably better set to [Relay]BandwidthBurst.

Isn't 10 MB the current default BandwidthBurst size in Tor?

  • That said, I wonder if there are situations where the write bucket gets dragged down and it basically lives right around "0-OutgoingBandwidthBurst?", thus putting the relay right back into the situation that this patch is trying to solve. We might instrument the patch to warn if the write bucket spends too many seconds too close to that value.

This could happen in very extreme cases only, when the outgoing traffic largely exceeds the incoming traffic so that the outgoing burst size (OutgoingBandwidthBurst) exceeds. In general, which is the usual case, if the written data exceeds the amount of tokens in the write bucket (e.g. answering directory reqeusts) we "steal" data from the read bucket which compensates the additional generated data; i.e. we read less data.

In my opinion this is also the reason why it is important to consider the global_foo_bucket and not the global_relay_foo_bucket for this proposal. Nevertheless, I really appreciate instrumenting the patch in detail. And if I can help, just let me know.

comment:5 Changed 6 years ago by arma

(See also #2536 for earlier work on this ticket.)

comment:6 Changed 6 years ago by nickm

Milestone: Tor: unspecifiedTor: 0.2.4.x-final

We should at least get this to the point where we can merge and test it in 0.2.4.x.

comment:7 Changed 6 years ago by nickm

So for testing, I'm going to suggest we might actually be okay trying the simplest possible implementation of this proposal: the one that doesn't know about all the imbalances that can occur in read/write ratios [edge connections; directories; etc], and which doesn't implement RelayBandwidth vs GlobalBandwidth.

If I understand the proposal properly, then even that version of the patch should be sufficient to show us a reduction latency/bufferbloat/etc. If the initial results look promising, *then* let's polish it up. How does that sound?

comment:8 in reply to:  7 Changed 6 years ago by arma

Replying to nickm:

If I understand the proposal properly, then even that version of the patch should be sufficient to show us a reduction latency/bufferbloat/etc. If the initial results look promising, *then* let's polish it up. How does that sound?

That idea matches the text I wrote when I filed #5336. So, "yes please".

comment:9 Changed 6 years ago by nickm

(Looks like #5336 is in fine swing. So I'm going to call this sufficient for experimentation. We'll need to do more to make it suitable for merge if we think the idea is really promising. There's a lot more to do on that front.)

comment:10 Changed 6 years ago by nickm

Keywords: tor-relay added

comment:11 Changed 6 years ago by nickm

Component: Tor RelayTor

comment:12 in reply to:  7 Changed 6 years ago by arma

Replying to nickm:

So for testing, I'm going to suggest we might actually be okay trying the simplest possible implementation of this proposal: the one that doesn't know about all the imbalances that can occur in read/write ratios [edge connections; directories; etc], and which doesn't implement RelayBandwidth vs GlobalBandwidth.

If I understand the proposal properly, then even that version of the patch should be sufficient to show us a reduction latency/bufferbloat/etc. If the initial results look promising, *then* let's polish it up. How does that sound?

Ok, the initial results look promising.

Is the next step (post sponsorf year2 deadline) to polish up the patch?

I feel like the patch makes the issue look more complex than it needs to be. But the issue looks too complex for me to be sure. ;)

comment:13 Changed 6 years ago by nickm

Is the next step (post sponsorf year2 deadline) to polish up the patch?

Polish up or rewrite or revise, depending on how far we decide the patch is from the "right" thing for the codebase.

I feel like the patch makes the issue look more complex than it needs to be. But the issue looks too complex for me to be sure. ;)

Sadly, the patch's problem was _also_ that it didn't cover enough cases. We should probably reread through all the design discussions that led to the current patch/design status again, to see whether we might have overlooked that would let us do the right thing, simply.

comment:14 Changed 6 years ago by karsten

Parent ID: #4682

The November 1 deadline has passed. Removing the parent ticket relationship to #4682 in order to close that ticket.

comment:15 Changed 6 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final

comment:16 Changed 4 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.???

comment:17 Changed 21 months ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:18 Changed 20 months ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:19 Changed 15 months ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:20 Changed 15 months ago by nickm

Keywords: network performance credit-bucket added
Points: 5
Severity: Normal
Summary: Review and update any existing patches for proposal 182Review and update any existing patches for proposal 182 ("credit buckets")
Note: See TracTickets for help on using tickets.