Do simulations of initial proposal 182 patch

added component::metrics/analysis parent::4682 priority::medium resolution::implemented status::closed type::task labels

The "task5336a" branch in my git repo (git://git.torproject.org/~arma/git/tor) is vanilla master as of today.

The "task5336b" branch is the credit bucket patch, using the huge 10MB credit cap.

And the "task5336c" branch uses a more conservative credit cap (making it equal to our bandwidthburst).

It would be great to see comparisons between these three.

While I'm at it: Rob/Kevin, when you set the BandwidthRate and BandwidthBurst for your simulated relays, do you pick the smallest number out of the descriptor and set both rate and burst to that number? Or do you pull out both the Rate and the Burst and use them?

I imagine a simulated network that never has any extra space in its token buckets could behave quite differently from the real Tor network (where the fast relays often have significant cushion).

Trac:
Username: Flo
Cc: robgjansen kevin to robgjansen, kevin, tschorsch@cs.uni-bonn.de

With ExperimenTor, I pull out both the BandwidthRate and BandwidthBurst options when sampling routers from a live Tor network configuration.

Replying to kevin:

With ExperimenTor, I pull out both the BandwidthRate and BandwidthBurst options when sampling routers from a live Tor network configuration.

Shadow also uses both the BandwidthRate and BandwidthBurst from the server descriptors.

Replying to arma:

The "task5336a" branch in my git repo (git://git.torproject.org/~arma/git/tor) is vanilla master as of today.

Kevin and I realized yesterday that the correct url is "git://git.torproject.org/arma/tor". The one I first mentioned won't work.

I just merged master into the task5336{a,b,c} branches and pushed new versions of them. That way they include the fix for #5373 (moved).

Trac:
task5336-2012-03-15.pdf

shadow simulation results for task 5336

I've attached a first set of results. The Tor model is as described in #4086 (where relay capacities in Shadow are based on their reported observed bandwidth in Tor).

Each of the task{a,b,c} branches were run directly, adding only configs needed for my private test network.

Completed download counts may give us a sense of load on the network. taska: 9482 320KiB (web), 43 5MiB (bulk) taskb: 27635 320KiB (web), 188 5MiB (bulk) taskc: 20076 320KiB (web), 201 5MiB (bulk)

Is there a reason that the taska counts should be so low (I usually expect somewhere in the 20k range for web download counts)? Did something change in a recent version of Tor? Or should I look closer at the logs and rerun taska?

Replying to robgjansen:

Is there a reason that the taska counts should be so low (I usually expect somewhere in the 20k range for web download counts)? Did something change in a recent version of Tor? Or should I look closer at the logs and rerun taska?

I opened #5397 (moved) to focus on this question (since it came up in #4486 (moved) too).

Rob: On #5397 (moved) you mention you aren't experiencing the issue anymore. Does that mean you have useful graphs for this ticket now? :)

Replying to arma:

Rob: On #5397 (moved) you mention you aren't experiencing the issue anymore. Does that mean you have useful graphs for this ticket now? :)

I will when #6401 (moved) is no longer blocking simulation work.

Shadow simulations are now running on EC2.

Trac:
20120808-ec2-creditbuckets-combined.pdf

client performance, credit caps

I just uploaded a graph of client performance. See #6401 (moved) for a description of the Tor network model and the client model.

There are 3 experiments here, all run with tor-0.2.3.16-alpha:

Load distribution for vanilla Tor:

TYPE	#XFERS	GiB	%
im	34735	0.033	0.075
web	85779	26.178	59.376
bulk	1586	7.744	17.565
p2p	596397	9.100	20.641
perf50k	1896	0.090	0.205
perf1m	965	0.942	2.138
TOTAL	721358	44.088	100.000

Load distribution for 10 MiB credit cap:

TYPE	#XFERS	GiB	%
im	21724	0.021	0.058
web	55965	17.079	47.710
bulk	2530	12.354	34.509
p2p	347047	5.296	14.793
perf50k	1517	0.072	0.202
perf1m	1000	0.977	2.728
TOTAL	429783	35.798	100.000

Load distribution for bandwidthrate credit cap:

TYPE	#XFERS	GiB	%
im	7217	0.007	0.031
web	29426	8.980	40.783
bulk	2379	11.616	52.755
p2p	48498	0.740	3.361
perf50k	841	0.040	0.182
perf1m	651	0.636	2.887
TOTAL	89012	22.019	100.000

It looks like the credit caps are reducing overall network load, mostly from the web clients. Bulk load seems to be increasing. The effect seems greater with smaller credit caps.

Are we able to say anything about the patch's effects on latency, memory usage, and whether nodes actually obey their bandwidth limits with the patch in place?

I have the same question as I had for #6341 (moved): both credit cap cases get their last byte faster than vanilla, but they end up doing fewer transfers. What's up with that?

Replying to arma:

I have the same question as I had for #6341 (moved): both credit cap cases get their last byte faster than vanilla, but they end up doing fewer transfers. What's up with that?

I don't know enough about what the credit cap thing is supposed to be doing here to answer this. Can you give any intuition to whether or not you would expect this to happen given the desired functionality here? And/or can you explain what the patch does briefly?

Also, note that a separate vanilla run was done in #6401 (moved) where the load mostly agrees with the vanilla run here. So is it reasonable to say the patch is causing the behavior?

Replying to nickm:

Are we able to say anything about the patch's effects on latency, memory usage, and whether nodes actually obey their bandwidth limits with the patch in place?

For each Tor node we can track CPU utilization, memory, and input/output bytes (though I may have to clean up some loose ends in this Shadow ticket). I believe this will allow us to address your concerns, but I am not sure what you mean by latency.

I'd have to do additional experiments with this feature turned on for the relays. Is it same to assume this is desired?

Print the heartbeat message every second instead of every minute with $ scallion --heartbeat-frequency=1 …
The heartbeat message will contain the number of bytes each nodes sends and receives per second. Match that up with the relay bandwidth limits to determine if nodes are actually obeying their bandwidth limits. You probably have to either modify the parse() function in analyze.py, or write a new script for this.

The per-node memory tracking is not working yet in Shadow, so we'll only be able to say things about overall memory consumption by looking at the data/dstat.log file.

Trac:

Replying to robgjansen:

Print the heartbeat message every second instead of every minute with $ scallion --heartbeat-frequency=1 …

The heartbeat message will contain the number of bytes each nodes sends and receives per second. Match that up with the relay bandwidth limits to determine if nodes are actually obeying their bandwidth limits. You probably have to either modify the parse() function in analyze.py, or write a new script for this.

Done. I wrote my own script and made two graphs: the first graph compares bandwidth rates to median bandwidths, and the second graph compares bandwidth bursts to 99th percentiles. For me it looks like all three branches respect bandwidth rates quite well and do not respect bandwidth bursts as much as they should. I do not see major differences between the three branches. I wonder if there's a better way to visualize this.

The per-node memory tracking is not working yet in Shadow, so we'll only be able to say things about overall memory consumption by looking at the data/dstat.log file.

I have the three dstat.log files. What do I do with them?

Trac:
Status: new to needs_review

Replying to karsten:

Replying to robgjansen:

Print the heartbeat message every second instead of every minute with $ scallion --heartbeat-frequency=1 …

The heartbeat message will contain the number of bytes each nodes sends and receives per second. Match that up with the relay bandwidth limits to determine if nodes are actually obeying their bandwidth limits. You probably have to either modify the parse() function in analyze.py, or write a new script for this.

Done. I wrote my own script and made two graphs: the first graph compares bandwidth rates to median bandwidths, and the second graph compares bandwidth bursts to 99th percentiles. For me it looks like all three branches respect bandwidth rates quite well and do not respect bandwidth bursts as much as they should. I do not see major differences between the three branches. I wonder if there's a better way to visualize this.

It may make sense that the amount sent on the wire is slightly more than the bandwidth 99th percentile bandwidth sent in Tor ( b/c control packets, packet header overheads, etc, are included in the amount sent on the wire but not in Tor's limits).

The per-node memory tracking is not working yet in Shadow, so we'll only be able to say things about overall memory consumption by looking at the data/dstat.log file.

I have the three dstat.log files. What do I do with them?

I believe the first few lines contain header info that explains the format of the csv. One of the columns has a timestamp and another has the system memory usage. You should be able to draw a memory-over-time plots with those two columns, and compare each branch in the same graph. (note that this is total system memory usage, so this would only work if nothing else is consuming memory on these machines - which should be the case if you used EC2)

Replying to karsten:

Replying to robgjansen:

Print the heartbeat message every second instead of every minute with $ scallion --heartbeat-frequency=1 …

The heartbeat message will contain the number of bytes each nodes sends and receives per second. Match that up with the relay bandwidth limits to determine if nodes are actually obeying their bandwidth limits. You probably have to either modify the parse() function in analyze.py, or write a new script for this.

Done.

Also, can you attach the performance graphs for this set of runs?

Trac:

Replying to robgjansen:

It may make sense that the amount sent on the wire is slightly more than the bandwidth 99th percentile bandwidth sent in Tor ( b/c control packets, packet header overheads, etc, are included in the amount sent on the wire but not in Tor's limits).

Makes sense. I attached another graph that shows cumulative fractions of the differences between 99th percentile and bandwidth burst. That graph shows that there's hardly any difference between the three branches.

I believe the first few lines contain header info that explains the format of the csv. One of the columns has a timestamp and another has the system memory usage. You should be able to draw a memory-over-time plots with those two columns, and compare each branch in the same graph. (note that this is total system memory usage, so this would only work if nothing else is consuming memory on these machines - which should be the case if you used EC2)

Okay, I attached a graph for system memory usage, too. All three branches were run in newly created EC2 instances. I can't spot any difference between the branches.

Also, can you attach the performance graphs for this set of runs?

I didn't make any performance graphs yet. Making them now. Will attach them once I have them.

Trac:
task5336-combined.pdf

Replying to karsten:

Replying to robgjansen:

Also, can you attach the performance graphs for this set of runs? I didn't make any performance graphs yet. Making them now. Will attach them once I have them.

Attached.

To sum things up:

it appears the node's are actually obeying the bandwidth limits
there is no noticeable degradation in system performance (memory and CPU usage)
client performance wins out in taskb (the 10 MiB cap)

Did I miss something?

Replying to robgjansen:

To sum things up:

it appears the node's are actually obeying the bandwidth limits

there is no noticeable degradation in system performance (memory and CPU usage)

client performance wins out in taskb (the 10 MiB cap)

Did I miss something?

Your conclusion looks about right. Does that mean we're done with this ticket and can close it?

Trac:
Status: needs_review to needs_information

We've definitely done some simulations.

I remain skeptical about the results though -- not because I think they're wrong, but I think because we don't have a good handle on what exactly is going wrong.

In particular, I wonder if further answers to #5398 (moved) would change our opinion here.

But this ticket does answer the "does it break or obviously go bad" question with a negative. Closing.

Trac:
Status: needs_information to closed
Resolution: N/A to implemented

closed

mentioned in issue #5397 (moved)

mentioned in issue #6401 (moved)

Do simulations of initial proposal 182 patch

Child items ...

Activity