Opened 9 years ago

Closed 8 years ago

#2586 closed task (duplicate)

Compare circuit build timeouts to Torperf completion times

Reported by: karsten Owned by: karsten
Priority: Medium Milestone:
Component: Metrics/CollecTor Version:
Severity: Keywords: TorPerfIteration20110305
Cc: mikeperry, tomb Actual Points:
Parent ID: Points: 4
Reviewer: Sponsor:

Description

Our slow-ratio #1919 Torperf runs indicate that there could be a correlation between the Tor client's circuit build timeouts and Torperf's measured completion times. We should grep Tor's log for circuit build timeouts and combine them with Torperf's .data files. We could visualize the two data sets as scatter plot with CBTs on x and completion times on y.

Child Tickets

Attachments (5)

torperf-circuit-build-timeouts.tar.gz (519.0 KB) - added by karsten 9 years ago.
#1919 log file parts containing circuit build timeouts
task2586.tar (945.0 KB) - added by karsten 8 years ago.
Data and source code for plotting scatter plot matrix
scattermatrix.png (381.3 KB) - added by karsten 8 years ago.
Scatter plot matrix
task2586-2.tar (945.5 KB) - added by karsten 8 years ago.
Data and source code for plotting CBTs and completion times
timematrix-timeouts.png (617.1 KB) - added by karsten 8 years ago.
CBTs and completion times

Change History (20)

comment:1 Changed 9 years ago by mikeperry

See also ticket #2551, where we specify that we also want to properly record (in a specified well-formed, extensible format) both circuit completion times as well possibly the BUILDTIMEOUT_SET control port event that specifies what the current circuit build timeout is.

This ticket is for the intermediate hack step of deciding if the CBT actually tells us anything visually about the torperf runs, to decide if it is worth including in the well-formed format of #2551.

Changed 9 years ago by karsten

#1919 log file parts containing circuit build timeouts

comment:2 Changed 9 years ago by karsten

I just attached the log file parts containing circuit build timeouts of our #1919 runs. Torperf's data files are available in the usual place, that is, here.

I could work on this task. If someone else would want to work on it (or parts of it), here's how I would start:

  • Write a script that reads Tor's log files and Torperf's .data files and writes new CSV files with a) the columns from the .data file and b) another column for the currently used circuit build timeout.
  • Extend filter.R from the #2543 code to include circuit build timeouts in the huge filtered.csv file.
  • Modify timematrix.R from the #2543 code to draw scatter plots of circuit build timeouts and Torperf completion times as described in the ticket description above.

Note that if I would work on this, I wouldn't care much about reusability of the new and changed code. This is a one-time analysis of which I do several per week. I would attach the code to this ticket, but I would only clean it up and generalize it if the results turn out to be useful.

Any takers?

comment:3 Changed 9 years ago by karsten

Owner: karsten deleted
Status: newassigned

comment:4 Changed 9 years ago by karsten

Keywords: TorPerfIteration20110305 added

comment:5 Changed 9 years ago by karsten

Owner: set to karsten

No takers so far. I'll work on this ticket.

Changed 8 years ago by karsten

Attachment: task2586.tar added

Data and source code for plotting scatter plot matrix

Changed 8 years ago by karsten

Attachment: scattermatrix.png added

Scatter plot matrix

comment:6 Changed 8 years ago by karsten

I added a scatter plot matrix of circuit build timeouts and completion times, including the data and source code.

So far, I don't see a clear correlation between the two variables. One would expect that completion times are lower for smaller timeouts, right? I don't see that effect for any one of the 15 Torperf runs.

Mike, is there anything else I should do with these data? Can we close this ticket?

comment:7 Changed 8 years ago by mikeperry

Hrmm, I think the scatter plot is the wrong way to represent this. The circuit build timeout doesn't magically make tor fast. It instead should be altering the top quartile of results only, and making them more dense. It is a strategy for reducing the variance in tor performance.

What we should do is plot the circuit build timeout as a line on top of the timematrix results that also had the quantile lines on it. We should then check to see if the slopes of the circuit build timeout lines has any relation to the slopes of the quantile lines.

The quantile lines with the most slope over time for the higher quantiles seemed to be the slow and slowratio 50kb fetches. It is those two in particular we want to look at first.

comment:8 Changed 8 years ago by dchasteen

Added points from sprint planning meeting.

comment:9 Changed 8 years ago by dchasteen

Points: 3

Added points from sprint planning meeting.

comment:10 in reply to:  7 Changed 8 years ago by karsten

Points: 34

Replying to mikeperry:

Hrmm, I think the scatter plot is the wrong way to represent this. The circuit build timeout doesn't magically make tor fast. It instead should be altering the top quartile of results only, and making them more dense. It is a strategy for reducing the variance in tor performance.

What we should do is plot the circuit build timeout as a line on top of the timematrix results that also had the quantile lines on it. We should then check to see if the slopes of the circuit build timeout lines has any relation to the slopes of the quantile lines.

The quantile lines with the most slope over time for the higher quantiles seemed to be the slow and slowratio 50kb fetches. It is those two in particular we want to look at first.

I tried to make a graph like the one you described and attached it. I think this graph has a couple of problems:

  • We haven't figured out yet what the quantile lines really mean. I think adding them to the graph is wrong.
  • Completion times and circuit-build timeouts are both in a time unit, but particularly for 1MB and 5MB on a different scale. ggplot doesn't allow different y scales for different variables, mostly because the authors say it's bad style.

I'm running out of ideas how to visualize what you expect to see. :(

Bumping points from 3 to 4.

Changed 8 years ago by karsten

Attachment: task2586-2.tar added

Data and source code for plotting CBTs and completion times

Changed 8 years ago by karsten

Attachment: timematrix-timeouts.png added

CBTs and completion times

comment:11 Changed 8 years ago by mikeperry

Owner: changed from karsten to mikeperry
Status: assignedaccepted

Ok, well from this graph it seems clear that we can see some correlation between the circuit build times, but that it is also not always present during periods of poor performance.

You're right, I think we want to do away with R's notion of these quantiles and actually compute our own quantile lines using buckets as opposed to some linear fit... Or, maybe we can just use R's quantile function in a piecewise fashion (like for each day's worth of results), so it doesn't try to fit a line to the whole graph.

I can create a new ticket for this for next iteration, and also include my points from https://trac.torproject.org/projects/tor/ticket/2543#comment:9.

comment:12 Changed 8 years ago by mikeperry

Owner: changed from mikeperry to karsten
Status: acceptedassigned

Why did trac decide to reassign this ticket to me just because I wrote a comment... Maybe it knew I was asking for too much? ;)

comment:13 Changed 8 years ago by mikeperry

If you want to declare this ticket as closed, I can create a new ticket for improving this quantile line drawing in a future iteration.

comment:14 in reply to:  13 Changed 8 years ago by karsten

Replying to mikeperry:

If you want to declare this ticket as closed, I can create a new ticket for improving this quantile line drawing in a future iteration.

We have ticket #2564 for calculating and visualizing quantiles of moving windows. I'm adding the idea of playing more with stat_quantile() there.

But did we find a good answer to the question whether circuit build timeouts are correlated to Torperf completion times?

comment:15 Changed 8 years ago by karsten

Resolution: duplicate
Status: assignedclosed

Closing this ticket and opening #2690 as a follow-up ticket. Not sure if "duplicate" is the correct resolution for this case, but I had to pick something.

Note: See TracTickets for help on using tickets.