Opened 9 years ago

Closed 8 years ago

#1766 closed task (implemented)

Project: "Improve TorPerf for more accurate measurements."

Reported by: nickm Owned by: karsten
Priority: Medium Milestone: Deliverable-Sep2010
Component: Metrics/CollecTor Version:
Severity: Keywords:
Cc: phobos Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by nickm)

We have a 30 September deliverable to make TorPerf more accurate. Andrew says there was some todo list to make torperf behave like a real tor client. iirc something about it ignoring guards and cbt.

Step one here is to get more clarity about what the deliverable means, and what part of it is deliverable in September, then to edit this description as appropriate.

Child Tickets:

#1918
annotate torperf output with paths
#1919
Set up a few new torperfs with fixed sets of entry guards


Child Tickets

TicketStatusOwnerSummaryComponent
#1918closedSebastianannotate torperf output with pathsMetrics/CollecTor
#1919closedSebastianSet up a few new torperfs with fixed sets of entry guardsMetrics/CollecTor

Change History (8)

comment:1 Changed 9 years ago by nickm

Description: modified (diff)

comment:2 Changed 9 years ago by karsten

Status: newassigned

The current torperfs do not use entry guards and make sure they build a new circuit for each request (every 5, 30, or 60 minutes). We have one torperf running 0.2.1.24-dev which doesn't use circuit build timeouts at all and two torperfs running 0.2.2.8-alpha-dev and 0.2.2.10-alpha-dev which may use CBT, but without the improvements in 0.2.2.14-alpha.

I'd like to keep these three setups unchanged, so that we get some long-term data for comparison.

So, the question is: What additional torperfs do we want? Turning on entry guards doesn't seem like a good idea, because it makes our results highly dependent on the initially picked entry guards. Also, what would we be trying to find out by that?

We might add a fourth torperf running 0.2.2.14-alpha to compare the CBT stuff to the other torperfs.

What else?

comment:3 Changed 9 years ago by arma

I'm not sure.

Andrew, can you give us some guidance about what the sponsor expects here? Or is there flexibility to turn this into whatever we see fit?

My guess is that Andrew listed this item based on overhearing discussions about the dev summit. I bet the discussion went something like "we should keep in mind that torperf's results are an average of all user results, and that some users will have better or worse results based on what guards they pick", and so Andrew wanted us to make torperf more realistic.

So on the one hand, we might say that torperf is already doing a fine job of letting us track average expected performance of the network over time.

Here's my first go at brainstorming a way to make torperf's output more useful. It would be good to see what performance the, say, 90th percentile of users see -- the people who are unlucky enough to pick guards that are slower than most other people pick. One approach would be to run 100 torperfs with entryguards on, and order them by performance, on the assumption that whoever gets bad torperf results clearly has worse guards (as long as we have enough data points). Yuck. A better approach would be to annotate torperf output with which first hop was used, and then do some smart analysis to reconstruct what the torperf graphs *would* have looked like for various combinations of guards they would have picked. I think that could yield some really useful results, if we can get the smart analysis right. That method would also let us simulate different guard selection algorithms, and reconstruct what the torperf output would have looked like if you'd picked that set.

comment:4 Changed 9 years ago by mikeperry

This sounds like something that can be done well with TorCtl/SQLSupport.py. We can just log all data, then run queries over it as needed. However, pretty sure I can't get anything reasonable done for Sept 30. This would be a good March task for me though.

comment:5 Changed 9 years ago by phobos

What about a second set of torperf nodes running more like a real 0.2.2.x client? The deliverable wants to see real world performance as if a standard client was running measurements. Or, said another way, "what performance should a real user expect to see over time when downloading various file sizes; 50KB, 1MB, and 5MB."

comment:6 Changed 9 years ago by mikeperry

One downside of this approach is that we can't see what effect CBT has on path selection for these paths. Its possible that clients with only really fast guards might decide that the fastest 80% of the paths of the network are much faster than clients with really poor guards.

So the results of the SQL select won't be identical to the 10 independent Torperf results..

comment:7 Changed 9 years ago by mikeperry

Sorry my reply was to arma's comment, not phobos's.

Phobos - the problem with just a single client is that we also need to have some way to determine just how typical that client is, and that depends largely on guard selection. However, with a bit of math, we can calculate the "most common" clients based on how many guards they have at which speeds. (Ie, how common it is to have each combination of fast, medium, and slow guards, and choose some representative clients from that).

comment:8 Changed 8 years ago by Sebastian

Resolution: implemented
Status: assignedclosed

Closing this since its two child tickets are done.

Note: See TracTickets for help on using tickets.