Opened 4 months ago

Closed 6 weeks ago

Last modified 6 weeks ago

#25774 closed project (implemented)

Record latency measurements using OnionPerf

Reported by: irl Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Ideas Version:
Severity: Normal Keywords:
Cc: chelseakomlo Actual Points:
Parent ID: Points:
Reviewer: irl Sponsor:

Description

OnionPerf already stores events during the circuit build that can be used to determine the circuit build latency. There may also be events that allow us to determine end-to-end latency. These are currently lost when converting the output into TorPerf's output format.

This was requested at the Rome Tor Meeting (2018).

Child Tickets

Attachments (5)

buildtimes.png (94.0 KB) - added by karsten 2 months ago.
latencies.png (111.9 KB) - added by karsten 7 weeks ago.
buildtimes-op-hk.png (123.3 KB) - added by karsten 7 weeks ago.
latencies-op-hk.png (133.2 KB) - added by karsten 7 weeks ago.
fractions-op-hk.png (149.7 KB) - added by karsten 7 weeks ago.

Download all attachments as: .zip

Change History (22)

Changed 2 months ago by karsten

Attachment: buildtimes.png added

comment:1 Changed 2 months ago by karsten

Status: newneeds_review

I just made a graph with circuit build times measured by all three OnionPerf instances in the past months:


If we were to put this on Tor Metrics, we'd probably make the source configurable and display either all sources at once or just a single one of them. And we'd probably stack the three graphs for the hops vertically, rather than horizontally, or alternatively use three colors for the three hops.

I think this already addesses the "circuit build latency" part in the ticket description. What exactly is meant by "end-to-end latency" there? Maybe there's data for that, too.

comment:2 Changed 2 months ago by chelseakomlo

Cc: chelseakomlo added

comment:3 in reply to:  1 ; Changed 7 weeks ago by irl

Reviewer: irl
Status: needs_reviewneeds_revision

Replying to karsten:

If we were to put this on Tor Metrics, we'd probably make the source configurable and display either all sources at once or just a single one of them. And we'd probably stack the three graphs for the hops vertically, rather than horizontally, or alternatively use three colors for the three hops.

This sounds good. This is not how I was expecting the graphs to look, the 2nd hop seems to happen faster than the 1st hop. Perhaps there are some overheads we incur on the 1st hop that regular clients wouldn't have because we don't use guards. (I don't think we can fix this, it's just something we should work out and document.)

It looks like there is overlap in the ranges for the 2nd and 3rd hops, so I think having separate plots is necessary and putting the 3 on the same plot with different colours wouldn't be readable.

The plots that share the same row are easier to compare than those that share the same column, so we should consider what users may want to compare.

For parameters, the current parameters for the existing performance graphs are all applicable except that it probably does not make sense to have this for the onion measurements.

I think this already addesses the "circuit build latency" part in the ticket description. What exactly is meant by "end-to-end latency" there? Maybe there's data for that, too.

End-to-end latency is the latency to send data across the circuit. From looking at the TorPerf spec, this would be the difference between DATAREQUEST and DATARESPONSE divided by 2.

comment:4 in reply to:  3 Changed 7 weeks ago by karsten

Replying to irl:

Replying to karsten:

If we were to put this on Tor Metrics, we'd probably make the source configurable and display either all sources at once or just a single one of them. And we'd probably stack the three graphs for the hops vertically, rather than horizontally, or alternatively use three colors for the three hops.

This sounds good. This is not how I was expecting the graphs to look, the 2nd hop seems to happen faster than the 1st hop. Perhaps there are some overheads we incur on the 1st hop that regular clients wouldn't have because we don't use guards. (I don't think we can fix this, it's just something we should work out and document.)

Indeed, this might be related to us not using guards. And yes, we should investigate this more and document it.

It looks like there is overlap in the ranges for the 2nd and 3rd hops, so I think having separate plots is necessary and putting the 3 on the same plot with different colours wouldn't be readable.

I was thinking of a graph like our "Fraction of connections used uni-/bidirectionally" graph. There's some overlap in that graph, too. Anyway, we can decide later.

The plots that share the same row are easier to compare than those that share the same column, so we should consider what users may want to compare.

Yes, but. We haven't used more than one graph in the same row on Tor Metrics. The reason is that all graphs have a configurable time frame on the x axis, and imagine how this graph would look like with years of data and 3 graphs next to each other. Also something we can decide later, when we're more sure what it is that we want to graph.

For parameters, the current parameters for the existing performance graphs are all applicable except that it probably does not make sense to have this for the onion measurements.

Right, it doesn't make much sense to have this for public vs. onion measurements. Nor does it matter whether we're fetching 50 KiB, 1 MiB, or 5 MiB files. Which basically leaves the source as parameter.

I think this already addesses the "circuit build latency" part in the ticket description. What exactly is meant by "end-to-end latency" there? Maybe there's data for that, too.

End-to-end latency is the latency to send data across the circuit. From looking at the TorPerf spec, this would be the difference between DATAREQUEST and DATARESPONSE divided by 2.

Aha! I'll make another graph for that. Do we want to do the division-by-2 step here, which includes the implicit assumption that both directions are equally fast, or do we want to use a round-trip metric of some kind? Just thinking about what would be more intuitive, that is, least confusing for users.

Thanks for your feedback so far!

Changed 7 weeks ago by karsten

Attachment: latencies.png added

comment:5 Changed 7 weeks ago by karsten

Replying to karsten:

Replying to irl:

End-to-end latency is the latency to send data across the circuit. From looking at the TorPerf spec, this would be the difference between DATAREQUEST and DATARESPONSE divided by 2.

Aha! I'll make another graph for that. Do we want to do the division-by-2 step here, which includes the implicit assumption that both directions are equally fast, or do we want to use a round-trip metric of some kind? Just thinking about what would be more intuitive, that is, least confusing for users.

I made a plot with round-trip latencies by onion vs. public server:


What do you think?

And was there another graph we discussed in Aberdeen that we could make from existing data? I vaguely recall we discussed three possible graphs, but I don't remember whether all three of them would use existing data.

comment:6 Changed 7 weeks ago by karsten

Status: needs_revisionneeds_review

comment:7 in reply to:  5 Changed 7 weeks ago by irl

Replying to karsten:

Indeed, this might be related to us not using guards. And yes, we should investigate this more and document it.

I've made a separate ticket for this: #26597

I was thinking of a graph like our "Fraction of connections used uni-/bidirectionally" graph. There's some overlap in that graph, too. Anyway, we can decide later.

Ah ok, I had not considered that the colours would be transparent. That is quite readable and more easily comparable in the time domain too.

For parameters, the current parameters for the existing performance graphs are all applicable except that it probably does not make sense to have this for the onion measurements.

Right, it doesn't make much sense to have this for public vs. onion measurements. Nor does it matter whether we're fetching 50 KiB, 1 MiB, or 5 MiB files. Which basically leaves the source as parameter.

Agreed.

Aha! I'll make another graph for that. Do we want to do the division-by-2 step here, which includes the implicit assumption that both directions are equally fast, or do we want to use a round-trip metric of some kind? Just thinking about what would be more intuitive, that is, least confusing for users.

I think RTT is fine actually. I guess people have heard of "ping" which uses RTT not RTT/2. I guess also if there are relays on xDSL lines then they definitely will have different latencies for upstream/downstream and so dividing by 2 would really just be an average.

Replying to karsten:

I made a plot with round-trip latencies by onion vs. public server:
What do you think?

This is really cool! I think using the transparent colours as in the connbidirect graph would be safe here to allow a more direct comparison between public/onion.

Can you make a couple of graphs, one for circ build times and one for rtt using the transparent colours just to see them?

And was there another graph we discussed in Aberdeen that we could make from existing data? I vaguely recall we discussed three possible graphs, but I don't remember whether all three of them would use existing data.

The third graph was I think scrapping the filesize parameters and instead using the percentile values. This is probably a different ticket if we choose to do it. (We might wait until I look at OnionPerf on LTE where we would want to be more conservative on bandwidth usage).

comment:8 Changed 7 weeks ago by irl

Status: needs_reviewneeds_revision

Changed 7 weeks ago by karsten

Attachment: buildtimes-op-hk.png added

Changed 7 weeks ago by karsten

Attachment: latencies-op-hk.png added

comment:9 Changed 7 weeks ago by karsten

Status: needs_revisionneeds_review

Replying to irl:

Replying to karsten:

Indeed, this might be related to us not using guards. And yes, we should investigate this more and document it.

I've made a separate ticket for this: #26597

Thanks!

Replying to karsten:

I made a plot with round-trip latencies by onion vs. public server:
What do you think?

This is really cool! I think using the transparent colours as in the connbidirect graph would be safe here to allow a more direct comparison between public/onion.

Can you make a couple of graphs, one for circ build times and one for rtt using the transparent colours just to see them?

Sure. I also changed them towards displaying a single source, which is closer to what we'd have on Tor Metrics later on (including the choice to display all sources together):



And was there another graph we discussed in Aberdeen that we could make from existing data? I vaguely recall we discussed three possible graphs, but I don't remember whether all three of them would use existing data.

The third graph was I think scrapping the filesize parameters and instead using the percentile values. This is probably a different ticket if we choose to do it. (We might wait until I look at OnionPerf on LTE where we would want to be more conservative on bandwidth usage).

Ah, yes. I'll think a bit about that, too, and open a new ticket for it.

comment:10 in reply to:  9 Changed 7 weeks ago by irl

Status: needs_reviewneeds_revision

Replying to karsten:

Sure. I also changed them towards displaying a single source, which is closer to what we'd have on Tor Metrics later on (including the choice to display all sources together):

I think these both look great. Even with the overlap I think they are still readable. I think these two are ready to be turned into patches.

Changed 7 weeks ago by karsten

Attachment: fractions-op-hk.png added

comment:11 in reply to:  9 ; Changed 7 weeks ago by karsten

Status: needs_revisionneeds_review

Replying to karsten:

Replying to irl:

The third graph was I think scrapping the filesize parameters and instead using the percentile values. This is probably a different ticket if we choose to do it. (We might wait until I look at OnionPerf on LTE where we would want to be more conservative on bandwidth usage).

Ah, yes. I'll think a bit about that, too, and open a new ticket for it.

I experimented a bit with this and came to the conclusion that we should probably avoid mixing 50 KiB, 1 MiB, and 5 MiB measurements and instead focus on 1 MiB measurements only. Otherwise we might see confusing results where, for example, partial 5 MiB downloads are completed faster than full 1 MiB downloads on some days. Also, I could imagine that 1 MiB is still a reasonable size for LTE experiments, whereas 5 MiB is too large and 50 KiB too small.

Here's a possible graph:


(I didn't open a new ticket, because this is still very related to the other two graphs. I also finished some code for the first two graphs today, but I'd like to test that code some more tomorrow before asking for a first review.)

comment:12 Changed 7 weeks ago by karsten

In addition to the third graph on partial downloads above, please also review commit 2761d1f in my task-25774 branch which implements the first two graphs on build times and latencies as discussed earlier.

comment:13 in reply to:  11 ; Changed 6 weeks ago by irl

Status: needs_reviewneeds_revision

Replying to karsten:

I experimented a bit with this and came to the conclusion that we should probably avoid mixing 50 KiB, 1 MiB, and 5 MiB measurements and instead focus on 1 MiB measurements only. Otherwise we might see confusing results where, for example, partial 5 MiB downloads are completed faster than full 1 MiB downloads on some days. Also, I could imagine that 1 MiB is still a reasonable size for LTE experiments, whereas 5 MiB is too large and 50 KiB too small.

My understanding for this one was that instead of performing three downloads, we would modify OnionPerf to only perform the one download, and derive values for smaller file sizes from the partial completion times.

Perhaps we need to modify OnionPerf to report different numbers, instead of percentiles. I think useful sizes would be 50K, 200K, 500K, 1M, 2M and 5M. To conserve bandwidth on the LTE probe(s) we could limit the downloads to 1M, although we'd have to see if we actually want to include those on Tor Metrics or just do a one-off analysis.

I am confused as to why exit circuits completed faster than onion circuits. Are you comparing DATAPERCx to DATARESPONSE or to START? I wonder if there's enough overhead in circuit setup that 1M is not enough time to see benefit from using an onion circuit instead.

Replying to karsten:

In addition to the third graph on partial downloads above, please also review commit 2761d1f in my task-25774 branch which implements the first two graphs on build times and latencies as discussed earlier.

I do not have an environment set up for metrics-web's statistics or Rserve, only for running enough to test Relay Search at the moment, so I have not run the code. The titles, descriptions and URLs all look good and I did not find any obvious errors in the Java, R or SQL.

I would not object to this being merged, but it's up to you if you want to wait for me to have a test environment set up.

comment:14 in reply to:  13 Changed 6 weeks ago by karsten

Status: needs_revisionmerge_ready

Replying to irl:

Replying to karsten:

I experimented a bit with this and came to the conclusion that we should probably avoid mixing 50 KiB, 1 MiB, and 5 MiB measurements and instead focus on 1 MiB measurements only. Otherwise we might see confusing results where, for example, partial 5 MiB downloads are completed faster than full 1 MiB downloads on some days. Also, I could imagine that 1 MiB is still a reasonable size for LTE experiments, whereas 5 MiB is too large and 50 KiB too small.

My understanding for this one was that instead of performing three downloads, we would modify OnionPerf to only perform the one download, and derive values for smaller file sizes from the partial completion times.

Perhaps we need to modify OnionPerf to report different numbers, instead of percentiles. I think useful sizes would be 50K, 200K, 500K, 1M, 2M and 5M. To conserve bandwidth on the LTE probe(s) we could limit the downloads to 1M, although we'd have to see if we actually want to include those on Tor Metrics or just do a one-off analysis.

Ah, got it. I'd say let's postpone this topic then, as it requires making code changes to OnionPerf and setting up new measurements which is out of scope for the current roadmap.

I'll finally open a new ticket for this before closing this one.

I am confused as to why exit circuits completed faster than onion circuits. Are you comparing DATAPERCx to DATARESPONSE or to START? I wonder if there's enough overhead in circuit setup that 1M is not enough time to see benefit from using an onion circuit instead.

I compared DATAPERCx to START to make the graph comparable to existing graphs. Keep in mind that onion circuits are twice as long as compared to exit circuits, so that downloads should still take longer after circuit setup. I didn't compare to DATARESPONSE, though.

Replying to karsten:

In addition to the third graph on partial downloads above, please also review commit 2761d1f in my task-25774 branch which implements the first two graphs on build times and latencies as discussed earlier.

I do not have an environment set up for metrics-web's statistics or Rserve, only for running enough to test Relay Search at the moment, so I have not run the code. The titles, descriptions and URLs all look good and I did not find any obvious errors in the Java, R or SQL.

I would not object to this being merged, but it's up to you if you want to wait for me to have a test environment set up.

That's good enough. Having descriptions reviewed matters most to me, and it's good to know that there are no obvious errors in the code. I'll find any remaining bugs when deploying things. (FWIW, I don't have a full metrics-web setup here, either.)

Thank you! I'll merge and deploy now. Setting to merge_ready for the first and second graph only, the third graph will go into its own ticket.

comment:15 Changed 6 weeks ago by irl

Ok cool. All sounds good to me. (:

comment:16 Changed 6 weeks ago by karsten

New graphs are available now:

Closing this ticket as soon as I have created that other ticket for the third graph.

comment:17 Changed 6 weeks ago by karsten

Resolution: implemented
Status: merge_readyclosed

Created #26673 for the third graph. Closing. Thanks!

Note: See TracTickets for help on using tickets.