Opened 16 months ago

Last modified 4 hours ago

#29370 accepted enhancement

Measure mode with arbitrary tgen traffic models

Reported by: irl Owned by: acute
Priority: Low Milestone:
Component: Metrics/Onionperf Version:
Severity: Normal Keywords: metrics-team-roadmap-2020, metrics-team-roadmap-2020-june
Cc: karsten, acute, robgjansen, jnewsome Actual Points: 0.1
Parent ID: #33321 Points: 1
Reviewer: Sponsor: Sponsor59

Description

onionperf measure currently runs using a Torperf-like measurement model by default. We should allow any tgen model to be given on the command line, and then our tgen client should use that model instead.

(Copied from https://github.com/robgjansen/onionperf/issues/9)

Child Tickets

Attachments (1)

0001-Removes-unused-traffic-model-argument.patch (1016 bytes) - added by acute 29 hours ago.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 16 months ago by irl

Cc: acute added
Owner: changed from hiro to metrics-team
Status: newassigned

Updating owner and CC

comment:2 Changed 10 months ago by irl

Resolution: invalid
Status: assignedclosed

comment:3 Changed 3 months ago by acute

Resolution: invalid
Status: closedreopened

Moving all gitlab OP tickets back to Trac.

comment:4 Changed 4 weeks ago by gaba

Keywords: metrics-team-roadmap-2020 added
Points: 1
Sponsor: Sponsor59

comment:5 Changed 2 weeks ago by karsten

Parent ID: #33321

comment:6 Changed 4 days ago by karsten

Cc: robgjansen jnewsome added

What is the use case here? Is this about providing TGen traffic models that are otherwise used in Shadow, but for more complex user models than bulk file downloads? OnionPerf users are not expected to generate these model files on their own, neither from scratch nor by editing existing model files, right?

One technical problem I see is that TGen models generated by OnionPerf contain measurement details like peers (server addresses) or ports (SOCKS port of the local Tor client). Would the provided models contain placeholders for these parts that OnionPerf fills in before passing models to its TGen instances? How does Shadow solve this problem?

Related to this ticket, we might want to implement #33243 by providing a traffic model file rather than by adding more command line parameters. In fact, we could take out the --oneshot parameter if we provide a traffic model file for it.

Implementation note: OnionPerf already contains a -m/--traffic-model parameter that does not do anything right now and that we would implement here.

comment:7 Changed 2 days ago by karsten

Actual Points: 0.1

comment:8 Changed 2 days ago by robgjansen

The use case was to allow OnionPerf to measure traffic patterns other than the usual "file download" pattern (the model that OnionPerf generates for itself internally). So, for example, you could set up a tgen traffic model to act like a ping utility, sending a few bytes to the server side and back to the client to measure circuit round-trip times.

My initial idea was not to make OnionPerf generate these models, but rather to create them externally and have OnionPerf just "pass them through" to tgen. That means whoever generated the models would need to correctly set the server addresses and SOCKS ports, etc.

I'm not sure that Tor wants this feature for OnionPerf. An alternative could just be to wait until you have a specific model in mind that you've decided you want to start measuring, and then generate that model internally as we do now with the 5MiB downloads. In that case the pass-through feature would not be needed, and you wouldn't have to maintain something that you don't use.

Do you think the pass-through feature is actually useful for Tor, or does generating models internally make more sense?

(The tgen models in Shadow are generated with the correct addresses/ports during the phase when we generate the Shadow experiment configuration. This is currently done using the Shadow-Tor config generator here.)

comment:9 in reply to:  8 Changed 2 days ago by acute

Replying to robgjansen:

My initial idea was not to make OnionPerf generate these models, but rather to create them externally and have OnionPerf just "pass them through" to tgen. That means whoever generated the models would need to correctly set the server addresses and SOCKS ports, etc.

I'm not sure that Tor wants this feature for OnionPerf. An alternative could just be to wait until you have a specific model in mind that you've decided you want to start measuring, and then generate that model internally as we do now with the 5MiB downloads. In that case the pass-through feature would not be needed, and you wouldn't have to maintain something that you don't use.

Do you think the pass-through feature is actually useful for Tor, or does generating models internally make more sense?

Having a pass-through feature could be very useful for research. For example, evaluating Tor performance with clients in mobile or other type of bandwidth-constrained networks would require a model that minimises the used bandwidth; or if a user wanted to fill the pipe for a congestion control experiment a larger file size model would be needed etc.

If there is no appetite for implementing this feature, what we could have instead is documentation that explains to users how to use their own model if they want to, and keep our own models (including oneshot) internally as suggested here.

comment:10 in reply to:  8 ; Changed 2 days ago by karsten

Thank you, robgjansen and acute, for your comments above! The more I think about this feature, the more I come to the conclusion that we do not need it.

I'll start with addressing acute's thoughts:

Replying to acute:

Having a pass-through feature could be very useful for research. For example, evaluating Tor performance with clients in mobile or other type of bandwidth-constrained networks would require a model that minimises the used bandwidth; or if a user wanted to fill the pipe for a congestion control experiment a larger file size model would be needed etc.

I can see how different network environments would require different measurement models. But maybe we can identify how these models should differ and then add parameters to OnionPerf's command line that feed into the generated TGen models. For example, the file size of the downloaded file could easily be a --filesize parameter on the command line.

If there is no appetite for implementing this feature, what we could have instead is documentation that explains to users how to use their own model if they want to, and keep our own models (including oneshot) internally as suggested here.

If researchers need to change parts of a model that cannot be configured using the OnionPerf command line interface, they will have to change the OnionPerf sources to do what they want. I'd say that that's still easier than editing a TGen XML file. If the missing piece is better documentation, we should provide that.

Now to robgjansen's thoughts:

Replying to robgjansen:

The use case was to allow OnionPerf to measure traffic patterns other than the usual "file download" pattern (the model that OnionPerf generates for itself internally). So, for example, you could set up a tgen traffic model to act like a ping utility, sending a few bytes to the server side and back to the client to measure circuit round-trip times.

The ping model is something that we're considering to implement in #30798, so let's consider what that would entail:

  • We would need a different TGen client model that sends a few bytes every second or every configurable amount of time.
  • We would also need to update the analysis code in OnionPerf. We'd no longer be interested in elapsed seconds between start and reaching a certain stage of the download. We'd want to extract the time between sending a ping and receiving a pong, for every one of them.
  • In fact, we might want to parse the TGen server logs as well, to learn when a ping arrived and the pong was sent back. We have that information, which is not available to the typical ping application, so why not use it.
  • We would have to update the visualization code to extract and display different metrics than in the bulk download case.

Thinking about different traffic models, what if we wanted to measure something like an HTTP POST rather than the HTTP GET? I'd assume that we'd have to provide a different TGen server model file as well, but I don't know for sure. If that's still possible with replacing just the TGen client model, there's probably another model that requires a custom TGen server model which we just didn't think of yet.

All in all, it's more than just the TGen model. We'd have to write a fair amount of code in order to implement a useful ping model in OnionPerf.

My initial idea was not to make OnionPerf generate these models, but rather to create them externally and have OnionPerf just "pass them through" to tgen. That means whoever generated the models would need to correctly set the server addresses and SOCKS ports, etc.

I'm not sure that Tor wants this feature for OnionPerf. An alternative could just be to wait until you have a specific model in mind that you've decided you want to start measuring, and then generate that model internally as we do now with the 5MiB downloads. In that case the pass-through feature would not be needed, and you wouldn't have to maintain something that you don't use.

Do you think the pass-through feature is actually useful for Tor, or does generating models internally make more sense?

(The tgen models in Shadow are generated with the correct addresses/ports during the phase when we generate the Shadow experiment configuration. This is currently done using the Shadow-Tor config generator here.)

If there had been existing models that we could have plugged in easily, that would have been a good argument in favor of this feature. But it seems like we could as well reuse the model-generating code from the Shadow-Tor config generator if we wanted to support these models in OnionPerf. And still, we would have to write new analysis and visualization code to evaluate those new measurements.

The internally generated model also has the advantage that it's easier to use. All it takes to start a measurement is a (potentially quite long) command with several parameters. But it doesn't require a (still potentially long) command plus one or two files. Describing the experiment would then be a matter of listing all software versions and the OnionPerf command used to start measurements.

My suggestions are that we:

  • make the current bulk transfer model more configurable by adding parameters like initial pause, transfer count, or filesize as part of #33432;
  • develop a ping model as internal model where OnionPerf generates TGen files, plus analysis and visualization code, as part of #30798, assuming there's need for developing such a model; and
  • remove the -m/--traffic-model parameter from the codebase and close this ticket as something we considered carefully but decided against.

Oops, this comment was longer than I had expected when starting to write it. Thanks for making it to the end, and thanks again for sharing your thoughts above. I'm open to discussing this more if there are aspects that I didn't acknowledge as much as I should.

comment:11 in reply to:  10 Changed 31 hours ago by robgjansen

Replying to karsten:

Thinking about different traffic models, what if we wanted to measure something like an HTTP POST rather than the HTTP GET? I'd assume that we'd have to provide a different TGen server model file as well, but I don't know for sure.

I think this only requires changes to the client side model, i.e., you would increase the sendsize and reduce the receivesize values.

If that's still possible with replacing just the TGen client model, there's probably another model that requires a custom TGen server model which we just didn't think of yet.

I designed TGen so that the server config is minimal: log level, how often to print a heartbeat message, and the port it should listen on. (See the start options table.) Otherwise it just responds to commands send from the client.

More complicated models than we have been discussing are possible, though, through the use of Markov models. Creating Markov models for TGen is even more complicated than creating TGen config files, but also really really powerful. I did my best to document how to create the Markov models, but I'm hoping that we won't need them for OnionPerf. (I use them to generate traffic flows in Shadow that are based on actual traffic flows that we measured at Tor relays.)

All in all, it's more than just the TGen model. We'd have to write a fair amount of code in order to implement a useful ping model in OnionPerf.

Agreed.

The internally generated model also has the advantage that it's easier to use. All it takes to start a measurement is a (potentially quite long) command with several parameters. But it doesn't require a (still potentially long) command plus one or two files. Describing the experiment would then be a matter of listing all software versions and the OnionPerf command used to start measurements.

My suggestions are that we:

[snip]

Agreed with all of this too! I think that as much as we can, we should make OnionPerf a self-contained tool that is primarily useful for generating and visualizing Tor metrics data. But, we could document that other models are possible but unsupported. If some researchers want to use OnionPerf to help them answer some research problems, they could fairly easily adapt OnionPerf to their specific needs.

comment:12 Changed 31 hours ago by robgjansen

Sorry, I don't know if "self-contained" is the right phrasing here. I mean that we should focus our efforts on building OnionPerf for the immediate needs of Tor metrics, rather than for many general use-cases. Reducing our scope will help us create a tighter, more polished tool.

comment:13 in reply to:  10 ; Changed 29 hours ago by acute

If researchers need to change parts of a model that cannot be configured using the OnionPerf command line interface, they will have to change the OnionPerf sources to do what they want. I'd say that that's still easier than editing a TGen XML file. If the missing piece is better documentation, we should provide that.

That is exactly what I had in mind for doing this.

My suggestions are that we:

  • make the current bulk transfer model more configurable by adding parameters like initial pause, transfer count, or filesize as part of #33432;
  • develop a ping model as internal model where OnionPerf generates TGen files, plus analysis and visualization code, as part of #30798, assuming there's need for developing such a model; and
  • remove the -m/--traffic-model parameter from the codebase and close this ticket as something we considered carefully but decided against.

This all sounds good, but I'd add to this writing some documentation for using other models we don't support as proposed above.

Last edited 29 hours ago by acute (previous) (diff)

comment:14 in reply to:  13 Changed 29 hours ago by karsten

Agreed with what you write, robgjansen.

Replying to acute:

If researchers need to change parts of a model that cannot be configured using the OnionPerf command line interface, they will have to change the OnionPerf sources to do what they want. I'd say that that's still easier than editing a TGen XML file. If the missing piece is better documentation, we should provide that.

That is exactly what I had in mind for doing this.

Great!

My suggestions are that we:

  • make the current bulk transfer model more configurable by adding parameters like initial pause, transfer count, or filesize as part of #33432;
  • develop a ping model as internal model where OnionPerf generates TGen files, plus analysis and visualization code, as part of #30798, assuming there's need for developing such a model; and
  • remove the -m/--traffic-model parameter from the codebase and close this ticket as something we considered carefully but decided against.

This all sounds good, but I'd add to this writing some documentation for using other models we don't support as proposed above.

Yes, good point! Updated next steps are:

  • make the current bulk transfer model more configurable by adding parameters like initial pause, transfer count, or filesize as part of #33432;
  • develop a ping model as internal model where OnionPerf generates TGen files, plus analysis and visualization code, as part of #30798, assuming there's need for developing such a model;
  • write documentation for using other models we don't support as proposed above; and
  • remove the -m/--traffic-model parameter from the codebase and close this ticket as something we considered carefully but decided against.

comment:15 Changed 28 hours ago by gaba

Keywords: metrics-team-roadmap-2020-june added

Adding all this tickets to the OnionPerf roadmap for June.

comment:16 Changed 12 hours ago by karsten

Status: reopenednew

Thanks for the patch! I applied it together with another minor patch to also remove the --traffic-model parameter from the documentation.

I also added comments to #33432 and #30798 with references to this ticket.

What remains is to write documentation for using other models we don't support as proposed above. I think we can keep this ticket open until that is done. Setting ticket status to new for that last part.

comment:17 Changed 4 hours ago by acute

Owner: changed from metrics-team to acute
Status: newaccepted
Note: See TracTickets for help on using tickets.