As part of #5752 (moved) we need to know how many circuits we're making now, how many we're discarding early because a stream didn't work, etc.
This is a two-part project: first is a tool to automatically make a series of requests to Tor, in a repeatable way, and second is a Tor controller script, probably using Stem, that watches stream and circuit events (and maybe more), and tracks which streams get allocated to which circuits, how many total circuits are made, how quickly results return, and other statistics. Then we would change the underlying Tor, replay the same set of requests, and know what circuit behaviors to expect.
I expect we'll also discover that we don't export enough info via the control protocol to make good conclusions; in that case we'll also want to modify Tor to export this info.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
Another tool that would be handy to put together with this would be something that auto generates web fetches at specified times. To be realistic, maybe it actually launches a wget --mirror or the like, to pull down the images on the pages only after the initial html arrives. Or maybe this is better as one of those Firefox extensions that instruments Firefox to make automated clicks.
It's likely that you'll have more fun (and make better progress) by ignoring Torflow and just using Stem (https://gitweb.torproject.org/stem.git) to hear the events. Or heck, just write your own little script to connect to the control port and pull down the events you want. Most of the work will be in deciding what to compute based on the events, rather than in parsing them.
I'm not entirely sure what you're looking for, but while stem is functional it's still pretty rough around the edges. Event parsing will be early in Ravi's project so it should be done somewhere in early June, but for now stem only provides the unparsed message objects. Here's an example for printing events...
# Simple script to start a tor instance, attaches to it, and prints BW events# for a few seconds.import timefrom stem.connection import connect_port, authenticatefrom stem.control import BaseControllerfrom stem.process import launch_tor, NO_TORRC# controller class that simply prints the events that it receivesclass EventPrinter(BaseController): def _handle_event(self, event_message): print event_message# Start a tor instance that, hopefully, won't conflict with anything. We can# connect to it and start using the instance when bootstrapping reaches 5%.print "starting tor..."tor_process = launch_tor( options = {'ControlPort': '2777'}, torrc_path = NO_TORRC, completion_percent = 5,)with connect_port(control_port = 2777) as control_socket: controller = EventPrinter(control_socket) authenticate(controller) controller.msg('SETEVENTS BW') time.sleep(5)tor_process.kill()
... and here's an example for doing something similar with TorCtl...
If anybody runs across a great developer who wants to get involved in Tor, get up to speed on Stem, and help us do research, this is a great bite-sized project -- write the tool to cause the series of requests to Tor, and the tool to hear (via the control port) how the streams were assigned to circuits, how they succeeded or failed, etc.
I'm marking as 'bounty' because we could do it as "trial" contract work for somebody.
Trac: Description: As part of #5752 (moved) we need to know how many circuits we're making now, how many we're discarding early because a stream didn't work, etc.
I think we could do this as a Tor controller that watches stream and circuit events.
I expect we'll also discover that we don't export enough info via the control protocol to make good conclusions; in that case we'll want to modify Tor to export this info.
[I'm not sure what component to put this ticket in, so I picked Torflow since it's already pretty good at parsing tor controller output.]
to
As part of #5752 (moved) we need to know how many circuits we're making now, how many we're discarding early because a stream didn't work, etc.
This is a two-part project: first is a tool to automatically make a series of requests to Tor, in a repeatable way, and second is a Tor controller script, probably using Stem, that watches stream and circuit events (and maybe more), and tracks which streams get allocated to which circuits, how many total circuits are made, how quickly results return, and other statistics. Then we would change the underlying Tor, replay the same set of requests, and know what circuit behaviors to expect.
I expect we'll also discover that we don't export enough info via the control protocol to make good conclusions; in that case we'll also want to modify Tor to export this info. Summary: Write stream/circ event parser to track circuit use to Write tool to automate web queries to Tor; and use Stem to track stream/circ allocation and results
Somewhat related, I'm planning to use Stem for the Torperf rewrite that fetches popular websites using Selenium/Firefox and that logs request times and circuit details for later analysis. That's a sponsor F deliverable which is due February 28.
Fine to do, but I don't think it blocks #5752 (moved), nor is it likely to give us any real data on how often users navigate between top-level sites in aggregate/on average (which is the real source of circuit creation under #5752 (moved)).
I think OnionPerf may already do much of what you want. It has a 'measure' mode to download data over Tor, a 'monitor' mode to log Tor control port events to file, a 'analyze' mode to process those log files into data files, and a 'visualize' mode to plot the results of the analysis.
Oh also, it will hopefully replace TorPerf one day, because the types of requests sent through Tor can be customized allowing us to model much more complex behaviors than a single file of a specific size.