It's broken, and nobody should ever use it. It ends up restricting the available paths to an unacceptably low and hard to analyze quantity of the network total. Probably something in the ballpark of 20%.
Instead, we should reuse the CBT code+math with end to end stream timing probes (for ex: STREAM CONNECT requests to 127.0.0.1). I will bet my "got it by accident along the way" Math degree that the resulting distribution is Pareto, which means we can accurately predict to the percentile how many paths we would like to reject at this phase using the same code CBT uses.
logder argues that this approach is better than CBT, because CBT primarily measure CPU load, not latency, and also the CREATE cell queues are different than the RELAY cell queues, which makes circuit building a poor performance metric for normal traffic. I don't see a good reason not to do both and pick the percentiles such that their product is still satisfyingly large. CPU saturated relays should be avoided, anyway, IMO.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
I wrote some example end-to-end probing code for #7691 (moved). Could easily be generalized/adapted for stream timing data collection.
Also note that for purposes of things like #7691 (moved), we do not want any activity we always do over a circuit to cause it to be counted as "successfully used". Otherwise, path bias attackers can anticipate this activity and let it through before failing the circuit.
I've been brainstorming about how to best deploy something like this. I have the following high-level ideas about how we'd use it to best effect:
We create a LowLatencyPorts torrc option to include ports where interactive latency matters more than connection setup time. This would include such things as SSH, Skype, Mumble, jitsi, etc.
We set two pairs of consensus parameters: one for these sets of ports, and another for predictive-built circuits. Each pair would have a CBT part and and a Stream Ping Timeout part (SPT).
If a stream request (or predictive build) comes in for a LowLatencyPort and we don't have any such circuits available, we apply the CBT value for the construction, and then run the ping and apply that. We then flag any such circuits that survive the CBT timeout and the SPT timeout as acceptable for use for such ports.
For normal predictive circuits, we would apply CBT + SPT, but with higher (more lenient) values. For all other on-demand circuits, we'd apply CBT only (so you don't have to wait for a end-to-end ping before using them).
I'm thinking that the cutoffs would be something like CBT=85 SPT=85 for predictive circuits (yielding the "fastest" 72% of network paths for them), CBT=80 SPT=50 for LowLatencyPort circuits (yielding the "fastest" 40% of paths for them), and CBT=75 for all other on-demand circuits.
Of course, we'll want to tune LowLatency SPT cutoffs such that they can actually support voice traffic.
I think this hack alone will get us to the SponsorF deliverable for voice.
All the researchers doing Tor anonymity analysis get really agitated when we add new path selection approaches that aren't based on global information. And assuming the congestion is inside the network, where you're connecting from shouldn't make a big impact. And finally, all these "local not global" approaches raise complex questions about an adversary who influences a target user's opinions to influence her paths.
So the first question is, how well can we approximate your above plans with probers (a la bwauths)? And the followup question is, how much information do we need to put into e.g. the consensus for it to work?
Also, you should know that Micah Sherr's 'virtual coordinate system' plan has some code somewhere, though I have so far failed to publically pry it out of them.
All the researchers doing Tor anonymity analysis get really agitated when we add new path selection approaches that aren't based on global information. And assuming the congestion is inside the network, where you're connecting from shouldn't make a big impact. And finally, all these "local not global" approaches raise complex questions about an adversary who influences a target user's opinions to influence her paths.
So the first question is, how well can we approximate your above plans with probers (a la bwauths)? And the followup question is, how much information do we need to put into e.g. the consensus for it to work?
Tor latency has a heavy weight on local characteristics in many instances.
In fact, the consensus is exactly the right place to put a timeout to allow a local/censorship adversary to look at the consensus, look at your traffic, and delay your circuit by just enough to make it fail when they want, so your routes go where they want.
This information needs to be computed locally so we can ensure that your Tor client completes a predictable percentage of the total available network paths, exactly like the CBT math does.
In fact, that math actually works in the current tor implementation. To within +/- 1%, 80% of your circuits build before the circuit build timeoutthat your client learns, and you can observe this fact in your logs and in the BUILDTIMEOUT_SET values captured by Torperf.
All the researchers doing Tor anonymity analysis get really agitated when we add new path selection approaches that aren't based on global information. And assuming the congestion is inside the network, where you're connecting from shouldn't make a big impact. And finally, all these "local not global" approaches raise complex questions about an adversary who influences a target user's opinions to influence her paths.
So the first question is, how well can we approximate your above plans with probers (a la bwauths)? And the followup question is, how much information do we need to put into e.g. the consensus for it to work?
Interpreting your suggestion another way as opposed to a globally-constructed timeout: I guess you're suggesting we could alter path selection itself based on the measured congestion/queuing information of each relay.
While I admit that this would not have the opportunities for route manipulation that a consensus timeout-based approach would, the practical problem is that the consensus updates every 2-4 hours for clients.. I don't think this is frequent enough for us to measure or model node queuing delay.
I still think we can tune the local system such that it ensures it allows within a very close tolerance to the fastest X% of paths. We can also code it to disable itself if it is consistently unable to maintain a prediction of this timeout such that the expected percentage of stream probes actually complete within that timeout. We could also add this code to CBT, and have it revert to Tor's 1 minute timeout in such a case..
Also, you should know that Micah Sherr's 'virtual coordinate system' plan has some code somewhere, though I have so far failed to publically pry it out of them.
However, in general, I don't think a virtual coordinate system will work, because I suspect the major issue with Tor for low-latency applications is it's own per-hop queuing delay variance, not topology..
Attached is the result of measurements of ~1.6M circuits, sorted by guards and compared their CBT and First-RTT values. Only guards that were measured at least 2k times each are included. It looks like a distribution for First-RTT is more stable then for CBT. Parameters seem to differ between guards for both CBT and First-RTT.
I cannot comment on the sponsor issue. However, I would like to mention a paper we published on the ticket's topic - see https://naviga-tor.github.io/ for details (pre-print, code, and data).