Hello, I like the way this is going. Your version has various improvements in terms of engineering logic (e.g. the way primary guards work and the circuit states).
I mainly focused on the circuit states and guard list details. I still need to think more about this in terms of concurrent circuits, and how the 'pending' flag works there.
BTW, the thoughtworks team has written lots of code (and unittests) for this feature. It would be great if we could reuse some of that work, and not let it all go to waste. Fortunately, I feel that Nick's current version shares enough features with the thoughtworks version that we might be able to reuse stuff.
I'm not sure if there's a gitlab for this version, or if I should wait for asn to make one.
Here are my comments:
ReachableAddresses is insufficient to model reachability
In [Section:FILTERED], we say that a member of FILTERED_GUARDS must have the following property:
It is not disabled because of ReachableAddress policy [XX??]
This works, but is not enough...
ReachableAddresses implicitly includes FascistFirewall and FirewallPorts (which are merged into ReachableAddresses during options processing anyway). We can also understand ClientUseIPv4 0 and ClientUseIPv6 1 as part of ReachableAddresses: although the current implementation doesn't actually merge them into ReachableAddresses, it acts as if it does.
ReachableDirAddresses and Directory Guards
We don't appear to model the distinction between ReachableDirAddresses and ReachableORAddresses. This doesn't matter for clients, because in 0.2.8 and later, clients only use begindir over ORPort for directory fetches (see #19704 (moved) for a ticket to deprecate ReachableDirAddresses and ClientPreferIPv6DirPort).
However, relays still choose directory guards for their fetches (ignoring ReachableDirAddresses, because they are relays). But do they use them? Some do, some just fetch straight from the authorities. But perhaps they're unnecessary for a relay. Perhaps we should just abolish directory guards entirely. (Almost all of a client's guards will cache directory info once they upgrade to 0.2.8, and relays can just ask a relay with a DirPort, or an authority.)
But does this open relays up to an attack where they eventually try a malicious relay?
Should all relays just fetch from the authorities? (Or the fallback directories?)
IP Version Preferences
There's no mention in the proposal of clients which can reach both IPv4 and IPv6 addresses, but prefer IPv6 addresses. We should use ClientPreferIPv6ORPort to tell us whether to prioritise using guards that support IPv6 over those that do not. [Section:FILTERED] would be a good place for this. I'd say it like this:
If ClientPreferIPv6ORPort is 1, and UseBridges is 0, it has an IPv6 ORPort
(this restriction must be ignored if fewer than {MEANINGFUL_RESTRICTION_FRAC} of {set:GUARDS} have an IPv6 ORPort),
That way, we make sure every guard is usable. But we also make sure that Tor still works on IPv4-only networks.
I considered modifying [Section:CONFIRMED] or [Section:PRIMARY] instead, but it got very complicated, very quickly. But perhaps we should consider using it as one of the ordering criteria as well, particularly if the current proportion of IPv6 guards is close to {MEANINGFUL_RESTRICTION_FRAC}.
No stable Tor release ever used ClientPreferIPv6DirPort for anything. It was introduced in 0.2.8.1-alpha, and effectively ignored in 0.2.8.4-rc when we switched to begindir for clients. We'll get rid of it in #19704 (moved).
Be careful about defining "public information"
When the proposal says: "(We do not randomize this, since it is public information.)", it's subtly wrong. While the information about which consensus a guard first went missing in is public, the information that this tor client was running and downloaded a consensus on that date is not.
In particular, if the guard went missing at the last hour on that date, it could be used to determine whether a user's Tor Browser was open in that particular hour. Combine several sets of information like this, and an adversary could generate a time profile.
No Reachable Guards
[XX Problem point: we have the potential, given a sufficientlyrestrictive filter, to have very few sampled guards in ourfiltered set.]
I suggest we warn the user, create a different state instance "D. Pathologically Restricted". We then make sure we initialise that state with enough usable guards. One way of doing this is to repeatedly re-sample, clearing it if necessary. This is ok on initialisation, because no-one knows our previous choices.
We could make it non-persistent to avoid leaking information about our previous restrictions, at the cost of potentially choosing an evil guard. Or we could make the opposite choice, and leak info, but avoid evil guards.
If we have multiple pathological environments, or all our previous guards go down in a pathological environment, then we could throw away our previous choices and start again. This exposes us to the risk of an evil guard, but avoids information leaks.
It's a tradeoff. I'm not sure which way to go. But we should at least allow users who only care about circumvention to use Tor in restrictive environments. If we don't do this, they will complain. Loudly.
Minor Issues
- Its bridge status matches the value of UseBridges.
What is a "bridge status"?
I suggest: The bridge status of a guard is 1 if it is defined using a Bridge line, and 0 otherwise.
There's no mention in the proposal of clients which can reach both IPv4 and IPv6 addresses, but prefer IPv6 addresses. We should use ClientPreferIPv6ORPort to tell us whether to prioritise using guards that support IPv6 over those that do not. [Section:FILTERED] would be a good place for this. I'd say it like this:
If ClientPreferIPv6ORPort is 1, and UseBridges is 0, it has an IPv6 ORPort
(this restriction must be ignored if fewer than {MEANINGFUL_RESTRICTION_FRAC} of {set:GUARDS} have an IPv6 ORPort),
That way, we make sure every guard is usable. But we also make sure that Tor still works on IPv4-only networks.
I considered modifying [Section:CONFIRMED] or [Section:PRIMARY] instead, but it got very complicated, very quickly. But perhaps we should consider using it as one of the ordering criteria as well, particularly if the current proportion of IPv6 guards is close to {MEANINGFUL_RESTRICTION_FRAC}.
Hmm, I'm not sure this will work, particularly for bridge clients, which only know either an IPv4 or an IPv6 address for their bridge. (Or a pluggable transport address, some of which are fake - but that is another matter.)
So we'll need to change the ordering in [Section:CONFIRMED] so either:
all guards with preferred addresses are higher priority than all guards without preferred addresses, or
within any grouping, guards with preferred addresses are higher priority than guards without preferred addresses. (I think this is the option we want.)
I think we might also want to make a similar modification to [Section:PRIMARY].