Opened 6 years ago

Closed 18 months ago

#5462 closed defect (fixed)

Clients should alert the user if many guards are unreachable

Reported by: mikeperry Owned by:
Priority: Medium Milestone: Tor: 0.3.0.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Normal Keywords: tor-client, large-feature, path-bias, 026-triaged-1, 027-triaged-1-in, 028-triaged, prop259, tor-guard, tor-guards-revamp, tor-03-unspecified-201612
Cc: asn Actual Points:
Parent ID: Points: 3
Reviewer: Sponsor: SponsorU-can

Description

If the user is behind a restrictive firewall, in a censored location, or is otherwise restricted in the number of guards they can use, the Tor Client should inform them of this fact.

Depending upon the rate of guard failure, tor should emit either a notice or a warn.

We should probably also perform a quick check to see if all guards are on a small subset of non-default ports, or perhaps just 80 or 443.

Child Tickets

Change History (38)

comment:1 Changed 6 years ago by arma

Summary: Clients should altert the user if many guards are unreachableClients should alert the user if many guards are unreachable

comment:2 Changed 6 years ago by arma

We've had this idea in mind for years now. But nobody has had a good plan for deciding how to actually decide when to alert. Imo we need to *not* alert when the user simply doesn't have her network connection up.

comment:3 Changed 6 years ago by nickm

Agreed. So let's see. I'll babble a little and perhaps come up with a solution at the end of it:

We want to detect when we can get through to only a subset of possible guards that should be up. We want to distinguish this from the case where our network has come up. So, let's do it like this: let's keep track of the status of our last N connection attempts to servers listed as Running. If some sufficient fraction of those is "failed", and the most recent attempt is "succeeded" , we might be just coming up: we can reattempt the connections to guards that had failed. If they start succeeding, we were only down; that's fine. If they still fail but the successful connections are "working", then we should warn.

Is that a plausible idea? It would need some nailing-down before it could be called an actual spec.

comment:4 Changed 6 years ago by nickm

Milestone: Tor: 0.2.4.x-final

Setting to 0.2.4.x. I'd take an obviously-right patch for 0.2.3.x, but I think this will need research and thought.

comment:5 Changed 6 years ago by mikeperry

Proposal 156 addresses the related ideas of detecting unreachable ports, which is a corner case we'll want to avoid mistaking for full Guard bias, if possible: https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/156-tracking-blocked-ports.txt

comment:6 Changed 6 years ago by nickm

Keywords: tor-client added

comment:7 Changed 6 years ago by nickm

Component: Tor ClientTor

comment:8 in reply to:  3 Changed 6 years ago by mikeperry

Replying to nickm:

We want to detect when we can get through to only a subset of possible guards that should be up. We want to distinguish this from the case where our network has come up. So, let's do it like this: let's keep track of the status of our last N connection attempts to servers listed as Running. If some sufficient fraction of those is "failed", and the most recent attempt is "succeeded" , we might be just coming up: we can reattempt the connections to guards that had failed. If they start succeeding, we were only down; that's fine. If they still fail but the successful connections are "working", then we should warn.

Hrmm. I'm not sure I like the "last N connections" thing.. It seems like a statistical check, and seems slightly more complicated than it needs to be?

What if we made this a periodic check of all entry guards whenever we detect network activity but the last check was some time ago? Ie: upon successful consensus download we could attempt to connect to every still-running guard in our guard list. That solves the "Is the network live?" question implicitly through valid consensus download, and allows us to check all of our guard nodes in a single pass, and alert the user if a large fraction of them are found to be unreachable. In this way, we can keep the code that does these checks mostly compartmentalized away from the rest of the guard usage code.

To avoid active attacks based recognizing the obvious signature of directory activity, we could additionally/alternatively perform this check at other obvious signs of network liveness, limited to some semi-periodic interval.

We could also inform the user of the top 2 or 3 most common ORport numbers for both the reachable and the unreachable sets of guards, and we can also state in this same logline if a large number of guards have been marked unreachable due to path bias.

If we implement a layer of entry guards for bridge users, we can actually do this same accounting for that layer, too. It bothers me that bridge users are basically already forced on to a very restricted set of entry points...

We should also ensure the following properties hold with guard list management:

  1. If we attempt to add a guard and it's unreachable but listed as up in the consensus, we should still keep it in our list of desirable guards.
  2. Any entry guard the consensus lists as running is not removed from our guard list, no matter how long we haven't been able to reach it. We want to keep a record of all the unreachable-but-running guards we've ever tried, I think.
  3. Anything else?

comment:9 Changed 6 years ago by nickm

I like the idea of killing the "last N connections thing" by optimistically retrying when we get a fresh consensus. (Don't we already mark guards as up when we get a consensus that lists them as Running, though? I thought we did something like that. We could change it from "mark them up" to "mark them as to-be-tested-for-upness", I guess?)

I don't like the "re-test all the guards" logic if the number of guards to test could be quite large, though. Can we limit it to some not-too-big number of our apparently-Running guards to try, or will a user with 20 guards they haven't been able to connect to launch 20 connections every time they get a consensus?

That solves the "Is the network live?" question implicitly through valid consensus download,

Hm. I think we need to have other triggers too, as you note, because it's quite possible for Tor to have with a fresh consensus but no network yet (like, when unsuspending a laptop with Tor running on it, then connecting to a wireless network or something).

comment:10 in reply to:  9 Changed 6 years ago by mikeperry

Keywords: mumble-feature added

Replying to nickm:

I like the idea of killing the "last N connections thing" by optimistically retrying when we get a fresh consensus. (Don't we already mark guards as up when we get a consensus that lists them as Running, though? I thought we did something like that. We could change it from "mark them up" to "mark them as to-be-tested-for-upness", I guess?)

I don't like the "re-test all the guards" logic if the number of guards to test could be quite large, though. Can we limit it to some not-too-big number of our apparently-Running guards to try, or will a user with 20 guards they haven't been able to connect to launch 20 connections every time they get a consensus?

It sounds reasonable that in this case we should only test a random subset, to avoid DoSing ourselves in case this feature has weird bugs. But if we have more than N=20+ guards that the dirauths think are up and we keep trying new ones because most are down, shouldn't this also be a cause for concern?

Related, we should also check if guards only appear to be up during our test and in the consensus, but not during normal use (to avoid active attacks).

That solves the "Is the network live?" question implicitly through valid consensus download,

Hm. I think we need to have other triggers too, as you note, because it's quite possible for Tor to have with a fresh consensus but no network yet (like, when unsuspending a laptop with Tor running on it, then connecting to a wireless network or something).

Ok. CBT already has some function calls that tell us when we've recently read data off an orconn. We can use a similar callback for this.

comment:11 Changed 6 years ago by nickm

Keywords: large-feature added; mumble-feature removed

This is important and we should do it. But I'm okay with kicking it out of 0.2.4 if it isn't done by the 10 Dec large-feature deadline: it's tricky, and reminds me a lot of CBT and PathBias and how it took a while to get them right.

comment:12 Changed 6 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final

Doesn't seem likely to get an implementation by the small-feature deadline. Deferring, but please move back if there is an implementation in the works.

comment:13 Changed 5 years ago by mikeperry

Keywords: path-bias added

comment:14 Changed 4 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.6.x-final

With luck the upcoming guard changes will make this easier; with unluck they will make them harder.

comment:15 Changed 4 years ago by nickm

Keywords: 026 added
Parent ID: #5456#11480

comment:16 Changed 4 years ago by nickm

Keywords: 026-triaged-1 added; 026 removed

comment:17 Changed 4 years ago by nickm

See also #13989; that is also probably the same as this. (Spotted by Mike)

comment:18 Changed 4 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.7.x-final

comment:19 Changed 3 years ago by nickm

Status: newassigned

comment:20 Changed 3 years ago by nickm

Keywords: 027-triaged-1-in added

Marking some tickets as triaged-in for 0.2.7 based on early triage

comment:21 Changed 3 years ago by isabela

Keywords: SponsorU added
Points: medium
Priority: majorcritical
Version: Tor: 0.2.7

comment:22 Changed 3 years ago by nickm

Milestone: Tor: 0.2.7.x-finalTor: 0.2.8.x-final

comment:23 Changed 3 years ago by nickm

Keywords: 028-triaged added

comment:24 Changed 3 years ago by nickm

Keywords: SponsorU removed
Sponsor: SponsorU

Bulk-replace SponsorU keyword with SponsorU field.

comment:25 Changed 3 years ago by nickm

Priority: Very HighMedium

comment:26 Changed 2 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.9.x-final
Status: assignednew

Turn most 0.2.8 "assigned" tickets with no owner into "new" tickets for 0.2.9. Disagree? Find somebody who can do it (maybe you?) and get them to take it on for 0.2.8. :)

comment:27 Changed 2 years ago by isabela

Sponsor: SponsorUSponsorU-can

comment:28 Changed 2 years ago by nickm

Keywords: prop259 added

These are all prop259-related.

comment:29 Changed 2 years ago by asn

Parent ID: #11480
Severity: Normal

comment:30 Changed 2 years ago by mikeperry

Keywords: tor-guard added

comment:31 Changed 2 years ago by nickm

Keywords: tor-guards-revamp added

comment:32 Changed 2 years ago by isabela

Points: medium3

comment:33 Changed 2 years ago by asn

Milestone: Tor: 0.2.9.x-finalTor: 0.2.???

Pushing this off 0.2.9 since we don't know how to do it right , and it's not in direct scope for prop259.

comment:34 Changed 2 years ago by asn

Cc: asn added

comment:35 Changed 21 months ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:36 Changed 20 months ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:37 Changed 18 months ago by teor

Status: newneeds_information

Was this implemented as part of proposal 217 in #19877?

comment:38 Changed 18 months ago by nickm

Milestone: Tor: unspecifiedTor: 0.3.0.x-final
Resolution: fixed
Status: needs_informationclosed

Indeed so!

s/217/271/

Note: See TracTickets for help on using tickets.