Opened 6 years ago

Closed 4 years ago

#13989 closed enhancement (fixed)

Freak out if we pick too many new guards in too short a time

Reported by: nickm Owned by:
Priority: High Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Normal Keywords: tor-client, guards, unfrozen, prop241, 027-triaged-1-in, 028-triage, 028-triaged, prop259, tor-guard, tor-guards-revamp, isaremoved, tor-03-unspecified-201612
Cc: robgjansen, asn, amj703 Actual Points:
Parent ID: Points: 4.5
Reviewer: Sponsor: SponsorU-can


According to the sniper attack paper, we should never really have to pick more than 5 new guards in a 4 week period (I think that's the number). If we do, our network is probably down or filtered or our guards are under attack.

This is going to have some tricky issues. For example, what should we do if we hit this threshold? We could decline to pick circuits until some node we've been willing to use as a guard comes up again, unless the user explicitly tells us to, I guess.

As another issue, we don't currently store exactly when we added a guard, but a randomized version of that. So perhaps we need a fuzzier version of this test.

Child Tickets

Change History (29)

comment:1 Changed 6 years ago by asn

I agree that hidden service operators should be able to specify their own limit of guard failures before they stop working. This feature can be disabled by default, and operators can enable them according to their own threat model and paranoia state.

FWIW, implementing this feature correctly and accurately might be hard given the current state of our guard data structures (e.g. #12466, #12450 and #12595).

comment:2 Changed 6 years ago by nickm

See also #5462 ; this is probably the same as that.

comment:3 Changed 6 years ago by asn

Some thoughts:

a) The sniper paper indeed has a table with some "too many new guards" thresholds. We should use it.
However, at the same time, that table was generated by reading past consensuses and simulations, not by running Tor.

Two days ago, I haxed my Tor and changed all the logs in entry_guard_register_connect_status() to log_warn() to see how often we use new guards. Over the past two days, and with NumEntryGuards being 1, I have used four guards: The first was my primary guard, then there are two extras because of #12450 when booting my laptop without network, and another guard that I'm not sure why it was used.

I also think that this feature should only be enabled if NumEntryGuards is 1. Since if the client is using more than 1 guard, the guard data structures are much more fragile (allowing bugs like #12466).

b) To make sure that our flaky guard data structures won't bite us here, I suggest we make a branch that just logs how many guards have been used over a day/week/month, and let testers run it for a while. I imagine that different use cases will see different amount of guards. For example, if someone switches on/off his network a lot during travelling, he should see many more guards used than someone using Tor on their desktop with a stable network connection.

Till then, maybe you want to check your state file and see which guards you are currently using? Are you using your top ones, or are they marked as EntryGuardDownSince?

c) The thresholds to "freak out" should probably be consensus parameters? So that we can tweak them according to the current guard settings. And they should also be torrc parameters. We should probably have a torrc option on how much to freak out; that is, whether to stop creating circuits, or just to issue a warning.

d) We probably want to do guards_used_n++ right after we successfully connect to a guard, somewhere around entry_guard_register_connect_status(). We probably don't want to wait after the whole circuit is completed, otherwise the other hops can fail the circuit and we would blame the guard.

comment:4 Changed 6 years ago by asn

I've been logging my guard use for the past few days, and now I have some more feedback. But first some stats: over the past 3 days, I have used 3 different circuit guards and 8 directory guards.

a) My top circuit guard has been up all the time. The reason I've used 3 different circuit guards is because of #12450. Worth noting that my internet connection is quite stable; if I was travelling instead, I would probably see #12450 happening much more. I will be travelling in a few weeks and I can do some more measurements then.

b) We use many directory guards because NumDirectoryGuards is 3 and that causes #12466 to happen a lot. I'm still undecided on whether directory guards should be counted as part of this ticket. We could potentially not count them because they are not very useful in e2e correlation attacks (?), but at the same time we end up revealing our IP address to many nodes of the network which is a bad thing.

c) Every day I see about 3-5 log messages like this:

[warn] Connected to new entry guard 'xxx'. Marking earlier entry guards up. 63/76 entry guards usable/new.
[warn] New entry guard was reachable, but closing this connection so we can retry the earlier entry guards.

this is part of the network down detection where if the connection succeeds to a new guard, we assume that the network is back up and try all guards from the top again. In this case, the connection to this new guard is closed. It's still unclear whether these short connections should be counted as part of this ticket for similar reasons as in the above point.

comment:5 Changed 6 years ago by asn

Nick, please check the branch guard_monitor in my repo:

It just promote some log messages to warn, so that you can better log your guard usage. I would suggest you run this branch for a few days and check your logs every once in a while.

I have personally added more logs around the code to better understand the various behaviors, but I will leave these additional logs to you. The ones I promoted are the most basic ones, I think.

comment:6 Changed 6 years ago by amj703

Cc: amj703 added

comment:7 Changed 6 years ago by nickm

Keywords: unfrozen added

comment:8 Changed 6 years ago by nickm

Keywords: prop241 added
Type: defectenhancement

comment:9 Changed 6 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.7.x-final

comment:10 Changed 6 years ago by nickm

Status: newassigned

comment:11 Changed 6 years ago by nickm

Keywords: 027-triaged-1-in added

Marking more tickets as triaged-in for 0.2.7

comment:12 Changed 6 years ago by isabela

Keywords: SponsorU added
Points: medium/large
Priority: normalcritical
Version: Tor: 0.2.7

comment:13 Changed 5 years ago by nickm

Milestone: Tor: 0.2.7.x-finalTor: 0.2.8.x-final

comment:15 Changed 5 years ago by nickm

Keywords: 028-triaged added

comment:16 Changed 5 years ago by nickm

Keywords: SponsorU removed
Sponsor: SponsorU

Bulk-replace SponsorU keyword with SponsorU field.

comment:17 Changed 5 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.9.x-final
Status: assignednew

Turn most 0.2.8 "assigned" tickets with no owner into "new" tickets for 0.2.9. Disagree? Find somebody who can do it (maybe you?) and get them to take it on for 0.2.8. :)

comment:18 Changed 5 years ago by isabela

Sponsor: SponsorUSponsorU-can

comment:19 Changed 5 years ago by nickm

Priority: Very HighHigh

comment:20 Changed 5 years ago by nickm

Keywords: prop259 added

These are all prop259-related.

comment:21 Changed 5 years ago by mikeperry

Keywords: tor-guard added

comment:22 Changed 5 years ago by nickm

Keywords: tor-guards-revamp added

comment:23 Changed 4 years ago by isabela

Points: medium/large4.5

comment:24 Changed 4 years ago by asn

Parent ID: #12595
Severity: Normal

Adding #12595 as the parent of this ticket, since prop259 (or any other similar proposal) should aim to also solve this problem (by restricting the total number of guards we connect to).

comment:25 Changed 4 years ago by isabela

Keywords: isaremoved added
Milestone: Tor: 0.2.9.x-finalTor: 0.2.???

comment:26 Changed 4 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:27 Changed 4 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:28 Changed 4 years ago by nickm

Parent ID: #12595

comment:29 Changed 4 years ago by asn

Resolution: fixed
Status: newclosed

prop271 should mitigate this by greatly restricting the number of guards we will ever try. Closing this bug since prop271 got implemented and merged.

Note: See TracTickets for help on using tickets.