Opened 7 years ago

Closed 7 years ago

Last modified 21 months ago

#7066 closed defect (fixed)

Guard disablement by path-bias detector must be disabled or removed

Reported by: rransom Owned by:
Priority: Immediate Milestone: Tor: 0.2.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-client
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Currently, any three dirauths (or any two dirauths if one other one starts voting on the relevant consensus parameters) can configure Tor clients to stop using their entry guards after a very small number of failed circuit-build attempts. This is as bad for a client as having UseEntryGuards disabled entirely.

Tor 0.2.3.x users will be compromised by this without even a log message warning them that they are abandoning entry guards.

This feature must be disabled completely for now or removed. The current path-bias detector code is unsafe to use with any non-zero value of pb_disablepct.

Child Tickets

Change History (15)

comment:1 Changed 7 years ago by mikeperry

As part of Proposal 209, I suggested we remove the ability of 0.2.3.x clients from disabling Guards entirely and promote the logline to warn (and also add an intermediate warn), so we have visibility for when the condition will hit without anything happening.

However, if this bug is instead about ways that rogue consensus parameters can harm the Tor network, we should create a parent ticket or retitle it. There are probably quite a few of them.. That's why we moved to the "median-of-three-or-nothing" strategy, IIRC.

comment:2 Changed 7 years ago by nickm

Status: newneeds_review

Please review branch "neuter_pathbias" in my public repository.

comment:3 Changed 7 years ago by mikeperry

Status: needs_reviewneeds_revision

What, what?

I'm sorry if this seems harsh and abrasive, but why did I bother to bust my ass for the proposal deadline if everything I suggested was just going to be ignored in favor of reactionary ifdefing?

I suggested an alternate plan here (needs merge again, btw):
https://gitweb.torproject.org/user/mikeperry/torspec.git/blob/path-bias-tuning:/proposals/209-path-bias-tuning.txt

On the other hand, if you just want to disable random features that can mess up the network if the directory authorities go rogue on us, you should also ifdef out the circuit build timeout code, the bandwidth authority voting, EWMA, the perconnbwrate stuff, and probably about a dozen other features too.

Either way, I think your commit needs some revision.

comment:4 in reply to:  2 ; Changed 7 years ago by rransom

Replying to nickm:

Please review branch "neuter_pathbias" in my public repository.

The changes/ file is wrong -- the problem isn't that pb_disablepct could keep clients from building circuits at all (although it would have that effect on clients which use bridges); the problem is that pb_disablepct is a user-tracing backdoor which can be used to cause clients to cycle through all the available Guard nodes as entry nodes.

Other than that, looks good for 0.2.3.x (I grepped for “path_bias_disabled” on that branch and didn't find anything else that turns it on). There will be a merge conflict on 0.2.4.x.

comment:5 in reply to:  4 Changed 7 years ago by mikeperry

Replying to rransom:

Replying to nickm:

Please review branch "neuter_pathbias" in my public repository.

The changes/ file is wrong -- the problem isn't that pb_disablepct could keep clients from building circuits at all (although it would have that effect on clients which use bridges); the problem is that pb_disablepct is a user-tracing backdoor which can be used to cause clients to cycle through all the available Guard nodes as entry nodes.

Can you explain the attack vector here? Should we disable all of our consensus parameter support?

Other than that, looks good for 0.2.3.x (I grepped for “path_bias_disabled” on that branch and didn't find anything else that turns it on). There will be a merge conflict on 0.2.4.x.

I do not believe this is the right way to disable the guard rotation features for 0.2.3.x. Please see my proposal, particularly the last section "Implementation Notes: Differences between proposal and current source".

comment:6 Changed 7 years ago by arma

nickm asked me to opine on this one even though i haven't read all the components yet. here are my early thoughts:

  • we definitely should not drop guards that go below whatever threshold, in 0.2.3.
  • i see these warnings in 0.2.4 in practice when my network gets flaky. i assume that means we missed spots in the code where the code concludes that the circuit failed and it's the guard's fault, but actually the circuit failed and it's the network's fault. i believe these are likely bugs in the cbt stuff too. we should at some point aim to refactor all the code that cbt touches so we get our behavior wrong in fewer places.
  • since i see these warns in 0.2.4, and therefore assuming that we are counting wrong, i am nervous about putting the logs into 0.2.3. i won't object to doing so if others really want to, but we should expect an ongoing pile of "bug" reports culminating in a good chance that we'll decide to take them out later as a backport.
  • if we aren't going to fix the counting bugs in 0.2.3 (i don't think we are anytime soon), i wonder if, when we *do* turn things on in 0.2.4 to experiment with them, we will end up wishing we didn't trigger false warnings in 0.2.3. does that mean we'll want different param names or something?
  • i'm really scared of the "dos the network and get people to dump their guards" attack. based on tariq's work i'm already planning to argue that we should bump up the guard rotation period to 6 months or more. anything that reduces that period, especially in a way that's adversary-influenceable, produces a very real attack on anonymity, from a very realistic adversary. that said, the adversary mike describes is a very realistic one too. i should read the proposal in more detail before deciding which one i'm more scared by.
  • i don't yet have an opinion on rransom's worry about "three authorities can set it to screw you". on first glance it looks like we don't have any other consensus params for which three auths can get together to harm anonymity. so we may in fact want to set tighter bounds on what failure percentages should cause a guard drop -- but i can't really speculate about that until we've resolved more of the issues i raise here. i wouldn't be opposed to some scheme where 3 auths acting by themselves are insufficient to set a consensus param; that should be a separate ticket if we want it.

comment:7 in reply to:  6 Changed 7 years ago by arma

Replying to arma:

  • since i see these warns in 0.2.4, and therefore assuming that we are counting wrong, i am nervous about putting the logs into 0.2.3. i won't object to doing so if others really want to, but we should expect an ongoing pile of "bug" reports culminating in a good chance that we'll decide to take them out later as a backport.

After sleeping a bit, I am less enthusiastic about putting the warnings in 0.2.3. The reason is that we already know they trigger (meaning there are bugs in how they're counted). And the warnings themselves don't really help us find the bugs (otherwise I'd've given you more hints than "it's when my network is flaky"). So if we want to put the warnings in 0.2.3, we should also plan to backport whatever fixes it takes to make the warnings only trigger when they should. That's likely more than we should want to backport.

To clarify, 0.2.3 is stable now. We're doing one last release, real soon now, and then calling it stable and focusing on 0.2.4.

comment:8 Changed 7 years ago by mikeperry

What warnings do you see again? There's a difference between the LD_BUG lines saying "This is a place where we might have miscounted/Tor circuit creation did something strange", and the "Your guard failed too many circuits" warnings...

Also, for the record, in every case where we've seen these overcounting messages, I've immediately dropped everything to diagnose them and at least write code to ensure they did not impact results. In fact, to my knowledge, due to the path_state_t state transition checks, none of the LD_BUG lines actually indicate overcounting/undercounting anymore. They merely mean we detected a weird state transition where we *would* have miscounted, had we not performed the state check.

Finally, I don't get how the guard rotation issue is comparable to the path bias attack. Tariq's paper calls your client's Guard list "compromised" as soon as you get one malicious Guard. I'm having a hard time seeing what attacks a malicious guard could perform that are more severe than this route capture attack...

The path bias defense isn't perfect, but what it *does* do is impede deployment of a Tor-busting router module by real world adversaries (ie Cisco and Bluecoat) that can fully deanonymize Tor for targeted users. The device works like this: You turn on a module in your Cisco router/Bluecoat filter/whatever that blocks specific users' access to all Tor guards/bridges except those that you control/compromise/issue NSLs to get the keys for. You load those Guard keys onto the device, which is able to tag all Tor circuits to those guards so that they automatically fail, unless they also extend to exits (or hidden service end points) whose keys are also under you control.

I'm not saying this warrants turning on the guard dropping defense in 0.2.3.x. What I believe it *does* warrant is getting it deployed as soon as possible so that such Tor-busting router modules do *not* get created. I think this means using the log lines in 0.2.3.x to help us fine tune the defense for 0.2.4.x (and perhaps decide if it is worth upgrading the bw auths to measure ambient circuit failure if it is found to vary considerably). Otherwise, we won't have anything deployed for regular users at least until 0.2.5.x is stable, and by then, such Tor-busting router modules surely will exist (if they don't already).

Obviously, the real solution to this mess is upgrading our protocols to remove malleability. However, I'm not optimistic we're going to have any clear notion on how to proceed on that front for several major Tor releases, and then there's the issue of waiting for whole the network to upgrade (including malicious Guards), while also avoiding any downgrade attacks. Unfortunately, I'm a shitty cryptographer, so instead of working on that front, I opted to try to buy us time to get that work+transition completed...

comment:9 in reply to:  8 ; Changed 7 years ago by arma

Replying to mikeperry:

What warnings do you see again? There's a difference between the LD_BUG lines saying "This is a place where we might have miscounted/Tor circuit creation did something strange", and the "Your guard failed too many circuits" warnings...

The latter.

comment:10 in reply to:  9 Changed 7 years ago by arma

Replying to arma:

Replying to mikeperry:

What warnings do you see again? There's a difference between the LD_BUG lines saying "This is a place where we might have miscounted/Tor circuit creation did something strange", and the "Your guard failed too many circuits" warnings...

The latter.

I believe it was this one:

      log_notice(LD_PROTOCOL,
                 "Low circuit success rate %u/%u for guard %s=%s.",

comment:11 in reply to:  6 Changed 7 years ago by arma

Replying to arma:

  • i don't yet have an opinion on rransom's worry about "three authorities can set it to screw you". on first glance it looks like we don't have any other consensus params for which three auths can get together to harm anonymity. so we may in fact want to set tighter bounds on what failure percentages should cause a guard drop -- but i can't really speculate about that until we've resolved more of the issues i raise here. i wouldn't be opposed to some scheme where 3 auths acting by themselves are insufficient to set a consensus param; that should be a separate ticket if we want it.

I think the short-term hack here can simply be for all dir auths to vote for "off" on this param. Then there's little risk that just 3 of them can force it to be something unsafe.

That said: if there's a bug that makes the current counts wrong, then if we don't rip pb_disablepct out of 0.2.3, we'll need to rename the consensus param (use a new one) in the future after we've fixed that bug, or the 0.2.3.22 people will be rotating guards way more often than they should.

Actually, since the feature is already in 0.2.3.x for earlier values of x, maybe it's too late, and the pb_disablepct name is already likely to be a lost cause?

comment:12 Changed 7 years ago by arma

the proposed plan is to leave 0.2.3 code alone, make all the auths vote 0 on disablepct, change the logs in 0.2.4 to make it more helpful when the bug occurs, and once the bug is identified and fixed, rewrite history so disablepct never existed, and think about whether to fix the bugs in 0.2.3 or 0.2.4, and maybe eventually come up with a consensus param to do what disablepct does now.

(unless there is no bug, in which case we get to keep disablepct and everybody lives happily ever after)

also, when i say 'the bug' i mean 'all the bugs'

mike and nick say they're ok with this plan.

comment:13 in reply to:  12 Changed 7 years ago by arma

Replying to arma:

make all the auths vote 0 on disablepct

I mailed the dir auths to ask them to do so.

comment:14 in reply to:  12 Changed 7 years ago by arma

Resolution: fixed
Status: needs_revisionclosed

Replying to arma:

change the logs in 0.2.4 to make it more helpful when the bug occurs

I opened #7157 for this part of the plan.

Closing this ticket now as resolved.

comment:15 Changed 21 months ago by teor

Severity: Normal

Set all tickets without a severity to "Normal"

Note: See TracTickets for help on using tickets.