Opened 8 years ago

Last modified 6 weeks ago

#2681 new enhancement

brainstorm ways to let Tor clients use yesterday's consensus more safely

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: prop212, tor-client, small-feature, tor-dos-dirauth, low-bandwidth, sponsor4, sponsor8-maybe, 034-triage-20180328, 034-removed-20180328 035-removed sponsor8-removed
Cc: Actual Points:
Parent ID: #2664 Points: 5
Reviewer: Sponsor:

Description

Right now Tor clients won't use a consensus that's 25 hours old. But if the directory authorities don't agree on a consensus for a day, things can go bad. We need to investigate other tradeoffs in this space than the one we've currently picked.

For instance: if you got your directory consensus info when it was valid, but you haven't been able to get any new consensus, perhaps you should be more forgiving about the timestamp on the consensus you have. That's a slightly different scenario than believing a new consensus that's 48 hours old.

Another option is just to change 24 to 48, which probably doesn't put clients at much greater harm, but gives us a lot more breathing room for mistakes.

The implementation side of this will be tricky, because we'll need to make sure that clients can handle descriptors that are 36 hours out of date too. We started implementing that feature several times, but I think we've never finished it.

Child Tickets

TicketTypeStatusOwnerSummary
#7241taskclosedVisualize how quickly the Tor network changes
#7986enhancementneeds_revisionLengthen the consensus validity interval

Change History (40)

comment:1 Changed 8 years ago by nickm

Milestone: Tor: 0.2.3.x-final

comment:2 Changed 7 years ago by nickm

Milestone: Tor: 0.2.3.x-finalTor: unspecified

comment:3 Changed 6 years ago by mikeperry

Keywords: dirauth-dos-resistance proposal-needed added
Milestone: Tor: unspecifiedTor: 0.2.4.x-final

With each new dirauth we add into the kool kids klub, it becomes less likely we'll be able to contact at least half of them within 25 hours in the event of something like the crash in the #2664 description.

What's the simplest thing we can get done on the 0.2.4.x timescale to improve this situation? Can we just bump the consensus and descriptor freshness limits?

In terms of what limit to bump to: I think we need to be able to survive at least a 3 day weekend. If someone were to bring down the dirauths on xmas, thanksgiving, or NYE, we need to not lose the Tor network because a patch couldn't be written in time.

comment:4 Changed 6 years ago by mikeperry

Keywords: MikePerry201210d added

I'll see if I can write something up for the Oct 10th deadline for this. Nick's main concern is that we make sure to preserve clients attempts fetch a fresh consensus from the dir mirrors/dirauths before using the old one to build circuits.

comment:5 Changed 6 years ago by nickm

Keywords: tor-client added

comment:6 Changed 6 years ago by nickm

Component: Tor ClientTor

comment:7 Changed 6 years ago by mikeperry

I pushed a proposal draft to my torspec.git remote mikeperry/tolerate-old-consensus.

The proposal is pretty simple, opting for the "just raise the freshness limit" route. I just did a bit of code review to round up all the defines involved in consensus and descriptor freshness and the functions that use them.

comment:8 Changed 6 years ago by arma

Initial thoughts:

  • s/Implementation Nodes/Implementation Notes/
  • It's good we're not trying to do this back in the era of normal descriptors. We throw those out after 24 hours, and we've had some concern in the past that it would be harder to move to a "after 24 hours but not if they're still referenced in a consensus" model.
  • While thinking about this I pondered trying to draw a distinction between "when I asked for a consensus they gave me this old one" and "I haven't been able to fetch a consensus for the past two days, but I still have this old one". The hope was that the former situation is scary ("under attack") but the latter is less scary ("undirected network problems"). But since clients fetch dir stuff via begin_dir these days, I don't think that distinction makes sense -- we can compare the time the relay says it is with the time on the consensus. But if they're much different, what do we do? "Log the possible attack and use it" is not so good.

I miss a discussion of the risk from using a 4-day-old consensus. Right now an adversary can give you his choice of 18 or so consensus documents, and you'll try a couple times to get something better, while using the one you've got. Now he can give you his favorite out of something like 120 consensuses. How much variance is there in them, and what are the characteristics between them that make them 'more vulnerable to attack' or less?

We should also make sure clients are asking with the "only give me a consensus if the one you have is newer than this time" option, to save bandwidth all around. (Alas, that's another leak about old client state -- "I'm the client that got its last consensus 36 days ago".)

Overall, I like the idea of bumping up the disaster timeframe. 5 days seems as good as any other choice. I think since some of the logic we're touching is finnicky, it'll be smartest to do some testing -- e.g. trigger the conditions in a test network and see what actually happens.

comment:9 Changed 6 years ago by mikeperry

Hrmm. I tried to ponder these imponderables, but I failed to both do that and get my other proposals done on time. Can we just set the limit at 3 days and call this 'small-feature' (or make a different ticket for that and call that one 'small-feature')?

Otherwise, once we start talking about checking consensus on the most current/correct consensus, we probably want something that tries to do multipath consensus hash verification. That seems like a /real/ proposal, as it would solve both this and other, perhaps more interesting attacks (such as https://lists.torproject.org/pipermail/tor-dev/2012-October/004063.html). One simple idea: Ask the k of fallback mirrors from #572 their current consensus hash, and make sure they all agree. They should all be authenticated by their identity key in the source code. Seems like this is a separate ticket for sure, though.

comment:10 in reply to:  9 Changed 6 years ago by nickm

Replying to mikeperry:

Hrmm. I tried to ponder these imponderables, but I failed to both do that and get my other proposals done on time. Can we just set the limit at 3 days and call this 'small-feature' (or make a different ticket for that and call that one 'small-feature')?

I'm pretty leery of calling stuff "small" when our reason for doing so is that if we called it "big" we couldn't merge it on the timeframe we want. That's as many as four tensWWWWW motivated reasoning, and that's terrible.

That said, this _does_ feel simple to me. If we get the proposal done soon and the code merged before the big feature deadline, we can try it out. (I don't want to push it to the end of the feature merge schedule, since this kind of thing is prone to having unexpected consequences that could mean more fixing would be needed .)

comment:11 Changed 6 years ago by mikeperry

Ok, which of the two would you prefer? If we're just changing the constant to 3-5 days, I think that proposal is "done" (modulo choosing the freshness duration. I picked 5 days, but 3 is also better than 24 hours).

If we're talking about creating mechanisms to verify consensus material is not targeted and is actually as current as it possibly can be, then we'd need a different (and substantially more complicated) proposal probably involving #572 in combination with some kind of query for the latest consensus creation time and ideally also some kind of "What's your latest consensus's hash" query.

I would like to write that second proposal, because I think it's a neat idea and helps address some other more serious route capture attacks involving dirauth key compromise, but I also probably can't get it done this week, nor will it be as straight-forward as just changing these defines to be a bit more relaxed.

comment:12 Changed 6 years ago by nickm

I think the "change it to 3 days" proposal isn't so bad; how about you turn your proposal into a minimal version of that and send it to tor-dev.

The second one seems significantly more complex; it's interesting to consider it for 0.2.5, but it doesn't feel feasible for 0.2.4 right now.

comment:13 Changed 6 years ago by mikeperry

Ok, I created #7126 for the second one.

comment:14 Changed 6 years ago by asn

Just brainstorming here, but I wonder if some kind of metric on how quickly the Tor network changes would help us decide if 3 days is a better interval than 5 days.
By "how quickly the Tor network changes", I mean that if you take a consensus X from 3 days ago and a consensus Y from today, what's the percentage of routers in Y that are also in X (based on identity key)?

Such a metric could be a set of probability distributions that describe how likely it is for the Tor network to change by a specific amount in X days.
So, for example, the probability distributions would tell us stuff like "Based on previous data, the Tor network has 40% chance to change by 20%, in five days." or "The Tor network has 80% chance to change by less than 5%, in one day." or "The Tor network has 40% chance to change by 35%, in two months".

comment:15 Changed 6 years ago by mikeperry

Keywords: small-feature added; MikePerry201210d removed

Hrmm. Not sure if we should make implementing the proposal and any related metrics a new ticket or not, here.. I'm probably going to personally ignore this until whatever the next deadline is, though. Let's just say 'small-feature' for now we don't forget about it entirely.

comment:16 Changed 6 years ago by nickm

I'd like us to start playing with this soon, just because it's easy. If it causes problems, let's find out about it soon.

comment:17 Changed 6 years ago by nickm

This is indeed small-feature, though we should try to get it done early in the cycle. We should also look into ways to simulate a client not being able to get a new consensus, so we can see how clients degrade without having to wait for the authorities to fail for a while.

comment:18 Changed 6 years ago by nickm

Keywords: prop212 added; proposal-needed removed

comment:19 Changed 6 years ago by nickm

Added #7986 to represent the "just have a slightly longer interval" interim fix.

comment:20 Changed 6 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final

Bumping to 0.2.5, along with other remaining noncritical enhancements.

comment:21 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.???

comment:22 Changed 3 years ago by mikeperry

Keywords: tor-dos-dirauth added; dirauth-dos-resistance removed

Canonicalize dirauth-dos to tor-dos-dirauth

comment:23 Changed 3 years ago by arma

Keywords: low-bandwidth added
Severity: Normal

comment:24 Changed 3 years ago by nickm

Points: 5

comment:25 Changed 2 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:26 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:27 Changed 2 years ago by nickm

Keywords: sponsor4 added

Bulk-adding "sponsor4" keyword to items that would appear to reduce low-bw clients' directory bandwidth usage. But we shouldn't build these without measurement/proposals: see #21205 and #21209.

comment:28 Changed 2 years ago by nickm

Sponsor: Sponsor4

Setting "sponsor4" sponsor on all tickets that have the sponsor4 keyword, but have no sponsor set.

comment:29 Changed 21 months ago by nickm

Remove Sponsor4 keyword, now that Sponsor4 is the value of the Sponsor field.

comment:30 Changed 21 months ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:31 Changed 21 months ago by nickm

Keywords: sponsor8-maybe added

comment:32 Changed 20 months ago by nickm

Milestone: Tor: unspecifiedTor: 0.3.2.x-final
Sponsor: Sponsor4Sponsor8-can

comment:33 Changed 18 months ago by nickm

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

comment:34 Changed 13 months ago by dgoulet

Milestone: Tor: 0.3.3.x-finalTor: 0.3.4.x-final

comment:35 Changed 11 months ago by nickm

Keywords: 034-triage-20180328 added

comment:36 Changed 11 months ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:37 Changed 11 months ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: 0.3.5.x-final

comment:38 Changed 8 months ago by dgoulet

Milestone: Tor: 0.3.5.x-finalTor: unspecified

This hasn't seen activity in so long, moving it out of 035.

comment:39 Changed 8 months ago by dgoulet

Keywords: 035-removed added

comment:40 Changed 6 weeks ago by gaba

Keywords: sponsor8-removed added
Sponsor: Sponsor8-can
Note: See TracTickets for help on using tickets.