Opened 12 years ago

Last modified 7 years ago

#547 closed defect (Fixed)

consensus with very few running routers

Reported by: weasel Owned by:
Priority: Low Milestone:
Component: Core Tor/Tor Version: 0.2.0.9-alpha
Severity: Keywords:
Cc: weasel, nickm, arma Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

when a majority of the authorities upgrade/restart just when it's time

to vote then they will builds votes with only themselves marked as running.

This causes consensus documents with only very few, if any routers marked

as running. In such cases it'd probably smarter to continue using an old

consensus.

Maybe the solution is to not vote when you have only been running for a
few (5 to 20?) minutes.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (12)

comment:1 Changed 12 years ago by nickm

Alternatively, we could have them vote, but not vote on the Running flag.

comment:2 Changed 12 years ago by nickm

12:01 < weasel> hah. I was going to suggest that also, but then figured that

hey, if you've not been around recently then chances are your
opinion of guard, fast, stable etc aren't all that good either

12:02 < nickm> Hmmm. I'd suggest that we decline to vote on lots of flags as a

current cut. In a later voting method, we can add a a way to
say "I abstain from everthing." How's that?

comment:3 Changed 12 years ago by arma

Based on dirserv_test_reachability()'s comments, it takes 1280 seconds before
we've measured everybody.

(We also launch a burst to measure every single router we know about at startup,
but it's not clear how effective that actually is.)

comment:4 Changed 12 years ago by nickm

So, the current flags are:

Authority Exit Fast Guard HSDir Running Stable V2Dir Valid BadExit Named Unnamed.

Which shall we decline to vote on if we haven't been around long enough?

comment:5 Changed 12 years ago by arma

Anything related to having descriptors, since we're probably in the process
of fetching newer ones from the other authorities.

To a first approximation, that's "all of them". To a second, it's "all but Authority".

comment:6 Changed 12 years ago by nickm

Ugh. Can we do better maybe? It's not uncommon to stop an authority and restart it immediately. If we do that,
the descriptors we have are still valid.

How about this: if we've just come up after being down at least X hours, vote only on Authority.

if we've just come up after being down less than X hours, vote on everything but Running.
if we've been up for at least 30 minutes, vote on everything.

Assuming this is sensible, the question becomes: what is X, and how do we detect it? One solution is to look at
the time in our state file when we first start. Are there better ones?

comment:7 Changed 12 years ago by arma

A value of X=1 seems plausible. Either you are restarting quickly, or you've probably
been down for a good while.

The cached-descriptors.new file gets written more often than state. Both are sort of
a kludge though.

Why are we trying to do better again? Eventually we will have enough stable v3
authorities that having a few of them not express preferences every so often should
be fine. This feels like complexity that we can do without.

comment:8 Changed 12 years ago by nickm

I'm thinking about the case where an important patch is released and a bunch of authorities reboot with
a new version at about the same time. But I guess if everybody resynchronizes quickly, that's fine.

Of course, if nobody votes on anything, that's a problem.

comment:9 Changed 12 years ago by arma

If there are no opinions at all, I agree that's a problem.

But if nobody has an opinion about Running, that's just as big of a problem.

And I don't see a way around that.

Maybe if none of the votes express opinions about Running, we refuse to make
a consensus, and hope that the previous one is still useful enough?

We could extend that to 'if none of the voters have opinions about Running,
Guard, Exit, Stable, Fast'.

comment:10 Changed 12 years ago by arma

Looks like Nick snuck a patch for this into r12441, where we vote on everything
but Running for the first 30 minutes.

comment:11 Changed 12 years ago by nickm

flyspray2trac: bug closed.

comment:12 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.