Right now only three dir auths put BadExit in their known-flags, so it takes any 2 of those 3 to give a relay the BadExit flag, which causes an exit relay to not be used by clients for exiting. This is a great convenience for the dir auth operators, since otherwise we'd have to get a majority of all nine (i.e. five) dir auth operators to declare that a relay shouldn't be used for exiting, and we'd be much less agile in response to detected bad behavior.
In comparison, all nine relays put Valid in their known-flags, so it takes a full 5 of the 9 to give a relay the Valid flag -- or said another way, it takes a full 5 of the 9 to take it away.
In the context of malicious HSDir roles, this lack of agility is hurting us. We should explore ways to make !invalid more like !badexit.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
One option is to have some dir auths just decide they won't vote about Valid (we add another config option just like AuthDirListBadExits). Then the decision about which relays get the Valid flag falls to a subset of the dir auths. Shazam, I think we're there.
I worry though that some of the steps we've taken to de-fang non-Valid relays won't just magically come along there. For example, we withhold the HSDir flag if we withhold the Valid flag (#16524 (moved)), but if 3 authorities vote about Valid, and two of them deciding to withhold Valid is enough for the relay to not be Valid, yet 7 of them remain voting yes on HSDir, then the relay will end up with the HSDir flag even if it doesn't have the Valid flag.
One fix there would be to teach all Tors to pretend that the HSDir flag doesn't count if there isn't a Valid flag.
Another fix would be to make a new consensus method that knows what's going on, and everybody agrees that if the consensus is going to say this relay isn't Valid, then the consensus should also say that the relay isn't HSDir, isn't Guard, etc. This fix seems more likely to be done right.
One option is to have some dir auths just decide they won't vote about Valid (we add another config option just like AuthDirListBadExits). Then the decision about which relays get the Valid flag falls to a subset of the dir auths. Shazam, I think we're there.
I worry though that some of the steps we've taken to de-fang non-Valid relays won't just magically come along there. For example, we withhold the HSDir flag if we withhold the Valid flag (#16524 (moved)), but if 3 authorities vote about Valid, and two of them deciding to withhold Valid is enough for the relay to not be Valid, yet 7 of them remain voting yes on HSDir, then the relay will end up with the HSDir flag even if it doesn't have the Valid flag.
Seems like we would have to relax the HSDir and Guard flag requirement to NOT require Valid if your dirauth has AuthDirListValid 0. Aren't we losing the "majority" concept from all dirauth? Here is an example:
Let's assume 3 out of 9 have Valid in their known-flags. This means that 6 dirauth will NOT vote for Valid thus will vote for HSDir and Guard without caring if a relay is valid or not (because it's not their "job").
Now voting happens, we have 3 dirauth saying that X relays are invalid (flag majority 3/3) so the other dirauth do not put them in the consensus as they are invalid with enough vote. Thus the rest is Valid.
This basically means that 2/3 dirauth (majority) can choose which relays are Guard/HSDir or not since they can simply boot out of the consensus any relay they want. Isn't this making the 6 other dirauth quite useless? Two colluding dirauth here can control the whole network (as for BadExit but that's less scary then removing node from the network).
As much as I want a way for us to remove invalid relays fast, this seems like an insane pressure to few dirauth operators and a not very fun addition to our network security?
I agree with your analysis, but I'm not sure we don't have that situation already with bwauths. I guess to a slightly lesser extend, because we have 5 of those - otoh, it's even easier to blame malice on bwauth bugs because there are so many. Going from 5 to 3 makes me really sad, tho.
Well, awesome. Let's have 5 dir auths vote about Valid, then? And do the new consensus param so everybody agrees to take away other flags when a relay doesn't deserve Valid?
Based on the latest three months of blocking malicious HSDirs, I can count 6 dir auths that respond to the reject rules in < 3 days. I think out of those, 3 or 4 apply the rules in few hours after push to dirauth-conf git.
Having 5 dir auths with the Valid flag in known-flags, it means majority is 3/5 which is one more than the original proposal here but we have at least 4 dir auths that are fast responding so I propose we set that to 6 dir auths with Valid in known-flags which moves the required majority to 4/7 (which is good because we have 4 auths that are fast).
I have no idea why this is still weirding me out that is 4 dir auths can choose who is out where right now we need majority of 5/9... Plus side is that 4 dir auths colluding is (maybe) unlikely.