#2664 new enhancement

DoS and failure resistence improvements

We just had a near-catastrophe today when an IPv6 relay descriptor took out all of the Tor directory authorities. It took us ~10hrs to correct this issue. The maximum we had before the network breaks for everyone is 28hrs. We need to consider implementing some procedures to both reduce the amount of turnaround time it takes to diagnose and fix cases like this, and also enhance the network's ability to function if we can't bring the authorities back online within 28hrs.

This ticket is the parent ticket for a series of child tickets that have been created to remind us to create actual proposals and procedures.

Child Tickets

#572enhancementclosedfallback-consensus file impractical to use
#2665tasknewCreate a dirauth DoS response procedure
#2666taskclosedCreate a nagios config for dirauths
#2671taskassignednickmBetter communication for authority operators, core developers in emergency situations
#2681enhancementnewbrainstorm ways to let Tor clients use yesterday's consensus more safely
#2693enhancementnewDesign and implement improved algorithm for choosing consensus method
#4339taskclosedTurn on the last part of proposal 110
#4483defectclosedteorIf k of n authorities are down, k/n bootstrapping clients are delayed for minutes

Note: See TracTickets for help on using tickets.