In #17158 (moved), we added a list of default fallback directories to Tor. Most clients will use these fallback directories to bootstrap in preference to the authorities.
After #17840 (moved) is merged, tor clients can bootstrap over IPv6. If the fallback directory has an IPv6 address, IPv6 clients will use it.
Can we check fallback IPv4 and IPv6 addresses regularly using DocTor?
It would be nice to report a summary figure, something like:
"200/250 fallback directories (80%) are reachable"
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
Oops, forgot to ask - DocTor is a tool to help directory authority operators. I'm fine with adding a check for the fallback directories but who should be notified about it and what action will be taken? I'm against generating notices unless someone is volunteering to be responsible to resolve the issue.
Oops, forgot to ask - DocTor is a tool to help directory authority operators. I'm fine with adding a check for the fallback directories but who should be notified about it and what action will be taken? I'm against generating notices unless someone is volunteering to be responsible to resolve the issue.
In general, it's good to know how the figure is trending, and be transparent about it, so can it be logged to #tor-bots every hour?
It's an issue if the figure drops below, say, 50%. We need to update the list of fallback directories in the next point release of every supported tor version. (Much like we update GeoIP.)
That's something I can do, but for redundancy, we should also notify nickm as the Core Tor lead.
Oops, forgot to ask - DocTor is a tool to help directory authority operators. I'm fine with adding a check for the fallback directories but who should be notified about it and what action will be taken? I'm against generating notices unless someone is volunteering to be responsible to resolve the issue.
In general, it's good to know how the figure is trending, and be transparent about it, so can it be logged to #tor-bots every hour?
Hang on, am I confusing consensus-health with DocTor?
I don't know how the DirAuth infrastructure works.
consensus-health is a website, DocTor is automated alarms. Once upon a time they were a single java codebase but they're now separate. To add confusion though the email list DocTor notifies is still called consensus-health@. ;)
Ok. Sounds like this should be a new daily check that notifies consensus-health@ if we drop below 50%, then someone can make a ticket about it.
Hi teor, where is the list of fallback directories? Sounds from the other ticket that it isn't merged into tor yet?
Without something to monitor this ticket is unactionable.
See src/or/fallback_dirs.inc in master for the current list in 0.2.8.1-alpha.
There's a minor update on the ticket that we'll merge later in the alpha series, once the set of changes is significant enough.
Thanks teor! Done, but with a lot of expansion over what you asked. Stem now has a FallbackDirectory class with two methods for getting this information...
FallbackDirectory.from_remote() reads the latest fallback_dirs.inc from gitweb, providing the latest fallback directories in tor's master branch.
FallbackDirectory.from_cache() provides the latest fallback directories Stem has cached. This is only as up-to-date as your Stem release but is quicker and avoids relying on gitweb.
Advantages are...
Stem's descriptor.remote module now puts less load on the directory authorities since it uses fallback directories as well.
Much, much easier to add further scripts that take advantage of the fallback directories.
Running Stem's integ tests with the ONLINE target includes a test that exercises all the fallback directories, notifying us if any are down.
Here's an example script to check the performance of the fallback directories...
import timefrom stem.descriptor.remote import DescriptorDownloader, FallbackDirectorydownloader = DescriptorDownloader()for fallback_directory in FallbackDirectory.from_cache().values(): start = time.time() downloader.get_consensus(endpoints = [(fallback_directory.address, fallback_directory.dir_port)]).run() print('Downloading the consensus took %0.2f from %s' % (time.time() - start, fallback_directory.nickname))
% python example.pyDownloading the consensus took 5.07 from Doedel22Downloading the consensus took 3.59 from tornoderdednlDownloading the consensus took 4.16 from LogformeDownloading the consensus took 6.76 from Doedel21Downloading the consensus took 5.21 from kitten4Downloading the consensus took 3.25 from kiliDownloading the consensus took 4.23 from wagnerDownloading the consensus took 3.30 from BabylonNetwork03Downloading the consensus took 3.50 from kitten2Downloading the consensus took 3.31 from cobyDownloading the consensus took 5.61 from GrmmlLitavisDownloading the consensus took 5.05 from Doedel24Downloading the consensus took 3.60 from BabylonNetwork02Downloading the consensus took 3.61 from UnnamedDownloading the consensus took 2.71 from BinnacleDownloading the consensus took 30.80 from eriadorDownloading the consensus took 6.91 from Doedel26Downloading the consensus took 3.30 from fluxe4Downloading the consensus took 3.16 from PedicaboMundiDownloading the consensus took 3.33 from kitten1Downloading the consensus took 3.39 from fluxe3
Feel free to reopen if you need anything else.
Trac: Status: new to closed Resolution: N/Ato implemented
Thanks teor! Done, but with a lot of expansion over what you asked. Stem now has a FallbackDirectory class with two methods for getting this information...
FallbackDirectory.from_remote() reads the latest fallback_dirs.inc from gitweb, providing the latest fallback directories in tor's master branch.
FallbackDirectory.from_cache() provides the latest fallback directories Stem has cached. This is only as up-to-date as your Stem release but is quicker and avoids relying on gitweb.
In #16774 (moved), we added the fallback directories to GETINFO defaults. Tor 0.2.8.1-alpha and later should be able to tell stem the fallback directories this way as well.
Stem's descriptor.remote module now puts less load on the directory authorities since it uses fallback directories as well.
FYI, tor currently tries to connect to 3 fallback directories in the first few seconds, then tries an authority. It downloads from the first one that connects, and cancels the others. See #4483 (moved).
Downloading the consensus took 30.80 from eriador
That's not good, can doctor please report any fallback directories that take a relatively long amount of time to serve a consensus (like doctor does for the authorities), and report any that take more than 10 seconds?
How can I get on a list that gets this output, or will it appear on IRC in #tor-bots?
Trac: Resolution: implemented toN/A Status: closed to reopened
In #16774 (moved), we added the fallback directories to GETINFO defaults. Tor 0.2.8.1-alpha and later should be able to tell stem the fallback directories this way as well.
That's fine. But this implementation doesn't require an active tor instance. For DocTor and other scripts dealing with descriptors having a tor process is an unnecessary hassle.
That's not good, can doctor please report any fallback directories that take a relatively long amount of time to serve a consensus (like doctor does for the authorities), and report any that take more than 10 seconds?
I doubt Nick wants a ticket every time a fallback directory is sluggish. If you're interested in avoiding slow fallback directories any reason not to simply run the script I gave above when picking them?
That's not good, can doctor please report any fallback directories that take a relatively long amount of time to serve a consensus (like doctor does for the authorities), and report any that take more than 10 seconds?
I doubt Nick wants a ticket every time a fallback directory is sluggish. If you're interested in avoiding slow fallback directories any reason not to simply run the script I gave above when picking them?
Sure, split off into #18398 (moved).
Thanks for implementing the IPv4 checks, the IPv6 checks are awaiting #17298 (moved).
Trac: Status: reopened to closed Resolution: N/Ato fixed
That's not good, can doctor please report any fallback directories that take a relatively long amount of time to serve a consensus (like doctor does for the authorities), and report any that take more than 10 seconds?
I doubt Nick wants a ticket every time a fallback directory is sluggish. If you're interested in avoiding slow fallback directories any reason not to simply run the script I gave above when picking them?
So I'm doing that when I pick them, but what if they become slow some time after the release?
Also, in #18812 (moved), we realised that we'd like to check that the fallback's current key matches the one in the source code.
So can you modify DocTor to call a fallback "failed" if:
it doesn't respond to an ORPort request, or
(almost all clients will connect to the ORPort and issue a begindir request)
the key doesn't match the one in the fallback list, or
it takes longer than 15 seconds to serve a consensus
(Are these doable? Is the amount of effort ok?
The current checks are still quite useful.)
It's ok to have a few fallbacks fail.
But I'd like to know when 25% of fallbacks are failing, so that we can update the list in the next point release.
How do I get that email/notification?
Trac: Status: closed to reopened Resolution: fixed toN/A
Hmmm. We can ping the ORPort but that's about it. DocTor can exercise the DirPort, but nothing besides tor knows how to talk the ORPort protocol. Capability I'd love to have in Stem though. :)
So to be clear are you asking for a ORPort ping? DirPort usage? Both?
the key doesn't match the one in the fallback list
Which key doesn't match? fallback_dirs.inc includes the address, dir_port, orport, fingerprint, and weight. Not spotting any keys.
it takes longer than 15 seconds to serve a consensus
Sure, can do. I probably won't be getting to this for a while though (pretty busy with nyx).
How do I get that email/notification?
Specify an address and we'll have DocTor send the notices there.
Hmmm. We can ping the ORPort but that's about it. DocTor can exercise the DirPort, but nothing besides tor knows how to talk the ORPort protocol. Capability I'd love to have in Stem though. :)
So to be clear are you asking for a ORPort ping? DirPort usage? Both?
Please ping the IPv4 ORPort, download a consensus from the IPv4 DirPort, and, when #17298 (moved) is done, ping the IPv6 ORPort. (Downloading a consensus from the IPv6 DirPort will be unreliable, and should wait for #18394 (moved). But that's OK, because almost all clients use the ORPort.)
the key doesn't match the one in the fallback list
Which key doesn't match? fallback_dirs.inc includes the address, dir_port, orport, fingerprint, and weight. Not spotting any keys.
The fingerprint.
I guess stem can't check the fingerprint unless it speaks the ORPort protocol?
it takes longer than 15 seconds to serve a consensus
Sure, can do. I probably won't be getting to this for a while though (pretty busy with nyx).
How do I get that email/notification?
Specify an address and we'll have DocTor send the notices there.
teor2345@gmail.com, and someone else as a backup. I think this should be nickm. He needs to know about failing fallbacks so we can decide whether to do a new point release, even if I'm the one that updates the fallback list.
The fingerprint.
I guess stem can't check the fingerprint unless it speaks the ORPort protocol?
What value are you hoping to get from this? Would checking that the fingerprint matches what's in the consensus do what you're after? Stem validates signatures of a few descriptor types but I don't think that's really what you're after here.
The fingerprint.
I guess stem can't check the fingerprint unless it speaks the ORPort protocol?
What value are you hoping to get from this? Would checking that the fingerprint matches what's in the consensus do what you're after? Stem validates signatures of a few descriptor types but I don't think that's really what you're after here.
Clients will refuse to connect to a fallback if it's changed its fingerprint from the fingerprint in the hard-coded list.
So yes, comparing the fingerprint in the fallback list to the current one for that IPv4:ORPort and IPv6:IPv6ORPort (if present) would discover this kind of failure. (And the IPv6 check wouldn't need any IPv6 connectivity!)