Preliminary analysis of the parent ticket suggests that most of the problematic requests are coming from an aggressively non-tor sources. One reasonable thing we could do to disfavor them is to give a 503 to directory requests that don't request compression, either with a ".z" suffix or an Accept-Encoding list.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
(The right way to see whether the request will be compressed is to look at the value of compression_methods_supported in directory_handle_command_get().)
Trac: Owner: N/Ato dgoulet Status: new to assigned
Thought 1: ORPort begindir requests from clients ask for compression, right? So this patch is a substantive and useful change, because it will resume answering real Tor clients who are trying to bootstrap from directory authorities, in a way that the #33029 (moved) patch broke (because all the dir auths running the #33029 (moved) patch are now returning 503 to nearly every consensus fetch attempt from non-relays, including ORPort-based begindir requests).
Thought 1b: Other effective ways to make us resume answering clients, but still survive the current flood, include "always answer microdesc-flavored requests" or "always answer ORPort-based requests". I think just favoring microdesc-flavored requests would not be enough to reenable the fetches from torflows though, because torflows set UseMicrodescriptors to 0. And "always answer ORPort-based requests" would not be enough to reenable the fetches for relays that haven't yet gotten into the consensuspublished their relay descriptor, because they would be fetching via the DirPort because they think they're a relay, yet the dir auths do not yet think they are a relay. A third edge case is DocTor and DepicTor (aka consensus-health), because right now with the defenses in place they're failing to fetch dir info from the authorities (#33067 (moved)). But consensus-health is doing its fetches wrong in all three ways -- it is fetching the vanilla consensus, uncompressed, from the dirport -- so it is going to have to change here no matter what. In sum: looking at whether compression is requested is the best choice of these three.
Thought 2: This commit looks fine to me, and it's a good start, but for me it is not enough.
Right now all it does is always answer compressed requests, which will implicitly drive bandwidth use toward compressed requests, because this strategy will leave less bandwidth available for uncompressed requests.
So it does not accomplish the "aggressively" part. With just this patch, moria1 will still max out its bandwidth use for whatever bandwidthrate I pick. So I will be forced to pick something tiny like 10MBytes/s because I know that whatever I pick, I will use that amount 24/7. And that situation will make the network more brittle, because moria1 won't be able to burst to handle larger directory loads if some other anomaly appears that requires dir auth responses to be able to scale suddenly.
So what I want in addition to the above is some way to say 503 to these unwanted requests even if I'm not at the edge of my bandwidth rate this very moment. I would be ok with a config option to just send 503 to uncompressed requests, and be done with it. In that case we are simply declaring that uncompressed requests are no longer supported by dir auths. Or I would also be ok with a config option that specifies what level of bandwidth to give to these requests, i.e., a bucket water level below which we send a 503, even if there are otherwise quite a few tokens in my bucket. That approach is a bit more complicated to build, but it would preserve the illusion that we're still willing to answer uncompressed requests. And it would more seamlessly handle the situation where the current ddos slowly fades away and/or returns later.
Thought 2b: We might worry that flat-out denying uncompressed requests will force this current ddoser to change, and we'll end up finding it harder to distinguish their requests and discard them. But I think with #33029 (moved) plus dgoulet's patch here, we'll already be denying the great majority of the uncompressed requests. With the "bucket water level" design I just described, we'll reject more but it won't be a substantive difference. Or said another way, if we're worried that the ddoser will change their approach, we should worry about that even if we just do #33029 (moved) plus dgoulet's patch, because we will definitely be making their directory fetches very unreliable. So if we're committed to that risk (which I think we have to be, unless we can think of something smarter), we might as well maximize the good side of the tradeoff, and defend ourselves, and our bandwidth use, better.
Ok so I talked with nickm and to keep this in 043 (and possibly backport), I went with a more simple approach with minimal code change.
I have added the AuthDirRejectUncompressedRequests 0|1 option that will unconditionally reject uncompressed requests if set. It would be off by default in 043 stable (and possible backports).
First, though, we should make sure that there aren't any current URLs for which existent Tor relays/clients don't use compression on a fetch, and they're compressing an authority.
Also I should check that this is only happening on GET requests.
Putting this on needs_revision for now, because of the possibility of needing an 042 backport. If we decide not to backport, then it can go into merge_ready IMO, especially if it has had testing on a dirauth.
This one is based on #33029 (moved) so I will wait until it gets merged to propose the backport version of this one. And it will make review much easier.
I think this change looks good, but I am not sure about allowing all compressed requests. I think we should check the write bucket for compressed requests, so that we prioritise requests in this order, by default:
voting
compressed requests
uncompressed requests
But if that's not possible, or it would take a lot of code changes, I'm happy to prioritise compressed requests (including voting) over uncompressed requests.
Also, please remove the binary files from the PR. Do we need to update the .gitignore to exclude *.a ?
Trac: Keywords: network-health 043-should deleted, network-health 043-should consider-backport-after-0434 added Reviewer: nickm, arma to teor Status: needs_review to needs_revision