Opened 7 months ago

Last modified 3 months ago

#33072 needs_revision defect

When under load, give 503 aggressively for dirport requests without compression

Reported by: nickm Owned by: dgoulet
Priority: Medium Milestone: Tor: 0.4.4.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: network-health 044-should consider-backport-after-0434 042-backport 043-backport
Cc: gk, dgoulet Actual Points:
Parent ID: Points:
Reviewer: teor Sponsor:

Description

Preliminary analysis of the parent ticket suggests that most of the problematic requests are coming from an aggressively non-tor sources. One reasonable thing we could do to disfavor them is to give a 503 to directory requests that don't request compression, either with a ".z" suffix or an Accept-Encoding list.

Child Tickets

Change History (28)

comment:1 Changed 7 months ago by nickm

Owner: set to dgoulet
Status: newassigned

(The right way to see whether the request will be compressed is to look at the value of compression_methods_supported in directory_handle_command_get().)

comment:2 Changed 7 months ago by dgoulet

Reviewer: nickm, teor, armadev

Branch: ticket33072_043_01
PR: https://github.com/torproject/tor/pull/1698

This is based on #33029 so the commit you are interested in is the last one.

Last edited 7 months ago by dgoulet (previous) (diff)

comment:3 Changed 7 months ago by dgoulet

Status: assignedneeds_review

comment:4 Changed 6 months ago by arma

Thought 1: ORPort begindir requests from clients ask for compression, right? So this patch is a substantive and useful change, because it will resume answering real Tor clients who are trying to bootstrap from directory authorities, in a way that the #33029 patch broke (because all the dir auths running the #33029 patch are now returning 503 to nearly every consensus fetch attempt from non-relays, including ORPort-based begindir requests).

Thought 1b: Other effective ways to make us resume answering clients, but still survive the current flood, include "always answer microdesc-flavored requests" or "always answer ORPort-based requests". I think just favoring microdesc-flavored requests would not be enough to reenable the fetches from torflows though, because torflows set UseMicrodescriptors to 0. And "always answer ORPort-based requests" would not be enough to reenable the fetches for relays that haven't yet gotten into the consensuspublished their relay descriptor, because they would be fetching via the DirPort because they think they're a relay, yet the dir auths do not yet think they are a relay. A third edge case is DocTor and DepicTor (aka consensus-health), because right now with the defenses in place they're failing to fetch dir info from the authorities (#33067). But consensus-health is doing its fetches wrong in all three ways -- it is fetching the vanilla consensus, uncompressed, from the dirport -- so it is going to have to change here no matter what. In sum: looking at whether compression is requested is the best choice of these three.

Thought 2: This commit looks fine to me, and it's a good start, but for me it is not enough.

Right now all it does is always answer compressed requests, which will implicitly drive bandwidth use toward compressed requests, because this strategy will leave less bandwidth available for uncompressed requests.

So it does not accomplish the "aggressively" part. With just this patch, moria1 will still max out its bandwidth use for whatever bandwidthrate I pick. So I will be forced to pick something tiny like 10MBytes/s because I know that whatever I pick, I will use that amount 24/7. And that situation will make the network more brittle, because moria1 won't be able to burst to handle larger directory loads if some other anomaly appears that requires dir auth responses to be able to scale suddenly.

So what I want in addition to the above is some way to say 503 to these unwanted requests even if I'm not at the edge of my bandwidth rate this very moment. I would be ok with a config option to just send 503 to uncompressed requests, and be done with it. In that case we are simply declaring that uncompressed requests are no longer supported by dir auths. Or I would also be ok with a config option that specifies what level of bandwidth to give to these requests, i.e., a bucket water level below which we send a 503, even if there are otherwise quite a few tokens in my bucket. That approach is a bit more complicated to build, but it would preserve the illusion that we're still willing to answer uncompressed requests. And it would more seamlessly handle the situation where the current ddos slowly fades away and/or returns later.

Thought 2b: We might worry that flat-out denying uncompressed requests will force this current ddoser to change, and we'll end up finding it harder to distinguish their requests and discard them. But I think with #33029 plus dgoulet's patch here, we'll already be denying the great majority of the uncompressed requests. With the "bucket water level" design I just described, we'll reject more but it won't be a substantive difference. Or said another way, if we're worried that the ddoser will change their approach, we should worry about that even if we just do #33029 plus dgoulet's patch, because we will definitely be making their directory fetches very unreliable. So if we're committed to that risk (which I think we have to be, unless we can think of something smarter), we might as well maximize the good side of the tradeoff, and defend ourselves, and our bandwidth use, better.

comment:5 Changed 6 months ago by arma

Reviewer: nickm, teor, armadevnickm, teor, arma

comment:6 Changed 6 months ago by dgoulet

Status: needs_reviewneeds_revision

comment:7 Changed 6 months ago by dgoulet

Status: needs_revisionneeds_review

Ok so I talked with nickm and to keep this in 043 (and possibly backport), I went with a more simple approach with minimal code change.

I have added the AuthDirRejectUncompressedRequests 0|1 option that will unconditionally reject uncompressed requests if set. It would be off by default in 043 stable (and possible backports).

comment:8 Changed 6 months ago by teor

Reviewer: nickm, teor, armanickm, arma

I've been tracking these changes, they look good to me.

comment:9 Changed 6 months ago by nickm

This is looking okay to me, modulo CI issues.

comment:10 Changed 6 months ago by nickm

First, though, we should make sure that there aren't any current URLs for which existent Tor relays/clients don't use compression on a fetch, and they're compressing an authority.

Also I should check that this is only happening on GET requests.

comment:11 Changed 6 months ago by nickm

I think that neither of those concerns above is going to materialize, though, on further inspection.

comment:12 Changed 6 months ago by nickm

Status: needs_reviewneeds_revision

Putting this on needs_revision for now, because of the possibility of needing an 042 backport. If we decide not to backport, then it can go into merge_ready IMO, especially if it has had testing on a dirauth.

comment:13 Changed 6 months ago by dgoulet

This one is based on #33029 so I will wait until it gets merged to propose the backport version of this one. And it will make review much easier.

comment:14 Changed 5 months ago by dgoulet

I need to wait for the 042 backport to be able to do an 042 and onward branch ...

comment:15 Changed 5 months ago by dgoulet

Status: needs_revisionneeds_review

0.4.3:
Branch: ticket33072_043_02
PR: https://github.com/torproject/tor/pull/1804

0.4.4+:
Branch: ticket33072_044_01
PR: https://github.com/torproject/tor/pull/1805

comment:16 in reply to:  14 Changed 5 months ago by teor

Replying to dgoulet:

I need to wait for the 042 backport to be able to do an 042 and onward branch ...

Nick merged #33029 to 0.4.2 and later over the weekend. Is there anything else blocking an 0.4.2 branch?

comment:17 Changed 5 months ago by teor

Keywords: consider-backport-after-0434 added
Reviewer: nickm, armateor
Status: needs_reviewneeds_revision

I think this change looks good, but I am not sure about allowing all compressed requests. I think we should check the write bucket for compressed requests, so that we prioritise requests in this order, by default:

  • voting
  • compressed requests
  • uncompressed requests

But if that's not possible, or it would take a lot of code changes, I'm happy to prioritise compressed requests (including voting) over uncompressed requests.

Also, please remove the binary files from the PR. Do we need to update the .gitignore to exclude *.a ?

comment:18 Changed 5 months ago by teor

Keywords: 042-backport 043-backport added
Milestone: Tor: 0.4.3.x-finalTor: 0.4.4.x-final

comment:19 Changed 5 months ago by dgoulet

With this, and #33029, everything coming from dirauth and relays is served as in the authority ignores the write limit.

So vote are _always_ served and anything coming from relays.

Thus this patch only applies to client requests. If compressed, they are served but if not, we look at the write limit and send back 503. Before this, dirauth would just answer everything regardless of write limit.

Is that sufficient or you still think we should prioritize. There is a challenge with prioritizing because that means we need to spool together the inbound requests and order it?

comment:20 Changed 5 months ago by teor

Sounds like it will solve our current problem with uncompressed requests.
So we could merge as it is.

But what if the DoS switches to compressed requests?

When we are rejecting uncompressed requests, can we look at the write limit for compressed client requests?
Seems like a simple change that could be really useful.

I don't think we need to be worried about the DoS switching to relay requests. That's a significant amount of error time for an attacker.

comment:21 in reply to:  20 Changed 5 months ago by dgoulet

Replying to teor:

Sounds like it will solve our current problem with uncompressed requests.
So we could merge as it is.

But what if the DoS switches to compressed requests?

In that case, we serve everything. This ticket was meant to improve the current situation instead of finding a long term fix and thus yes if the DoS switches, we have another problem.

When we are rejecting uncompressed requests, can we look at the write limit for compressed client requests?

That would mean a specific write limit for compressed client requests? Is this what you mean? Or if uncompressed requests, we should only reject if limit is too low?

Right now, this only looks at the global write bucket. It would require significant changes to have limits per "type" of requests (client vs relay, compressed vs uncompressed, ...).

Another reason for this is that only very old clients will request uncompressed and thus this patch would refuse them access to directory documents. But could also create some thundering herd maybe?

comment:22 Changed 5 months ago by teor

I'm not suggesting any complicated new features.

The current logic is a bit complicated. And the code does not match the man page.

At the moment, "always answer compressed" overrides AuthDirRejectRequestsUnderLoad. I don't think that's useful or necessary. And it's not documented.

Here is what the code does now:

  • compressed requests: always answer
    • includes tor authorities, relays, clients: almost always compressed
  • if AuthDirRejectUncompressedRequests is:
    • 1 - reject uncompressed: all uncompressed requests will be rejected
    • 0 - accept uncompressed: uncompressed requests are handled based on other options (see below)
  • if AuthDirRejectRequestsUnderLoad is:
    • 1 - reject under load: depends on the write bucket load:
      • too much load: we will reject uncompressed requests (usually bots)
      • room to answer: we will answer uncompressed requests (usually bots)
    • 0 - accept all: we will answer uncompressed requests (usually bots)

I don't think that's the design we want.

So I suggest we delete these lines:

  if (c_method != NO_METHOD) {
      /* Always answer compressed request. */
      return false;
    }

Here is what the code does after we delete those lines:

  • if AuthDirRejectUncompressedRequests is:
    • 1 - reject uncompressed: all uncompressed requests will be rejected
    • 0 - accept uncompressed: uncompressed requests are handled like compressed requests, based on other options (see below)
  • if AuthDirRejectRequestsUnderLoad is:
    • 1 - reject under load:
      • relay or authority always answer
      • client: depends on the write bucket load:
        • too much load: we will reject requests from clients
        • room to answer: we will answer requests from clients
    • 0 - accept all: we will answer all requests

comment:23 Changed 4 months ago by teor

Keywords: 044-should added; 043-should removed

This ticket is in 0.4.4, so it can't be 043-should.

Last edited 4 months ago by teor (previous) (diff)

comment:24 Changed 4 months ago by dgoulet

Status: needs_revisionneeds_review

Got it! I agree. I made the changes in both branches/PR and also addressed your other comments. Thanks!

The 042 branch I totally forgot. All 3 branches require much changes from original patch so I would ask you to just be careful with review here since they were _not_ trivial backports :S.

0.4.2:
Branch: ticket33072_042_01
PR: https://github.com/torproject/tor/pull/1858

0.4.3:
Branch: ticket33072_043_02
PR: ​https://github.com/torproject/tor/pull/1804

0.4.4+:
Branch: ticket33072_044_01
PR: ​https://github.com/torproject/tor/pull/1805

Last edited 4 months ago by dgoulet (previous) (diff)

comment:25 Changed 4 months ago by teor

Status: needs_reviewneeds_revision

It looks like the code, function comments, and changes files are out of sync in these branches.

These lines are only deleted in maint-0.4.3:

  if (c_method != NO_METHOD) {
      /* Always answer compressed request. */
      return false;
    }

Are there tests that give a different result when this code is deleted?
(I don't want to block on tests, so maybe we could do them later in master.)

Also, I'm a bit confused about how you want me to merge here. There's no forward merges in these PRs. So when we backport, we'll need to do an "ours" merge forward, to avoid conflicts. Is that ok?

comment:26 in reply to:  25 Changed 4 months ago by dgoulet

Replying to teor:

It looks like the code, function comments, and changes files are out of sync in these branches.

These lines are only deleted in maint-0.4.3:

  if (c_method != NO_METHOD) {
      /* Always answer compressed request. */
      return false;
    }

Are there tests that give a different result when this code is deleted?
(I don't want to block on tests, so maybe we could do them later in master.)

Wait...hmmm ok I will re-assess all 3 branches, this is getting confusing. Sorry about that.

Also, I'm a bit confused about how you want me to merge here. There's no forward merges in these PRs. So when we backport, we'll need to do an "ours" merge forward, to avoid conflicts. Is that ok?

Yeah... What about cherry-picking the commits and being done with it? I agree it is not good but the backport is surprisingly non trivial on each version :(

comment:27 Changed 4 months ago by teor

I don't think cherry-picking helps resolve conflicts between your 0.4.2, 0.4.3 and master branches. We'll see those conflicts when we try to merge 0.4.2 and 0.4.3 forward to master.

I think an "ours" merge is the best way to resolve this issue.

I can try to help with the branches, if you'd like?

comment:28 Changed 3 months ago by nickm

Parent ID: #33018
Note: See TracTickets for help on using tickets.