Opened 10 months ago

Closed 10 months ago

Last modified 10 months ago

#28367 closed enhancement (duplicate)

RFE additional DOS mitigations for exits

Reported by: starlight Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: unspecified
Severity: Normal Keywords: tor-dos
Cc: Actual Points:
Parent ID: #24797 Points:
Reviewer: Sponsor:

Description

A relay I operate recently experienced a DOS state resulting from intense scanning behavior. The scanner initiated huge quantities of connections outbound on an exit such that the interface maximum configured socket count (62k) was fully consumed and normal client activity was squashed to zero. Load was so intense it was difficult to SSH in, NTP complained it could not reach time servers and numerous attempts were required to successfully open a daemon control socket (via loopback, not sure why). Was able to mitigate the attack without restarting any daemons and nothing broke, node resumed normal operation. Clearly a recoverable resource exhaustion scenario.

To limit the impact of this category of activity, two relatively simple mitigations come to mind:

1) create a configurable limit on the number of OR + DIR + exit_edge connections on each interface which may be set lower than absolute resource limits; this will prevent a DOS situation from rendering the overall system inaccessible and hopefully permit unimpaired daemon control ports creation; the setting will interact with the maximum number of in-flight DNS queries when a local resolver is configured and this ought to be documented

2) create a outbound exit_edge connection rate limit set to some reasonable value to constrain scanning

NOTES:

file handle limit 128k

nf_conntrack_max = 65536

Child Tickets

Change History (5)

comment:1 Changed 10 months ago by catalyst

Milestone: Tor: 0.4.0.x-final

comment:2 Changed 10 months ago by teor

Keywords: tor-dos added
Milestone: Tor: 0.4.0.x-finalTor: unspecified
Parent ID: #24797
Resolution: duplicate
Status: newclosed
Version: Tor: 0.3.4.9Tor: unspecified

Tor will use all available file handles for connections. If your system does not support that many connections, then you should reduce the number of file handles that tor can use.

To reduce the number of file handles, use ulimit -n (limit) or the equivalent daemon launcher option.

You may also want to set DisableOOSCheck 0 in your torrc, which causes tor to terminate connections at around 90% of the limit, rather than failing.

Socket limits will be better documented in 0.3.5:
https://gitweb.torproject.org/tor.git/tree/doc/tor.1.txt#n300

The rest of this ticket is a duplicate of #24797, which stalled in needs_revision.

comment:3 Changed 10 months ago by starlight

An obvious objection to ulimit -n as a control is that this is simplistic with respect to multi-homed systems and may not always result in resilient behavior. Port limits operate with respect to IP addresses rather than at global daemon level. If ulimit -n is saturated, it will not be possible to open new control connections.

comment:4 Changed 10 months ago by starlight

Another point to think about is rate limiting of connections. Scanners generally operate by extending a number of circuits to an exit and then rapidly opening streams / edge_connections on each, so an effective way to mitigate this form of behavior is to have a rate limit that curtails or kills circuits that rapidly initiate connections while leaving calmer circuits untouched. The first priority flesh-and-blood users who brows the web can continue unharassed while bots get squelched.

comment:5 in reply to:  4 Changed 10 months ago by teor

Replying to starlight:

An obvious objection to ulimit -n as a control is that this is simplistic with respect to multi-homed systems and may not always result in resilient behavior. Port limits operate with respect to IP addresses rather than at global daemon level. If ulimit -n is saturated, it will not be possible to open new control connections.

You can open new control connections if you set ulimit -n to a level your system can handle, and also set DisableOOSCheck 0:

To reduce the number of file handles, use ulimit -n (limit) or the equivalent daemon launcher option.

You may also want to set DisableOOSCheck 0 in your torrc, which causes tor to terminate connections at around 90% of the limit, rather than failing.

Replying to starlight:

Another point to think about is rate limiting of connections. Scanners generally operate by extending a number of circuits to an exit and then rapidly opening streams / edge_connections on each, so an effective way to mitigate this form of behavior is to have a rate limit that curtails or kills circuits that rapidly initiate connections while leaving calmer circuits untouched. The first priority flesh-and-blood users who brows the web can continue unharassed while bots get squelched.

You're right: we should work out a way of rate-limiting exit connections as well.

Until we do that, I suggest using a firewall to rate-limit the number of new outbound connections. It's not as targeted as inbound connections per IP address, but it will help.

Note: See TracTickets for help on using tickets.