Our existing OOS code kills low-priority OR connections. But really, we need to look at all connections that an adversary might be able to create (especially dir and exit connections), or else an adversary will be able to open a bunch of those, and force us to kill as many OR connections as they want.
This problem is the reason that DisableOOSCheck is now on-by-default.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
So, what's the best strategy here? We'd like to emphasize connections that are getting lots of usage, but only real usage. The existing code kills whatever OR connections have the fewest circuits, and leaves everything else alone. But if DirPort is open, or if we're an exit, that can be really bad.
My first thought was to treat directory server connections and exit as if they had one circuit, and then to rank them by number of circuits along with the OR connections. But maybe that's vulnerable too? An attacker could just start a bunch of clients, open two circuits from each, and get an exit to kill off all its exit connections. Probably not so good.
Should we look at last-written time, or queue age, or something else? There may be cleverness needed.
normally, one would use IP reputation to deal with spamming attacks. however, for obvious reasons, I can see why that might be frowned upon in these circles.
therefore, some other unfalsifiable proof of work is required. one could implement a custom proof-of-work protocol, but it seems more useful to me to measure the bandwidth used. this incurs negligible overhead for legitimate users, but has the added benefit that attackers are forced to encrypt their data in order to increase their bandwidth usage. additionally, if attackers have vastly more bandwidth than you, they can simply mount a traditional DoS attack anyways.
tl;dr just sort connections by recently used valid data traffic.
directory connections are poorly impacted by this metric, but:
if the connection is legitimate, there will be data flowing down it soon after fetching directory information anyways. works better with the new ORPort-only architecture, but for legacy clients I guess we could just sum together the bandwidth used by an IP address and use that somehow
AIUI directory connections are only absolutely necessary during the very first startup. at any later time, if a directory connection cannot be made or is suddenly terminated, cached data can be temporarily used until a connection can be re-established. therefore, prematurely terminating directory connections is not a huge problem, and is much better than rejecting new connections which may require relay service.
The equivalent IPv6 netblock would be a /32, the minimum regional internet registry allocation block size.
We could identify the /16s or /32s with the largest numbers of connections, and kill those first, using one of the other "usefulness" heuristics.
I think that might work, but I don't see why that would be any better than using only bandwidth consumed. In fact, I think that would have the same issue that I mentioned on the list of overkilling NATed clients, potentially the ones most in need of anonymity! IIRC, Cleanfeed is known to proxy all connections through a small number of IPs; I wouldn't be surprised if China, Iran and company did the same.
Filtering based on bandwidth used is reputation-neutral, has zero false positives, and has near-zero added cost.
in other words, it's impossible to, using netblocks only, distinguish between "real" clients behind some mobile network's carrier-grade NAT and a bunch of regular clients on a VPS somewhere.
hm... upon further consideration though, perhaps it would be possible to use a memory-hard proof of work algorithm here. even phones under $100 have at least 2 GB of RAM, so completing an occasional 1 GB POW should only momentarily slow the device. should be easy on battery life too, unlike a CPU POW. I did a quick calculation and an attacker would need s = cfn, where s is the required server RAM, c is the challenge difficulty, f is the frequency, and n is the number of connections to be held, and if c = 1 GB * 3 sec, f = 1/10 min, n = 200, then s = 1 GB, or around $5/month per 200 connections, which seems sufficiently expensive to deter this particular attack. however, there are a number of downsides to this plan. not only does it require additional protocol design (time which could be spent doing something else, like IPv6 support), I hear the iOS Tor people are limited to 15 MB, so even if the device has 10 GB of RAM that won't help. I figure "you must reopen the Tor app every ten minutes to maintain your connection" is not a good solution.
hm... perhaps we could use both: clients that require long-running connections for things like IRC must submit proofs of work (either CPU or memory), and iOS clients just have to live with occasionally re-establishing their connections if the relay is under DoS.