There are classes of relay cells that don't count in the circuit windows. Specifically, only data relay cells count in the circuit windows.
So for example, a malicious website could arrange for the client to open 2200 streams to it, and then the website could close them all at once. 2200 relay end cells would work their way back towards the client. The #9063 (moved) fix would get upset and kill the circuit.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
A) A website just causes Alice to launch 2200 stream connections. Then she gets 2200 connected cells heading her way.
B) If Alice is using the leaky pipe design and has two circ-window-maxes heading her way, but she's also uploading stuff and triggers a sendme cell to return to her, that brings us to 2001.
Actually, this is all clients. Bob pointed out that it is their Guards who have to upgrade for them to be vulnerable, not them.
Pretty bad stuff.
Perhaps we should remove #9063 (moved) and save it for Gaurds who actually experience OOMing?
Trac: Summary: #9063 (moved) enables Guard discovery of 0.2.4.x clients in about an hour by websites to#9063 (moved) enables Guard discovery in about an hour by websites Priority: major to critical
Hm. Targetted OOM, or guard discovery. Not a great choice.
We might need to do the 0.2.5 version of a #9063 (moved) fix that we were talking about, where we defined a maximum space allocatable for cells, and do some kind of OOM killer on on clogged-looking circuits if we run out of space for new queued cells. Not great, and not delightful to contemplate for an 0.2.4 backport, but the OOM situation isn't greater either.
This doesn't make discovering a client's entry guards significantly easier. It does make failing a client's circuits from the exit/destination end significantly easier, and as soon as the path-bias detector is turned on, that becomes a full remote user-tracing attack.
Here are a few sketch oom-killer algorithms. Let's try to find flaws in them and find something better.
Algorithm 1.
have a max number of allocated cells configured.
when that number is reached, consider all circuits and kill those with the longest queues until we are at X% of the limit.
Algorithm 2.
as in alg 1 above.
when he hit the limit, consider the connections with the largest number of queued cells on circuits aiming for them.
kill those connections until we are at X% of the limit.
Algorithm 3.
1, 2) as in alg 2 above.
3) kill the biggest circuits on those connections until we are at X% of our limit.
Algorithms 1T, 2T, 3T:
As algorithms 1, 2, 3, but instead of longest queue, look at longest flush time, where flush time is the queue length divided by the connection's observed bandwidth.
As algorithms 1T 2T and 3T, but define a queue's flush time as some function that accounts for the number of other circuits targetting the same connection. If we were using round robin, I would say: N/BW where bw is the conn observed bandwidth and N is the sum of all queue lengths on that connection, each queue length clippedd to be no longer than the queue under consideration.
Instead of closing circuits or connections in response to OOM, you could check whether the relay has hit its rate limit, and if it hasn't, send a bunch of cells along the circuit/connection. (Perhaps the next relay will have more room for them.)
"Maximum number of cells in entrys's circuit queue = 8 * (1000_data + 10_sendmecirc + 20_sendme_stream + N * (1_connected + 1_end)), N is number of maximum allowed streams for client"
"so 10k is actual number for client with 100 streams limit"
(quoting from irc)
If we disable the leaky pipe feature, 2k cells is enough, with 100 stream limit.
Of course, this isn't counting drop cells, which we might want to add in at some point to help with website fingerprinting.
As algorithms 1T 2T and 3T, but define a queue's flush time as some function that accounts for the number of other circuits targetting the same connection. If we were using round robin, I would say: N/BW where bw is the conn observed bandwidth and N is the sum of all queue lengths on that connection, each queue length clippedd to be no longer than the queue under consideration.
Is there any reason not to simplify this down to the circuit level only? Anything based on connections is easily subverted by a sybil attack. And you can't consider the time-independent queue length alone because of the sybil attack.
The goal is to drive up the cost of the attack while making sure you are not killing an honest client's circuit. If you consider a function of the queue length and the waiting time of the longest waiting cell for each circuit, then a malicious client will have to read cells at a rate at least as fast as the slowest honest client on every sybil in order to cause the relay to target the honest client's circuit. It can be shown that the longer the attacker's queue becomes, the faster it must read from it to avoid selection because it needs to keep the waiting times of the cells in the malicious circuits lower than those of all honest circuits.
I guess authenticated SNEDMEs, where the exit inserts a 1 byte nonce into every 100th cell and won't increase the package window unless the SENDME contains the nonce, is also off of the table since like the queue length limit defense, it doesn't handle non-data cells.