Step 0 is to determine if any of the Gbit+ tor relays (especially Guard+Exit nodes) ever come close to running out of TCP sockets.
Step 1 is find some way to measure stream failures from the bwauths, compute a stream_error value, and use it. See #4708 (moved) for more details on that.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
We are seeing some instances of potential socket exhaustion on Moritz's nodes, but it looks like #4710 (moved) is going to block implementing detection properly. It also seems to be a very ephemeral and transient condition for his nodes. Most of the time they operate with only ~8-10K connections per interface.