The change in eed326f0 still makes us lose a feature we were using: if your network has gone away, but the kernel hasn't told your tcp connection to fail, then we'd notice when we timed out waiting for a response to our create_fast cell. Now we no longer do.
This is one of those weird edge cases that e.g. made Nathan extra happy with Tor's network state detection on Android.
I wonder if we want to keep the "mark it down" behavior when we're closing the circuit not due to a destroy cell.
Looks better yes. I wonder if there are other edge cases.
As for the attack you describe, yes. I think that used to be possible too by answering with a created-fast and then just dropping any extend or begindir cells?
I guess another option here is what you were heading toward originally: we dump the guard if it ever can't process one of our create requests. That seems a poor plan though, especially in light of the current circuit create load?
Looks better yes. I wonder if there are other edge cases.
There likely are. Can you think of any likely way to find them better than "read the code hard" or "merge and see what happens"?
As for the attack you describe, yes. I think that used to be possible too by answering with a created-fast and then just dropping any extend or begindir cells?
That's also true.
I guess another option here is what you were heading toward originally: we dump the guard if it ever can't process one of our create requests. That seems a poor plan though, especially in light of the current circuit create load?
Go ahead and merge, I say. And since it's a consensus param, you can merge into 0.2.4 too if you want. (If we later discover a bug, we can throw away this consensus param and try again with another one.)
I'm a tiny bit nervous putting this into 0.2.4 right away. If 0.2.4 client uptake goes faster than server uptake, it might make more load than we want. (Also, a lot of my work to make ntor go faster is targeted for 0.2.5, which will have uptake even more slowly.)
Do you think this matters? If so, let's hold off on an 0.2.4 merge.
In either case, I've merged the branch (now called "prop221_squashed_024" into master). I'm leaving this ticket in 0.2.4 for possible merge to the maint-0.2.4 branch.
Trac: Milestone: Tor: 0.2.5.x-final to Tor: 0.2.4.x-final
Looking at the prop221 branch: I think this means that we should backport fec12af65366e0789bfdd4affef0b7ff6ca1f921 and ad71d42f4b10b6f06355f44f5c82d6c7859a08da ("Implement proposal 221: Stop sending CREATE_FAST" and a fixup commit.) but not eed326f074634d00cc8455d3b9d854550f8a2955 and 8911ee01b5679ccfa76a5353859d21daf964e780 ("circuit_build_failed: distinguish "first hop chan failed", "CREATE failed"" and a squash commit.)