The problem here is that nothing in our spec unambiguously prevents the components of versions being negative, and so the if ((i = (a-b))) return i; pattern we use in tor_version_compare() potentially underflows.
This is bad when we may have -ftrapv or ubsan enabled: both of those turn signed underflow into a crash. (And it's still undefined behavior in any case, which we should really try to prevent.)
My branch bug21278_024_v2 tries to fix this, with two approaches:
tor_version_compare() now uses unsigned arithmetic to produce the same results while avoiding undefined behavior. This should mean -- if I coded it right -- that we don't have any visible behavior differences form before (except "not crashing").
dirserv_get_status_impl() now rejects incoming descriptors with negative versions, while leaving voting unchanged. Changes to this function operate at a single authority, and don't require a change in the consensus method number.
Additionally, I found two more cases where we use the if ((i = (a-b))) return i; pattern to implement a comparison function. I believe that they are both safe, but somebody should look them over. The fixes for those are in my bug21278_024_v2_extra branch, on top of my bug21278_024_v2 branch.
Trac: Status: assigned to needs_review Summary: Fix TROVE-2017-001 to Avoid signed integer underflow when comparing versions (Fix TROVE-2017-001)
I pushed a bug21278_024_v3 branch that has an extra commit to better document the new function.
That said, I think your commits 2c768c2c0 and 54e2e027 are only for dir auths, right? So there isn't any point in backporting them earlier than 0.2.9?
Speaking of which, for 4e720ad7 which looks like the main fix here, the changelog stanza says This bug is harmless, except when Tor has been build with --enable-expensive-hardening, which would turn it into a crash. Am I remembering correctly that only recent Tor branches have put expensive-hardening on by default? That is, the earlier fix for this TROVE (i.e. disabling ftrapv) only went into 0.2.9.x and 0.3.0.x? Why backport a fix for a harmless bug so far back? :)
Additionally, I found two more cases where we use the if ((i = (a-b))) return i; pattern to implement a comparison function. I believe that they are both safe, but somebody should look them over. The fixes for those are in my bug21278_024_v2_extra branch, on top of my bug21278_024_v2 branch.
I'm a big fan of what 0a2abb37 is doing (modulo the fixes that teor suggests). Since it's more of a belt-and-suspenders thing, is it more suited for 0.3.0?
Whereas 557385874 is dir auth only (I think?), so there's no need for it to go earlier than 0.2.9. But you never know when sha1 collisions are going to drop, so I think putting it in 0.2.9 is reasonable if you want to do that. (On the other hand, once sha1 goes bad, "hey directory authorities, either start using the default value of --enable-fragile or upgrade to 0.3.0.x" seems like a totally doable statement too.
Am I remembering correctly that only recent Tor branches have put expensive-hardening on by default? That is, the earlier fix for this TROVE (i.e. disabling ftrapv) only went into 0.2.9.x and 0.3.0.x?
This bug affects everybody who has trapv or ubsan turned on. In 0.2.9.1-alpha, we turned trapv on by default, which caused this bug to affect 0.2.9.1-alpha through 0.2.9.8.
This bug will still affect all older versions if they have --enable-expensive-hardening turned on.
I have a bug21278_024_v4 that has only the minimal fix for the integer issue. I propose that it go into 0.2.4.
I have a bug21278_redux_029 that blocks the bogus versions at the directory level, and includes a changes file and roger's function documentation. I propose that it go into 0.2.9.
I agree that it's okay to merge bug21278_024_v2_extra to 0.2.9. I have a bug21278_extra_029 branch to port those forward. I'm okay with taking that in 0.2.9 or 0.3.0.
I have a bug21278_024_v4 that has only the minimal fix for the integer issue. I propose that it go into 0.2.4.
This is ok, as anyone on a private network can stop their own relays misbehaving.
make check passes for me on macOS 10.12 i386 and x86_64.
I have a bug21278_redux_029 that blocks the bogus versions at the directory level, and includes a changes file and roger's function documentation. I propose that it go into 0.2.9.
This refuses to compile for me with:
src/or/routerparse.c:5555:32: error: comparison of unsigned enum expression < 0 is always false [-Werror,-Wtautological-compare] router_version->status < 0 ||
The 64-bit arch compiles and passes make check test-network-all.
I don't know what the extra newline is doing in f1c2cea165, but that's a nitpick.
The changes file in 1ff289a745 includes trailing whitespace.
I agree that it's okay to merge bug21278_024_v2_extra to 0.2.9. I have a bug21278_extra_029 branch to port those forward. I'm okay with taking that in 0.2.9 or 0.3.0.
I think 029, in case there is a security issue here.
make check test-network-all passes for me on macOS 10.12 i386 and x86_64.