I have fixes in my branch stack_fixes_032, with each issue getting a separate commit. I have not re-run this branch under STACK.
There seem to be two kinds of issues here:
We checked to see whether something that can't possibly be negative is negative. Sometimes this means that we had done the arithmetic wrong; sometimes it means that the check was erroneous.
We check for NULL-ness of a pointer that can't possibly be NULL. This usually means that the NULL check was redundant.
I'd like a second opinion on all the arithmetic fixes, especially to know if you think they represent a real bug that needs a separate ticket.
There is one warning in test you might want to look at while you are at it (see attached output). What about the code in ext? Is it worthwhile looking at it, too?
Thanks for looking at test, gk. Yes, I do think it's worthwhile looking at ext -- it's not our code, but we're shipping it, so we ought to make sure it's right.
Thanks for looking at test, gk. Yes, I do think it's worthwhile looking at ext -- it's not our code, but we're shipping it, so we ought to make sure it's right.
The code changes look good to me! Possible minor documentation improvements below. Feel free to merge without another round of review after any doc changes.
Spelling ("happend") in changes file.
In d8f0417799f8eaa0865cd8ba24e04f7689c99d78 maybe clarify in the commit message what the STACK warning is about? (Is it a potential signed underflow?)
In 8922986c3190bc3cb0b9036d7e024817c307bd6a "swapping over into unsigned integers" seems like it's mostly an issue for 32-bit platforms; should that be mentioned in the commit message?
In 370916f75e2f00ea4b6f0f1f1ccf3df04504aab7 maybe clarify in the commit message that new_element() can't return NULL because tor_malloc_zero() can't return NULL? Similarly for 4b2f8c35f69744632e14684e76dc5484202dc7ae.
Actually, gk said that the tortls one didn't work, so I'm not merging that. I've made a new branch with the changes you suggested, and merged it to master. I removed the #24590 (moved) commit, since I merged that separately.
Putting back in accepted for the tortls issue. (gk, do I understand correctly that I never figured out how to make stack stop complaining about that?)
Trac: Status: merge_ready to accepted Milestone: Tor: 0.3.2.x-final to Tor: 0.3.3.x-final
It looks like STACK is complaining about the comparison on line 498 if (earliest_start_time >= now) being simplified based on the assumption that the earliest_start_time computation doesn't overflow or underflow, and therefore the algebraic equivalences hold (cancelling the now from the comparison). The added tor_assert(cert_lifetime <= INT_MAX) adds some constraints to cert_lifetime but apparently that's not enough. STACK doesn't seem to know any constraints on now, so maybe as far as it's concerned, the subtraction could still underflow. (or the additions could overflow)
Perhaps the comparison on line 498 should be if (cert_lifetime < min_real_lifetime + start_granularity), with the original earliest_start_time computation moved into an else clause.
Hm maybe that should be if (cert_lifetime <= min_real_lifetime + start_granularity || cert_lifetime > now) because the subtraction for earliest_start_time could still underflow. (also fixed the comparison operator that I had wrong before)
Then again you could also start getting mixed signed/unsigned comparison warnings so maybe we need to use a signed intermediate.
I've tried the first approach you suggested, and didn't get signed/unsigned warnings, so lt's try it out.
gk, does my stack_again_032 branch fix the warning for you?
Looks good now. Just the warning in test is still there (see comment:4 and the respective attachment).
That looks like another instance of tor_malloc_zero() (called from new_fake_channel()) always returning non-NULL (assuming the line numbers on maint-0.3.2 are still correct), so this conditional in test_channel_flushmux() is always true:
762 if (ch) 763 circuitmux_free(ch->cmux);
Even better, there's also
718 ch = new_fake_channel(); 719 tt_assert(ch);
so that's another reason why it can't be NULL, along with other unconditional dereferences.
The test_channel_flushmux() test no longer exists in 0.3.3, since channel queues no longer exist. (See #23709 (moved).)
If we want, we can remove all the redundant checks for whether new_fake_channel() has failed -- I have those in a branch called one_more_stack_fix, but I suggest that we don't actually take them, since these don't actually seem to cause stack warnings.
The one_more_stack_fix changes probably aren't necessary on master. I think STACK wouldn't complain about the tt_assert(ch) calls because they don't come after dereferences of ch. The redundant check in the cleanup of test_channel_flushmux() on 0.3.2 probably should still be deleted if we would like to make 0.3.2 STACK-clean.
for (i=0; i<N_CONSENSUS_FLAVORS; ++i) { consensus_waiting_for_certs_t *waiting = &consensus_waiting_for_certs[i]; if (!waiting->consensus) continue; if (networkstatus_check_consensus_signature(waiting->consensus, 0)>=0) { char *waiting_body = waiting->body; // 2049 if (!networkstatus_set_current_consensus( waiting_body, networkstatus_get_flavor_name(i), // 2052 NSSET_WAS_WAITING_FOR_CERTS, source_dir)) { tor_free(waiting_body); } }
What's I think is happening here is that the compiler sees that waiting->body is computed, and so realizes that "i" must be in range 0 <= i < N_CONSENSUS_FLAVORS. This could be used to eliminate the assertion and default case in networkstatus_get_flavor_name() when it's inlined.
The trivial workaround here is to compute the flavor name first as in my stack_em_up branch. This doesn't actually change anything or fix anything real, so I'm not doing a changes file on it.
You're saying STACK is warning about the tor_fragile_assert() in networkstatus_get_flavor_name() being eliminated as dead code as a consequence of a buffer overflow not occurring when looping over N_CONSENSUS_FLAVORS? Is the problem partly that STACK doesn't see tor_fragile_assert() as intrinsically unreachable because by default it's nonfatal?
Do we get this STACK warning if we build with ALL_BUGS_ARE_FATAL? If not, I'm inclined to leave it as is and recommend that people run STACK with that defined.
Do we get this STACK warning if we build with ALL_BUGS_ARE_FATAL? If not, I'm inclined to leave it as is and recommend that people run STACK with that defined.
No, that warning does not show up anymore. However, a new one pops up (with commit 29784d80faab3d8307916c7a945e89fe082e3db0) in that case:
The patch in stack_em_up on the other hand solves this problem without creating new warnings. I think that's preferable to keeping a list of false positives (because ALL_BUGS_ARE_FATAL wouldn't be defined in any release build anyway).
I think ALL_BUGS_ARE_FATAL is good to use with STACK because we use it with other static analysis tools for similar reasons. (It helps tools figure out what code is intentionally dead and decreases the noise level.)
I kind of understand why the new error is happening with ALL_BUGS_ARE_FATAL but I can't describe it well yet.
These tickets were marked as removed, and nobody has said that they can fix them. Let's remember to look at 033-removed-20180320 as we re-evaluate our triage process, to see whether we're triaging out unnecessarily, and to evaluate whether we're deferring anything unnecessarily. But for now, we can't do these: we need to fix the 033-must stuff now.
Trac: Milestone: Tor: 0.3.3.x-final to Tor: unspecified