7802 tweaks per code review
7802 accidentally got merged with 8024 before being reviewed; joint code review by nickm and myself followed. Nothing so objectionable as to merit reverting the merge, but a few points noted (full IRC transcript will be attached):
18:45 < athena> okay, in circuitbuild.c we've got some pretty straightforward functions to query parameters that look just fine, then a long block of code which is probably where we should be focusing most attention 18:45 < nickm> Right. The "rate" configuration things are all percentages, but I don't see anything that keeps you from configuring a value below zero or above 100 in config.c. We should note that. NOTE. 18:45 < nickm> (I'm going to grep for NOTE strings.) 18:45 < athena> okay, yeah, those should be checked and clamped, i think
18:48 < nickm> I don't like the name "pathbias_check_use_rate" -- "check" is pretty vague. NOTE. 18:48 < nickm> The return; at the end of that function offends me. NOTE. 18:48 < athena> yeah 18:49 < nickm> I think the comment for the next function (pathbias_mark_use_success) may be wrong; the "stat" in the first sentence should probably be 'state'. and the word "mark" should appear before "it". 18:49 < nickm> NOTE.
18:53 < nickm> ooh. In mark_use_success, we check path_state < PATH_STATE_USE_ATTEMPTED. We should make sure that a comment on path_state_t says that its values must never be reordered or else stuff might break. NOTE 18:54 < athena> did you mean to say pathbias_should_count? i'm not seeing a pathbias_should_close 18:54 < nickm> sorry, should_count 18:55 < athena> okay 18:55 < athena> looks like it ignores circuits that have some purposes, that are one-hop or checks some config options 18:55 < athena> do circuit purposes ever change during the lifetime of a circuit 18:55 < athena> ? 18:56 < nickm> I believe circuit purposes can change under some circumstances, though the rules are nontrivial 18:56 < athena> hmm 18:56 < nickm> maybe we should cache the result of this function if we've ever called it before? 18:56 < nickm> maybe we should warn if it changes? 18:57 < nickm> this isn't a new problem, if it's a problem 18:57 < athena> yeah, we need a detection for that if we can't be sure it doesn't happen 18:57 < nickm> NOTE. Let's ask mike?
19:04 < athena> well, the code moves the pathbias_send_usable_probe() to USE_ATTEMPTED; it seems nontrivial to evaluate
the effects of doing that
19:05 < athena> let's note that and move on, and ask mike about it when he's around
19:05 < nickm> right.
19:05 < athena> probably means we should settle on 'revert' for now if there's something like that in there that we
don't understand
19:06 < nickm> oh wait
19:06 < athena> ?
19:06 < nickm> read the description of 7802 again.
19:06 < nickm> he's splitting PATH_STATE_BUILD_SUCCEEDED into two states, where previously the distinction was not
based on state but based on timestamp_dirty
19:07 < athena> right
19:07 < athena> and the old probe did test timestamp_dirty
19:07 < athena> okay, this one's fine, then
19:07 < nickm> so that block that moved isn't the build_succeeded block; it's the build_succeeded && timestamp_dirty
block
19:07 < nickm> great
19:07 < nickm> and it now sets the state to ALREADY_COUNTED so we don't do this again. Sounds good.
19:08 < nickm> next two changes are function renames.
19:08 < athena> yeah
19:08 < nickm> Then the function now known as pathbias_get_close_success_count.
19:08 < nickm> Hm. path_state >= USE_FAILED and path_state >= BUILD_SUCCEEDED in the same function. I wish those were
macros or inline functions or something.
19:08 < nickm> NOTE above
19:09 < athena> yeah, making assumptions about enums like that always makes me feel icky
19:10 < nickm> (If you add a new value, you need to add it in the right position or there's hell to pay)
19:10 < athena> yeah
19:11 < nickm> so the change in pathbias_get_closed_count is probably okay though I think.
19:12 < athena> yeah, agree
19:12 < nickm> agree?
19:12 < nickm> great
19:18 < nickm> pathbias_act_on_failure_rate? 19:18 < nickm> It doesn't return a boolean; it just messes with stuff and acts accordingly 19:18 < athena> it does, though 19:18 < nickm> oh, it does? 19:18 < athena> it returns -1 if it rejects the guard for a high failure rate 19:18 < nickm> Do we ever look at its return value? 19:18 < athena> or falls to the return 0 at the bottom of the function otherwise 19:18 < athena> dunno 19:19 < nickm> Looks like we call it in 2 places, and never check the return value there. 19:19 < athena> doesn't look like it 19:19 < athena> yeah, your suggestion then 19:19 < nickm> ok. NOTE let;s rename it to pathbias_act_on_failure_rate. 19:20 < athena> should we also get rid of the return value? 19:20 < nickm> Also NOTE that in the case where it sets any persistent value on a guard, it does need to call entry_guards_changed I think 19:20 < nickm> yes we should. (NOTE)
19:21 < nickm> Hm. In the later part of the function that does scaling... 19:22 < nickm> ah, never mind. 19:22 < nickm> that looks okay to me if it does to you, since guard->use_{successes,attempts} are doubles. 19:23 < athena> yeah 19:23 < athena> it's just using a slightly awkward way of passing a rational 19:23 < nickm> We should check whether we enforce that scale_factor is at least 1, and mult_factor is less than or equal to scale_factor. 19:23 < athena> but i'm not sure why, since if those are doubles we're already using floating point anyway 19:23 < athena> could just be a single function returning another double? 19:23 < nickm> (NOTE) 19:23 < nickm> I think it should be. 19:24 < nickm> I think he's doing it this way because consensus parameters can only be ints 19:24 < nickm> (int32_t, I think) 19:24 < athena> also make sure that we never divide by zero 19:24 < athena> oh, right 19:24 < nickm> Right. (NOTE wrt the refactoring though) 19:24 < athena> so it'd just be a pain to extend that to allow double, then 19:25 < nickm> yeah, but we can have the function return a double, and let the user configure a double themselves if they want to override it in torrc 19:25 < nickm> (It would be crazy to override the numerator but not the denominator) 19:25 < athena> yeah 19:26 < athena> agreed; should be one function calculating the ratio of two int32_t consensus params or pulling a double out of the config file 19:26 < nickm> Right. Let's do it like that. NOTE
19:26 < nickm> on to pathbias_check_close_rate. That should probably get renamed too. 19:27 < athena> pathbias_act_on_close_rate? 19:27 < nickm> yeah. That one does have its return value checked. 19:27 < nickm> (lots of duplicate code here too I think) 19:27 < nickm> Does this look like duplicate code to you? Does a later commit fix that? 19:28 < athena> hold on, lemme look 19:29 < athena> yeah, it's kinda duplicated 19:29 < athena> but it's calling different functions in the log messages (get_use_success_count() vs. get_close_success_count()) 19:30 < athena> getting rid of the duplication would mean passing some function pointers around, probably 19:30 < athena> and i'm not sure where it gets scale_factor from 19:31 < nickm> Or having one block at the top of the function that says if (counting_closed) { success_count=get_close_success_count() } else {success_count=get_use_success_count();} and avoid the function pointers 19:31 < athena> yeah, okay 19:31 < athena> then should refactor to have less duplicate code i think 19:32 < nickm> The scaling block is pretty different 19:32 < nickm> isn't scale_factor coming from pathbias_get_scale_factor ? 19:32 < athena> hmm 19:32 < nickm> I hope everything scaled in that block is also a double, or our idea of having one function that returns a double is going to break. 19:33 < athena> yeah 19:34 < athena> well, we could split the scaling out into separate rescale_<use,close> functions, and handle them like the success_count thing? 19:35 < nickm> Seems good. Also Given that there are two sets of values that get scaled independently, we should make sure that their documentation says which block is which, clearly. 19:35 < nickm> NOTE on the refactor and NOTE on the documentation
19:41 < nickm> I don't like the check on the return value of pathbias_check_close_rate(). That function only returns -1 the first time it decides to disable the guard, right? 19:41 < nickm> maybe it should call it, then check whether guard->path_bias_disabled is true? (NOTE)
19:45 < nickm> Hm. I wonder what initializes node->use_attempts and node->use_successes and the other doubles if this part isn't called. 19:46 < nickm> If they all come from a malloc_zero() or a memset, that's a little tricky. Does memset(&dbl, 0, sizeof(double)) get you a good zero? 19:46 < athena> not on things that aren't IEEE-754 in general 19:47 < nickm> hm. we proably have other places where we do this. 19:47 < athena> yeah 19:47 < athena> it is theoretically a portability issue, but it's gotta be pretty rare to run into a machine like that 19:48 < athena> the only examples of non-754 fp i can think of off-hand are VAX and some older crays 19:48 < nickm> We could treat it like systems where memset(&ptr, 0, sizeof(void*)) doesn't set ptr to NULL. 19:48 < athena> yeah 19:48 < nickm> NOTE. let's do that.
19:57 < nickm> athena: this code looks okay to me, but I wonder if it wouldn't be better to change the check to call one of the pathbias_act_on_foo_rate() functions instead to eliminate duplicated code. What do you think? 19:58 < athena> probably 19:59 < nickm> athena: I think this one isn't super-urgent, but we should still NOTE it. agree?
20:06 < athena> 24b9b9f791defcb6156c587a035fde58c35a46d9 next? 20:06 < nickm> yes let's 20:07 < nickm> The two blocks look duplicated, but there are only two 20:08 < athena> yeah 20:08 < athena> it could still be made into two calls to a single duplication-free function, though 20:08 < nickm> agreed. 20:08 < nickm> NOTE 20:08 < athena> a2db17a1aab77c4183f589815800a779a5f24524 ? 20:08 < nickm> hang on. So, if even one stream is detached, we roll back to USE_ATTEMPTED? 20:09 < nickm> What if there are multiple streams and only one is detached? 20:09 < nickm> It doesn't seem too harmful, but we should ask mike IMO. 20:09 < athena> hmm 20:09 < athena> yeah 20:09 < nickm> NOTE let's ask him
20:11 < nickm> a2db17a1aab77c4183 ? 20:11 < athena> yeah 20:14 < nickm> doesn't seem crazy. I don't get it though. 20:14 < nickm> I wonder if the comment in circuit_launch_by_extend_info before the line that's being removed is now inaccurate. NOTE
20:21 < nickm> so, the next commit does the common code refactoring you mentioned before when it adds pathbias_count_circs_in_states 20:22 < nickm> also NOTE that the pairs of arguments to pathbias_count_circs_in_states are good things to document in the documentation for path_state_t 20:22 < athena> yeah, agree 20:22 < nickm> As for the part of this change that changes how scaling happens... how does that look to you? 20:24 < athena> it looks correct to me 20:25 < nickm> I personally would kind of prefer it if we didn't add things to {circ,use}_{attempts,successes} until we had the corresponding changes in the other scaled variables. then this commit wouldn't be needed. that's a potentially bigger refactoring though. NOTE it for later?