From the list of best practices defined in #29219 (moved), when we can write tooling to do so, we should write tooling to find and measure violations of these practices
Some of these measures (function length, file length, layer violations via includes) already have code for them; others will be harder. We should probably aim for the lower hanging fruit here.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
Nick what do you mean that some of these measures already have code for them? We have code in Tor to count function and file length? I could not find that in scripts/ but we do have the #include checking script.
On this note, and since we are chasing for low-hanging fruit here, I played a bit with static analysis tools today. The thing that worked out best because it's open source and does some of the things that we want it do is GNU complexity (...): https://www.gnu.org/software/complexity/
Based on the best practices list of #29219 (moved), complexity can calculate: file length, function length, nesting level count, and function-per-file count. As far as I know we don't have any of these right now so being able to do these would be an improvement. Also it's a GNU tool so it's FOSS and something we can use as an optional parts of our scripts toolkit.
As an example here is complexity's output for scheduler_kist.c:
In the CLI I set the score threshold to 1 so that it ignores truly trivial functions, but that can be set to 0 if wanted.
You can see the following:
It tells us that there is a function with a bad nesting depth of 5.
It tells us how big each function is (nc-lns counts only code).
It tells us how big the file is (307 LoC).
It tells us that the file contains 26 functions.
It has a weird but configurable algorithm to calculate function score.
The only weird thing is that it miscalculates at which line a function is, I think this has something to do with the way we comment our code but I haven't figured it out yet (does not seem related to macros). And it also does not work really well with MOCK_IMPL etc.
I can also pass it all the .c files of Tor and it will give me a project wide summary which I have attached to this ticket.
Does this seem like a plausible way forward? Like, having a script tha runs this in a useful way and parses the output to give some controlled information? Or do we car eabout other best-practices from #29219 (moved) more? This likely seems to be the lowest hanging fruit, but I'm open to more.
Also what do we want to do? Throw error if we have a nesting score above 7 (example)? Or just give out information?
I think that the function/file length thing I mentioned was just ad hoc one-liners, like find src/ -name '*.[ch]' | xargs wc -l |sort -n and like whatever I used for cyclomatic complexity in #6313 (moved).
I think you should feel free to go with whatever fruit seems lowest-hanging here. complexity seems like a good start for measurement, but I don't see immediately how to use it for enforcing rules. What do you think there?
OK fixed the CI issues and also added a squash commit that slightly improves the error message in case of problems. We can always do better but this might be OK for a minimum-viable-product here.
LGTM, except that I think one of the requested changes is important for correctness. I've added a commit for that, plus another commit for the message change, in a new PR: https://github.com/torproject/tor/pull/782
I did a squashed branch, but the CI for the original pull request failed, so I will only merge if the CI for the squashed branch works:
https://github.com/torproject/tor/pull/786
Almost there. The problem with your fix was that you added a problem line for hs_cell.c:hs_cell_parse_introduce2 but there was already one in the exceptions file. Since the second one was more down it had more priority (having two same problems in the exception file is currently UB).
Perhaps we can soft-solve this problem by editing the right exception line, and then as part of #29746 (moved) we can figure out what to do about double exception lines?
Hmm, no, travis runs with python 2.7.6 by default, so we should fix the script so it works with python3.
Thanks for catching that.
Seems like checkIncludes.py is also a python3 script but it also runs with python2 in jenkins/travis because of the from __future__ import print_function trick.
I can't merge this today: my day should end soon, I'm on leave tomorrow, and Appveyor still has 3.5 hours to go. I'll be back Monday, or anyone can merge in the meantime?