Define guidelines for variable naming
Starting with Java, but this should be extended to the other languages used in Metrics' products. Two excerpts from comments from ticket #22983 (moved) that are concerned with variable naming and try to define some guidelines:
=== #22983 (moved) comment 44: I'd say there are also very many non-issues here, which I know are different in the existing older code base, but seem to make the code less readable to me, when I started to become familiar with the old codebase. For example, there are almost line filling variable names (not in this patch) that differ only in one to five letters. And there is the word 'Descriptor' all over, which often feels like cluttering when reading the code for the first time. The following is an example: (all following diffs are from the 664f540 commit)
- public static List<Descriptor> parse(byte[] raw, File file)
...
+ public static List<Descriptor> parse(byte[] rawDescriptorBytes,
+ File descriptorFile) ...
Maybe, 'raw' alone is too terse, but 'rawBytes' seems fine whereas in 'rawDescriptorBytes' the word part 'Descriptor' overwhelms the important information. The method 'parse' receives raw bytes and tries to find a descriptor.
Here some other examples from the current patch&fixup round (we could recycle them for the guide lines, so I try to be verbose):
Renaming of isValid, here:
- public boolean isValid(String line);
+ public boolean validateLine(String line);
makes the code less readable. For example:
- -> null != line && !line.isEmpty() && !validator.isValid(line))
...
+ -> null != line && !line.isEmpty() && !validator.validateLine(line))
validateLine doesn't say that the result of the validation is returned (as a boolean). In addition, isValid(line) is more 'readable' as it 'translates' (e.g., read aloud) to "is valid line", whereas validateLine(line) results in "validate line line", which even without the duplication of line doesn't hint what happens. Similarly, sanitizeLine(line) vs. sanitize(line) (where I had clean(line) initially, but I don't mind the renaming) and postProcessLines(lines) vs. postProcessing(lines).
Why rename logBytes to rawDescriptorBytes? logBytes seems fine in a log descriptor implementation. If I read this code for a first time I would wonder if rawDescriptorBytes is inherited because of its generic name.
(Instead of renaming 'extension' to 'fileExtension' I'd suggest 'fileType', because that's what it is, i.e., not only an extension, which could be mistaken to be a string.)
=== reply from comment 45 ===== Variable names
Thanks for writing down your thinking about variable names in the given detail. It helps a lot, not only for this ticket but also as a guide for future tickets. Let's go through the examples:
-
Leaving out the somewhat redundant "descriptor" from
rawDescriptorBytes
anddescriptorFile
is fine with me. It's indeed obvious what's meant here. -
I quite strongly disagree with
isValid(line)
as a name for a method that takes a line, tries to validate it, and returns whether that was successful. To me,isX()
is the name for a getter, not the name for a method that does something with a given parameter. If this were aLine
class with a methodisValid()
that returns whether the line is valid or not, that would be something different. But that's not the case here. For another example, considerFile.delete()
which deletes the file and returns whether that was successful. We wouldn't argue about renaming that method toisDeleted()
just because it returnstrue
orfalse
. As a general rule I'd say that the name of a method that performs something should be the verb describing the action, whereasis
is typically reserved for getters. Ah, and in this case it's up to the documentation to say whatvalidate(line)
returns, though that's relatively obvious. -
Leaving out "line" in
sanitizeLine(line)
and friends is okay, too. -
You're probably right about keeping
logBytes
rather thanrawDescriptorBytes
.