Comments - like documentation - are a liability. Most projects tend to treat them as supporting artifacts, but they are actually unidentified risks. Did you account for the time spent either rewriting documentation/doxygen/comments as you go (frequent chunks of small work) or an after the fact round of document backfilling (single chunk of large work) in your schedule?
No one even reads those documents. The client claims they are to assist future developers - internally or another vendor - but would you trust a Word document or a .cpp file when everything is on fire and you need to figure out a system's behavior? Same thing applies for code comments.
There is one source of truth in software, the code. You can pile on the design documents, wikis, and architecture diagrams but at the end of the day, what is coded is what gets executed. Make the code easy to understand and reason about. And then, if you really want to be nice to future developers, throw in a single page overview of the codebase (README) that gives someone an entry point.
Edit: I will just add the caveat to preempt the inevitable - this does not apply to public facing documentation of an API/framework/library etc.
"would you trust a Word document or a .cpp file when everything is on fire and you need to figure out a system's behavior?"
There's a bigger philosophical problem here. If your development cycle is code -> comment -> code -> comment ... (where you write code and then comment), then your criticism is fair. But the right way is to comment first. When you describe what you want in words, then you or anyone else can go back and evaluate if the code matches the words.
(In my experience, 90% of bugs stem from developers thinking one thing and implementing something different, and these could be avoided if you had a normative document)
It's like writing a paper: you outline, sketch out what you want to say, and then actually start crafting sentences. The outline should guide your paper, and if there's a problem you first update the outline.
Test-first is also broken, because you end up crafting code that suits the test coverage.
https://github.com/shtylman/node-int/issues/1#issuecomment-1... is an example of an issue which came from an implementation without 100% test coverage. I claim that unless you know every single way in which a function is used, and even if you did, you can't design tests to cover 100% of use cases.
Not all code is readable. Some of it always ends up being an ugly hack. If that ugly hack is not documented, I want to start punching previous developers.
I wish people would stop taking these purist approaches. There is an obvious middle ground to this - document the non-obvious and your public facing APIs.
Perfect. There's always going to be an edge case, and with code there is always going to be some logic that can't be understood at first glance. Tests are great, but comments help too. As long as comments stay updated with code changes they are useful.
Comments can indeed lie to you, but code can also be doing something that was not actually intended.
Having a comment to clarify intent is a saving grace when code looks correct but is doing the wrong thing. At least when a comment is conflicting with the code you know at once that you need to start asking questions...
It all comes down to: does the benefit out way the costs?
A well-maintained suite of automated tests provide an enormous benefit. A brittle set of Selenium scripts costs a project more than paying a Tester to manually run scenarios.
Yeah, I agree, it's a question of cost vs benefit.
A bunch of tests that check trivial cases is not helpful at all, a well-maintained suite of tests is very useful.
What I mean is that a bunch of documentation that only states the obvious is not helpful at all. A set of well-maintained (!) code comments can make navigating unfamiliar code much, much easier.
I can prove that tests don't function. They are exactly the opposite of comments in that way. Change the functionality? Tests fail. I know they fail. I fix them.
This is one of the reasons why I'm a fan of doctests in python. Generally if the tests start to fail the documentation surrounding it needs to be updated as well, which makes it a bit easier to actually keep the documentation up to date.
This. Comments and documentation are pointless unless they are audited and vetted somehow, then there is the extra cost associated with maintaining them. Managers who don't know better think its just a bit of extra work that the developer does up front, but that is pointless without the above.
I couldn't disagree with this more strongly--giving somebody a blob of code (no matter how elegant) is a recipe for disaster.
Let's assume good comments--not useless, lying, or broken ones. I won't contest that bad comments are at least as bad as useless, as you doubt would not contest that bad code is at least as bad as useless.
When I read a comment, I parse it directly. When I read code, no matter how neat it is, I often trigger a twitch of "How would I rewrite this?", which itself gets in the way of understanding.
The more code there is, the harder it is going to be to digest at first glance, and if your codebase could be adequately described by a single page of text I would wonder why I need to work on it instead of rewriting it entirely.
Hint: if you only have one page of docs, and especially if you have no test suite, I cannot work on the code in a meaningful way safely.
If you don't value safety, if you don't value engineering, then by all means omit comments. It's just a crutch for people who can't see the code man.
I might also point out that that same philosophy implies that we should just stick with assembly. Especially if you use any language with dynamic types/duck typing (Ruby, Javascript, etc.) seeing "at a glance" what exactly is the proper form of data to pass into and out of function can be hard. Documentation here helps.
EDIT:
Note also that commenting in libraries, or in mathematical or geometric routines, can be really helpful. The code doesn't lie, sure, but if you aren't experienced/schooled/clever enough to recognize what is going on it won't help you.
A great example of this is the fast inverse square root hack:
The problem domain of the source code dictates the value of the comments. If the code is highly algorithmic or data structure driven its more useful to include a pointer to a book / paper rather than littering the code with comments which will never do as good a job at exposition as a book / paper.
On the other hand arcane business logic needs to be documented because there is no way to recreate that from a mental model.
Generally I would agree with this--the problem is that citing papers/books/sites can result in brittle, broken external references. Ideally both a brief explanation and a link to the source material would be helpful.
I do value safety and engineering - in my mind, an ideal deliverable would be the source code, an automated test suite, and a 1-page document explaining the overall design of the code.
Comments don't provide safety - tests do. Comments are not good engineering - well designed code that is easy to reason about and follow is.
None of these things are mutually exclusive, and all non-trivial code should ship with a combination (obviously excluding "incorrect comments"). I think inline comments should be relatively unneeded, and should only describe why you did something in an unusual or non-intuitive way. Otherwise, your code should require few comments, because what it does should be obvious.
Regardless, you sure as hell better document and/or test your interfaces. The nice thing about tests is that they can automatically be verified for correctness; documentation and comments cannot.
The thing about comments is that they don't get maintained. So the comments end up diverging from the implementation. The tests all pass even if the comments are wrong, so nobody cares unless you are particularly diligent about code reviews of EVERY commit.
Also this should go without saying but I still see comments like this:
// loop over the collection
for (var foo in bar) {
...
}
When I write code, the first thing I do is write out the code inside comments - including stuff like "Loop over the collection:". Once I've written the comments, I start filling in the code. I find that whilst this generates "unhelpful" comments, it leads to much cleaner and well designed code. There's another advantage too. If your IDE or text editor changes the colour of comments, as it should, to a colour different to that of the code, then you can ignore all the code and simply read the comments. I find this makes skimming through code much faster.
Granted, this won't work in all situations. If you can't trust your colleagues to keep the comments updated (and if you don't review commits), then this is going to cause problems - however if you're writing for yourself or with a small group of people you trust, I strongly recommend giving it a go.
It's a huge anti-pattern, because then people train themselves to think that slamming out a lot of trivial and redundant comments is a good use of their time. Worse, then they get used to writing lots of trivial comments and the really important more complex or subtle comments go unwritten.
Yes, thank you, sorry for too big of a mental shortcut.
By bad tests I mean trivial tests (much like trivial comments that just state the obvious), or ones that work on an assumption that need not be generally true and thus changes of the code that are correct from a logic standpoint might still make them fail.
Then such bad tests don't mislead, they are just a waste of energy and provide no assurance. At least if the test fails and the code is good, you will know to fix the test. Nothing like that with comments written in human natural language.
// $set $foo to $bar
$foo == $bar;
// print out value of $foo
echo $foo;
// is $bar the same as $foo? if so, return true, otherwise return false
if ( $bar === $foo ) { // check if the same
return true; // great, it's true!
} else { // here we check if it isn't true
return false; // nope, not true, let's return false
}
I have nothing against comments where something needs to be documented, but not everything needs comments as most code is essentially self-documenting.
I expected it to be a given that these kinds of comments are unnecessary. It's better to comment on the "why" than the "how" or "what". Comments are useful when they help explain a piece of code's integration into the rest of the system. Why is it there in the first place?
99% of the time when you are writing a single line comment you should instead be either renaming a variable or using the extract method refactoring using some abbreviated form of the comment in the new name.
For the remaining cases it's more important to make sure that you're documenting the "why" of something in the code. If you find yourself often in the situation where making changes to the code requires modifying the comments to keep them up to date that's a big sign that you aren't writing comments appropriately.
1) When I'm not all that familiar with the language. This allows me to grok the purpose and actions of code quicker.
2) When the action being performed contains more lines than fit in the screen. Knowing a bit more context going in can make what follows more clear.
For 1, it's a matter of who you intend to read/modify the code after you. For 2, that's probably more a case for splitting that code into multiple functions.
From that, I think comments are best used by poor and great programmers, but the middle of the road may not get much benefit out of them. If you don't know how to split your complex bits of code into multiple chunks, or if you expect most people coming after you to now be as experienced as you, comment.
I started writing code in a more literate style recently (where the code is a document with markdown interspersed with the code segments) and found that writing what I want to do first makes the development process much easier than just delving into code. Feels like writing a paper in many ways (you write the outline, flesh with your thoughts on how it should work, and then start writing code)
I put this tool together on a whim (although the github rendering is somewhat strange because I took liberties with the fenced code blocks): https://github.com/niggler/voc
I feel like I'm seeing a lot of discussion about what should be commented. Just like with tests, this can be hard to put your finger on. Even more so, the level of detail a comment needs to have for it to be useful/useless will vary for each developer reading/working on the code. In my opinion it's better to be overly helpful than it is to leave it all difficult to understand.
Also, it's obvious that outdated comments are terrible. It is my expectation that if comments are going to be in the code at all, they should be maintained just like the rest of the code.
Commenting methods can often be redundant. But I find it is almost always a good idea to comment classes that contains any degree of complexity. Too often when digging into a new project I see a class like "ItemHelper", and have to read a few dozens of lines to realize something that could be summarized like: "Converts X-data to a format that Item can parse".
It makes it so much easier to get introduced to a code base, and comments like that seldomly needs to be updated.
I prefer a far more stilted comment style otherwise you have trouble remembering the code between the comments and tend to scroll a lot more. I mean /* The error cases here shouldn't happen, but check anyway */ is really not that helpful.
Also, all the long descriptions before function calls are mostly redundant. If i want to know what a function does I can go there and look, but it really should have a sufficiently descriptive name that after the first time I don't need to check again most of the time. Unless something non obvious is going on.
I think it is helpful to know if the check is just a sanity check or if there's an already known set of conditions can lead to that particular error.
The function comments I also find very useful. Read a few of them and see how much information they carry. Preconditions for the function, reasons why it does what it does, assumptions it makes... These functions are used from throughout the code and it's important to document them well.
I'm purposefully not quoting specific parts of the file, because of course if you look at each and every one of them, you'll find a few that could be improved. But the OP asked for a well commented code base and if PostgreSQL is not one, then I don't know what would be.
I could argue that comments are best for exceptional behavior so if it's normal to do a lot of sanity checks there is little reason to comment on each one. However, my point was saying something like:
if (error) //sanity check
Saves space and get's the same point across without padding the line count.
I actually rather like that comment. When reading code I find it useful to figure out how each error condition can be hit, so it's nice to know that something is just a sanity check and shouldn't ever actually happen.
At first brush those comments seem overly verbose and obvious:
/* Do we have any named arguments? */
...
/* If so, we must apply reorder_function_arguments */
if (has_named_args)
{
args = reorder_function_arguments(args, func_tuple);
No one even reads those documents. The client claims they are to assist future developers - internally or another vendor - but would you trust a Word document or a .cpp file when everything is on fire and you need to figure out a system's behavior? Same thing applies for code comments.
There is one source of truth in software, the code. You can pile on the design documents, wikis, and architecture diagrams but at the end of the day, what is coded is what gets executed. Make the code easy to understand and reason about. And then, if you really want to be nice to future developers, throw in a single page overview of the codebase (README) that gives someone an entry point.
Edit: I will just add the caveat to preempt the inevitable - this does not apply to public facing documentation of an API/framework/library etc.