Comment Your Code

Nov 17, 2017

There’s a disturbing thread that pops up every once in a while where People On
The Internet say that comments are bad and the only reason you need them is
because you and/or your code aren’t good enough. I’m here to say that’s bullshit.

Code Sucks

They’re not entirely wrong… your code isn’t good enough. Neither is mine or
anyone else’s. Code sucks. You know when it sucks the most? When you haven’t
touched it in 6 months. And you look back at the code and wonder “what in the
hell was the author thinking?” (and then you git blame and it’s you… because
it’s always you).

The premise of the anti-commenters is that the only reason you need comments is
because your code isn’t “clean” enough. If it were refactored better, named
better, written better, it wouldn’t need that comment.

But of course, what is clean and obvious and well-written to you, today, while
the entire project and problem space are fully loaded in your brain… might not
be obvious to you, six months from now, or to the poor schmuck that has to debug
your code with their manager breathing down their neck because the CTO just ran
into a critical bug in prod.

Learning to look at a piece of code that you understand, and trying to figure out
how someone else might fail to understand it is a difficult skill to master. But
it is incredibly valuable… one that is nearly as important as the
ability to write good code in the first place. In industry, almost no one codes
alone. And even if you do code alone, you’re gonna forget why you wrote some
of your code, or what exactly this gnarly piece of late night “engineering” is
doing. And someday you’re going to leave, and the person they hire to replace
you is going to have to figure out every little quirk that was in your head at
the time.

So, throwing in comments that may seem overly obvious in the moment is not a bad
thing. Sometimes it can be a huge help.

Avoiding Comments Often Makes Your Code Worse

Some people claim that if you remove comments, it makes your code better,
because you have to make your code clearer to compensate. I call BS on this as
well, because I don’t think anyone is realistically writing sub-par code and
then excusing it by slapping a comment on it (aside from // TODO: this is a
temporary hack, I'll fix it later). We all write the best code we know how,
given the various external constraints (usually time).

The problem with refactoring your code to avoid needing comments is that
it often leads to worse code, not better. The canonical example is factoring
out a complicated line of code into a function with a descriptive name. Which
sounds good, except now you’ve introduced a context switch for the person reading
the code.. instead of the actual line of code, they have a function call… they
have to scroll to where the function call is, remember and map the arguments
from the call site to the function declaration, and then map the return value
back to the call site’s return.

In addition, the clarity of a function’s name is only applicable to very trivial
comments. Any comment that is more than a couple words cannot (or should not)
be made into a function name. Thus, you end up with… a function with a
comment above it.

Indeed, even the existence of a very short function may cause confusion and more
complicated code. If I see such a function, I may search to see where else that
function is used. If it’s only used in one place, I then have to wonder if this
is actually a general piece of code that represents global logic… (e.g.
NameToUserID) or if this function is bespoke code that relies heavily on the
specific state and implementation of its call site and may well not do the right
thing elsewhere. By breaking it out into a function, you’re in essence exposing
this implementation detail to the rest of the codebase, and this is not a
decision that should be taken lightly. Even if you know that this is not
actually a function anyone else should call, someone else will call it at some
point, even where not appropriate.

The problems with small functions are better detailed in Cindy Sridharan’s medium post.

We could dive into long variable names vs. short, but I’ll stop and just
say that you can’t save yourself by making variable names longer. Unless your
variable name is the entire comment that you’re avoiding writing, then you’re
still losing information that could have been added to the comment. And I think
we can all agee that usernameStrippedOfSpacesWithDotCSVExtension is a terrible
variable name.

I’m not trying to say that you shouldn’t strive to make your code clear and
obvious. You definitely should. It’s the hallmark of a good developer. But
code clarity is orthogonal to the existence of comments. And good comments are
also the hallmark of a good developer.

There are no bad comments

The examples of bad comments often given in these discussions are trivially
bad, and almost never encountered in code written outside of a programming 101
class.

// instantiate an error
var err error

Yes, clearly, this is not a useful comment. But at the same time, it’s not
really harmful. It’s some noise that is easily ignored when browsing the
code. I would rather see a hundred of the above comments if it means the dev
leaves in one useful comment that saves me hours of head banging on keyboard.

I’m pretty sure I’ve never read any code and said “man, this code would be so
much easier to understand if it weren’t for all these comments.” It’s nearly
100% the opposite.

In fact, I’ll even call out some code that I think is egregious in its lack of
comments - the Go standard library. While the code may be very correct and well
structured.. in many cases, if you don’t have a deep understanding of what the
code is doing before you look at the it, it can be a challenge to understand
why it’s doing what it’s doing. A sprinkling of comments about what the logic
is doing and why would make a lot of the go standard library a lot easier to
read. In this I am specifically talking about comments inside the
implementation, not doc comments on exported functions in general (those are
generally pretty good).

Any comment is better than no comment

Another chestnut the anti-commenters like to bring out is the wisdom can be
illustrated with a pithy image:

But, that was a problem 20 years ago, when code reviews were not (generally) a
thing. But they are a thing now. And if checking that comments match the
implementation isn’t part of your code review process, then you should probably
review your code review process.

Which is not to say that mistakes can’t be made… in fact I filed a “comment
doesn’t match implementation” bug just yesterday. The saying goes something
like “no comment is better than an incorrect comment” which sounds obviously
true, except when you realize that if there is no comment, then devs will just
guess what the code does, and probably be wrong more often than a comment would
be wrong.

Even if this does happen, and the code has changed, you still have valuable
information about what the code used to do. Chances are, the code still does
basically the same thing, just slightly differently. In this world of
versioning and backwards compatbility, how often does the same function get
drastically changed in functionality while maintaining the same name and
signature? Probably not often.

Take the bug I filed yesterday… the place where we were using the function was
calling client.SetKeepAlive(60). The comment on SetKeepAlive was
“SetKeepAlive will set the amount of time (in seconds) that the client should
wait before sending a PING request”. Cool, right? Except I noticed that
SetKeepAlive takes a time.Duration. Without any other units specified for the
value of 60, Go’s duration type defaults to…. nanoseconds. Oops. Someone had
updated the function to take a Duration rather than an Int. Interestingly, it
did still round the duration down to the nearest second, so the comment was
not incorrect per se, it was just misleading.

Why?

The most important comments are the why comments. Why is the code doing what
it’s doing? Why must the ID be less than 24 characters? Why are we hiding this
option on Linux? etc. The reason these are important is that you can’t figure
out the why by looking at the code. They document lessons learned by the devs,
outside constraints imposed by the business, other systems, etc. These comments
are invaluable, and almost impossible to capture in other ways (e.g. function
names should document what the function does, not why).

Comments that document what the code is doing are less useful, because you can
generally always figure out what the code is doing, given enough time and
effort. The code tells you what it is doing, by definition. Which is not to
say that you should never write what comments. Definitely strive to write the
clearest code you can, but comments are free, so if you think someone might
misunderstand some code or otherwise have difficulty knowing what’s going on,
throw in a comment. At least, it may save them a half hour of puzzling through
your code, at best it may save them from changing it or using it in incorrect
ways that cause bugs.

Tests

Some people think that tests serve as documentation for functions. And, in a
way, this is true. But they’re generally very low on my list of effective
documentation. Why? Well, because they have to be incredibly precise, and thus
they are verbose, and cover a narrow strip of functionality. Every test tests
exactly one specific input and one specific output. For anything other than the
most simple function, you probably need a bunch of code to set up the inputs and
construct the outputs.

For much of programming, it’s easier to describe briefly what a function does
than to write code to test what it does. Often times my tests will be multiple
times as many lines of code as the function itself… whereas the doc comment on
it may only be a few sentences.

In addition, tests only explain the what of a function. What is it supposed to
do? They don’t explain why, and why is often more important, as stated above.

You should definitely test your code, and tests can be useful in figuring out
the expected behavior of code in some edge cases… but if I have to read tests
to understand your code in general, then that’s red flag that you really need to
write more/better comments.

Conclusion

I feel like the line between what’s a useful comment and what’s not is difficult
to find (outside of trivial examples), so I’d rather people err on the
side of writing too many comments. You never know who may be reading your code
next, so do them the favor you wish was done for you… write a bunch of
comments. Keep writing comments until it feels like too many, then write a few
more. That’s probably about the right amount.