On documenting code

November 2, 2016

When I write code, I try to make it clear what it does without needing
comments; I try to use meaningful function and variable names, not make
functions too big or too small, try not to write anything too ‘clever’, etc.
Recently I’ve began to feel that’s not really enough. I feel that I can make
improvements to make it clearer to others what is going on in my code, and most
importantly, to me when I have to fix a bug six months later.

Let’s take a look at a couple of popular open source projects and see how they
document their code. Just to make sure nobody gets offended I should note that
these are just random examples, I’m not picking on anyone or any project in
particular. I’ve specifically looked for shorter functions as examples, as it’s
easier to see and explain what’s going on.

Example 1

In the Ruby community it’s popular for code to be self-documenting and to use
tests as documentation, as the language itself is quite clean and testing is
ingrained into the community. Ruby is the main language I’ve used throughout my
career, so my way of writing and documenting code is pretty similar to this.

This is from Bundler,
the most popular dependency management tool for Ruby. This returns a hash where
the value is some sort of checksum, and the key, called name, is - I assume -
the name of a Ruby Gem/library. I’m also assuming versions_path is a
filename, and lines returns a subset of lines from that file. I don’t know
what the 2nd entry of the line contains, and why it is ignored.

Without knowing what the class as a whole does (for which there also isn’t any
documentation), you can’t know exactly what this does. Even looking around the
code that uses this, it’s not exactly clear what it does and what edge cases
there are. This isn’t meant to be a public API, but that’s not really the point.

This could be improved with a short sentence explaining what this method does,
how it should be used, what class variables it uses, what it returns and why.

Example 2

This is from
Ruby on Rails,
specifically the ORM ActiveRecord. Just looking at the code, it’s about on-par
with the previous example. In Rails it’s standard for functions suffixed with a
question mark to return booleans, so it’s pretty clear what abstract_class?
and table_exists? return. attribute_types isn’t so clear - it must act
like a hash seeing how it’s used, but it’s not so clear what it contains or
what this would return without knowing the domain-logic (in ActiveRecord
‘attribute’ means a database column). There isn’t anything too
complicated going on, but it’s not entirely clear without looking around the
class.

But this has documentation! The comment explains what it does and the edge
cases. The example enforces what it does and makes it clear exactly what it
returns. It doesn’t explain the why however, i.e. why does this return an empty
array for an abstract class? I should note that this is a public API, and Ruby on Rails uses the
code to automatically generate documentation.

This is Go, it’s from
go-github,
a client for the GitHub API. I feel that Go has a bit of an advantage over Ruby
is terms of legibility, it has types in arguments for starters and the built-in
go fmt tool really help enforce coding standards. On the other hands it is
popular to use single character variable names (u here is the path of a URL)
and the multiple return values in this example aren’t entirely clear (although the
language does support named return values).

The documentation of this could be improved, for example explaining clearly
what each return value is and why (I assume *Response is returned so you can
log the HTTP response in case of an error, but is there any other use?). It’s
also not clear what sort of errors could be returned without digging into
NewRequest and parseBoolResponse.

This is some of the Erlang code for WatchSumo.
It’s a simple utility function which takes an Erlang timestamp (from
calendar:universal_time()) and returns it in ISO 8601 format (e.g.
1994-11-05T13:15:30Z).

It’s pretty clear it converts some sort of timestamp, but unless you a master of
Erlang format strings, the only way you’ll know what format it returns is by
running it. Even if the function had a clearer name like
calendar_universal_time_to_iso8601, I’d still have to Google “iso8601” to
know what format it returns. In this case, the best way to document it would be with
an example.

More examples

(If you have any more good examples, send them in, and I’ll update the list)

Takeaways

So what can we learn from this? Self-documenting code usually explains the
“what”, but it doesn’t explain the “why”. Usually I add a comment explaining
why something happens if it feels a bit weird, but I don’t think that’s
enough. What I feel weird probably doesn’t match what someone else feel is
weird, and these examples show that.

Here are some guidelines I want to follow in relation to commenting code:

The class/module/file should be commented explaining at a high level what it
does and why.

The function should be commented explaining briefly what it does, but mainly
why. Complicated logic and public APIs should go into more detail.

Arguments should be commented explaining what they are and what they are used
for (especially when using a weakly typed language).

Return values should be commented explaining what they are and how they can
be used. Take care when returning values from other functions to ensure they
are clear (like errors in the third example).

Any variables used or changed outside the scope of the function should be
commented (but prefer writing functional code and avoid global state).

Examples of usage should be included where it will help make it clearer what
the code is for, especially public APIs.

Link to other documentation with a “see also” where you will just be
repeating what is in other comments (without too much indirection).

Be careful though, religiously writing comments just for the sake of writing
comments is just as bad.
Comments should be added to make code clearer. If with comments it’s still not
clear, maybe you just to rewrite the code.