This is pretty bad code. As it stands, the code is unreviewable and
well-nigh undebuggable, because there is no way of telling what it is
intended to do.

The code was found with no comments whatsoever. Comments by
themselves will never transform bad code into good code. However,
comments can be seen as the first step along a road that leads to
better code, because some comments as to intent would give us a
chance of judging whether the code was doing what it was intended to
do.

Let’s try to reverse-engineer this code. We can see that it always
returns either the integer 1 or the integer 0. These can be
considered the canonical boolean values, i.e. the canonical
representations of the concept of truth and the concept of falsity.
This is arguably useful, because the language’s built-in true/false
checker accepts a goodly number of non-canonical representations of
truth and falsity. For example, the integer 0, the string "0", and
the string "0000" (among other things) all represent falsity.

So to a first approximation, this routine takes a node whose value is
some (possibly non-canonical) representation of truth or falsity, and
returns the canonical representation. To this level of approximation,
if(node.getValue()) will behave the same as
if(node.getBoolValue()). There is an advantage to using
getBoolValue(), because it always returns an integer, which can safely
be used in arithmetical expressions, whereas getValue() might not.
Similarly, getBoolValue() might look better in a printout.

But wait, we haven’t finished analyzing the code. Apparently the
author decided that it would be nice to allow the string "false" as
yet another representation of falsity. This means it is no longer
true that if(node.getValue()) will behave the same as
if(node.getBoolValue()).

So far, so good, I guess. Using a five-letter string to
represent falsity seems a little bit weird, but if that is what the
author wants to do, he’s allowed to do it. Maybe that functionality
is needed as part of a user interface somewhere.

I doubt that many
users are going to type "false" when they could just type 0. Also,
accepting "false" but not accepting "False" or "FALSE" doesn’t seem
like the greatest user-interface design.

Alas, we still have not fully analyzed the code. I suspect that the guy who
wrote it never fully analyzed it, either. There remain cases that
have not been considered. In particular, an uninitialzed node (i.e. a
node with the value nil) will be treated as true by getBoolValue(),
even though it is treated as false by the language. So this is yet
another way in which if(node.getValue()) will behave
differently from if(node.getBoolValue()).

Maybe this was intended. It is easy to find cases where “silence
implies consent”, i.e. cases where an uninitialized variable should
be treated as true. (For example, in a modeling program, when adding
an electrical switch in series, where previously there was a solid
connection instead of a switch, the default value of the switch should
be on, not off.)

It strikes me as an unpleasant design to combine these three ideas
into one function: Canonicalizing the representation of truth and
falsity, adding a new representation of falsity, and changing the
behavior of unintialized variables. If I were doing it, I would
put these ideas in separate functions. But if the author really
wants to smoosh them together, he’s allowed to do so.

The point remains: Some comments would make it a whole lot easier
to figure out what the author intended. This in turn would make
it a whole lot easier for ordinary folks to use this routine. They
would be able to use it without taking the time to reverse-engineer
it.

Once upon a time I was actually using this routine. I considered it a
convenient way to treat unintialized nodes as representing truth.

However, the author changed the function. The new version is shown
in example 2.

How did this mess come about? It was not an accident.
The author made a point of not putting comments
in his code ... and publicly attacking others if they dared
to write code with comments. See item 20 below.

Perhaps the author, to this day, has never thought clearly enough
about his code to realize that the code is not self-documenting. In
this case, the process of writing down the intent of the code would
have helped him clarify his thinking. Or perhaps the author changed
his mind about the intent. Or perhaps the intent was rock-solid all
along, but he didn’t understand his own code well enough to tell
whether the first version (or any other version) implemented the
intent or not.

I had to change my code; I had to rip out all references to this
function, because it no longer did what I wanted. (The fact that
the author changed this without asking anybody – and without even
telling anybody – just added to the unpleasantness.)

You might be tempted to say that the change was a step toward
simplification, in the sense that the code now implemented only two
ideas instead of three. It canonicalized the representation, and
added the string "false" as a new representation of falsity.

Ah, if only that were true.

This code is so bad that there are still cases, heretofore unanalyzed
cases, where if(node.getValue()) does one thing and
if(node.getBoolValue()) does another. So I still have no idea
what is the intent of this code, no way to know whether it is
functioning correctly, and no way to know whether/how it can be used
safely. (I don’t really care anymore. This is now of theoretical
interest only, because I rewrote all my code so that it no longer
depends on this misbegotten getBoolValue() function.)

Countless times people have come to my office saying, “I need to do
blah-de-blah; is there any chance you have a program to do that?”
Sometimes I am able to help out. I may not have the exactly what was
asked for, but I’ve got something similar that can be modified. And I
can find it.

This is called reusability. When you are able to re-use
software, productivity goes up by a huge factor.

Reusability is related to modifiability and extensibility. These
ideas are particularly relevant to “open” software. Note the
contrast:

Some software is open, but only in some narrow legalistic
sense. It may be legal to modify the software, but it is impractical,
because the software is so badly written and so badly
documented.

Some code is open in spirit, i.e. open in the broadest,
grandest sense. That means that it is practical (as well as
legal) for a wide range of people to maintain it, extend it, and/or
modify it to serve new purposes.

Everybody seems to agree that critical thinking is important.

Some people claim that critical-thinking skills are hard to
learn and nearly impossible to teach.

I say the first step is simply
check your work. This is something that is supposed to be
taught in first grade and every grade thereafter. The second step
along this road is to check the other guy’s work.

Commenting the code is a way of saying to the world that you believe
in critical thinking. Software that needs to be reliable will be
subject to a code review. Good documentation makes the review
go more smoothly, and makes it more likely that the review will
accomplish its intended purpose.

Bottom line: Software with good documentation is more reliable, more
reviewable, more maintainable, more reusable, and more extensible than
software without.

Commit log messages Let’s assume you are using some
source of source-code management (SCM) system. This might also be
called a document control system or version control system. You can
(and should) use such a thing to document how the current state of the
project differs from the preceding state. This introduces the concept
of the lineage of a file, consisting of the current version of
the file, all previous versions, and the commit log messages. The
commit log messages are internal to the lineage, but external to the
current version of the file. (This means that you can’t classify
commit log messages as simply “internal” or “external” without
specifying whether you are talking about the file or the lineage.)
See item 18 for more discussion of commit log messages.

On numerous occasions I’ve seen code that looked simple, but wasn’t.
The code was only a few lines long, but a dozen pages of algebra were
required in order to demonstrate that those were the correct few
lines. In such a situation, internal documentation is a hopeless
task, because within comments you cannot typeset diagrams or complex
equations. The solution is to write some external documentation and
bundle it with the code.

The comment doesn’t tell us anything we didn’t already know. It is
obvious from reading the code that we are fetching the
“interval-sec” property and storing it in “b”. But that leaves
all sorts of open questions and issues. The main overarching issue
is, we would like to know the meaning of “interval-sec”. For
example, we might like to know how “interval-sec” got into the
property tree, and/or we might like to know what “b” is going to do
with it.

Note that the name “interval-sec” doesn’t tell us much about the
meaning. There are lots of intervals in the world. It turns out that
this particular interval is the inverse of the some mouse event repeat
rate ... but you could never figure that out by reading the code. You
could grep all day for “mouse” or “event” or “repeat” or
“rate” and never find this bit of code.

In order to be useful, a comment doesn’t even need to be a complete
sentence. Any comment that mentioned “mouse” or “repeat” would
have greatly facilitated maintaining and extending the code in
example 4. I’m sure the guy who wrote the code knew it had
something to do with “mouse” and “repeat”, so it wouldn’t have
been difficult for him to mention it.

4.
Here is yet another example of the same sort of
nonsense. This is not at all contrived; it was found “in the
wild”, written by somebody who was trying to do the right thing.

The comment in example 5 was found in the file
environment_mgr.cxx. The apparent goal of this comment was to
explain the overall purpose of the file. That’s a commendable goal.
However the comment doesn’t tell us much beyond what we could have
inferred from the filename.

If you need further evidence that the comment in example 5 is
uninformative, consider the fact that the same comment was found
verbatim in another file, namely environment_ctrl.cxx:

Sufficiently wizardly programmers could presumably ascertain instantly
that the purpose of the jj++ statement was to solve a fencepost
problem, so for them the comment is useless: they understand this line
of code, with or without the comment. At the opposite extreme,
low-skilled wannabe programmers would have no clue what a fencepost
error is, so for them the comment is useless: they don’t understand
this line of code, with or without the comment. However, in between
these two extreme there is a huge population of ordinary programmers
who know what a fencepost error is, or can at least look it up
(e.g. in reference 1), yet might not have instantly
recognized that this particular jj++ pertained to a fencepost
situation.

6.
As a rough estimate, doing a good job of writing the
comments takes just as long as writing the code. When you are
planning a project, budget enough time for the code and the
comments. This is approximately double the budget for the code
alone.

As an obvious corollary, if you are sure that nobody is ever going to
re-use the code or even look at it, you shouldn’t bother commenting it
at all. It would be a waste of time.

The converse corollary is that if you think the code is going to be
re-used even once, the comments are a break-even proposition.
If the number of re-uses exceeds one, then the comments pay off very
handsomely.

True story: One of the first nontrivial programs I ever wrote was a
communication program, sort of a primitive version of Kermit or
Telnet. It was written in assembly language. I started writing it
after dinner one day, and finished it before midnight. I never
dreamed that anyone but me would ever use it ... but five years later
it was still in use by significant numbers of people, including my
friends and employees. They wanted to add some features to the
program, but they couldn’t. They discovered that the source file
contained exactly zero comments, and teased me about this mercilessly.
The program was a lose/lose/lose proposition: It was good enough that
nobody wanted to stop using it, yet it was limited enough that they
could rightfully complain about it, and the code was so inscrutable
that extending it would have been harder than writing a replacement
from scratch.

7.
If you’re a bad programmer, you shouldn’t comment your code.
Nobody is going to use your code, let alone re-use it, so in
accordance with item 6 it would be a waste of time to comment
it.

8.
If you’re a good programmer, choosing whether or not to
comment your code is tantamount to a self-fulfilling prophecy:

If you don’t comment your code, there is little chance that
it will be re-used, so in accordance with item 6 you can say to
yourself, “See, it wasn’t reused, so it would have been a waste of
time to comment it”.

If you do comment the code, there’s a good
chance that it will be re-used, possibly many times, so in accordance
with item 6 you can say to yourself, “See, it was re-used, so
it’s a good thing I commented it”.

In general, given my choice of self-fulfilling prophecies, I choose
the prophecy that has the happiest outcome. In particular, I comment
my code.

9.
Good software contains an element of redundancy, in
the sense that understanding the comments is a backup for
understanding the code, and vice versa. This twofold redundancy does
not merely make misunderstandings twofold less likely; it makes them
many, many times less likely. Do the math: If there is a 5% chance
of misunderstanding the code, and a 5% chance of misunderstanding the
comments, then in the best case (where the two outcomes are
independent), the chance of misunderstanding them both is only 0.25%,
i.e. twenty times less.

The independence mentioned in the previous paragraph explains why
the comments should give a second viewpoint on what the code is
doing, rather than mindlessly parroting what the code says, as
mentioned in item 2.

Well-written works of natural language contain a great deal of
redundancy. We should expect a goodly amount of redundancy in
well-written works of computer language.

10.
It is particularly important to document the
interfaces. For example, when writing a subroutine, document the
meaning of the arguments and the meaning of the return value.
Document the assumptions that are made and the restrictions that
apply.

Usually the interface should be documented in the header file,
not just in the implementation file. To say the same thing in the
language of C or C++, the interface should be documented in the .h
file, not just in the .c file. If you want to cut-and-paste the same
comments into the .c file, that’s fine, but the .h file should be
considered the “master” copy. One reasonable scheme is to put a
remark in the .c file telling people to read the explanations in the
.h file. The upside of this scheme is that only need to maintain one
copy of the documentation. The downside is that it is slightly less
convenient for readers to refer back and forth from file to file.

The interface documentation should be in the header file, because that
is the publicly exported definition of the interface. If you do
things right, incomparably more people will be reading the .h file
than reading the .c file. Of course at some point you need to switch
to external documentation, as discussed in section 2.2.

11.
The names of symbols (such as variables and functions) can
often help explain the meaning of the symbol, but they are rarely
sufficient explanation, except in unusually-simple cases. See
section 4 for a discussion of this point.

12.
This whole document could be considered an essay on
programming style or commenting style, in the broad, strategic sense.
However, often people speak of “commenting style” in a much
narrower, tactical sense. They talk about style issues such as how
much comments should indented and whether //⋯ comments
are better than /* ⋯ */ comments.

These tactical style issues are not trivial, but they are, relatively
speaking, less important than the strategic issues. The important
thing is that comments convey meaning. If your style is so bad that
it interferes with intelligibility, you need to change your style.
Otherwise, I’m not going to worry about it; not right now, anyway.

If you are modifying a program, you should probably adhere to the
existing style, other things being equal, because a mixture of
inconsistent styles can make things harder to read. However this
remains a tactical issue, of minor importance compared to the broader
strategic issues.

13.
As mentioned in the introduction, good comments are
not a substitute for good code, and good code is not a substitute for
good comments.

To express the same idea another way: Good comments are not an excuse
for bad code, and good code is not an excuse for bad comments (or no
comments).

Here’s a situation comes up fairly regularly: If you find yourself
writing a long comment to explain the limitations of the code,
consider the possibility that it might be easier to write some
less-limited code than it is to document the limited code ... not to
mention the fact that fixing the code results in better functionality.

Fixing the code is better than
documenting the code’s limitations.

14.
If you find that the code and the comments are in
conflict with each other, you immediately know something’s wrong, but
usually you don’t immediately know whether you need to fix one, or the
other, or both. You need to do some serious engineering. Look
around. See who calls this code, and why. Figure out what this code
should do to best serve the needs of the overall project.

15.
In some fraction of the conflict cases, the code is
right and the comment is wrong. Note that a wrong comment might be
worse than no comment ... or it might not. Often it is easier to
repair a slightly-wrong comment than to create a comment where there
was none before. For example, if a long, detailed comment is wrong as
to one detail, it is straightforward to repair that one detail.

Of course, if a comment is maliciously deceptive, then it can
be considered harmful and worse than useless. However, this is
rare.

With rare exceptions, a comment is never truly harmful. Even
if the comment is not 100% correct, it provides a clue as to the
programmer’s state of mind at the time.

To repeat, it is common to find comments that have so little value
that it was not worth the trouble of writing them ... but once a
comment has been written, the value is rarely less than zero.
Returning to the classic lame example presented in example 1 –
namely ii++; /* increment ii */ – the comment is worthless but
it isn’t actually harming anyone. If you see such a thing, either
ignore it, delete it, or (preferably) replace it with something
better.

If the code contains worthless comments, the worthless comments are
not the problem. They may be symptomatic of a lack of more-valuable
comments, but they are not, by themselves, a problem.

Don’t pick a fight over worthless comments. Doing so would just make
you look childish. Remember, coolness of a person can be judged by
the size of the problems that annoy them.

17.
If a file contains a huge number of comments, you
should consider the possibility that external documentation is needed,
perhaps a user guide or an application programming manual.

Be sure that the comments contain a reference to the external
documentation, so it can be found when needed. Distribute the code
and documentation together, as a bundle.

18.
Commit log messages should be used to document how
(and perhaps why) the new state of the project differs from the
immediately preceding state.

Commit log messages are not a substitute for comments. Information
that needs to be in the file should be in the file, not hidden in the
commit log. (This is obvious with modern SCMs such as git, which
commit multiple files at the same time, but it remains true and
important even if you are committing just one file.)

It’s a question of incremental versus cumulative, i.e. a question of
differential versus integral. A glance at the comments in the files
should tell you what you need to know about the current version of the
project. The commit log messages pertain to the difference between
the current version and the previous version.

19.
It is OK to have comments that have only ephemeral
value. For example, it is OK to have comments that explain the
relationship between the current version of the file and preceding
version. (This should not be a substitute for good commit-log
messages, but could well duplicate some of the function of the
commit log messages. Redundancy is OK.)

If you see a comment that has outlived its usefulness, delete it.

In some well-written software, each file contains its own changelog at
the top of the file, summarizing all changes going back years. In the
days before modern SCMs, this was a particularly good idea.

Nowadays this is not necessary What’s worse, it is likely
that the early changelog messages have long since outlived their
usefulness, and are just wasting space. The changelong function is
better left in the SCM’s hands. Anybody who wants to see the early
commit messages can ask the SCM for them.

On the other hand, one
could take the redundant approach. Comments in the file that
duplicate the changelog messages are not a serious problem.

20.
Believe it or not, I once heard somebody
arguing against comments. The argument went something like this:

“The code should document itself.
Comments are an admission that the code isn’t understandable by
itself.”

This idea is wrong in so many ways that I hardly know where to start.

For one thing, “admissions” are not directly related to truth.

In Galileo’s day, there were a lot of people who did not
admit that there were moons around Jupiter.

In truth, there were
moons around Jupiter, whether anyone admitted it or not.

Applying this idea to the subject at hand, we see that:

If the code is bad, it’s bad whether you admit it or not;
removing comments is not going to make it better.

If the code is
good, adding comments is not going to make it worse.

Furthermore, having a problem and not admitting it is just
terrible from a teamwork point of view. Remember what Lyndon Johnson
said: “While you’re saving your face, you’re losing your ass.” If
you have a problem, the best thing is to fix the problem. If you have
a problem that cannot be immediately be fixed, the best thing is to
admit the problem – and document the problem – because that is the
first step toward eventually fixing it.

Note that the argument cited at the top of this item is a rather
direct denial of the view expressed in item 13. Also
it shows an amazing degree of non-understanding of the importance of
redundancy, as discussed in item 9.

Also: It is always important to distinguish what should be
from what is. Some of the people who loudly assert that
their code “should be” self-documenting are not reliably
capable of writing code that actually is self-documenting. See
section 1 for an example of this.

21.
Although the behavior described in
item 20 is pretty bad, I’ve occasionally seen worse.
Sometimes people write software that is very obscure. That includes
obscure code with no comments, uninformative comments, obscure
comments, or even misleading comments.

There are various different motives that (separately or in
combination) lead to the same behavior. These include:

There are some people who are selfish. They like to write
software, but don’t like for other people to write software.
Sometimes they are programming for pay, and this is a ploy for
improving their job security. Sometimes they are programming for fun,
and this is a ploy for “owning” a piece of supposedly-open software.
In either case, if they make it obscure enough, no one else will want
to work on it.

There are some people who are insecure. They figure that if
they can understand their software but you can’t, that makes them
smarter than you.

Needless to say, no matter what the motive(s) may be, this sort of
behavior counts as really bad teamwork. Rather than creating job
security, it will get the perpetrator kicked off any well-managed
software project. Almost every software manager on earth is aware of
these ploys. The rule is, if somebody writes code that other people
cannot understand, it reflects badly on the author, not on the others.

In the software business, there exists the idea of conditional
compilation. That means it is possible to write some code that is
visible in the file but will not be included in “this” version of
the executable program. Some languages have fancy features expressly
for the purpose of conditional compilation such as the #if
⋯ #endif features in the C and C++ languages.

There are four possibilities, depending on form and function:

Sometimes the comment-features of a language are used for
comments in the obvious way. This may include code in the comments.

Sometimes the conditional-compilation features of a language are used
for comments. Such comments may include code that is never intended
to be compiled, and indeed may include non-code verbiage.

Sometims the comment-features are used to serve the purpose
of conditional compilation. Code that is sometimes useful and
sometimes not can be moved temporarily into a comment. This is called
“commenting out” the code. This is widespread in languages that
don’t have a proper conditional-compilation feature

Sometimes the conditional-compilation features are used
for conditional compilation in the obvious way.

One of the most common uses for conditional code pertains to debugging
aids, also known as scaffolding, also known as test-harness code. In
the early phases of development, the scaffolding is very useful, but
when the code is fully tested the scaffolding is no longer needed.
However, it is better to conditionalize the scaffolding than to
delete it, because it will be needed again as soon as somebody tries
to modify or extend the code.

Let’s be clear: Leaving the scaffolding in place, subject to
conditional compilation, makes the code more open, more maintainable,
more reusable, more extensible, et cetera.

Usually, my preference is to use the “official” conditional
compilation features of the language ... especially when
conditionalizing a long snippet, or when there are multiple snippets
in multiple places, all of which are subject to the same condition
(for instance if a debugging variable is declared in one place, set in
another place, and printed out in another place).

On the other hand,
if the snippet is self-contained and only two or three lines long, I
see no harm in just commenting it out.

As mentioned in item 11, the names of symbols (such as variables
and functions) sometimes help to explain the meaning of the symbol.
An example of this is discussed in section 4.2.

On the other hand, names alone are rarely sufficient
explanation (except in the almost-simplest cases).

In sufficiently-simple situations, the meaning of the symbol
will be clear from context, even if the name is something meaningless
like “X”, and coming up with a more meaningful name would not be
worth the trouble.

At the other extreme, in complicated or even
moderately-complicated situations, it is impractical for the name to
convey the entire meaning. As explained in reference 2, a name is not supposed to be a description.

We can draw useful analogies to natural language: A titmouse is not a
mouse, chocolate turtles are not made from turtles, milk of Magnesia
is not made from milk, and buckwheat is not a form of wheat. As
Voltaire famously remarked, the Holy Roman Empire was neither holy,
nor Roman, nor an empire. The point here is that you should not
expect the name of a thing to tell you what you need to know about the
nature of the thing. This is true in computer languages as well as
natural languages.

In natural language documents, we use dictionaries, glossaries, and
legends to explain the meaning of each word. By the same token, the
comments should include a glossary, explaning the meaning of each
symbol.

Sometimes an entire sentence – or even multiple paragraphs – will be
needed to explain the meaning of a function including its uses and
restrictions. Comments are the appropriate place for such an
explanation. It would be absurd to encode all of that into a
sentence-long or paragraph-long function name.

In particular, it seems silly to require the name of a variable
to encode the type of the variable. If types are important
to you, use a type-safe language. If you are using a type-safe
language, it is silly to ask the programmer to keep track of
something that the compiler can keep track of more easily and
more thoroughly.

Similarly, it is usually not advantageous for the name of a variable
to encode the units of measurement. In most cases a good
technique is to pick a consistent set of units (such as SI) and use it
for all internal calculations, converting (if necessary) from/to other
units only when doing I/O, as discussed in section 5.2.
Quantities with non-standard units may require special comments, maybe
even a special name. Or, better yet, use a type-safe language or
something similar, so that the computer can do most of the work,
automatically keeping track of the units and doing conversions
automatically whenever necessary. This is better because the computer
can enforce the rules, whereas names and comments do not enforce
anything; they may help the programmer live within the rules, but they
do not actually prevent mistakes.

Symbols that are very local in scope can have short names.
For example, in a small local context it would be reasonable to use
“T0” to denote a starting temperature. In typical cases there is no
advantage in using a longer name such as “starting_temperature” or
“starting_temperature_in_degrees_kelvin”.

Symbols that are
widely exported need to have longer and more systematic names, to
avoid conflicts.

Namespaces are an important tool, allowing symbols to be called by
short names when appropriate and longer names when appropriate.
Example: Atmosphere::ISA::T0 is smarter than Atmosphere_ISA_T0,
because the former can be shortened when appropriate, when working
within the Atmosphere::ISA namespace.

Just to reiterate: The rule is, you should make the code as
self-documenting as possible, and then comment it. Since the code is
never fully self-documenting, the comments fill in some of the gaps,
and even if the code were perfectly self-documenting, redundancy is
useful.

One way to make the code more self-documenting is to avoid what we
call “magic numbers”; that is, unnamed numerical constants. See
reference 3 for a discussion of magic numbers and why they
should be avoided.

Let’s look at a real-world example. I emphasize that this is not at
all a contrived example; this is some code I found “in the wild” in
a low-level library routine, written by an experienced programmer who
was trying to do the right thing (but didn’t quite succeed).

In example 8, the numbers 0,1,2,3,4, and 5 are perfect examples of
magic numbers. They are the sort of magic numbers that should be
avoided, for all the reasons cited in reference 3.

When somebody tries to read this section of code, all sorts of
hard-to-answer questions arise. Why are runway and taxiway lights
apparently not included in the category of ground lights? How did the
number 0 (as opposed to 5 or 137) come to be associated with VASI
lights? Are runways 1 and taxiways 2, or vice versa? What about
approach lights? Are they missing entirely, or are they included in
one of the six categories mentioned here? Why are we checking the
visibility? What is the policy regarding visibility? Why are we
implementing high-level policy in a low-level library routine anyway?
Is this a check against flight visibility, or against ground
visibility, or something else entirely?

A subset of these questions can be answered with the help of the
comments. Based on the comments, it appears that the numbers 1 and 2
are associated with runways and taxiways. This comment is crucial,
because it allows us to search the file. Trying grep -i
runway.*light doesn’t help. Trying grep -i taxiway.*light
doesn’t help, either. But grep -i taxi.*light hits paydirt; it
finds the place where successive children of the lightSwitch object
are associated with various types of lights.

This is a good-news bad-news story:

The comments are few and shallow.

The comments are helpful;
without them the code in this example would be considered really,
really terrible.

The comments merely describe the mechanics of the code. As
such, they are in the least-helpful category, as discussed in
item 2

In this case the code is so weak that commenting on
the mechanics actually adds information.

We still lack any comments on the higher-level strategy and
intent. Commenting on the mechanics cannot take the place of
commenting on strategy and intent.

This code, as it stands, would irritate the reviewer during a code
review. Code is easier to review if each passage, by itself, can
be seen to be manifestly correct. In this case, the reviewer would
have to do a lot of work to make sure that the magic numbers were
being used correctly. This is the sort of work no reviewer (indeed no
human) should need to do; the computer should do it. The code
should be written in such a way that the compiler can guarantee
at compile-time that such numbers are being used consistently.

By the same token, this code would tricky to modify. A programmer
who disturbed the order in which the children were added to the
lightSwitch object would cause bugs in places far from the site
of the disturbance.

Many of these questions and problems would go away if the unnamed
numerical constants were replaced with named constants, such as
TAXI_CHILD(0), RWY_CHILD(1), TAXI_CHILD(2), et cetera.
Using an enum{} would make this even simpler and better.
Then in every (!) place where the code indexes into the lightSwitch
object, these names should be used. This includes (!) the place where
the children are initially added to the object. This would instantly
eradicate a wide class of bugs, and would make other bugs much easier
to debug. Note that grepping for every occurrence of
RWY_CHILD is likely to be much more rewarding than grepping
for every occurrence of 1.

Learn to use a modern programming language. By that I mean something
with objects (aka classes) and type safety. It should also have a
reasonable library including containers (such as strings, vectors,
associative arrays etc.) and math functions, so you don’t need to
re-invent or even re-write all that stuff.

Reference 4 discusses at length the advantages of
using a modern programming language.

It is a good practice, in most situations, to pick a consistent set of
“preferred” units (such as SI) and use it for all internal
calculations. If some external source feeds you data in some other
units, convert to the chosen internal units as soon as possible on
input. By the same token, convert from the chosen internal units to
external units as late as possible on output.

Here is a good technique for doing this. Consider the following example:

In this code, the “mph” variable represents one mile per hour
measured in the chosen internal units (in this case SI units). So,
when we write 80*mph, it says what it means and means what it says: 80
miles per hour.

When converting something for output, you have to think for a
millisecond, at least until you form the habit of doing things
the right way. Here’s an example:

When using this system, the “/” means “in”, so the expression
“myspeed/mph” is pronounced “myspeed in mph” – i.e. myspeed
measured in mph. You will quickly get used to thinking of it in this
way.

Dividing myspeed by mph makes sense in terms of the factor-label
method, especially when you see (A) in the context of (B) in this
example. Also the scaling is correct, in the sense that if mph is a
“big” unit, the number that gets printed is smaller, since we need
fewer such units to do the job.

I don’t like to spend much time discussing misconceptions and bad
ideas, so let me just mention in passing that there are other ways of
handling the unit-conversion task that are an order of magnitude
clumsier and uglier.

Being fastidious about the units is absolutely mandatory. If you need
a reminder of why, consider the Mars probe that was lost, at a cost of
328 million dollars, due to a mixup in the units. See reference 5.

The method suggested here does not solve all the world’s problems.
For example, it cannot convert degrees Fahrenheit to degrees Celsius,
and it cannot convert dBm to milliwatts. However, it works extremely
well over a very wide range, whenever one thing is proportional to
another.

Remember that units are not the same as dimensions, as discussed in
reference 6. Also remember that dimensional
analysis is just a simplified form of scaling analysis, as discussed
in reference 7 and reference 8.

Let’s compare and contrast the two snippets of code in
example 9 and example 10. The first one tries to open a file, and assumes
the attempt will succeed. The second one takes into account that the
attempt will fail. It prints an error message explaining what it was
trying to do, along with details of what went wrong.

It is bad practice to ignore possible errors. It is a very common
practice, but that doesn’t make it any less bad.

There are multiple reasons for this, as we now discuss. Suppose users
are using the code that you’ve written.

Bad Scenario

Good Scenario

One day some user says “The code doesn’t work.” You have
no idea what the problem is.

One day the user sees an error message,
realizes that he did something wrong, and fixes the problem himself.
You never even hear about it.

Error messages are necessary for the users, as part of the user
interface.

Later that day, another user says “The code doesn’t work.”
You still have no idea what the problem is. Maybe there’s a bug,
maybe there isn’t. You have to identify a bug before you can
begin to fix it.

In cases where the user can’t fix the problem
himself, he comes to you and says “The code gave me the following
error message.” You know exactly where to start looking for the bug.

Error messages are necessary to preserve the programmer’s sanity.

All in all, good error detection and good error messages can improve
your productivity – and your team’s productivity – by a huge factor.

In some cases, especially for relatively serious errors, throwing an
exception is good practice. You might want to use a return code for
trivial oddities, and throw an exception for more serious problems.
Exceptions, when properly used, can result in code that is easier to
write, easier to read, and more reliable. See reference 4 for more about exceptions. Part of the story
here is that if a status code is ignored, it gets lost, whereas if an
exception is ignored, it gets passed up the chain until somebody deals
with it.

Note that not every “unusual” situation should be treated the same.
Consider a file-not-found situation for example. There are many
possibilities, including the following:

Suppose the user’s custom configuration file cannot be found.
Assuming you have chosen intelligent defaults, the program can
continue, simply using the default configuration. In this
case file-not-found does not generate an error; it does not
even generate a warning.

Sometimes file-not-found should generate a warning. The
program then takes some remedial action and continues.

Sometimes file-not-found is a fatal error. No remedial
action is possible. No continuation is possible.

This explains another way in which paying attention to error codes
results in better software: As part of the process of paying
attention, you should ask yourself whether you can take some remedial
action, in which case situation that could have been a disastrous
error becomes no error at all. This is the finest form of what we
call defensive programming. In other words:

Ignoring serious errors does not reduce the number of errors.
It just makes them harder to fix.

The best way to minimize the number
of error messages is to minimize the number of errors.

Here is an example of doing things reasonably well. This is taken
from real-world code. The inner HTML within the <canvas>
element will be shown if and only if the browser does not
support the <canvas> feature. It will show an explanatory
message on a pink background. Similar words apply to the
<script> element.

<canvas id="atomCanvas" width="500" height="500">
<div style="background-color:pink;">
Hmmm, it looks like this browser does not support
the HTML5 &lt;canvas&gt; feature.
</div>
</canvas>
<script type="text/javascript" src="./foo/bar.js">
<div style="background-color:pink;">
Hmmm, this browser does not seem to be supporting Javascript.
Additional information may be available via
the browser's "error console" or some such.
</div>
</script>

Good error checking and good error messages are particularly important
for educational software. If students never made mistakes, they
wouldn’t be students.

Good error checking and good error messages are also super-important
for research-grade software. Researchers can be relied upon to use
software (and everything else) in ways that were never anticipated.
It’s part of the job description.

Error messages should be as specific as possible. For example,
suppose I am applying a design-rule checking program to a circuit
design with several gazillion transistors. It is no help at all if
the program says “this layout violates the design rules”. It is
much better if the program says “There is a parasitic lateral-PNP at
location (xx, yy) involving blockID foo and blockID bar....” Note
that in a research situation, it is entirely possible that I was
intending to create a lateral-PNP.

Generating good error messages is sometimes super-easy, but sometimes
not. Commonly a low-level library function will know the details of
what went wrong, while a mid-level routine will know more about what
it was trying to do, and why. In such a case, it may be worthwhile to
proceed as follows: The low-level routine throws an exception.
The mid-level routine catches the exception, adds some additional
information, and then re-throws.

Note that when using a modern object-oriented programming language,
you can return a fairly complex object as an result-status, and you
can throw a fairly complex object as an exception.