Comments

I was chatting informally with a group of developers last week when one used a phrase guaranteed to set me off.

“I write self-commenting code,” he revealed proudly.

Uh huh. I have yet to see a program of any useful size that, stripped of comments, is self-documenting. C is not a language like English or Swedish where there’s so much information conveyed that even in a noisy room, where one might catch only 70% of the words, the meaning still comes across.

Computer languages are inherently dense and precise: miss a single character and the program won’t run correctly. Mix up “Identifier” and “identifier" and, if you’re lucky, the compiler will complain. Programmers with less good fortune will get a clean compile and spend days or weeks looking for a hard-to-find bug.

Usually these folks will go on at length about their use of long variable names. Hey, I’m all in favor of making variables as long as they need to be to clearly express an idea. But length, in this case, isn’t always an asset. I find it awfully hard to read something like:

The C constructs (operators et al) are lost in the morass of names. And reading a single statement split across many lines confuses the eyes.

Most compilers only recognize the first 31 characters as being unique so it’s dangerous to get enamored with exceedingly long names.

Some developers subscribe to the “comment nearly every line” school of thought. Their code looks like:

for (i=0; i
++Array_Pointer; // point to next element in Array
*Array_Pointer=0; // Set Array element to zero
} // end for

Much better is to eliminate those annoying and not particular informative comments and prefix the entire snippet with:

// Array is a sparse matrix; empty
// elements are denoted by zero so
// here we initialize all elements to
// “empty” (a zero).

The second style conveys the sense of the code while the first gives plenty of detail and no context.

Others write the code first and add the comments later. They don’t want to “waste time” with documentation while endlessly fiddling with a routine to get it to work. But the comments should be the function’s design. If the developer doesn’t know enough about the design before cranking code, just how do he start pounding C into the editor? Is it a random walk? “Uh, hmm, I dunno, let’s try:”

void main(void){

“Oh boy, now what? How about maybe initializing something… or should we set up a queue?”

The idea must be that if they type enough C a function, a structure and a clear idea will emerge. That might indeed happen… eventually. But not efficiently.

There’s a spec somewhere - perhaps only in the developer’s head - which describes in English what a function should do in a human-friendly manner. The code is a translation of that spec to cryptic and unforgiving computerese. So I figure the way to write a function is to create all of the comments first. The header, and even all of the individual little snippets of English spread throughout the code. Then the function’s design is done.

After that, anyone can fill in the code.

Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. Contact him at jack@ganssle.com. His website is www.ganssle.com.

No such animal... self commenting code. But on a similar note,
I'm for auto syntax correcting compilers. Suppose I write:

fr(idx=0; idx<10; idx++)="">10;>

{ etc(idx); }

or

whle(idx<25)>25)>

{ etc(idx); }

The compiler should be smart enough to see that "fr" is really
and "whle" is really . It would be interesting
to see a compile with an auto syntax correcting feature. What
do you think?

- Steve King

ok...

'self-commenting-code' is pretty much an oxymoron. Comments,
in general, are supposed to elucidate what the code is doing.

In general, I first look at the header file...if I find nothing
of use there, then I cringe and start plowing through the
1000's of lines of 'C'.

In my opinion, it is better to have:

1. Clearly defined specifications....(notice, I did not say complete!)

2. Decent design document, (Word with some of those Visio pictures
are great).

3. Some header files that show some of the entity relationships,
and have some comments on the rules and constraints.

- Ken Wada

Mr. King's idea of a "correcting compiler" is interesting. However
I am not sure I want the compiler changing my source
files. Perhaps such a compiler needs 2 modes, a "correcting"
mode for use in development and a "production" mode which does
not change the source code.

He can have almost the same thing with modern code editors. By
coloring keywords, the typos of "for" and "while" would become
quickly apparent, since they would appear as variable or
function colors instead of keyword colors. An even greater
advantage is coloring the comments different than the code.
If you accidentally put some code in a multi-line comment it
would be readily apparent.

I think Mr. Wada's 3 points are excellent.

- Bob Bailey

The compiler should be smart enough to see that "fr" is really
"for" and "whle" is really "while".

I see that your form does not like the greater than or less than
symbol... and anything between them is deleted!

- Steve King

To make sure that the code is self-commenting. The computer
has to understand the components and the result.

Take one popular example.

"Portable"

"Computer" ->

Portable Computer

Did the computer recognize this?

Don’t think so.

PS

Documentation and tools are often driven by word processors,
compilers and portability, rather then by the need for good
program documentation.

- Martin

We were taught, many years ago, to provide concise comments in code only in areas where explanation was necessary. Why?

If you have had to maintain code that has changed significantly, but its verbose and unneccesary comments were left unchanged, you need not ask why!

- Martin Allen

Donald Knuth (yes that Knuth) developed something he called
literate programming. The idea is to merge

writing programs & documentation.

I has a cult like following. CWEB is

a tool for C, and there tools for languages like ML. The thing
is

most of us are not great writers -- we

"hack out code" and as a result the concept never seemed to
catch on

- Alwyn E. Goodloe

The closest that I have seen something that was "self-documenting",
and useable was doxygen. If the comments are written
with the proper syntax that the tool understands, it works
very nicely. One additional feature that it has is that it can
genearte user documentation along with diagrams. A very useful
tool, which has a very good price as well -free. :)

- Arvind V

There is nothing like moving to a new company and inheriting
a substantial "self-documented" program to make one a believer
in the proper use of documentation. Self-documented code
is particularly cruel when your predecessor left you with a
revision that does not work, things that worked were changed,
and self-documentation does not tell you why. Comparing files
will highlight the changes, but are the changes part of
a new feature, a bug fix, or did he just find a cleaner way
to re-write working code? Self-documented code will never answer
these vital questions. You can trace pointers, functions,
variables, etc., but you can only guess at why it was done
the way that it was done and not some other way.

- Steve Wise

Steve, I agree that the compiler ought to be smart enough to
get "for" from "fr" given the context, but a developer who
is slamming out code so sloppily that he depends on the compiler
to mop up that kind of error is probably making far less
benign mistakes than misspelling language keywords. He will
be made aware of some of those mistakes at compile and link
time, some during debugging, and some by customers who find
his bugs for him. He deserves a slap on the wrist for poor
workmanship and lack of attention to detail.

I agree with Jack that many programmers make the mistake of overcommenting,
especially because it obscures structure, the
discernment of which is vitally important to a reader understanding
the code. Comment briefly, and comment why. Don't
waste my time with comments as to what or how.

Excessive comments are also the most prone to rot when the code
changes beneath them. As Norm Schryer said, "If the code
and the comments disagree, then both are probably wrong."

I want to emphasize something Jack implies, but doesn't come
right out and say: comments are documentation, and as such should
be good English -- complete sentences, not broken speech.
Even between /* and */, tenets of good technical writing
apply, and correct grammar always leads to greater understanding.

- Daniel Daly

No such thing as self-commenting code? I beg to differ! This
is from the IOCCC, 2001 (cheong.c, probably need to view
this in a fixed font):

#include

int l;int main(int o,char **O,

int I){char c,*D=O[1];if(o>0){

for(l=0;D[l ];D[l

++]-=10){D [l++]-=120;D[l]-=

110;while (!main(0,O,l))D[l]

+= 20; putchar((D[l]+1032)

/20 ) ;}putchar(10);}else{

c=o+ (D[I]+82)%10-(I>l/2)*

(D[I-l+I]+72)/10-9;D[I]+=I<0?0>0?0>

:!(o=main(c/10,O,I-1))*((c+999

)%10-(D[I]+92)%10);}return o;}

- Dan McCarty

As much as self documenting code may be a pipe dream (I certainly
haven't been able to write without comments and I've tried),
writing code lucidly is something that doesn't happen
near often enough. I mean what about

- well written or placed comments

- descriptive variables

- choosing a while();, for();, or do {} while; that makes the
code more comprendable.

Style counts. Maybe we just ought not call it 'self documenting'

- Pat Thomson
\

"So I figure the way to write a function is to create all of
the comments first. The header, and even all of the individual
little snippets of English spread throughout the code. Then
the function’s design is done.

After that, anyone can fill in the code."

Thank You, Thank You, Thank You!

I've been preaching this approach to my students (Mechanical
Engineers learning to build mechatronic systems) for years.
Now I can point them to a non-academic source that agrees. Maybe,
I'll have heard the last of 'It's all done, I just need
to go back and comment it'

naw, that's wishful thinking, but this will be usefull none the
less.

- Ed Carryer

Hi Jack

Interesting timing from my POV...

I am working on a software project, which will also end up with
a book describing how the code works. My list of things
to do [in order] is:

- design the code [mainly data structures]

- write book [mostly]

- write/test the code

- edit/finish book

I assume that this approach would receive the "Ganssle Seal of
Approval"?

- Colin Walls

I've seen code that's largely self-documenting. Consider the
CPP (C preprocessor, not C++) module in the LCC project. It
is very sparsely commented, yet easy to understand and maintain
despite being rather complex (it even includes a hand-coded
lexical analyzer and parser). On the other hand, one of
the originators of Unix told me he wrote this code, so I guess
it shouldn't come as a great surprise that he could do a
real bang-up job at it.

Of course, whether code is self-documenting depends on the purpose
and experience of the reader. If I intend to be noodling
around inside the code, fixing bugs and adding new features,
I'd rather the code be written so as not to need comments.
On the other hand, if I want to simply use a library, I don't
want code comments, just documentation. If you have a system
that can extract documentation from comments, such as
doxygen or MATLAB does, that's great (as long as you keep them
up to date). But these really aren't code comments at that
point.

- Gerald Williams

My contributions to your article:

#define elementsof(x) (sizeof(x) / sizeof(x[0]))

To reference the number of elements in an array without having to spell out the number; for (i=0; i < elementsof(my_structure_array);="" i++="" {="" do_something()="" }="">

and cross-referencing the header files:

#include // CHAR_BITS, ...
#include // printf(), getc(), ...

So that the casual observer knows what comes from where.

- M. David Gelbman

Your code needs to communicate two things to the reader:

1) What the code is doing
2) Why it's being done this way

Self-documenting code is the best way to accomplish #1, but it doesn't address #2. Your product spec addresses #2, but typically from a higher view than the code level. That's why, in addition to writing code that makes it easy to figure what's going on, you also need to comments to explain why your doing what your doing, what you already tried but didn't work, why it makes sense to break a certain coding rule, what gotchas future programmers should look out for, etc.

- Tony Gray

"In my opinion, it is better to have:

1. Clearly defined specifications....(notice, I did not say complete!)

2. Decent design document, (Word with some of those Visio pictures are great)."

In my experience, this approach usually ends in the specs and design documents becoming separated from the code. Some poor bastard gets handed the code without the documentation to make a change--or you receive 500 pages of docs, and after reading them you realize that the docs apparently described rev 1.0, but you are working on rev 3.6.

If you put the documentation in the program as comments, it's always there for the next programmer. And, while it isn't guaranteed that each programmer updated the comments as he made the changes to the program, at least it's practical to do what you should!

I've had even worse experiences. I once had the job of designing a new device to communicate with an old military computer system over a proprietary serial bus. I received detailed (written to mil spec, of course) documentation on the serial bus, including the schematics of the old interface card, but no source code. It "wasn't needed", as I wasn't changing the program at that end--was probably a strange form of assembly language anyhow. Working my way through the docs, I became a bit puzzled because a critical 54LS74 flipflop had to be triggered by the falling edge of a clock pulse, and I darn well know that the '74 triggers on the rising edge. It took a week with a very good logic analyzer to finally capture the totally undocumented series of pulses, emitted several seconds before a message came through, that set the preceding circuits up to where that flipflop would receive a rising edge when needed.

There's a small block of code somewhere in that source code I didn't have that generates those pulses, and it probably starts with comments explaining just what it does. But I didn't receive that because "everything I needed to know" was in the specification documents, except that most likely the programmer probably wasn't allowed to update that to explain how he worked around the hardware designer's mistake.