(Taken from the website of Andy Lowe some time in 1999. Where are you now,
Andy?)

One way that we have found to effectively organize thought in the
absence of a well-defined methodology is what we call "The Ten Bugs in
the Known Universe." Our statement that there are ten bugs in the
universe of programming has been called bold (the estimate may in fact
be high), but it is nonetheless useful.

Our list is similar in spirit to the style
guidelines enumerated in
The Elements of Programming Style
by Brian Kernighan and P.J. Plauger. We are
also indebted to the more rigorous "A
Discipline of Programming" by the great Edsger W. Dijkstra.
Dijkstra deprecates even the concept of "debugging" as being
necessitated by sloppy thinking. Perhaps. Yet as we are not perfect,
let us not allow defects in our thought processes to prohibit us from
accomplishing anything. So, debug we must.

Let's assume for a moment that there are in fact only ten categories
of bugs in all programming. It must follow that these ten bugs keep
reappearing in different forms over and over again. If only we could
itemize them and devise tools to distinguish them, we might be able to
get a handle on the problems we have been trying to solve so
frantically and reactively for so long. Maybe if we can name them, we
can tame them. We try to do just that here, with our observation of the

This problem is especialliy characteristic of novices and even experts
sometimes in very complex systems which are hard to understand. It
can have many different manifestations depending on the context, from
syntactic typographical errors ("typos") into more fundamental
"thinkos" as quoted..

for example, use versus reference. mistaking a pointer for the
structure and vice versa.

Good compilers and syntax checkers can find many of these syntactic
problems. Sometimes the problem is really understanding what the
compiler is trying to tell you.

Sources of error here include failing to understand the
implications of a complex interface; failing to take into account
latency of trashing, examples in socket programming can be easily
found, etc.

The lack of a clear specification can be really vexing because in the
presence of a well-defined body of code, you might expect that there
is an interface, somewhere, even if it is just in the designer's head.

But if the interface isn't articulated, then there is no way to make
reference to it. In this case, the programmer may make assumptions
about the nature of his inputs or outputs. He or she may survive
testing and even customer acceptance.

But then those assumptions are violated, and in the absence of a
specification, it's impossible to say which side of the interface the
bug is really on. Think about it. You have a component (the
interface) to which no individual or group is assigned. Oops.

So one way to combat this is to try to extract an interface
specification from the existing application and test it for validity
in the normally expected usage cases and compare that with how the
interface actually behaves in clearly out-of-bounds conditions.

The real tough part comes right at the limits, implicitly or explicitly
defined as the case may be.

In other words, if the interface is not defined, then the code's
behavior itself defines the interface.

Once you analyze the situation in this way, an infererence can be
made: if two pieces of code don't agree on the nature of their inputs
or outputs then what you have is a job of post-hoc specification on
your hands. This is covered elsewhere (cf. Ted J. Biggerstaff & Alan J. Perlis, Software Reusability: Concepts & Models, New York: ACM (Addison-Wesley), 1989. ).

This passive problem is not nearly as nasty as his
evil twin, stack trasher, and sinister cousin,
memory trasher can sometimes be. It is
one of the disciplines of programming to
maintain consistency and completeness with respect to heap and stack
memory usage, and manipulation of pointers and collections.

The C language is both notorious and beloved for its loose typing.
Most static data structures behave deterministically. Errors in or
failure to initialize these structures will generally either always
fail or always succeed, even if coded improperly, and thus will be
picked up in the first rounds of testing. (You did run it at least
once before you checked it in, right?)

Normal C++ and Java memory management through disciplined use of
closed allocators and deallocators (new and delete) along with user
hooks through constructors and destructors is, in our opinion much
safer and usable than working with live wires like malloc() and
free(), whose easy abuse is responsible for catastrophic failures like
rockets unintentionally crashing into Venus.

The memory trasher is a vicious bug, and together with a special case,
the stack trasher, have probably cost more money in software
development over the years than any other single category of problem. The prohibition of pointers from Java may be one of the greatest advances in software reliability in decades.

Debugging facilities for stack and heap mismanagement, are in our opinion,
ripe for implementation in hardware, and you should use hardware support whenever you have it. Some architectures already support exception handling and the use of protected memory pages. These will help you to recover from certain categories of error (such as jumping through a null pointer, or accessing memory at an invalid address), but they will not help you find or correct them.

We have two recommendations for helping to find and correct memory trashers and leakers, especially the most vile, latent variety which elude conventional system testing mechanisms. The first is to use debugging memory allocators. Such memory allocators work in a variety of ways, but the simplest variety put "guard rails" at the head and tail of each heap allocation. Some also keep linked lists of the allocated segments for later sanity checking, and record the module and line number of the call which allocated the block.

Then, at intervals throughout execution, the list can be traversed checking the guard rails for the appropriate test patterns. By progressively narrowing the intervals between tests, the instance of the error may be isolated.

Stack errors may be similarly trapped, but this requires support from the compiler or sophisticated debuggers to help tag the stack frames.

The best way to find and fix such problems is through disciplined and thorough unit testing at the module level -- before integration with other modules. If all modules undergo such testing, the chances for unanticipated errors to crop up during integration is significantly reduced.

During unit testing, it is also prudent to use a memory allocator which provides the ability to "squeeze" memory. That is, out of memory conditions can be simulated at progressively narrower intervals throughout program execution. This helps to flush out the latent problems hiding in your code.

Kernighan and Plauger advise you to program defensively, meaning to write code that can handle erroneous input. We go a step further and advise you to intgrate defensively. That is, do not attempt to integrate your code with other modules until you have not only verified your own behavior, but have also made sure your colleagues have been just as careful. Forcing them through such a gate will surely be your most reliable path to a successful (and uneventful!) integration party.

Off-by-one errors happen much more frequently than you might suspect,
typically resulting in confusion in counting -- for example the famous
fencepost error may be helpful to think about: "If I want to put up a
fence forty feet long, and I have four ten-foot sections, how many
fenceposts do I need?" Other off-by-one errors are due to erroneous
unit conversion or phase conversion or failing to agree on whether to
begin counting at zero or one, rounding or unanticipated math errors,
inaccuracies etc. Whenever you have to count something, always ask
yourself: "am I counting fenceposts or fence sections?" And also ask
yourself, "do I start counting at zero or one?"

If you don't understand the problem, you can't provide the
solution. This is really kind of a fundamental point and needs
elaboration. Who owns the specification for your product? Who
really understands it, in its entirety at least at some level?

If you answered "no one" then bzzzz -- wrong! At the very least, your
customer does. If your pieces don't fit together well enough to solve
his or her problem, then you lose.

We use the term specification to denote the user-requirements document.
More normally "specification" refers to the design specification which
is derived from the user-requirements. What we're trying to get at here,
is that these documents represent a communication stream between the
intended users of a product and its designers. since the
product-cycle has been reduced from years to a year, to nine months to next quarter,
this communication must be effective, or the project will not achieve
complete success.

Technology-driven organizations have a natural tendency to try to lead
their audiences where they don't want or know that they want to
go. That implies a learning process on the part of the customer.
Market-driven organizations more often follow their customers into
novel problem spaces. This implies a learning process on the part of
the developers. The "What specification?" error, therefore, represents two different but symmetrical problems.

This problem is similar to the "We don't agree ..." error in that it represents a communication problem.
The difference lies in which entities are communicating. The latter is probably more
severe or a least embarassing in that components within the system don't even agree
on what is happening. On the other hand, the former can be fatal to the extent that
your customer may refuse to accept your product because of it.

An analogy might be politics -- in which you may have two models of
politicians -- one who leads through vision, and persuades, cajoles
orders or otherwise compels people to follow. With enlightened
"generalship", if you will, this can be a successful strategy. This
is not often followed in a democracy, where the concept is put on its
head and the people (voters/customers/workers/investors), are the holders of the vision or
wisdom, and a successful politician leads by following. That is,
taking into consideration what the people want, he then decides what
his positions are. So let's contrast these two styles of
statesmanship through examples, pros and cons etc.

Should we design an analytical model for these categories of
statesman, we might be able to prove that all innovation stems from
the visionary, whereas the democratic or opportunistic strategy is
more successful at consolidation and maintenance.

What sort of statesman (or project manager) are you? Do you wish to
lead your customer, or take a poll before you decide? In any case,
clearly defined requirements have two characteristics: a well-defined
problem statement and a well defined satisfiability criteria.

That is, only when you know what you are trying to do, can you
unequivocally demonstrate when you have done it.

This would imply that satisfiability tests must be defined before
design is begun, and that this must ultimately be the responsibility
of the customer.

The differences between the "You're not doing ..." error and the "It's not
doing ..." error
are subtle, but important. The latter referred to a
fundamental lack of comprehension of what is going on. As noted, this
is common and expected behavior of novices, and sometimes even happens
to experts. In contrast, the former never goes away. This problem
refers to the situation where the diagnostician fully understands the
difference between case A and case B, and the evidence points to case
A, but he stubbornly only observes symptoms of case B. Consider a
preacher who has since the Eisenhower administration
predicted the arrival of Armageddon sometime in 1988.
Imagine him giving his first sermon one bright Sunday morning in
January of 1989. Such is the chagrin of an experienced programmer
trapped in the vice of "You're not doing ..."!

We have observed the behavior of very talented programmers with a wide
variety of training and education levels, and the breakthrough
characteristic of the very best is, in our opinion, an intuitive
ability to discriminate previously unobserved but relevant facts
rather than continually review known but irrelevant ones. This
"knack" may perhaps be taught and refined, but it was certainly not in
our Computer Science curriculum.

If you suffer from lack of clarity, or are not concentrating, you must
first become aware of that. High performers in all fields must
develop the ability to concentrate, but computer programming can be
uniquely taxing to the intellect (especially at 3:00 on Monday morning
before the Comdex show opens). This is really a psychological
capability -- an awareness that, for example, "My jaw is tight, maybe
I need some sugar." You sit on your butt so long, it turns out that
stimming (that twitchy leg motion people sometimes get), helps to keep
your blood pressure up.

Maybe you suffer from gridlock of the mind, where you repeatedly
retread the same erroneous thought patterns: "It's broken and I didn't
change anything at all, except for that, but it can't be
that."

Such examples seem obvious once pointed out, but many is the
programmer hour spent in "You're not ...". The trick is
simply to be aware of your situation, your own state of mind as well
as the completeness and accuracy of your powers of observation. Not
only do you need to know which tools to apply to discriminate the
component, which to apply to identify the module and line of code
causing your problem, you need to know when the tool you are using is
inadequate -- when to drop it in favor of another one.

"a guard acts as a sentry to a body of code and does not permit
execution of that body unless a defined set of conditions is met."

Poorly-defined guards are unfortunately very common. In standard C
code, you normally think of guards in terms of error checking. For
example, all logic depending on a successful call to malloc() should
be guarded against it returning a null pointer. This is tedious and
sometimes obscures the readability of the code. Readability directly
relates to maintainability, and neither should inhibit reliability.

Structured exception handling provides a mechanism to define default
behavior for specific failure modes, so that reliability can be
enhanced through an assurance that we can guard against all
potential failures in some default way, without precluding the ability
to guard specific failures in specific ways.

A body of code without a guard appears naked to the trained eye. More
often, we see ill-defined guards. Such code masks potential "bogons"
-- that is latent bugs. The behavior of such logic outside its
intended context cannot be predicted.

An exit condition is the test which decides if a computation is
complete. The most common instance of this sort of problem is an
infinite loop. There are several causes for infinite loops, but
"Poorly-Defined ..." is the most obvious one of them. For example, consider an
exit condition which tests against an invariant:

for (i=0, j=0; j<1; i++) ;

We're sure none of our readers will ever write a loop with an exit
condition like this. Other instances of this bug are premature loop
termination, premature function returns or block exits and so on.

This cliche more properly describes a subject of a marketing text than
a software text. But without properly defined requirements, how can
we label any behavior erroneous? On the other hand, we would be wise
to keep our minds open to the possibility of a happy accident. Rare
as they are, they still occur, and we never know when we might stumble
over a gem.

Simply remember that redefining your requirements is sometimes an
option. Relaxing your constraints expands the space of potentially
successful solutions, and may move your solution across the line from
failure to success.

When you find yourself with a defect or potential defect in your
software, try to categorize it according to this enumeration.
Remember that you may be looking at a compound problem. If you are
looking at something that does not easily fit into any of these
categories, try to break your problem down into two or more simpler
ones. Become familiar with all of your tools, and don't neglect them

Finally, we leave you with our motto: "Where there's one bug, there's
two." Don't stop looking because you have found a problem.
The chances are excellent that there is another one nearby.