Get your size-12s off my Daffodils!

The problem is that at the point of inclusion of BarbSoftStuff.h, it
introduces the namespace BaRBSoft and the function
BaRBSoft::TheFunc(). That's the correct thing.
Unfortunately, when AcmeThreadingStuff.h is subsequently included, it
defines TheFunc to be TheFuncST
(or TheFuncMT for multithreaded builds) for the
remainder of the compilation unit. So where you see
BaRBSoft::TheFunc() in the body of
main(), the compiler actually sees
BaRBSoft::TheFuncST(). Not happy, Bjarne! (You won't
have to study much of Bjarne's writings to discover his antipathy to
macros, as in [2, 3, 4]. Where the
master leads, so shall we happy grasshoppers follow ...)

You might wonder whether this can be fixed by reversing
the order of inclusion. Alas, that just shifts the
problem.

Now the compiler is perfectly happy, but the linker gets
the hump. The reason is that the declaration of
BaRBSoft::TheFunc() inside
BaRBSoftStuff.h is translated by the preprocessor to
BaRBSoft::TheFuncST(). The same thing
happens, as before, in the body of
main(), so the compiler sees both the
definition and the use of the same symbol. However,
because BaRBSoft are jealous guarders of their
intellectual property, and supplied only a static
library, containing
BaRBSoft::TheFunc(), against which to
link, the linker fails to find
BaRBSoft::TheFuncST().

So, whichever way you cut it, the
#define of
TheFunc() in AcmeThreadingStuff.h has
trampled over our code, and broken it.

(For further reading on this issue�or many other
important ones�we think it's worth pointing you to the
latest in Herb Sutter's excellent Exceptional
C++ series, Exceptional C++ Style [5]. Item 31 explains the
problem.)

War Story

Several years ago, Matthew worked for a software company writing
cross-platform software for network administration and statistical
gathering. The software used its own messaging system, and one of the
methods in the messaging API was called GetMessage().
It all worked tickety- boo. Then they had to port their nice working
system to Windows.

I'm sure you can guess the rest. Lots of compiler /
linker problems complaining that
SuperDuperNetworkMgr::GetMessageA()
could not be found. No doubt many of you are groaning in
recognition of the problem, and have experienced first
hand the Windows headers #definition of
GetMessage to either GetMessageA or
GetMessageW, among myriad similar. Needless to say, this
didn't endear the development team's Tandem/UNIX-heads to
Windows.

They weren't in a position to sit back and pontificate on
the abstract problem. A solution had to be found, and
fast. The choices in this case were all unpleasant:

Compile the entire Windows version of the system in the
presence of the Windows headers. For those of you that
are familiar with this notion, you can imagine the
deleterious effect on build times.

Put in #defines in their root headers, for Windows builds-
only, to emulate the perversions of the function names
done by the Windows headers in those compilation units
that include them.

Create a header to be included by all Windows-specific
compilation units, which #included windows.h, and then
immediately added the requisite #undefs to render the
system "whole" again.

For reasons of both speed and "purity of soul", Option 1
was ruled out. Option 3 was the one selected, but the
team subsequently "evolved" to Option 2.

You might think that, ugly as it is, this problem is at
least discoverable at compile/link time. For the
networking product at that stage of its development, that
was so, and any of the three options above would yield
"correctness", once compile and link stages were complete
and error free. But consider what happens if you're using
dynamic libraries, and are loading functions explicitly
by name, via dlopen()/dlsym() (UNIX)
or LoadLibrary()/GetProcAddress()
(Windows). Just because the preprocessor will merrily
change your GetMessage() to
GetMessageA() does not mean it will
also examine your string literals and do the same thing.
Hence, you can have lurking problems in a code-base that
was thoroughly tested and working on another operating
environment, and such lurkers can be extremely hard to
find. That is the case for any of the three options. (The
only times such problems become easy to find are when
you're doing a demonstration for your boss the day before
he does your salary review, or when you've shipped the
product to a client that has placed exacting downtime
fines on your company. :-)

Can good C-itizens still get caught?

Clearly this problem is composed of two aspects, which
combine to give the killer effect. There's the need to
map one name to another, and also the potential wider
(than intended) name correspondence on which the mapping
may act. In principle, if either of these can be
obviated, the problem goes away.

In C, the macro-preprocessor is all we have, and there's
no alternative for providing the name mapping, so good
authors of C libraries attempt to address the second
aspect, the name correspondence. This is usually
addressed by prefixing the names with an appropriately
unique symbol, to give "safe(r) macros". For, example,
Matthew's recls library [6] — implemented in C++,
but presenting a C-API — uses the prefix
Recls_, as in
Recls_CalcDirectorySize(). While not
being a theoretical guarantee, this technique usually
suffices in practice.

A Better Approach for C++

One of the basic tenets of C++, as espoused by Bjarne Stroustrup himself
[7], is that the preprocessor should
be, at worst, relegated to the bench, and only brought onto the pitch when
facing a particularly feisty opponent. Maybe we can follow that intent a
little in this case?

Many years ago, Matthew used his one-good-idea-per-year quota and
applied some common sense to the problem. As many of you will know, C++
compilers are required to define the preprocessor symbol
__cplusplus when processing a C++ compilation unit; in
other words, when compiling a C++ source file. We can leverage this just
as readily as we can the presence of UNICODE, or
ACMELIB_MULTI_THREADING, or any other symbol, in order
to know when we're in C or in C++. Remember, in C we must accept the
status quo and merrily trample away. However, in C++ we have a better
choice to macros, however unique we've attempted to make them: namespaces
and inline functions.

(Note: C99 defines the inline keyword for C code,
and other compilers have proprietary extensions to do the same thing, so
it's possible to take the C++ approach for C, as long as your compiler
supports it.)

Let's look at how this might work in practice, by
rewriting our AcmeThreadingStuff.h header:

Now, in C++ compilation, there is no TheFunc preprocessor
symbol definition, there is only the bona fide function
TheFunc(). This means that
TheFunc() no longer trespasses over
other namespaces. In our mixed — AcmeLib + BaRBSoft —
example, the symbol TheFunc from the
BarBSoft namespace is now thoroughly
unaffected by the definition of the AcmeLib version in
the global namespace.

Indeed, a future evolution of the BaRBSoft library might
result in a similarly conditionally defined nature to its
TheFunc, perhaps according to the
ambient character encoding, as follows:

Because we've used inline functions, rather than macros,
the name mapping in the BaRBSoft namespace does not leak
out and pollute any other namespace, including the global
namespace within which ACMELIB's
TheFunc is defined. Now we can kiss
goodbye to compile errors, and missing symbols.