Templates can be a complex and intimidating feature. I know I’m never
comfortable with a feature until I know how it compiles and works
down to the bottom level. Any complex process can be broken down into
simpler, lower level steps that can be mastered and then understanding
built up from there. Rather than get into a long discussion of the arcana
of templates, I’ll show the step—by—step way they’re compiled. Often,
much of the strangeness of the arcana becomes clearer once the compilation
process makes sense.

Given the simple template program that computes a factorial at compile time:

The source code is preprocessed and lexed (converted to a stream of
tokens). We’ll skip past that as it’s standard compiler technology and
not peculiar to templates.

The two factorial templates are parsed. Parsing means a data structure
is created that represents the templates. Our two factorial templates are
installed into the symbol table, and since they have the same name, they
are often stored in a list hanging off the ‘factorial’ name.

The bodies (the stuff between the { }’s) of the templates are
stored as ASTs
Abstract Syntax Trees.
Sometimes, as in the Digital Mars C++ compiler, they are stored
as lists of tokens.

The function foo() is semantically analyzed, meaning that the AST
is evaluated, typechecked, symbols defined, etc. The factorial<10> is
recognized as an instantiation of a template named factorial with an
argument that is a literal integer 10.

factorial is looked up in the symbol table, and two factorial template
declarations are found. The compiler must determine which of those templates to
instantiate.

The template instantiation arguments, in this case the literal 10,
is applied to each template declaration. The list of template declarations
that match are called the ‘candidate list’. In this case, the 10 matches
the int n parameter of the first factorial template, and does not match
the 1 parameter of the second. So there is only one candidate, and that
is selected.

If there are more than one candidate templates, the compiler must
determine the so—called best match. This is a lot like function overloading.
The technique used for finding the best match for both C++ and the D
programming language is called
partial ordering.

Now we have the specific template declaration that needs to be instantiated
and the actual arguments to instantiate it with. But there is a rule in C++
and D that a template with a specific set of arguments can only have one
instantiation for the whole program. This means that if the template has
already been instantiated with the same arguments, we can simply point to
that instantiation and we’re done.

Finding an existing instantiation is done by “mangling” the template name,
namespace, and arguments
into a string. For factorial<10>, this mangled name looks like (for g++)
9factorialILi10E. (This can be read as a “9 character name factorial
with a literal integer 10 argument followed by the end of the arguments.”)
None of the template body is needed for this. This string will be unique
in the program.

Hung off of the template declaration in the symbol table is another table
with those mangled template instantiation names as indices to the corresponding
template instantiations.

If the template instantiation isn’t found, then it needs to be instantiated.

A fresh symbol table scope is created. The template declaration parameters
are declared in it as symbols initialized to the actual argument values.
Type parameters are declared as typedefs (aliases in D), integer parameters
are declared as constants. The template declaration body is then parsed (if
not already an AST), and semantically analyzed by the regular other bits
of the compiler that deal with functions, classes, and declarations.

A reference to the new instantiation is also installed into the template
declaration’s table of instantiations.

The declarations created by the template instantiation are now treated
as regular types and symbols by the rest of the compiler, which doesn’t
know or care they came from templates.

The astute reader will wonder what happens when separate compilation
units generate the same template instantiations with the same arguments.
Wasn’t there a rule that says only one unique instantiation per program?
You’re quite right, and the solution is to be found in that the template
instantiations are identified in the object file as “COMDATs”, short for
common data blocks, a delightful
term which instructs the linker to remove the duplicates.
(Early C++ compilers which didn’t have the luxury of COMDAT—supporting
linkers generally had a kludgily awful time with this problem.)

I hope this has shed at least a little light on how templates work.
If you want to learn more about how real compilers work, I am hosting
a seminar in the fall on compiler construction.

Acknowledgements

Thanks to Eric Niebler, Andrei Alexandrescu and Jason House for review a draft of this.