Elsa: The Elkhound-based C/C++ Parser

Elsa is a C and C++ parser. It is based on the Elkhound parser generator. It lexes
and parses input C/C++ code into an abstract syntax tree. It does
some type checking, in the interest of elaborating the meaning of
constructs, but it does not (yet?) reject all invalid programs.

baselexer.h,
baselexer.cc:
Intermediate Lexer abstraction, built on top of yyFlexLexer and implementing
LexerInferface (thus fitting between flex and Elkhound), but not specific to
any set of tokens. Lexer (lexer.h) builds on top of this.

cc.ast:
C/C++ Abstract Syntax Tree. This is the most important
file in the parser, since it defines the interface between
the parser and everything else that comes after it. It is
documented separately in cc.ast.html.

cc.gr:
C/C++ parsing grammar. This is the second-most important file,
as it tells Elkhound how to parse the token stream. This grammar
is based on that in the C++ Standard document, but then modified
to remove unnecessary ambiguities and improve the grammar's ability
to extract structure.

cc_elaborate.ast,
cc_elaborate.h,
cc_elaborate.cc:
This module finds implicit function calls (like constructors) and creates
an explicit representation of them. An analysis can then ignore implicit
calls and just use the constructed explicit AST.

cc_err.h,
cc_err.cc:
ErrorMsg, an object for representing type checking errors. For now
it's just an error string plus some metadata (like source location),
but I plan to evolve it to include more structured data like pointers
to (instead of just string representations of) the types involved in
the error.

cc_flags.h,
cc_flags.cc:
This module defines a variety of enums relevant to parsing and
type checking C++, including enums for all the built-in types,
operators, etc.

cc_lang.h,
cc_lang.cc:
CCLang, a package of language dialect options. Setting flags in
this class tells the lexer, parser and type checker what language
options to support (e.g. C vs. C++).

cc_tcheck.ast,
cc_tcheck.cc:
This is the type checker. It consists of an AST extension to
add type checking entry points and annotations, and an implementation
of all of those type checking functions. It's the most complicated
part of the parser.

cc_type.h,
cc_type.cc:
This module defines the representation of types. They
form the core of the data manipulated by the type checker.
They are documented separately in
cc_type.html.

ccparse.h,
ccparse.cc:
This module defines part of the parser context class, and assists
minimally with parsing.

cfg.ast,
cfg.h,
cfg.cc:
This is type-checking extension that computes a statement-level
control flow graph for each function.

const_eval.h,
const_eval.cc:
Constant-expression evaluator. Tries to predict the effect of
coercing data among different representation sizes, among other things.

generic_amb.h:
This is the generic ambiguity resolution procedure. It typechecks
all of the alternatives, and selects the one that passes. Note that
there are other ambiguity resolution procedures in use, but this is
the one used in the absence of a specialized procedure.

generic_aux.h:
Some routines for printing and modifying AST nodes that have
ambiguity pointers.

gnu.lex,
gnu_ext.tok,
gnu.gr,
gnu.ast,
gnu.cc:
These files comprise the "gnu" extension module, though in truth this contains
extensions for both gcc and C99. See gnu.gr for a complete
list of the extensions implemented.

implconv.h,
implconv.cc:
This module represents and computes implicit conversions, as defined
in sections 13.3.3.1 and 13.3.3.2 of the C++ standard.

implint.h,
implint.cc:
Support routines, including ambiguity resolution, for the implicit-int
K&R extension.

main.cc:
This module contains the main() function of the parser. It's a simple
driver around the other modules. The nominal intent is that people who
want to use parts of Elsa in their own projects users will copy and modify
this file as necessary.

mangle.h,
mangle.cc:
This is a very rudmentary name mangler. It is a somewhat arbitrary injective
map from Types to character strings, for use by the Oink linker imitator
(identifying declarations of the same entity from different translation units).
It does not implement any standard mangling scheme.

matchtype.h,
matchtype.cc:
Type matching in the presence of type variables corresponding to template
parameters; sort of a generalized Type::equals.

serialno.h,
serialno.cc:
This is a simple module that can be used to attach object creation
serial numbers when an appropriate compile-time switch is used. This
is sometimes more convenient than working with virtual addresses,
while debugging.

typelistiter.h,
typelistiter.cc:
Generic interface, plus a couple of implementations, for iterating
over sequences and examining their stored types.

variable.h,
variable.cc:
Variable, a class for holding information about names in the
"variable" namespace. See
variable.h for a list of the kinds
of things that get represented with Variables. This module
is closely related to cc_type.

include:
When preprocessing, add this directory to the preprocessor's
search path. It contains compiler-specific headers. Generally
I just use gcc's headers, but some of gcc's headers use syntax
that Elsa doesn't (yet?) understand, so this directory contains
my replacements.