C/C++

The Spirit Parser Library: Inline Parsing in C++

By Joel de Guzman and Dan Nuffer, September 01, 2003

Powerful parsing made easy via modern template techniques.

Composites

By itself, the chlit class might not seem very useful. Yet, composed to
form a hierarchy, trivial parser classes can produce quite complex and elaborate
parsing functionalities (see the alternative class in Listing
Three for an example).

This class is a composite parser class. It has two sub-parsers: left
and right. parse returns a success if one of its subjects is successful.

Through object composition alone, we can build parsers of arbitrary complexity.
The library includes a couple of predefined primitives (like chlit) as
well as composites (such as alternative). Primitives for parsing include
literal strings, ranges, and numerics. Composites include alternatives, sequences,
intersections, and differences. These parsers can be combined to form a hierarchy
in a multitude of ways.

Operators

Primitives and composites can be composed to form more complex composites using
operator overloading, coded in a way that mimics EBNF. Composing an alternative,
for example requires two operands and is accomplished by overloading the operator
| as shown here:

The sequence class is very similar to alternative, but it has
a slight twist in the implementation of its parse member function. sequence
matches both its subjects (left and right) in strict sequence.
Like alternative, sequence has a corresponding overloaded operator:
the >> operator.

Unary and Binary Composites

So far, we have just tackled binary composites such as alternative and
sequence. Unary composites work similarly to binary composites but with
only a single subject. The Kleene star, used for zero or more iterations, and
the positive closure, used for one or more iterations, are examples of unary composites. Listing Five shows the Kleene star class implementation.

As is obvious by now, binary composites inherit from the binary class that
does the work of storing a copy of its left and right subjects. In the same
vein, unary composites inherit from the unary class, similar to the binary class
but with only one subject. Most fundamentally, all parsers, whether primitive
or composite, inherit from the base parser class.

All entities in the library are variations of the examples presented. There
are entities that deal with set-like operations, iteration (Kleene star
and its brethren), semantic actions (hooks to external C/C++ code), directives
(unary composites that modify the functionality of its subjects), parser assertions,
and error handlers.

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!