5.8.1 LR Table Construction

For historical reasons, Bison constructs LALR(1) parser tables by default.
However, LALR does not possess the full language-recognition power of LR.
As a result, the behavior of parsers employing LALR parser tables is often
mysterious. We presented a simple example of this effect in Mysterious Conflicts.

As we also demonstrated in that example, the traditional approach to
eliminating such mysterious behavior is to restructure the grammar.
Unfortunately, doing so correctly is often difficult. Moreover, merely
discovering that LALR causes mysterious behavior in your parser can be
difficult as well.

Fortunately, Bison provides an easy way to eliminate the possibility of such
mysterious behavior altogether. You simply need to activate a more powerful
parser table construction algorithm by using the %define lr.type
directive.

Directive: %define lr.typetype

Specify the type of parser tables within the LR(1) family. The accepted
values for type are:

lalr (default)

ielr

canonical-lr

(This feature is experimental. More user feedback will help to stabilize
it.)

For example, to activate IELR, you might add the following directive to you
grammar file:

%define lr.type ielr

For the example in Mysterious Conflicts, the mysterious
conflict is then eliminated, so there is no need to invest time in
comprehending the conflict or restructuring the grammar to fix it. If,
during future development, the grammar evolves such that all mysterious
behavior would have disappeared using just LALR, you need not fear that
continuing to use IELR will result in unnecessarily large parser tables.
That is, IELR generates LALR tables when LALR (using a deterministic parsing
algorithm) is sufficient to support the full language-recognition power of
LR. Thus, by enabling IELR at the start of grammar development, you can
safely and completely eliminate the need to consider LALR’s shortcomings.

While IELR is almost always preferable, there are circumstances where LALR
or the canonical LR parser tables described by Knuth
(see Knuth 1965) can be useful. Here we summarize the
relative advantages of each parser table construction algorithm within
Bison:

LALR

There are at least two scenarios where LALR can be worthwhile:

GLR without static conflict resolution.

When employing GLR parsers (see GLR Parsers), if you do not resolve any
conflicts statically (for example, with %left or %precedence),
then
the parser explores all potential parses of any given input. In this case,
the choice of parser table construction algorithm is guaranteed not to alter
the language accepted by the parser. LALR parser tables are the smallest
parser tables Bison can currently construct, so they may then be preferable.
Nevertheless, once you begin to resolve conflicts statically, GLR behaves
more like a deterministic parser in the syntactic contexts where those
conflicts appear, and so either IELR or canonical LR can then be helpful to
avoid LALR’s mysterious behavior.

Malformed grammars.

Occasionally during development, an especially malformed grammar with a
major recurring flaw may severely impede the IELR or canonical LR parser
table construction algorithm. LALR can be a quick way to construct parser
tables in order to investigate such problems while ignoring the more subtle
differences from IELR and canonical LR.

IELR

IELR (Inadequacy Elimination LR) is a minimal LR algorithm. That is, given
any grammar (LR or non-LR), parsers using IELR or canonical LR parser tables
always accept exactly the same set of sentences. However, like LALR, IELR
merges parser states during parser table construction so that the number of
parser states is often an order of magnitude less than for canonical LR.
More importantly, because canonical LR’s extra parser states may contain
duplicate conflicts in the case of non-LR grammars, the number of conflicts
for IELR is often an order of magnitude less as well. This effect can
significantly reduce the complexity of developing a grammar.

Canonical LR

While inefficient, canonical LR parser tables can be an interesting means to
explore a grammar because they possess a property that IELR and LALR tables
do not. That is, if %nonassoc is not used and default reductions are
left disabled (see Default Reductions), then, for every left context of
every canonical LR state, the set of tokens accepted by that state is
guaranteed to be the exact set of tokens that is syntactically acceptable in
that left context. It might then seem that an advantage of canonical LR
parsers in production is that, under the above constraints, they are
guaranteed to detect a syntax error as soon as possible without performing
any unnecessary reductions. However, IELR parsers that use LAC are also
able to achieve this behavior without sacrificing %nonassoc or
default reductions. For details and a few caveats of LAC, see LAC.