Writing a small parser / interpreter (Part 2: Parser)

Writing a small parser / interpreter (Part 2: Parser)

In the previous step, we looked at how to write a lexical scanner for a small subset of the LOGO programming language.

Our parser relies heavily on the fact that we’re parsing a CFG language free from left-recursion (non-terminal not repeated as first symbol on lhs) and common left side prefixes (for any one rule, no productions share a common prefix). These properties enables us to write a special kind of parser, called a LL(1) parser.

The second “L” signifies that we’re recognizing left-most parse-trees (parse-trees are the derivations of the production rules in our grammar), but this is non-important in our case, since the LOGO subset grammar can only produce one kind of parse-trees (you could try writing down derivations, if in doubt).

The “(1)” part tells us that it is sufficient to look one symbol ahead to be able to predict the next production rule.

Code

The trick is to write a method, either a recursive one or one that “decends” into a sub-method, where each method corresponds to the possible rule outcomes for a nonterminal symbol. We match tokens as we go along.