Parsing with Active Patterns in F# : Page 4

Discover how to use F#'s active patterns to build prototype parsers that you can test interactively.

by Adam Granicz

Jun 18, 2009

Page 4 of 4

Parsing with Active Patterns: Building Up Rules

Now you have all the primitives to start building your parsers. Consider a simple language consisting of arithmetic expressions such as those you saw in the first part of this article. In that language (for which you only defined the abstract syntax earlier) you had numbers and the four basic arithmetic operators. This time you want to build a parser for this language, but also extend it to include floats (you just make all numbers floats) and function calls. A sample input in this language might be the string 1+cos(2+3)/sin(4*5).

This expression language calls for a slightly updated AST, so you need to change the earlier definition a bit (note that you are defining this as part of a new Ast module):

The parser (continuing the Language module you defined earlier) needs active patterns to recognize numbers (floats), identifiers (function names that you are calling), and various tokens (parentheses and operators):

Using these you can express the essential part of your grammar: expressions. To get around operator precedence you can simply introduce a hierarchy of active patterns that parse parts of an expression in decreasing precedence as is commonly done in recursive descent (LL) parsers.

Note that in this language the arguments passed to a function call are not separated, so you can pass arbitrary number of arguments to functions by separating them simply with whitespace. If you want to add your own separator symbol, you can build another rule that parses expressions followed by that separator symbol, and have the Star active pattern refer to that rule.

Because the rule that you first match successfully preempts all others, you need to be careful to list those rules that are recursive first—which inherently implies that you cannot start a match case with the same active pattern that you are defining; otherwise, you wind up in an infinite recursion. To get around that issue, you can make all binary operators associate to the right, so 1+2+3 will actually reduce to 1+(2+3).

Finally, a useful active pattern is Eof, which recognizes the end of the input string. However, rather than simply checking for an empty remaining input string, you should also check whether the only remaining input is whitespace (recall that whitespace can contain comments also).

You can now patch up the evaluator you saw in the first part of this article to evaluate this new expression type by adding the evaluation of function calls (here you add only the sine and cosine functions, both of which work on a single argument):

You've seen how to define active patterns that convert and partition values, via an example that used partial active patterns to implement a parser for arithmetic expressions with function calls. Effectively, you've seen how to translate the boring and error-prone task of implementing grammars with parser generators into a typed discipline that closely matches your parsing rules expressed in BNF syntax. Using these techniques you are better equipped to quickly prototype and implement your own parsers, define your own domain-specific languages, and work with your own concrete syntax for your data.

Adam Granicz is the CEO of IntelliFactory, the primary place for F# development, training and consulting services, and the coauthor of Expert F#, the most comprehensive guide on F# coauthored with Don Syme, the designer of the language. He has done research in functional programming languages and holds a Masters degree from the California Institute of Technology.