We will define two evaluation mechanisms, one will reduce one redex at a time, with the index of the redex passed in to the function, the other will just reduce until a normal form is reached, if there is one.

It’s convenient to use option types to indicate if a subexpression has been reduced, and it’s even more convenient to define some monadic-style helper functions:

Here’s the single redex reducer: try to reduce each subexpression, keeping track of redex indexes. Return SOME(e') if a sub evaluation succeeds, else return NONE with some monadic trickery to handle the mechanics. We avoid expanding global identifiers unless they are in functional position.

That’s all very well for reducing individual redexes, let’s define something that will just let rip on an expression. We’d like to do a proper normal order evaluation, so we’ll use an evaluation stack: go down left branch of expression, reducing beta redexes as we go. When we have got to the end, there are no more top-level redexes, so recursively evaluate the items on the stack, and finally fold back up in into a single expression:

Next up, variable substitution, a little tedious, but has to be done. First, we need to know if a variable occurs free in an expression. If it does, we need to find a variable that doesn’t, which we do by decorating with primes. Having got ourselves a variable that isn’t free, we can use it to substitute for the one that is, and that’s all there is to it. The code is probably clearer:

Phew, glad that’s over. Next, we need to lex and parse expressions. Lexing is straightforward, variables are single letters, we allow either \ or λ for lambda; it seems that we get the UTF-8 bytes for λ one at a time, so that’s just about our only multi character token, apart from primed identifiers (since we use primes to avoid variable capture in substitutions, we want to be able to read them in as well). Lex input is just a list of characters from an exploded string.

Parsing is even more fun. This is a hand-built LR table-driven parser; table driven parsers are good for tricks like semi-intelligent error recovery or doing parse conflict resolution on the fly (useful eg. for languages with redefinable operation precedences like ML). We don’t do anything like that though, we don’t even explicitly detect errors, and instead rely on ML’s pattern matching – if the input is ungrammatical, we get an inexhaustive match error: