This article describes in some details our interpreter which we have created during the course Essentials of interpretation. We summarize intermediate results and the main parts of the evaluator making notes which were omitted in the code articles.

Those of you who actively followed the code articles on GitHub will find this summary as a good repetition of the passed material which will help to improve the knowledge and understanding. Others, who just have joined the topic, may first read this article and already after that analyze the code articles in detail (though, the vice-versa variant is also possible).

During this stage of the course I was glad to see several forks of the project’s repository and to analyze interesting implementations and solutions for the exercises. Besides, I received some questions and alternative solutions via e-mail.

OK, so let’s summarize our small language. It’s already turned out powerful enough since supports things such as first-class functions and closures, thus making it easy to use closure pattern for OOP.

We all know the eval function in different languages which evaluates the arbitrary code passed to it. Exactly the processes of evaluation (that is, retrieving the value of an expression) eval‘s short name reflects. We used the full name of this core process and call the function as evaluate.

The main task of the evaluate is (not surprisingly) to evaluate a passed expression. That is, to obtain the value which corresponds to the result of the expression.

In the most top level, the expression is the whole program itself. However, while we go recursively deeper into the program parts, we evaluate its sub-expressions which are functions, declarations of variables, conditional expressions, etc.

And in essence evaluate is a very simple and primitive function. It just determines the type of an expression and according to this type executes corresponding handler procedure which knows how to evaluate this particular expression. That’s it. Seriously, there is no any other “magic” in interpretation of a programming language.

In this part we’ll talk only about first four expression types: self-evaluating, variables look up and also variables definition and assignment. We leave functions and other expressions for the next part.

Our evaluator (or interpreter — these are synonyms in this case) works with so-called AST (Abstract syntax tree) format.

AST format is usually simplified and convenient to recursively traverse the AST via the evaluator. It also helps us not to bother with some syntactic complexities of a concrete syntax. The concrete syntax though is processed by parsers which produce as their result exactly the AST. We considered example of a parser in lessons 2and 3.

As we said the process of concrete syntax parsing in general isn’t related with the process of interpretation (the parsing is just a previous stage of it; we may have any concrete syntax which can be transformed into our AST), so in the course we work exactly with interpretation of AST and not parsers.

We have chosen a very simple expressions format and use JavaScript arrays for it. The first element of this array is exactly the expression type which is called the type-tag and other elements are already depended on this type:

[<type-tag>, ... other fields ...]

For example:

["define", "x", 10]

The expression above is the variable definition. Its type-tag (define) directly says about it and we also see the name of the variable — x and its value, 10.

We have a common type-tag tester function which is used by all highly-abstracted predicates. It’s called isTaggedList and accepts the type-tag with the expression and checks whether the expression is of the passed type:

The function above as it follows from its name checks whether an expression is the variable definition. A logical question may follow — why do we need at all these wrappers? Why not just simply to test something like this:

The answer is also simple: abstracting such checks into separate functions, we give ourselves the ability not to make commitment about again the exact format of the AST. If later, we will want to change this exact form of some expression, we have to change only related with it procedures. Moreover, abstracted predicates allow us to process some expressions individually, e.g. having alternative type-tags:

This is the simplest type of expressions. Self-evaluating expressions do not require additional evaluation, but directly return their values as a result of evaluation.

In the language we have only two types of self-evaluating expressions: numbers and strings.

Thus, strings in contrast with other languages are quoted with only one single quote (it’s quite easy to do, since expression parts are separated being elements of the expression array and quoted with double quotes at JavaScript level). And numbers are the same as in other languages.

The function checks whether an expression is the name of a variable. Examples: x, null?, +, etc.

Any value can be assigned (bound) and reassigned (rebound) to the name. This is why association of a variable name and its value is called a binding.

We represent environments as simple JavaScript objects, so the look up and assignment are done in very simple manner — checking whether a property corresponding to the variable name exists in the object and if so — we simply get its value.

There is a special global environment which is stored in the TheGlobalEnvironment object and which exists in the single exemplar and is available prior the program execution. The global environment is prepopulated with some built-in bindings, among which e.g. math-functions, such as addition, multiplication, print function, etc.

And its handler procedure, evalDefinition must recursively call evaluate to compute the value to be associated with the variable. The environment must be modified to change (or create) the binding of the variable:

Since the definition is already a complex expression, we have separator procedures which get specific parts of an expression. Thus, in this particular case we have getDefinitionName and getDefinitionValue getter functions.

The functions is a little bit complicated, since we use define for both: definition of simple variables and also for definitions of functions. The signature of a variable definition was already shown above:

It again takes into account both: variables and functions, and has separated logic for this. We don’t touch getting values of user-defined functions in this part, so we skip this makeLambda part for now, and test the function with only simple variables.

Notice, in evalDefinition above we wrap the result of getDefinitionValue into the recursive evaluate again — to obtain the real value from the “raw” definition value. It can be seen on the example of assigning to a variable the value of another variable:

Notice the exclamation mark in the name of the type-tag. In general, we use variable names ending with ! for functions which modifies the original data. Assignment to a variable — set!, does exactly this. Also we use “question mark” names, such as e.g. null? for predicates, that is for functions which return boolean value.

The set! node has similar to define format, but it’s easier, since shouldn’t consider complex case with function definition — it just assign a passed value to the variable:

First, we again check whether the variable is defined in the environment, and if it’s not, then we throw the exception about it.

Secondly, in contrast with definition, assignment may modify some parent environment (whereas the definition always creates a binding in the own environment). To see a simple concrete example, think of a (possibly inner) JavaScript function:

To be able to resolve variable x, the function foo should have access to the parent, global environment. And exactly x from the global environment should be updated, but not the local x should be created.

The same is in our interpreter. Later we’ll describe functions in detail, but for now just mention, that every function, when is called creates new fresh environment (a new scope) to bind its arguments to parameter names and also to bind local variables.

Thus, this fresh activation environmentextends the parent environment. If it’s a global function, then its activation environment extends the global environment. If it’s an inner function, then it extends the parent frame — that is, the activation environment of the parent function.

Therefore functions form something that we may call a scope chain. These are chained environment frames where every frame corresponds to its scope.

To extend parent environment frame, we use native JavaScript inheritance. Function Object.create helps us with it.

So if a function updates a variable from some parent frame (in this case such variables are called “free variables“), we should first find the parent frame where the variable is actually defined. This is for we use do-while loop above in the setVariableValue function.

At this step we discussed the first part of the evaluate (or eval) function. We have seen that eval is a simple case-analysis function which dispatches to the needed expression handlers depending on the expression type.

We considered several expression types: self-evaluating (the simples expressions), variables loop up, and also definition of or assignment to variables. The later three are related with the concept of environments. Thus, assignment also works with the concept of nested environments or, another naming, the scope chains.

In the next part we’ll talk about functions and other expressions (such as if expression, etc) in our language.

Dmitry, thanks for this summary! I follow all your work and this series is very interesting.

On your format of AST nodes: I’ve seen that usually a separate class for each AST-node is used with own eval method which knows how to evaluate the result. Do you think such approach is easier for representation?

Great series, thanks! I just started to follow the sources on github and have some questions.

You show how to evaluate a single expression. But how do we evaluate a whole program? Could you show how a simple program with definition of variables and using functions may look like in your language?