Eval - The new Haxe macro interpreter

Eval

Eval is the name of the new interpreter built into Haxe. It replaces the previous implementation which served as the execution engine for macros and --interp since 2010. During this period, a lot of macro code has been written. Developing a new interpreter that supports all such code is a challenge, but we are confident in our ability to pull it off.

Why we need a new interpreter

The main problem of the current interpreter is performance. We can compare it to neko, which we generally consider to have a decent performance. All these tests are measured in seconds:

what

neko

interp

polygonal-ds

0.4

1.8

haxe-checkstyle 1

9.5

44.6

haxe-checkstyle 2

3.2

21.5

thx.core

5.1

25.9

haxeparser std

5.8

19.9

The interpreter is roughly 4-5 times slower than neko. This can be a significant difference in macros, which we want to not slow down compilation, and especially compiler services, too much.

Furthermore, the current architecture mandates that any change in the macro system recompiles the entire macro system. This mostly affects compilations through the compilation server, but can also arise within a single compilation in some circumstances. While macro compilation is usually quite fast in every-day situations, it can become a bottleneck. It would then be preferable to have a more incremental approach to macro system changes.

Why a new target?

Indeed, a question to be asked is why a new interpreter had to be realized as a new target, instead of building on top of the current one. There are, unfortunately, some key problems with the current approach:

The String and Array implementations wrap a native implementation. This is notoriously slow for such heavily used classes, in part because it puts additional strain on the garbage collection (GC).

Field access is implemented as a binary integer search. This has an average cost of O(log(N)) for any field access, while it could be O(1) in many statically known situations.

The code that results from a Haxe compilation is first sent through the neko generator, and the result of that is used in the interpreter. This adds an additional pass and causes complications with incremental approaches, because neko usually just generates one big piece of code.

Supporting the neko standard library would require quite a few workarounds.

Eval implementation

At the heart of eval lies its context. During any given compilation, there are at most two contexts active:

The macro context, which is used to run macros.

The interpreter context, which is used to run --interp.

The context's main job is to maintain a list of prototypes and constructor functions. Prototypes are either static or instance prototypes, and are looked up using an integer hash-code of their dot paths. Constructor functions are used to initialize new instances and correspond to the new fields being defined on a class.

To illustrate, this is what happens in case of a new pack.SomeClass() call:

Get the hash code for pack.SomeClass.

Look up the instance constructor function.

Look up the instance prototype.

Create an instance and assign the prototype to it.

Call the constructor function with the just created instance as this.

Return the instance.

Looking at this list, we can see that steps 1-3 give the same result for each new pack.SomeClass() call: The hash code is always the same and it always leads to the same constructor function and instance prototype. This means we can memoize this information and re-use it for future instance constructions.

In fact, there is a lot of information that we only have to calculate once:

If we assume that static class fields are stored in an fixed-length array, we can determine the field offset within that array. This also applies to enum constructors.

We can do the same for instance methods that are not overridden.

We know if a loop has to consider continue or break expressions.

We know what run-time type a safe-cast expects, and what run-time types a try catch expression might catch.

We know how many local variables we have in a given scope, and can determine their offsets within an array.

There's more, of course, but the point here is to show that this leads to a natural 2-pass implementation.

JIT + run

All the information we just mentioned is calculated at what we call JIT-time. This is the first pass that results in a tree of functions, which are then executed at the actual run-time. To illustrate, one may imagine each AST node of the program being replaced by a function like so:

a + b;// after JITopAdd(env->env.getLocal(0), env->env.getLocal(1))

The sneakily introduced env is the environment of a function. When a function is called, it creates a new environment. That environment mostly manages local variables in an array. In our example above, the JIT determined that variable a has slot 0 and variable b has slot 1. Getting their values is then a cheap array access operation.

The general theme here is to do as much work as possible during JIT-time, so we don't have to do that work at run-time. After JIT-time has created the run-time functions, we can discard most of the information used during JIT-time (unless we are in debugger-mode, but more on that later). What we end up with is "just some functions", and executing these is quite fast.

Eval does well with strings, allocation and structural access (of fields, enum values etc.). Anonymous objects can be faster than class instances because they don't require a constructor call to be evaluated.

What does this mean for me?

If you are a mere user of macros, all this really means is that the macros you use should run a bit faster. For macro developers, some eval-specific information might be interesting:

The target name is eval, so #if eval holds in both macros and --interp mode. As before, #if macro holds in macro mode.

Anonymous objects are implemented as instances, similar to classes. Field access on them for known fields is O(1). Fields that are added later are stored in an overflow map and are accessed at O(log(N)).

By default, the interpreter does not record a stack trace. This allows some optimizations which provide a significant speed-up in some cases. While developing a macro, one may want to define -D eval-stack in order to get reliable stack traces.

The interpreter supports per-function timers by defining -D eval-times. This, in combination with --times, gives an accurate measure of where most of the run-time is spent. Note that this flag makes compilation overall significantly slower.

There is debugger support by defining -D eval-debugger. More on that later!

The old interpreter did not care about constructs like untyped __js__ unless it would actually execute. Eval is going to fail on that during its JIT stage, which requires some additional conditional compilation in these cases.