A Tiny Lisp for AltOS

I took a bit of a diversion over the last week or so when I wondered
how small a lisp interpreter I could write, and whether I could fit
that into one of the processors that AltOS runs on. It turns out, you
can write a tiny lisp interpreter that fits in about 25kB of ram with
a 3kB heap for dynamic data.

I decided to target our ChaosKey boards; they're tiny, and I've got a
lot of them. That processor offers 28kB of usable flash space (after
the 4kB boot loader) and 6kB of ram with the processor running at a
steaming 48MHz.

I'm not at all sure this is useful, but I always enjoy doing language
implementations, and this one presented some 'interesting' challenges:

Limited RAM. I don't have space to do a classic stop/copy
collector.

Limited stack. A simple lisp implementation uses the C stack for
all recursion in execution and memory collection. I don't have
enough ram for that.

Iterative Compacting Allocator

I'm betting someone has built one of these before, but I couldn't find
one, so I wrote my own.

The basic strategy is to walk the heap to find a subset of the active
objects which are allocated sequentially in memory with only unused
storage between them. These objects are then compacted in-place, and
then the heap is walked again to update all references to the moved
objects. Then, the process is restarted to find another subset and
move them.

By looking for these subsets starting at the bottom of the
heap, and working upwards towards the top, the whole heap can be
compacted into a contiguous chunk at the bottom of memory.

Allocation involves moving a pointer along at the top of active
memory; when it gets to the top of the heap, collect and see if
there's space now.

As always, the hardest part was to make sure all active memory was
tied down. The second hardest part was to make sure that all active
pointers were updated after any allocation, in case a collect moved
the underlying object. That was just bookkeeping, but did consume much
of the development time.

One additional trick was to terminate the recursion during heap
walking by flagging active cons cell locations in a global bitmap and
then walking that separately, iterating until that bitmap is
empty. Nested lambdas form another recursion which should probably get
the same approach, but I haven't done that yet.

An unexpected "benefit" of the tiny heap is that the collector gets
called a lot, so any referencing bugs will have a good chance of being
uncovered in even a short program execution.

ROM-able Lisp

Instead of implementing all of the language in C, I wanted to be able
to implement various pieces in Lisp itself. Because of the complex
nature of the evaluation process, adding things like 'let' or even
'defun' turn out to be dramatically simpler in Lisp. However, I didn't
want to consume bunches of precious RAM to hold these basic functions.

What I did was to create two heaps, one in ROM and the other in
RAM. References are be tagged as to which heap they're in.

16-bit Values

Lisp programs use a pile of references. Using a full 32 bits for each
one would mean having a lot less effective storage. So, instead, I use
an offset from the base of the heap. The top bit of the offset is used
to distinguish between the ROM heap and the RAM heap.

I needed a place to store type information, so I settled on using the
bottom two bits of the references. This allows for four direct type
values. One of these values is used to indicate an indirect type,
where the type is stored in the first byte of the object. The direct
types are:

Value

Type

0

Cons cell

1

14-bit int

2

String

3

Other

With 2 tag bits, the allocator needs to work in 32-bit units as the
references couldn't point to individual bytes. Finally, I wanted 0 to
be nil, so I add four to the offsets within the heaps.

The result is that the ROM and RAM heaps can each cover up to
32k - 4 bytes.

Note that ints are not stored in the heap; instead they are immediate
values stored in 14 bits, providing a range of -8192 to 8191. One can
imagine wanting more range in ints at some point.

Heap-based Evaluator

A simple lisp implementation uses the fact that eval is re-entrant and
do the operation on the C stack:

This makes things really simple and provides for a clean framework for
implementing various bits of lisp, including control flow and
macros. However, it rapidly consumes all of the available memory for a
stack, while also requiring separate bookkeeping for the in-use memory
in each frame.

I replaced this design with one which keeps the lisp stack on the
heap, and then performs eval with a state machine with all state
stored in global variables so that the memory manager can reference
them directly.

Each eval operation is performed in a separate 'stack' context, which
holds the entire eval state except for the current value, which lives
in a separate global variable and is used to pass values out of one
stack frame and into another. When the last stack context is finished,
the evaluation terminates and the value is returned to the caller.

There are nine states in the state machine, each of which is
implemented in a separate function, making the state machine a simple
matter of pulling the current state from the top of the stack and
invoking the associated function:

Because there's no C recursion involved, catching exceptions is a
simple matter of one test at this level.

Primitives like progn, while, cond and eval all take special magic in
the state machine to handle; getting all of that working took several
attempts before I found the simple loop shown above.

Lexical Scoping

The last time I did a lisp interpreter, I implemented dynamic
scoping. Atoms were all global and had values associated directly with
them. Evaluating a lambda started by saving all of the existing global
values for the parameter atoms and then binding the new values. When
finished, the previous values would be restored. This is almost
correct, but provides surprising results for things like:

The value that foo gets for 'baz' is 2 instead of 1 under dynamic
scoping, which most people find surprising. This time, I was
determined to use lexical scoping, and it turned out to be
surprisingly easy.

The first trick was to separate the atoms from their 'value'; each
atom can have a different value in different lexical scopes. So, each
lexical scope gets a 'frame' object, those contain the value for each
atom defined in that scope. There's a global scope which holds all of
the globally defined values (like baz, foo and bletch above). Each
frame points to its enclosing scope, so you can search upwards to find
the right value.

The second trick was to realize that the lexical scope of a lambda is
the scope in which the lambda itself is evaluated, and that the
evaluation of a lambda expression results in a 'function' object,
which contains the lambda and its enclosing scope:

In this case, the inner lambda in foo can 'see' the value of bar from
the enclosing lambda. More subtly, even if the inner lambda were
executed multiple times, it would see the same baz, and could even
change it. This can be used to implement all kinds of craziness,
including generators:

The current implementation of each frame is a simple array of
atom/value pairs, with a reference to the parent frame to form the
full scope. There are dramatically faster implementations of this same
concept, but the goal here was small and simple.

A Tiny Allocator Optimization

With eval consuming heap space for stacks, frames and argument lists,
the interpreter was spending a lot of time in the collector. As a
simple optimization, I added some free lists for stack frames and cons
cells.

Stack frames are never referenced when they're finished, so they can
always go on the free list. Cons cells used to construct argument
lists for functions are usually free.

Builtin functions have a bit which indicates whether they might hold
on to a reference to the argument list. Interpreted lambdas can't get
the list while nlambdas, lexprs and macros do.

Each lambda execution creates a new frame, and while it would be
possible to discover if that frame 'escapes' the lambda, I decided to
not attempt to cache free ones yet.

Save and Restore

To make the lisp interpreter more useful in tiny computers, I added
the ability to save and restore the entire heap to flash. This
requires leaving enough space in the flash to preserve the heap,
further constraining the amount of flash available for the application.