I have recently started an internship in the Gallium team working on intermediate representation (IR) for compilers. The aim is to get a better understanding of the relation that links "traditional" forms and their functional counterparts.

Many different representations exist, the ones I am focusing on are:

static single assignment (SSA): each variable is assigned to only once (appears once in the left-hand side of an assignment), this makes data-flow tracking easy,

continuation passing style (CPS): a variant of λ-calculus where functions do not return but apply a continuation to their result, and

administrative normal form (ANF): a variant of λ-calculus with flat (not-nested) let bindings.

There are other IRs (such as the monadic intermediate language) which I will not explore.

I started my internship by reading papers on the topic, and coding a tool to help in the exploration of the different IRs. The code is on my github (which is not bad for reasons of OSS code hosted on non OSS-powered servers but because it uses flash).

An example with SSA and CPS

Translation into SSA requires the conversion to a control flow graph (CFG; here flattened using labels and jumps/branchs for edges) and placement of ϕ-functions. These ϕ-functions hack around the main SSA restriction: variables can not be assigned to twice. Thus ϕ-functions merge variables coming from different places.

branching is done via branch <variable> <label1> <label2> and is equivalent to jump <label1> if the variable is true, and jump <label2> otherwise, and

variables that were introduced as a result of the single-assignment constraint preserve their original variable names as prefixes, others are prefixed with t.

The CPS version resemble SSA's. The main difference is with the ϕhack ingenious idea. The following table sums up the correspondence:

SSA

CPS

Procedure

λ

Block

λ

ϕ-node

Parameters for a block-λ

Jump/Branch/…

Application

Here is the translated examples. The λs are noted with a \. The names of the labels have been reused to make the correspondence easier to see. (Note that we do not use a complete CPS and allow the use of operators in direct style. On the other hand we kept CPS for if-then-else.)

SSA is Functional Programming

The article informally explains the links between SSA and functional programming in general. SSA has syntactic constraints that makes it nested-scope-like and thus easy to translate into a functional language. Jumps are mapped to function applications and ϕs to function parameters.

Compiling Standard ML to Java Bytecodes

The article explains the internals of MLj, a compiler from SML97 to Java bytecode. It uses a typed, functional intermediate representation called monadic intermediate language (MIL). Features include: SML-java interoperability (possible foreign function calls) and various optimisations (such as monomorphisation).

MIL is a typed functional language. The type system differentiate values and computations—the latter including effect information. Optimisations based on β and η reduction need a transformation called commuting conversions to avoid let-in nesting.

Towards the end, the compiler transforms (commuting-conversion normalised) MIL into a variant of SSA. This final representation is then transformed into real bytecode.

Compiling with Continuations, Continued

The paper advocates the use of CPS and argues that it is better than either ANF or MIL for use as an intermediate language in compilers. Optimisations apply better on CPS than on alternative in that they are well known λ-calculus transformation and they preserve the language (thus avoiding the need for a normalisation/simplification pass).

The CPS language is formally defined (well-formed-ness and semantics have derivation rules). The translation from a reasonable fragment of ML into CPS is detailed. The CPS language is then extended to include exceptions and recursive functions.

It is made clear that optimisations are all particular cases of β and η rewritings. This contrasts with MIL and ANF in that both require commuting conversion normalisation to be performed between optimisation passes—leading to O(n2) complexity on some optimisation/normalisation interleavings.

Design and Implementation of Code Optimizations for a Type-Directed Compiler for Standard ML

This PhD thesis explores different aspects of type-directed compilation (using the Typed Intermediate Language (TIL) framework. It uses an IR called LMLI (for λiML or lambda intermediate ML). The optimisation phase is performed on a subset of LMLI similar to ANF with type annotations.

The following arguments are presented concerning this choice of IR rather than CPS:

the two are theoretically equivalent (there exist an efficient translation from one form to the other),

direct-style (DS) (that is non-CPS)

makes it easier to re-use compilation techniques (e.g. from GHC), and

is more suited to the set of implemented optimisations, even though

CPS has a more powerful β inlining

Concerning point (2a), it is interesting to note that some compilers (e.g. SML/NJ and some Scheme compilers) use CPS.

Optimizing Nested Loops Using Local CPS Conversion

Using a DS IR makes certain loop optimisations difficult (more specifically, nested loops can makes some calls appear as non-tail while they essentially are). Because this is specific to DS, it is proposed to transform appropriate parts of the program to CPS, hence the local nature of CPS conversion.

Detecting the parts that need conversion requires a traversal of the DS IR (and a fixed-point search to support mutually recursive functions) to build the environment associating program variables to the abstract continuations lattice: Γ : var → (label ∪ {⊤, ⊥})

The LCPS conversion offers then the possibility to group functions into clusters. Clusters can then be compiled as would loops: with only one stack frame (however many functions there are) and jump operations. Thus nested loops as well as "mutually tail recursive" (used for state-machine encoding) can be optimised.

Just as pointed by Kennedy in "Compiling with Continuations, Continued", defects of the DS IR are avoided introducing continuations.

an environment Γ mapping the set Var of variables onto the Abs = {⊤, ⊥} ∪ Const lattice—where ⊤ is for variables for which there is no information, and ⊥ is for variables that do not receive concrete values1,

a working set for remembering the things that remains to be done

and is two fold:

Analysis (looking for the biggest fix-point):

assign the ⊤ value to every var

fold over the program modifying the environment and the working set

Specialisation (Using the Analysis's results):

fold over the program

removing dead-code (functions, not in Γ)

replacing the constants by their values

replacing non-terminating computations by an infinite loop—with some tricks for keeping side-effects.

Code

In order to understand the relationship between the different IRs, I'll try to port algorithms from SSA to CPS (and, if time constraints allow it, ANF). This translation is more interesting than its inverse because:

SSA algorithms are plenty and easy to find, and

λ-calculus has strong theoretical foundations.

Item (a) widens the scope of the study and item (b) makes results easier to interpret (or so I hope).

Because it is easy—once one has the dominator relation computed—to translate SSA into CPS the problem of porting an optimisation Optimssa is to close following diagram; that is, finding the Optimcps that verifies, for any program P:

Of course the trivial solution Transl − 1(Optimssa(Transl(.))) is correct but useless as our goal is to understand the relation between SSA and CPS—finding Optimcps is a mean of exploration. I do not expect every algorithm to be translatable in a pseudo-commuting manner. Pseudo-commutation might only be verified on some particular subset of SSA/CPS.

Additionally, I am writing a parser from LLVM human-readable assembly to my own SSA format so as to be able to use the optimization algorithms available in the LLVM project. My custom SSA form is close enough to LLVM's. target architecture information, calling convention annotations, and other details are ignored, and some invariants are enforced in the type system.

Absence of concrete value can be either due to code being dead or as a result of non-terminating computations.↩