Latest revision as of 22:00, 15 December 2008

Note (Aug 27, 2007): This page was started about a year ago. Over time, the focus was changed to integration with Yhc Core, and the work in progress may be observed here: Yhc/Javascript.

Disclaimer: Here are my working notes related to an experiment to execute Haskell programs in a web browser. You may find them bizzarre, and even non-sensual. Don't hesitate to discuss them (please use the Talk:STG in Javascript page). Chances are, at some point a working implementation will be produced.

Contents

A Wiki page Hajax has been recently created, which summarizes the achievements in the related fields. By these experiments, I am trying to address the problem of Javascript generation out of a Haskell source.

To achieve this, an existing Haskell compiler, namely nhc98, is being patched to add a Javascript generation facility out of a STG tree: the original compiler generates bytecodes from the same source.

After (unsuccessful) trying several approaches (e. g. Javascript closures (see [3]), it has been decided to implement a STG machine (as described in [4]) in Javascript.

The abovereferenced paper describes how to implemement a STG machine in assembly language (or C). Javascript implementation uses the same ideas, but takes advantage of automatic memory management provided by the Javascript runtime, and also built-in handling of values more complex than just numbers and arrays of bytes.

To describe a thunk, a Javascript object of the following structure may be used:

So, similarly to what is described in the STG paper, the c method is used to evaluate a thunk. This method may also do self-update of the thunk, replacing itself (i. e. this.c) with something else, returning a result as it becomes known (i. e. in the very end of thunk evaluation).

Some interesting things may be done by manipulating prototypes of Javascript built-in classes.

Thus, simple numeric values are given thunk behavior: by calling the c method on them, their value is returned as if a thunk were evaluated, and in the same time they may be used in a regular way, when passed to Javascript functions outside Haskell runtime (e. g. DOM manipulation functions).

Similar trick can be done on Strings and Arrays: for these, the c method will return a head value (i. e. String.charAt(0)) CONS'ed with the remainder of a String/Array.

First thing to do is to learn how to call primitives. In Javascript,
primitives mostly cover built-in arithmetics and interface to the Math object. Primitives need all their arguments evaluated before they are called, and usually return strict values. So there is no need to build a thunk each time a primitive is called.

So, for each Haskell function, two Javascript functions are created: one creates a thunk when called with arguments (so it is good for saturated calls), another is the thunk's evaluation function. The latter will be passed around when dealing with partial applications (which will likely involve special sort of thunks, but we haven't got down to this as of yet).

Note that the _c() method is applied twice to the output from HMain.g_T: the function calls f_T which returns an unevaluated thunk, but this result is not used, so we need to force the evaluation to get the final result.

NB: indeed, the thunk evaluation function for HMain.g should evaluate the thunk created by HMain.f_T. Laziness will not be lost because HMain.g_C will not be executed until needed.

To simplify handling of partial function applications, format of thunk has been changed so that instead of _1, _2, etc. for function argument, an array named _a is used. This array always has at least one element which is undefined. Arguments start with array element indexed at 1, so to access an argument n, the following needs to be used: this._a[n].

For Haskell programs executing in a web browser environment, analogous to FFI is calling external Javascript functions.
Imagine this Javascript function which prints its argument on the window status line:

Initially, functions compiled from Haskell to Javascript were prepresented as members of objects (one object per Haskell module). Anticipating some complications with multilevel module hierarchy, and also with functions whose names contain special characters, it has been decided to pass every function identifier through the fixStr function: in nhc98 it replaces non-alphanumeric characters with their numeric code prefixed with an underscore. So a typical function definition looks like:

Note the function name: Test3_46p3_T; in previous examples it would have been something like Test3.p3_T.

Partial function applications need a different thunk format. This kind of thunk holds the function to be applied to its arguments when the application will be saturated (number of arguments becomes equal to function arity), number of remaining arguments, and an array of arguments so far.

Such a thunk always evaluates to itself (_c()); it holds the function name in its _s member, number of remaining arguments in its _x member, and available arguments in its _a member, only in this case the array does not have undefined as its zeroth element.

So, when such an expression is being computed, a special Runtime support function is called, which obtains the partial application thunk via evaluation of its first argument (Test3_46w_T())._c()), and adds the arguments provided ([2, 3]) to the list of arguments available so far. If number of arguments becomes equal to the target function arity, normal function application thunk is returned, otherwise another partial application thunk is returned. The Runtime support function looks like this:

Note the use of the apply method. It may be used also with functions that are not methods of some object. The first argument (this_arg) may be null or undefined as it will not be used by the function applied to the arguments.

NHC98 acts differently when a partial application is not defined as a separate function, but is part of another expression.

For each application of p and z, an internal function NHC_46Internal_46_95applyN_T is called where N depends on the target function arity. In Javascript implementation, all these functions are indeed one function (because in Javascript it is possible to determine the number of arguments a function was called with, so no need in separate functions for each arity). The internal function extracts its first argument and evaluates it (by calling the _c() method), getting a partial application thunk. Then, the Runtime support function HSRuntime_46doApply is called with the thunk and arguments array:

Here's my attempt. I'm going to implement Haskell to javascript compiller, based on STG machine. This appeared to be not so easy task, so I'd be happy to get some feedback.

This is an example translation of some Haskell functions to JavaScript, I'm trying to be descriptive, but if I'm not, please, ask me or write your suggestions. I'm not quite sure if this code is really correct.

I've got some feedback from Edward Kmett. Edward claims that simple trampolining as used below is less efficient than Appels trampoline. Trampolining is a trick to simulate tail calls. Simon Peyton Jones used the same technic as I did in my code (and Dimitry did too). The technic is simple: return continuation to call it. Mini-interpreter is used for trampolining on the stack:

while (f=f())
;

It is efficient in Simon version because of GNU C compiler's tweaks (not portable so). But they are not available in JavaScript.

Using the same interpreter in JavaScript I have to return function to simulate tail call and call interpreter again to simulate normal call. This seems very inefficient and Edward claims it is.

The trick is the transformation of the program to Continuation Passing Style. We need no stack at all when using this transformation. So every call is tail call. We can get rid of interpreter and just call functions as usual in JavaScript. We can use counter to count stack frames (function enterings), and when we rich the limit, we don't call continuation directly, but register it as a callback on the timer event (and thus we flush the stack). So we use longer jumps on stack, and some peaple claims it's more efficient. Moreover we get some framework to introduce parallel threads. Stack jump become a quantum for the thread.