Just a quick update this time. The scxml-js is moving right along, as I’ve been adding support for new features at, on average, a rate of about a feature per day. Today, I reached an interesting milestone, which is that scxml-js is now as featurful as the old SCCJS compiler which I had previously been using in my research. This means that I can now begin porting the demos and prototypes I constructed using SCCJS to scxml-js, as well as begin creating new ones.

New Demos

Here are two new, simple demos that illustrate how scxml-js may be used to to describe and implement behaviour of web User Interfaces (tested in recent Firefox and Chromium; will definitely not work in IE due to its use of XHTML):

Both examples use state machines to describe and implement drag-and-drop behaviour of SVG elements. The first example is interesting, because it illustrates how HTML, SVG, and SCXML can be used together in a single compound document to declaratively describe UI structure and behaviour. The second example illustrates how one may create state machines and DOM elements dynamically and procedurally using JavaScript, as opposed to declaratively using XML markup. In this example, each dynamically-created element will have its own state machine, hence its own state.

I think the code in these examples is fairly clean and instructive, and should give a good sense regarding how scxml-js may ultimately be used as a finished product.

Here’s another quick update on the status of my Google Summer of Code project.

Finished porting IR-compiler and Code Generation Components to XSLT

As described in the previous post, I finished porting the IR-compiler and Code Generation components from E4X to XSLT.

Once I had this working with the Java XML transformation APIs under Rhino, I followed up with the completion of two related subtasks:

Get the XSL transformations working in-browser, and across all major browsers (IE8, Firefox 3.5, Safari 5, Chrome 5 — Opera still to come).

Create a single consolidated compiler front-end, written in JavaScript, that works in both the browser and in Rhino.

Cross-Browser XSL Transformation

Getting all XSL transformations to work reliably across browsers was something I expressed serious concerns about in my previous post. Indeed, this task posed some interesting challenges, and motivated certain design decisions.

The main issue I encountered in getting these XSL transformations to work was that support for xsl:import in xsl stylesheets, when called from JavaScript, is not very good in most browsers. xsl:import works well in Firefox, but is currently distinctly broken in Webkit and Webkit-based browsers (see here for the Chrome bug report, and here for the Webkit bug report). I also had limited success with it in IE 8.

I considered several possible solutions to work around this bug.

First, I looked into a pure JavaScript solution. In my previous post, I linked to the Sarissa and AJAXSLT libraries. In general, a common task of JavaScript libraries is to abstract out browser differences, so the fact that several libraries existed which appeared to do just that for XSLT offered me a degree of confidence when I was initially choosing XSLT as a primary technology with which to implement scxml-js. Unfortunately, in this development cycle, on closer inspection, I found that Sarissa, AJAXSLT, and all other libraries designed to abstract out cross-browser XSLT differences (including Javeline, the jquery xsl transform plugin), are not actively maintained. As web browsers are rapidly moving targets, maintenance is a major concern when selecting a library dependency. In any case, a pure JavaScript solution did not appear feasible. This left me to get the XSL transformations working using just the “bare metal” of the browser.

My next attempt was to try to use some clever DOM manipulation to work around the Webkit bug. In the Webkit bug, xsl:import does not work because frameless resources cannot load other resources. This meant that loading the SCXML document on its own in Chrome, with an xml-stylesheet processing instruction pointing to the code generation stylesheet, did generate code correctly. My idea, then, was to use DOM to create an invisible iframe, and load into it the SCXML document to transform, along with the requisite processing instruction, and read out the transformed JavaScript. I actually had some success with this, but it seemed to be a brittle solution. I was able to get it to work, but not reliably, and it was difficult to know when and how to read the transformed JavaScript out of the iframe. In any case my attempts at this can be found in this branch here.

My final, and ultimately successful attempt was to use XSL to preprocess the stylesheets that used xsl:import, so as to combine the stylesheet contents, while still respecting the semantics of xsl:import. This was not too difficult, and only took a bit of effort to debug. You can see the results here. Note that there may be some corner cases of XSLT that are not handled by this script, but it works well for the existing scxml-js code generation backends. This is the solution upon which I ultimately settled.

One thing that must still be done, given this solution, is to incorporate this stylesheet preprocessing into the build step. For the moment, I have simply done the simple and dirty thing, which is to checked the preprocessed stylesheets into SVN.

It’s interesting to note that IE 8 was the easiest browser to work with in this cycle, as it provided useful and meaningful error messages when XSL transformations failed. By contrast, Firefox would return a cryptic error messages, without much useful information, and Safari/Chrome would not provide any error message at all, instead failing silently in the XSLT processor and returning undefined.

Consolidated Compiler Front-end

As I described in my previous post, a thin front-end to the XSL stylesheets was needed. For the purposes of running inside of the browser, the front-end would need to be written in JavaScript. It would have been possible, however, to write a separate front-end in a different language (bash, Java, or anything else), for the purposes of running outside of the browser. A design decision needed to be made, then, regarding how the front-end should be implemented:

Implement one unified front-end, written in JavaScript, which relies on modules which provide portable API’s, and provide implementations of these API’s that vary between environments.

Implement multiple front-ends, for browser and server environments.

I decided that, with respect to maintainability, it would be easier to maintain one front-end, written in one language, rather than two front-ends in different languages, and so I chose the first option. This worked well, but I’m not yet completely happy with the result, as I have code for Rhino and code for the browser mixed together in the same mdoule. This means that code for Rhino is downloaded to the browser, even though it is never called (see Transformer.js for an example of this). The same is true for code that targets IE versus other browsers. I believe I’ve thought of a way to use RequireJS to selectively download platform-specific modules, and this is an optimization that I’ll make in the near future.

In-Browser Demo

The result of this work can be seen in this demo site I threw together:

This demo provides a very crude illustration of what a browser-based Graphical User Interface to the compiler might look like. It takes SCXML as input (top-most textarea), compiles it to JavaScript code (lower-left textarea, read-only), and then allows simulation from the console (bottom-right textarea and text input). For convenience, the demo populates the SCXML input textarea with the KitchenSink executable content example. I’ve tested it in IE8, Safari 5, Chrome 5, Firefox 3.5. It works best in Chrome and Firefox. I haven’t been testing in Opera, but I’m going to start soon.

Future Work

The past three weeks was spent porting and refactoring, which was necessary to facilitate future progress, and now there’s lots to do going forward. My feeling is that it’s now time to get back to the main work, which is adding important features to the compiler, starting with functionality still missing from the current implementation of the core module:

I’m going to be presenting this work at the SVG Open 2010 conference at the end of August, so I’m also keen to prepare some new, compelling demos that will really illustrate the power of Statecharts on the web.

I’m two weeks into my Google Summer of Code project, and decided it was time to write the first update describing the work I’ve done, and the work I will do.

Project Overview

First a quick overview of what my project is, what it does, why one might care about it. The SCXML Code Generation Framework, JavaScript Edition project (SCXMLcgf/js) centers on the development of a particular tool, the purpose of which is to accelerate the development of rich Web-based User Interfaces. The idea behind it is that there is a modelling language, called Statecharts, which is very good at describing dynamic behaviour of objects, and can be used for describing rich UI behaviour as well. The tool I’m developing, then, is a Statechart-to-JavaScript compiler, which takes as input Statechart models as SCXML documents, and compiles them to executable JavaScript code, which can then be used in the development of complex Web UIs.

I’m currently developing this tool under the auspices of the Apache Foundation during this year’s Google Summer of Code. For more information on it, you could read my GSoC project proposal here, or even check out the code here.

Week 1 Overview

As I said above, I’m now two weeks into the project. I had already done some work on this last semester, so I’ve been adding in support for additional modules described in the SCXML specification. In Week 1, I added basic support for the Script Module. I wrote some tests for this, and it seemed to work well, so I checked it in.

Difficulties with E4X

I had originally written SCXMLcgf/js entirely JavaScript, targeting the Mozilla Rhino JavaScript implementation. One feature that Rhino offers is the E4X language extension to JavaScript. E4X was fantastic for rapidly developing my project. It was particularly useful over standard JavaScript in terms of providing an elegant syntax for: templating (multiline strings with embedded parameters, and regular JavaScript scoping rules), queries against the XML document structure (very similar to XPath), and easy manipulation of that structure.

These language features allowed me to write my compiler in a very declarative style: I would execute transformations on the input SCXML document, then query the resulting structure and and pass it into templates which generated code in a top-down fashion. I leveraged E4X’s language features heavily throughout my project, and was very productive.

Unfortunately, during Week 1, I ran into some difficulties with E4X. There was some weirdness involving namespaces, and some involving scoping. This wasn’t entirely surprising, as the Rhino implementation of E4X has not always felt very robust to me. Right out of the box, there is a bug that prevents one from parsing XML files with XML declarations, and I have encountered other problems as well. In any case, I lost an afternoon to this problem, and decided that I needed to begin to remove SCXMLcgf/js’s E4X dependencies sooner rather than later.

I had known that it would eventually be necessary to move away from E4X for portability reasons, as it would be desirable to be able to run the SCXMLcgf/js in the browser environment, including non-Mozilla browsers. There are a number of reasons for this, including the possibility of using the compiler as a JIT compiler, and the possibility of providing a browser-based environment for Statechart development. Given the problems I had had with E4X in Week 1, I decided to move this task up in my schedule, and deal with it immediately.

So, for Week 2, I’ve been porting most of my code to XSLT.

Justification for Targeting XSLT

At the beginning of Week 2, I knew I needed to migrate away from E4X, but it wasn’t clear what the replacement technology should be. So, I spent a lot of time thinking about SCXMLcgf/js, its architecture, and the requirements that this imposes on the technology.

The architecture of SCXMLcgf/js can be broken into three main components:

Front End: Takes in arguments, possibly passed in from the command-line, and passes these in as options to the IR Compiler and the Code Generator.

IR Compiler: Analyzes the given SCXML document, and creates an Intermediate Representation (IR) that is easy to generate code from.

Code Generator: Generates code from a given SCXML IR. May have multiple backend modules that target different programming languages (it currently only targets JavaScript), and different Statechart implementation techniques (it currently targets three different techniques).

My goal for Week 2 was just to eliminate E4X dependencies in the Code Generator component. The idea behind this component is that its modules should only be used for templating. The primary goal of these template modules is that they should be easy to read, understand, and maintain. In my opinion, this means that templates should not contain procedural programming logic.

Moreover, I came up with other precise feature requirements for a templating system, based on my experience from the first implementation of SCXMLcgf/js:

must be able to run under Rhino or the browser

multiline text

variable substitution

iteration (loops)

if/else blocks

Mechanisms to facilitate Don’t Repeat Yourself (DRY)

Something like function modularity, where you separate templates into named regions.

Something like inheritance, where a template can import other templates, and override functionality in the parent template.

Because I’m very JavaScript-oriented, I first looked into templating systems implemented in JavaScript. JavaScript templating systems are more plentiful than I had expected. Unfortunately, I did not find any that fulfilled all of the above requirements. I won’t link to any, as I ultimately chose not to go down this route.

A quick survey of XSLT, however, indicated to me that it did support all of the above functionality. So, this left me to consider XSLT, the other programming language which enjoys good cross-browser support.

I was pretty enthusiastic about this, as I had never used XSLT before, but had wanted to learn it for some time. Nevertheless, I had several serious concerns about targeting XSLT:

How good is the cross-browser support for XSLT?

I’m a complete XSLT novice. How much overhead will be required before I can begin to be productive using it?

Is XSLT going to be ridiculously verbose (do I have to wrap all non-XML text in a <text/> node)?

Is there good free tooling for XSLT?

Another low-priority concern was that I wanted to keep down dependencies on different languages; it would be nice to focus on only one. I’m not sure about XSLT’s expressive power. Would it be possible to port the IR-Compiler component to XSLT?

I did an initial review of XSLT. I found parts of it to be confusing (like how and when the context node changes; the difference between apply-templates with and without the select attribute; etc.), but decided the risk was low enough that I could dive in and begin experimenting with it. As it turned out, it didn’t take long before I was able to be productive with it.

Text node children of an <xsl:template/> are echoed out. This is well-formed XML, but I’m not sure if it’s strictly legal XSLT. Anyhow, it works well, and looks good.

This was pretty bad. The best graphical debugger I found was: KXSLdbg for KDE 3. I also tried the XSLT debugger for Eclipse Web Tools, and found it to be really lacking. In the end, though, I mostly just used <xsl:message/> nodes as printfs in development, which was really slow and awkward. This part of XSLT development could definitely use some improvement.

I’ll talk more about 5. in a second.

XSLT Port of Code Generator and IR-Compiler Components

I started to work on the XSLT port of the Code Generator component last Saturday, and had it completed by Tuesday or Wednesday. This actually turned out not to be very difficult, as I had already written my E4X templates in a very XSLT-like style: top-down, primarily using recursion and iteration. There was some procedural logic in there which need to be broken out, so there was some refactoring to do, but this wasn’t too difficult.

When hooking everything up, though, I found another problem with E4X, which was that putting the Xalan XSLT library on the classpath caused E4X’s XML serialization to stop working correctly. Specifically, namespaced attributes would no longer be serialized correctly. This was something I used often when creating the IR, so it became evident that it would be necessary to port the IR Compiler component in this development cycle as well.

Again, I had to weigh my technology choices. This component involved some analysis, and transformation of the given SCXML document to include this extra information. For example, for every transition, the Least Common Ancestor state is computed, as well as the states exited and the states entered for that transition.

I was doubtful that XSLT would be able to do this work, or that I would have sufficient skill in order to program it, so I initially began porting this component to just use DOM for transformation, and XPath for querying. However, this quickly proved not to not be a productive approach, and I decided to try to use XSLT instead. I don’t have too much to say about this, except to observe that, even though development was often painful due to the lack of a good graphical debugger, it was ultimately successful, and the resulting code doesn’t look too bad. In most cases, I think it’s quite readable and elegant, and I think it will not be difficult to maintain.

Updating the Front End

The last thing I needed to do, then, was update the Front End to match these changes. At this point, I was in the interesting situation of having all of my business logic implemented in XSLT. I really enjoyed the idea of having a very thin front-end, so something like:

There would be a bit more to it than that, as there would need to be some logic for command-line parsing, but this would also mostly eliminate the Rhino dependency in my project (mostly because the code still uses js_beautify as a JavaScript code beautifier, and the build and performance analysis systems are still written in JavaScript). This approach also makes it very clear where the main programming logic is now located.

In the interest of saving time, however, I decided to continue to use Rhino for the front end, and use SAX Java API’s for processing the XSLT transformations. I’m not terribly happy with these API’s, and I think Rhino may be making the system perceptibly slower, so I’ll probably move to the thin front end at some point. But right now this approach works, passes all unit tests, and so I’m fairly happy with it.

Future Work

I’m not planning to check this work into the Apache SVN repository until I finish porting the other backends, clean things up, and re-figure out the project structure. I’ve been using git and git-svn for version control, though, which has been useful and interesting (this may be the subject of another blog post). After that, I’ll be back onto the regular schedule of implementing modules described in the SCXML specification.