Getting a C++ game engine running in JavaScript, using WebGL for rendering, is a massive feat and is largely due to the toolchain that Mozilla has developed to make it all possible.

Since the release of the Unreal Engine 3 port to Asm.js I’ve been watching the response on Twitter, blogs, and elsewhere and while some developers are grasping the interesting confluence of open technologies that’ve made this advancement happen I’ve also seen a lot of confusion: Is Asm.js a plugin? Does Asm.js make my regular JavaScript fast? Does this work in all browsers? I feel that Asm.js, and related technologies, are incredibly important and I want to try and explain the technology so that developers know what’s happened and how they will benefit. In addition to my brief exploration into this subject I’ve also asked David Herman (Senior Researcher at Mozilla Research) a number of questions regarding Asm.js and how all the pieces fit together.

What is Asm.js?

In order to understand Asm.js and where it fits into the browser you need to know where it came from and why it exists.

Asm.js comes from a new category of JavaScript application: C/C++ applications that’ve been compiled into JavaScript. It’s a whole new genre of JavaScript application that’s been spawned by Mozilla’s Emscripten project.

Emscripten takes in C/C++ code, passes it through LLVM, and converts the LLVM-generated bytecode into JavaScript (specifically, Asm.js, a subset of JavaScript).

If the compiled Asm.js code is doing some rendering then it is most likely being handled by WebGL (and rendered using OpenGL). In this way the entire pipeline is technically making use of JavaScript and the browser but is almost entirely skirting the actual, normal, code execution and rendering path that JavaScript-in-a-webpage takes.

Asm.js is a subset of JavaScript that is heavily restricted in what it can do and how it can operate. This is done so that the compiled Asm.js code can run as fast as possible making as few assumptions as it can, converting the Asm.js code directly into assembly. It’s important to note that Asm.js is just JavaScript – there is no special browser plugin or feature needed in order to make it work (although a browser that is able to detect and optimize Asm.js code will certainly run faster). It’s a specialized subset of JavaScript that’s optimized for performance, especially for this use case of applications compiled to JavaScript.

The best way to understand how Asm.js works, and its limitations, is to look at some Asm.js-compiled code. Let’s look at a function extracted from a real-world Asm.js-compiled module (from the BananaBread demo). I formatted this code so that it’d be a little bit saner to digest – it’s normally just a giant blob of heavily-minimized JavaScript:

Technically this is JavaScript code but we can already see that this looks nothing like most DOM-using JavaScript that we normally see. A few things we can notice just by looking at the code:

This particular code only deals with numbers. In fact this is the case of all Asm.js code. Asm.js is only capable of handling a selection of different number types and no other data structure (this includes strings, booleans, or objects).

All external data is stored and referenced from a single object, called the heap. Essentially this heap is a massive array (intended to be a typed array, which is highly optimized for performance). All data is stored within this array – effectively replacing global variables, data structures, closures, and any other forms of data storage.

When accessing and setting variables the results are consistently coerced into a specific type. For example f = e | 0; sets the variable f to equal the value of e but it also ensures that the result will be an integer (| 0 does this, converting an value into an integer). We also see this happening with floats – note the use of 0.0 and g[...] = +(...);.

Looking at the values coming in and out of the data structures it appears as if the data structured represented by the variable c is an Int32Array (storing 32-bit integers, the values are always converted from or to an integer using | 0) and g is a Float32Array (storing 32-bit floats, the values always converted to a float by wrapping the value with +(...)).

By doing this the result is highly optimized and can be converted directly from this Asm.js syntax directly into assembly without having to interpret it, as one would normally have to do with JavaScript. It effectively shaves off a whole bunch of things that can make a dynamic language, like JavaScript, slow: Like the need for garbage collection and dynamic types.

As an example of some more-explanatory Asm.js code let’s take a look at an example from the Asm.js specification:

Looking at this module it seems downright understandable! Looking at this code we can better understand the structure of an Asm.js module. A module is contained within a function and starts with the "use asm"; directive at the top. This gives the interpreter the hint that everything inside the function should be handled as Asm.js and be compiled to assembly directly.

Note, at the top of the function, the three arguments: stdlib, foreign, and heap. The stdlib object contains references to a number of built-in math functions. foreign provides access to custom user-defined functionality (such as drawing a shape in WebGL). And finally heap gives you an ArrayBuffer which can be viewed through a number of different lenses, such as Int32Array and Float32Array.

The rest of the module is broken up into three parts: variable declarations, function declarations, and finally an object exporting the functions to expose to the user.

The export is an especially important point to understand as it allows all of the code within the module to be handled as Asm.js but still be made usable to other, normal, JavaScript code. Thus you could, theoretically, have some code that looks like the following, using the above DiagModule code:

This would result in an Asm.js DiagModule that’s handled special by the JavaScript interpreter but still made available to other JavaScript code (thus we could still access it and use it within a click handler, for example).

What is the performance like?

Right now the only implementation that exists is in nightly versions of Firefox (and even then, for only a couple platforms). That being said early numbers show the performance being really, really good. For complex applications (such as the above games) performance is only around 2x slower than normally-compiled C++ (which is comparable to other languages like Java or C#). This is substantially faster than current browser runtimes, yielding performance that’s about 4-10x faster than the latest Firefox and Chrome builds.

This is a substantial improvement over the current best case. Considering how early on in the development of Asm.js is it’s very likely that there could be even greater performance improvements coming.

It is interesting to see such a large performance chasm appearing between Asm.js and the current engines in Firefox and Chrome. A 4-10x performance difference is substantial (this is in the realm of comparing these browsers to the performance of IE 6). Interestingly even with this performance difference many of these Asm.js demos are still usable on Chrome and Firefox, which is a good indicator for the current state of JavaScript engines. That being said their performance is simply not as good as the performance offered by a browser that is capable of optimizing Asm.js code.

Use Cases

It should be noted that almost all of the applications that are targeting Asm.js right now are C/C++ applications compiled to Asm.js using Emscripten. With that in mind the kind of applications that are going to target Asm.js, in the near future, are those that will benefit from the portability of running in a browser but which have a level of complexity in which a direct port to JavaScript would be infeasible.

So far most of the use cases have centered around code bases where performance is of the utmost importance: Such as in running games, graphics, programming language interpreters, and libraries. A quick look through the Emscripten project list shows many projects which will be of instant use to many developers.

A number of game engines have already been ported. A good demo of what is possible is the BananaBread FPS Game (Source Code) which is playable directly in the browser and features multiplayer and bots.

A port of LaTeX to JavaScript, called texlive.js, using Emscripten, allowing you to compile PDFs completely within your browser.

Asm.js Support

However it’s important to emphasize that Asm.js-formatted JavaScript code is still just JavaScript code, albeit with an important set of restrictions. For this reason Asm.js-compiled code can still run in other browsers as normal JavaScript code, even if that browser doesn’t support it.

The critical puzzle piece is the performance of that code: If a browser doesn’t support typed arrays or doesn’t specially-compile the Asm.js code then the performance is going to be much worse off. Of course this isn’t special to Asm.js, likely any browser that doesn’t have those features is also suffering in other ways.

Asm.js and Web Development

As you can probably see from the code above Asm.js isn’t designed to be written by hand. It’s going to require some sort of tooling to write and it’s going to require some rather drastic changes from how one would normally write JavaScript, in order to use. The most common use case for Asm.js right now is in applications complied from C/C++ to JavaScript. Almost none of these applications interact with the DOM in a meaningful way, beyond using WebGL and the like.

In order for it to be usable by regular developers there are going to have to be some intermediary languages that are more user-accessible that can compile to Asm.js. The best candidate, at the moment, is LLJS in which work is starting to get it compiling to Asm.js. It should be noted that a language like LLJS is still going to be quite different from regular JavaScript and will likely confuse many JavaScript users. Even with a nice more-user-accessible language like LLJS it’s likely that it’ll still only be used by hardcore developers who want to optimize extremely complex pieces of code.

Even with LLJS, or some other language, that could allow for more hand-written Asm.js code we still wouldn’t have an equally-optimized DOM to work with. The ideal environment would be one where we could compile LLJS code and the DOM together to create a single Asm.js blob which could be executed simultaneously. It’s not clear to me what the performance of that would look like but I would love to find out!

Q&A with David Herman

I sent some questions to David Herman (Senior Researcher at Mozilla Research) to try and get some clarification on how all the pieces of Asm.js fit together and how they expect users to benefit from it. He graciously took the time to answer the questions in-depth and provided some excellent responses. I hope you find them to be as illuminating as I did.

What is the goal of Asm.js? Who do you see as the target audience for the project?

Our goal is to make the open web a compelling virtual machine, a target for compiling other languages and platforms. In this first release, we’re focused on compiling low-level code like C and C++. In the longer run we hope to add support for higher-level constructs like structured objects and garbage collection. So eventually we’d like to support applications from platforms like the JVM and .NET.

Since asm.js is really about expanding the foundations of the web, there’s a wide range of potential audiences. One of the audiences we feel we can reach now is game programmers who want access to as much raw computational power as they can. But web developers are inventive and they always find ways to use all the tools at their disposal in ways no one predicts, so I have high hopes that asm.js will become an enabling technology for all sorts of innovative applications I can’t even imagine.

Does it make sense to create a more user-accessible version of Asm.js, like an updated version of LLJS? What about expanding the scope of the project beyond just a compiler target?

In my opinion, you generally only want to write asm.js by hand in a very narrow set of instances, like any assembly language. More often, you want to use more expressive languages that compile efficiently to it. Of course, when languages get extremely expressive like JavaScript, you lose predictability of performance. (My friend Slava Egorov wrote a nice post describing the challenges of writing high-performance code in high-level languages.) LLJS aims for a middle ground — like a C to asm.js’s assembly — that’s easier to write than raw asm.js but has more predictable performance than regular JS. But unlike C, it still has smooth interoperability with regular JS. That way you can write most of your app in dynamic, flexible JS, and focus on only writing the hottest parts of your code in LLJS.

There is talk of a renewed performance divide between browsers that support Asm.js and browsers that don’t, similar to what happened during the last JavaScript performance race in 2008/2009. Even though technically Asm.js code can run everywhere in reality the performance difference will simply be too crippling for many cases. Given this divide, and the highly restricted subset of JavaScript, why did you choose JavaScript as a compilation target? Why JavaScript instead of a custom language or plugin?

First of all, I don’t think the divide is as stark as you’re characterizing it: we’ve built impressivedemos that work well in existing browsers but will benefit from killer performance with asm.js.

It’s certainly true that you can create applications that will depend on the increased performance of asm.js to be usable. At the same time, just like any new web platform capability, applications can decide whether to degrade gracefully with some less compute-intensive fallback behavior. There’s a difference in kind between an application that works with degraded performance and an application that doesn’t work at all.

More broadly, keep in mind the browser performance race that started in the late 00’s was great for the web, and applications have evolved along with the browsers. I believe the same thing can and will happen with asm.js.

How would you compare Asm.js with Google’s Native Client? They appear to have similar goals while Asm.js has the advantage of “just working” everywhere that has JavaScript. Have there been any performance comparisons?

Well, Native Client is a bit different, since it involves shipping platform-specific assembly code; I don’t believe Google has advocated for that as a web content technology (as opposed to making it available to Chrome Web Store content or Chrome extensions), or at least not recently.

Portable Native Client (PNaCl) has a closer goal, using platform-independent LLVM bitcode instead of raw assembly. As you say, the first advantage of asm.js is compatibility with existing browsers. We also avoid having to create a system interface and repeat the full surface area of the web API’s as the Pepper API does, since asm.js gets access to the existing API’s by calling directly into JavaScript. Finally, there’s the benefit of ease of implementability: Luke Wagner got our first implementation of OdinMonkey implemented and landed in Firefox in just a few months, working primarily by himself. Because asm.js doesn’t have a big set of syscalls and API’s, and because it’s built off of the JavaScript syntax, you can reuse a whole bunch of the machinery of an existing JavaScript engine and web runtime.

We could do performance comparisons to PNaCl but it would take some work, and we’re more focused on closing the gap to raw native performance. We plan to set up some automated benchmarks so we can chart our progress compared with native C/C++ compilers.

Emscripten, another Mozilla project, appears to be the primary producer of Asm.js-compatible code. How much of Asm.js is being dictated by the needs of the Emscripten project? What benefits has Emscripten received now that improvements are being made at the engine level?

We used Emscripten as our first test case for asm.js as a way to ensure that it’s got the right facilities to accommodate the needs of real native applications. And of course benefiting Emscripten benefits everyone who has native applications they want to port — such as Epic Games, who we teamed up with to port the Unreal Engine 3 to the web in just a few days using Emscripten and asm.js.

But asm.js can benefit anyone who wants to target a low-level subset of JavaScript. For example, we’ve spoken with the folks who build the Mandreel compiler, which works similarly to Emscripten. We believe they could benefit from targeting asm.js just as Emscripten has started doing.

Alon Zakai has been compiling benchmarks that generally run around 2x slower than native, where we were previously seeing results anywhere from 5x to 10x or 20x of native. This is just in our initial release of OdinMonkey, the asm.js backend for Mozilla’s SpiderMonkey JavaScript engine. I expect to see more improvements in coming months.

How fluid is the Asm.js specification? Are you open to adding in additional features (such as more-advanced data structures) as more compiler authors being to target it?

You bet. Luke Wagner has written up an asm.js and OdinMonkey roadmap on the Mozilla wiki, which discusses some of our future plans — I should note that none of these are set in stone but they give you a sense of what we’re working on. I’m really excited about adding support for ES6 structured objects. This would provide garbage-collected but well-typed data structures, which would help compilers like JSIL that compile managed languages like C# and Java to JavaScript. We’re also hoping to use something like the proposed ES7 value types to provide support for 32-bit floats, 64-bit integers, and hopefully even fixed-length vectors for SIMD support.

Is it possible, or even practical, to have a JavaScript-to-Asm.js transpiler?

Possible, yes, but practical? Unclear. Remember in Inception how every time you nested another dream-within-a-dream, time would slow down? The same will almost certainly happen every time you try to run a JS engine within itself. As a back-of-the-envelope calculation, if asm.js runs native code at half native speed, then running a JS engine in asm.js will execute JS code at half that engine’s normal speed.

Of course, you could always try running one JS engine in a different engine, and who knows? Performance in reality is never as clear-cut as it is in theory. I welcome some enterprising hacker to try it! In fact, Stanford student Alex Tatiyants has already compiled Mozilla’s SpiderMonkey engine to JS via Emscripten — all you’d have to do is use Emscripten’s compiler flags to generate asm.js. Someone with more time on their hands than me should give it a try…

At the moment all DOM/browser-specific code is handled outside of Asm.js-land. What about creating an Emscripten-to-Asm.js-compiled version of the DOM (akin to DOM.js)?

This is a neat idea. It may be a little tricky with the preliminary version of asm.js, which doesn’t have any support for objects at all. As we grow asm.js to include support for ES6 typed objects, something like this could become feasible and quite efficient!

A cool application of this would be to see how much of the web platform could be self-hosted with good performance. One of the motivations behind DOM.js was to see if a pure JS implementation of the DOM could beat the traditional, expensive marshaling/unmarshaling and cross-heap memory management between the JS heap and the reference-counted C++ DOM objects. With asm.js support, DOM.js might get those performance wins plus the benefits of highly optimized data structures. It’s worth investigating.

Given that it’s fairly difficult to write Asm.js, compared with normal JavaScript, what sorts of tools would you like to have to help both developers and compiler authors?

First and foremost we’ll need languages like LLJS, as you mentioned, to compile to asm.js. And we’ll have some of the usual challenges of compiling to the web, such as mapping generated code back to the original source in the browser developer tools, using technologies like source maps. I’d love to see source maps developed further to be able to incorporate richer debugging information, although there’s probably a cost/benefit balance to be struck between the pretty minimal source location information of source maps and super-complex debugging metadata formats like DWARF.

For asm.js, I think we’ll focus on LLJS in the near term, but I always welcome ideas from developers about how we can improve their experience.

I assume that you are open to working with other browser vendors, what has collaboration or discussion been like thus far?

Definitely. We’ve had a few informal discussions and they’ve been encouraging so far, and I’m sure we’ll have more. I’m optimistic that we can work with multiple vendors to get asm.js somewhere that we all feel we can realistically implement without too much effort or architectural changes. As I say, the fact that Luke was able to implement OdinMonkey in a matter of just a few months is very encouraging. And I’m happy to see a bug on file for asm.js support in V8.

More importantly, I hope that developers will check out asm.js and see what they think, and provide their feedback both to us and other browser vendors.

John, thank you for this article. I have been wondering about the usefulness of asm.js out side of the normal C/C++ -> LLVM -> asm.js. I thought it would be great if somebody like John Resig could write an article about this. And here I have it :-)

@Owen: Not really, no. It’s more akin to Assembly (Asm.js) -> C (LLJS). So you still won’t have much of the power that comes from a higher level language like JavaScript, but it’d be a lot better than attempting to write Asm.js by hand.

@Ruben: I disagree with him slightly – I feel that this is largely a tangential effort. Many of the optimizations that we’ve gotten though because of the big Emscripten push have been quite beneficial – for example typed arrays. I think Asm.js is going to really shine once people start to build more individual utilities/modules that will benefit from the performance and can be used in an existing JavaScript application (rather than complete C/C++ applications dumped to an Asm.js blob).

Thank you for the article. There is a question that bothers me. JS has a very large community, but programs for asm.js need to be written in C/C++. In theory, application can be translated from C to JS painlessly, but in real world it’s not likely to happen. So programmers, who write something targeted for asm.js, should know both C and JS and debug them both. That will be confusing, especially for the first time. There can be a number of solutions, but I think, it will take time to get used to them. So my question is, how much time do you think it will take for community to adapt?

@Bryan: Native browser UI elements are so far different from desktop UIs, with many elements missing (like highly configurable message boxes), that wxWidgets would probably need to write an HTML/JS/CSS library of its own (or a wrapper around an existing JS UI library) to sensibly render existing wx-apps in the browser.

@Egor: That’s a big advantage to having an intermediary language like LLJS – it’ll help to make it easier for devs to get started generating Asm.js without having to write C/C++ and compile it using Emscripten.

@Zizzle: That’s certainly a valid concern. I’d also argue that we’ve been in this situation for a long time now, already. For example I doubt the highly-compressed source code that powers Gmail is any more understandable than the code that powers one of the 3D game Asm.js demos. Open-ness is a huge accelerant to development but thankfully we have communities like Github where people feel willing to open up their code in a way that is understandable and usable (so while I can’t figure out how the BananaBread demo works by looking at the source code I can download the project on Github and understand it much better).

As a GWT programmer I am wondering whether the existence of asm.js will increase the competition or Google just begin to support the Java->LLVM magic->ASM.js compilation path, too. Sure the current method has a _lot_ optimization while compiling from Java to JS, however those charts are really impressive and make me feel that there could be even more unexploited potential.

I’ve been thinking also about the widget toolkits on desktop and in browser: why aren’t buttons, menus, and other elements drawn directly on canvas? Most recently, there are more possibilities to have those WIMP elements rendered to an OpenGL surface (Qt, for exampe). As these Epic-Mozilla example shows, it is possible to create well performing applications in statically typed highlevel languages running in a browser using OpenGL.

So I’m really looking forward to the first widget toolkit libraries with support of asm.js. Or of course, maybe also GWT will support it.

@Roy Tinker: Not when you consider that they render via canvas and SVG. The QT desktop widget library has been compiled to JavaScript via Emscripten. Not necessarily advocating for this as a common use case, but that’s how you do it.

I’m a bit curious about the number of features that asm.js plans to support. Are they aiming to support the full spec of the language (like ANSI C or C++11) and if they plan on supporting any gui libraries, if not how do I make any graphic non-OpenGL stuff? Will it provide access to the DOM to the C/C++ application?

Also how do they handle threaded code, as far as I remember ANSI C/C++11 does no provide any hardware-threading. Do they support (or plan to) POSIX threads? If they do do they compile down to Web Workers? If so how is the performance for threaded applications going to be like?

Will this finally open up a path for fast matrix computations in the browser. Matrix computations make the bases of any scientific and engineering application. I haven’t seen any of the mathematics C/C++ libraries being ported to js via emscripten. Is this due to lack of interest or difficulty?

For all the talk of HTML5 and the hard work of many library teams like jQuery UI, the web’s interface UI is relatively atrocious compared to that available to desktop application developers. Simple example: a truly simple and user-friendly text editor. Something as simple as a real-time character entry counter often still bogs down the user experience despite the oft-hyped advances from the JavaScript engine race.

If interface UI elements like those from QT, GTK+ etc can be used on the web via easily implemented compiled-to-asm.js modules, web developers might finally get decent UI elements to play with!

‘For complex applications (such as the above games) performance is only around 2x slower than normally-compiled C++ (which is comparable to other languages like Java or C#). ”

Java has actually been beating C++ in speed for scientific computation for quite awhile and in general is on parity everywhere else. (Some specific examples its a bit slower, some its a bit faster.)

Although I have less information on C#, give its closeness to C++ (including the elimination of some of the most difficult to optimize things in Java like Java’s ‘everything is overridable’ inheritance model) I’d be shocked if Mono wasn’t reaching at least 80% of C++ speed.

Half the speed of C++ is pretty good for Javascript, but still doesnt come close to touching these other languages.

So in my understanding then, let’s say for example, a good image manipulation software written in C++ could be compiled to asm js, then that file could be used let’s say on a website where the exported object could be interacted with by other “non-asm” JavaScripts and on a platform that supported asm the exported image manipulation would be blazingly fast, and on non-asm-supporting platforms it would work but be slower?

Is that a generally correct assumption?
If so this could really be something, even for JS programers who don’t want to learn C/C++.

…ecmascript in the tracemonkey, v8 and nitro generates
x86/ARM code. Maybe there could be a ecmascript subset which generates fast machine code for algorithmic intensive code. ie ignores global object and prototype and is maybe typed.

And:
‘performance modules’ which would be written in and parsed as a subset
of ES which is *extremely sympathetic to the machine code generating
infrastructure* of the ecmascript engines and thus generates efficient
machine code. Without being to closely tied to (specific)
implementations.

I would love to see if it is possible to produce asm.js ready javascript from typescript. This is probably the best high level language for now which can be dynamic and static. I suppose in typescript you can give to compiler the most important informations to produce a very good optimized asm.js code at places where it needs to be optimized and still to write quick and flexible javascript where micro performance doesn’t matters.
Also one could serve asm.js code or normal javascript depending on browser support for asm.js