Google has been working on Native Client (NaCl) and Portable Native Client (PNaCl) for a while now. Firefox recently announced and released asm.js. Here’s a few random personal thoughts with no particular conclusion
First off a bunch of disclaimers. (1) I work on the Chrome team. (2) I’m speaking for myself, not Google nor the Chrome team. In fact I’m pretty sure some other people on the Chrome team will 100% disagree with some of these opinions. So, these opinions / thoughts are my own and in no way represent Google or the Chrome Team.

Second, let’s spell out what NaCl, PNaCl, PPAPI, NPAPI and asm.js are

NaCl is a PPAPI plugin that allows untrusted native code to run in the browser securely. It only supports x86 code. It provides access to PPAPI APIs, sometimes called Pepper APIs.

PNaCl is the portable version of NaCl. Instead of being x86 only it translates LLVM bytecode to the host processor.

NPAPI is the Netscape Plugin API that was developed in the mid 90s to allow browser plugins. It is massively insecure and is the #1 vector for browser exploits. All it defines is how a plugin can talk to a browser. Otherwise the plugin is free to do whatever it wants with your entire machine. An NPAPI plugin can read any file, call any OS function, do any networking or nasty things it wants.

PPAPI is a new plugin standard originally co-developed by Mozilla and Google but apparently Mozilla stopped participating. It provides *cross-platform* out of process APIs for most common system functions. In other words, it lets a plugin do things like audio, graphics, networking and file i/o in a cross platform way. This means a plugin developed for PPAPI should run on every OS and Browser that supports PPAPI. It would have to be recompiled but since the APIs are the same on all platforms it should work with no code changes. This is in contrast to NPAPI plugins which are specific to each OS since they themselves call OS level services.

The combination of PPAPI and PNaCl provides a 100% portable and secure way to run native code in a browser.

asm.js is the specification of subset of javascript that can optionally be compiled directly into assembly language. It works by allocating a single large TypedArray which it treats as the memory of a virtual computer. It then uses a small subset of JavaScript as an assembly language to manipulate the contents of that TypedArray. There’s an C/C++ compiler called Emscripten will generate asm.js from C and C++.

asm.js is an amazing hack. It’s a hack because it’s absolutely the worst idea if you want performance. C++->JavaScript->Asm. It’s amazing because it’s actually a solution that works in all browsers that support TypedArrays.

Here’s what I find funny about asm.js vs PNaCl. As more and more people try to use asm.js they’ll find the size of their executable explodes. Instead of a bytecode it’s outputting source code so the size of executables is HUGE. This will put pressure to just ship the bytecode and skip the JavaScript step. Hmmm, sounds like PNaCl.

Then there will be a push for other features like say pthreads but pthreads can not be implemented in JavaScript given its design constraints inside the browser. Maybe there is some creative way to remove those constraints by executing all the asm.js in a special type of “asm.js worker” so no direct access to the DOM and shared memory between “asm.js workers”. Hmm, sounds like PNaCl.

The problem now is you’ll need to add support for graphics, audio, file i/o, keyboard and mouse input, networking, etc from workers in ways that meet the needs of these special “asm.js workers” and legacy C++ code. Hmmm, sounds like PPAPI.

I don’t know if we’ll actually get there. It just seems like asm.js and PNaCl are closer than people are admitting.

Well, sort of, in the most general sense, because Emscripten/Mandreel/asm.js/etc. on the one side and PNaCl on the other are both aiming to let people run C/C++ type code on the web, at native speeds. That’s true. And there are even some technical similarities, PNaCl uses LLVM and so do Emscripten and Mandreel.

But regarding those details. First, code size: It turns out that C++ compiled to JS is actually of quite reasonable size. When you gzip it, like all large web-delivered content should be, it is similar to gzipped native code (http://mozakai.blogspot.com/2011/11/code-size-when-compiling-to-javascript.html ). Yes, when it is temporarily unzipped before being executed, it is quite large, but that should not matter much in practice. For example, Epic Citadel is a huge codebase, over a million lines of C++, and it works ok right now in JS, even before any special optimizations for that issue.

So I don’t see pressure here to add any binary format.

Second, regarding the need for all the graphics, audio, etc. APIs that would be needed both on the main thread and on worker threads. Those APIs are coming to workers anyhow, without any connection to asm.js. People just want to be able to do WebGL, IndexedDB, etc. in web workers, in normal JS that they write. That makes sense, so this is already being talked about (and perhaps even implemented) for at least IndexedDB, WebGL and canvas 2D, that I can recall offhand.

So again, there is no pressure here to do anything special for asm.js: asm.js is just JavaScript, and it will use the APIs that we provide for JavaScript, there is certainly no need for any new APIs for asm.js.

Last note: NaCl supports ARM, not just x86.

Erwin Coumans

Those thoughts seem ignorant because it ignores browser compatibility. You can be pretty sure that (P)NaCL is not going to be supported by iOS Safari nor Microsoft Internet explorer (without the use of plugins of course, don’t even mention plugins in this context please). This means (P)NaCL is just a Google Chrome solution. asm.js on the other hand requires just TypedArrays, which are pretty much universally supported, see http://caniuse.com/typedarrays
Aside from that, Google pissed off many early NaCL adaptors by adding a last-minute requirement that NaCL apps need to be distributed through their Chrome Web Store, as if Google doesn’t trust NaCL to be secure enough to be distributed openly.

A positive thing about (P)NaCL is that it allows someone to create a cross-platform Browser plugin (say Unity 3D) Those thoughts seem ignorant because it ignores browser compatibility. You can be pretty sure that (P)NaCL is not going to be supported by iOS Safari nor Microsoft Internet explorer (without the use of plugins of course, don’t even mention plugins in this context please). This means (P)NaCL is just a Google Chrome solution. asm.js on the other hand requires just TypedArrays, which are pretty much universally supported, see http://caniuse.com/typedarrays
Aside from that, Google pissed off many early NaCL adaptors by adding a last-minute requirement that NaCL apps need to be distributed through their Chrome Web Store, as if Google doesn’t trust NaCL to be secure enough to be distributed openly.

A positive thing about (P)NaCL is that it allows someone to create a cross-platform Browser plugin (say Unity 3D) with embedded (P)NaCL interpreter (sel_ldr) so that code can be distributed together with data. That is great to maintain backwards compatibilty.

My point isn’t that people will implement PNaCl. My point is the pressure put on asm.js to provide all the things that PNaCl already supplies will end up making asm.js nearly indistinguishable from PNaCl. It might take 5 years to get there. When we do we’ll look back and find it sad that it took such a round about way to arrive at the same solution

> For example, Epic Citadel is a huge codebase, over a million lines of C++, and it works ok right now in JS, even before any special optimizations for that issue.

It requires a build of Firefox that addresses 3GIG of ram just to run something that runs on the lowest smartphones. I wouldn’t call that no pressure.

As for bringing those APIs to web workers. For one I actually write the specs for some of those APIs. Mozillia has resisted bringing them to web workers. For two, my point is people want to port their code. Their multi-threaded C/C++ code will not port without shared memory, something that is forbidden in the current JavaScript model. To get here will require more than just web APIs in workers.

Andre Weissflog

Uh, so much misinformation…

In my “proof-of-concept” demos of porting a non-trivial existing 3D engine to JS+WebGL and NaCl (not pNaCl yet), the (gzipped) download size of the JS+WebGL demos is actually significantly smaller then a single NaCl nexe. The JS+WebGL dragons demo (here: http://www.flohofwoe.net/demos/dragons_asmjs.html ) is about 560kByte gzipped, the corresponding NaCl exe is about 1.2 MByte gzipped (didn’t try pNaCl yet), and since NaCl is currently only enabled for CWS, you’ll have to download the whole bundle at 2.7MB gzipped, while the asm.js demo starts rendering after the 560kByte JS download. The size of gzipped asm.js code is basically identical to the native 64-bit OSX demo compiled from the same code.

Even the speed difference isn’t that important if your code is WebGL heavy, since then the implementation of the GL layer (and GPU fillrate) takes over, and the actual performance of “CPU code” isn’t that important anymore. For instance in this demo: http://www.flohofwoe.net/demos/dsomapviewer_asmjs.html, it doesn’t really matter whether you’re running in FF Nightly or Chrome Canary, Chrome performance even used to be faster in the past because the GL implementation seemed to be more efficient then FF’s.

It is actually true that asm.js and NaCl aren’t that different from a dev’s point of view, but it’s the reach that’s important. asm.js runs great on all browsers, NaCl/pNaCl only on Chrome. And emscripten is mainly a sort-of-a LLVM back-end and C-runtime-emulation, which is a much less complex piece of engineering then all of NaCl (I assume).

You also missed the point of the article entirely. My point isn’t that people will implement PNaCl. My point is the pressure put on asm.js to provide all the things that PNaCl already supplies will end up making asm.js nearly indistinguishable from PNaCl. It might take 5 years to get there. When we do we’ll look back and find it sad that it took such a round about way to arrive at the same solution

Andre Weissflog

It’s not *that* bad really. The only things that suck a little bit when doing an emscripten port is the sorry state of audio and networking across browsers, because of the messy HTML5 APIs and slow standardization in those areas. Not having pthreads isn’t actually such a great deal if you’re moving the multithreading abstraction to a higher level, and there’s a move in game engines as well to get rid of excessive shared state in multithreaded code. NaCl had the call-on-main-thread restriction for very long, which also required major rearchitecturing in a multithreaded game engine. Not having SIMD is a problem, but I was surprised how fast math heavy code is in modern JS engines.

The only remaining points on my emscripten/HTML5/asm.js-wishlist are:

– WebAudio across all browsers with ONE common compressed-audio format
– WebRTC across all browsers (at least the Data Channel stuff)
– WebGL in IE (doh)
– make SIMD C/C++ compiler intrinsics actually generate SIMD code in JS engines “somehow”
– “transfer-of-ownership” for WebWorker data across all browsers
– expose more extensions in WebGL across browsers which reduce number of calls into WebGL: instanced rendering, vertex array object, heck, even introduce new WebGL specific extensions, I think the high call overhead in WebGL (also in NaCl’s GL) compared to native GL apps is much more important then having 90% or 50% of CPU performance available…

Well, I am on an old laptop right now, with a total 3GB of physical RAM and no swap, and it runs ok. As I mentioned, this is before any real attempt to optimize that, this can improve a lot without any binary format. And this is on the largest app I can think of that would be relevant to port in the foreseeable future.

I don’t follow those API talks super-carefully, so I don’t know details of what is objected to and what is supported. Overall though I keep hearing that these APIs are in the works. The web needs them in general.

Regarding shared memory, that is a valid point – if we want pthreads-style code to run in asm.js, that would actually need some new API, which would require standardization work. There is some pressure regarding that coming from game companies. But this is the only new thing that would be needed specifically here.

It took just a few months to make the first version of asm.js optimizations in Firefox, which got to half of native performance, with just 1 engineer. And the web worker APIs are coming anyhow in parallel. This stuff is happening pretty fast.

Also, PNaCl is not a full, perfect solution (which you sort of imply, that the web needs to catch up to). It lacks dynamic linking, C++ exceptions, SIMD, and needs to speed up compilation quite a bit. The JS approach already has some of those things.

Erwin Coumans

My point is that implementing a system such as (P)NaCL that just works on a single browser is easy. Making it work across multiple browsers, through Javascript, will take a long time indeed. As both systems mature, it is likely that they share similar characteristics. It will likely take more than 5 years (possibly never) until the (P)NaCL solution works as cross-vendor as asm.js.

Calling another solution a “hack” or a “round about way” is just defamation.

BrendanEich

“PPAPI is a new plugin standard originally co-developed by Mozilla and Google but apparently Mozilla stopped participating.”

That’s a loaded-to-the-point-of-falsehood way of describing what happened. We created plugin-futures@mozilla.org in 2004 and restarted NPAPI incremental evolution (along with browser competition). The archives are public. I encourage everyone here to read the thread containing Robert O’Callahan’s reply giving his doubts about the wisdom of even starting something like Pepper:

Mozilla never “started participating” beyond the discussion phase, and we parted company quickly for reasons given quite clearly. Please get this right.

/be

DigDug2k

“Mozillia has resisted bringing them to web workers.”

I’m
pretty shocked to read this. Are you sure you don’t just mean “Mozilla
didn’t like the API we proposed?” There are bugs tracking moving Canvas,
WebGL, and IndexedDB to workers (Audio and Video are already there
thanks to MediaStreams I think?). None of them point to any proposed
specs. Are there some?

In the meetings with Mozilla working out the proposals they pointed out that to implement WebGL in workers they’re going to have to re-architect their entire graphics infrastructure (to be more like Chrome’s). While they have plans to do that it’s not a priority.

But, back to the original point, neither of those proposals are compatible with porting C/C++ code which is the pressure that asm.js brings to the table. PNaCl/PPAPI, at least for OpenGL ES 2.0, bring support in such a way that C/C++ code does not have to be changed. Shared memory, multiple contexts, threads just work. Companies porting their code using asm.js will want the same features and so there will be pressure to provide them. They can’t be provided to current workers because they are incompatible with JavaScript and the DOM but they could be provided in some new special asm.js only worker. Again, my point being because of the pressure, if they go that direction they’ll basically be recreating PPAPI/PNaCl

Aside from threads with shared state – which as I said earlier is a valid point – I don’t follow you at all. Once we can run GL, audio, sockets, IndexedDB, etc. from workers, what else is needed?

We have ported lots of C++ code, including several full game engines, and not hit any serious problems except for threads with shared state. If we had those rendering/audio/etc APIs in workers, then we could run all those same games in workers too.

I’m not suggesting PNaCL will ever be cross-vendor. I’m suggesting only that asm.js will end up being very similar PNaCl

Asm.js is a hack. Even some if it’s proponents call it a hack. It’s like saying “hey, I wrote Photoshop in bash script”. It would be an amazing hack. That wouldn’t mean it’s not a hack.

As one example, to manipulate a string in asm.js you poke bytes into a typedarray. Then to get them in the DOM you’d need to have asm.js code that converts them to unicode code points and appends them 1 at a time to a JavaScript string even though the whole thing is actually in JavaScript in the first place. If that’s not a hack I don’t know what is.

Erwin Coumans

Some people call PNaCL a hack.

Are you claiming that PNaCL is not a hack? Or do we just have two hacks, PNaCL and asm.js?

Once you can run all those things from workers and you write all the glue to convert from the JS APIs to C/C++ APIs you basically have PNaCl. PNaCl is just that, C++ in workers with access to all HTML5 APIs as C/C++ APIs + shared state. PNaCL just avoids the assembly language disguised as JavaScript part for a more direct route. Again, the point being asm.js is more like PNaCl then not.

asm.js can do similar things to PNaCl, sure, and that will increase over time.That’s agreed to by everyone I think. There are some shared *goals*.

But the implementation is very different, the timelines are different, the standardization story is different, etc., everything but the goals and that they both use LLVM somehow 😉

benoit_jacob

> In the meetings with Mozilla working out the proposals they pointed out
that to implement WebGL in workers they’re going to have to re-architect
their entire graphics infrastructure

That is absolutely not true; I wonder what these meetings were?

We could implement WebGL-on-Web-Workers in about 2 weeks effort now, if we wanted to prioritize that above everything else.

The only nontrivial part would be that we don’t have yet the abiltity to cycle-collect on worker threads, and our WebGL implementation has a minor dependency on cycle collection. However, this is a small enough dependency that it can be removed in 1 hour of work if needs be, and anyway cycle-collection-on-worker-threads is being implemented as we speak so we soon (matter of weeks) won’t even have to worry at all anymore about that.

My understanding when talking to Jeff Gilbert was that Firefox needed to move all GL calls to a single thread in order to support multiple contexts with shared resources across context/workers. That’s certainly been Chrome’s experience. Many drivers just can’t handle multiple threads on GL. Some can’t even support multiple contexts and we’ve had to virtualize contexts. I thought Jeff acknowledged both of those issues and said there was plans to do that but maybe I misunderstood what he meant.

benoit_jacob

All what the worker thread doing WebGL needs to share with the compositor thread, is a single texture (or N textures for multi-buffering). There are various mechanisms allowing to share textures, it’s a much more specific and limited problem to solve than general OpenGL object sharing. In fact, that’s a problem we’ve already had to solve on various platforms where we already have off-main-thread-compositing.

You’re talking about rendering from a worker. I’m talking about sharing resources with a worker so the devs can do things like asynchronous image decoding and texture uploads and asynchronous shader compilation while another thread is still doing the rendering. Putting WebGL in a worker without sharing resources solves neither of those issues.

benoit_jacob

Indeed, I was talking about rendering from a worker. I think that WebGL-on-a-Web-Worker is already very useful even without cross-context resource sharing. I have posted to the public_webgl thread my thoughts about why I believe that WEBGL_shared_resources will not be efficient when used between two WebGL contexts on different threads, and why I believe that we’re better off focusing on less generalistic, more specialized solutions to share only one texture. In fact, now that I think about it, WEBGL_dynamic_texture is the right approach. Replying to the list. Let’s continue there.

Both have their uses. Making a jank free app requires being able to upload async and compile async. Even if you put the entire thing in a worker compiling blocks the driver for up to several seconds meaning you need 2 threads to solve that issue. Some people have suggested making special APIs for those 2 cases. I believe in giving devs the primitives they need so they can make things we have not considered rather than hobble them with only a few specialized features. See console games with 3 to 8 cores as examples of apps that use multi-threading to GPUs.

Correct me if I’m wrong, but I was under the impression that interfacing the DOM/JS with (p)nacl is pretty much impossible. I.e.

* No calling (p)nacl from JS.
* No calling into JS from (p)nacl
* No calling of any browser API from (p)nacl
* No creating of DOM objects in (p)nacl
* No calling of DOM object methods in (p)nacl
* No access to the document, window or any other JS scoped object

I so (and I could be wrong) it sounds pretty useless. People aren’t gonna abandon their JS-libs and JS-targeting favorite languages (like coffeescript). And they sure as hell ain’t gonna write a website that can’t access any standard APIs.

I think it depends on what you’re trying to do. Both PNacl and Emscripten are about porting native code, not about interfacing with JS. Plenty of people would just like to take their C++ app and have it run in the browser with minimal changes. You can see examples in the Chrome web store, “From Dust” and “Bastion” for example could really care less about the 6 things you listed. Otherwise, you can do those things from (p)nacl they’re just async.

Also you might be aware that Pnacl now targets asm.js as an option. It just runs faster in a browser that supports real PNacl. In browsers that don’t it translates to asm.js but you get no threading.

Of course it depends on what you’re trying to do. But a browser isn’t just a way to run code, what makes web applications tick is the ability to rely on a (somewhat) cross platform standardized API to acomplish things like GUIs, HW acc. graphics, audio, video, keyboard/mouse/touchpad input, camera capture, networking and so forth. And for a lot of running these APIs, speed matters relatively little as they spend their majority of time waiting for user/network input.

I’d argue that the majority of people who write web-applications today, don’t want to port their entire app to C++ or Java or whatever and compile it whole to some bretzel (short for asm.js/(p)nacl). What a lot of us want to do is optimize inner loops that are computing intensive and compile them to bretzel, analogous to the way that you’d use say ctypes and python, where you’d write a small C module and plug it in as the inner loop to your physics simulation, FFTs, convolutions or whatever.

And regarding “async”, I don’t think that cuts it. Data processing (such as in a physics engine) invariably requires you to work with the results when they’re done, which involves waiting for them during the frame you’re trying to render, and then update your other presentations from it. I just don’t see that ever working well if every call you make is async, you’re going to spend the majority of the applications time copying buffers around and waiting for some kind of browser/OS lock to reschedule your callback code, which can be many frames after you actually needed the data, that kinda defeats the point of trying to do things efficiently in the first place.

It goes without saying that certain things are easier in NaCl, and others are easier in asm.JS.

For our work (http://xlabsgaze.com/) NaCl has been brilliant. We have a quite intensive realtime image processing pipeline that reads images from the webcam. The code base is heavily multi-threaded C++ and makes extensive use of boost (including boost_threads), OpenCV, and other libs.

Porting the C++ code was quite painless. The ~25% loss in performance was easily recovered with added workers in the pipeline. So for us, shared memory threading model is an absolute necessity as we’re dealing with large images being passes around.