I have imagined how cool this would be, but it seems like the huge stick in the mud would be the size of GHC compiled binaries. Waiting for a 5+ MB haskell binary to download every visit seems like a big downer. Perhaps they allow liberal caching of these nacl binaries though?

Has anyone seriously evaluated how much work this would be? I'd imagine it'd all be in the assembly generator deep in GHC?

Mark Lentczner has looked into NaCl and PNaCl for GHC. I think the conclusion was that the biggest hurdle is to the the RTS (written in C) to compile using NaCl. The problem is that the RTS uses OS APIs such as epoll, which aren't available there.

I've thought about it recently. The main backend work would be making sure GHC outputs assembly that the NaCl assembler is kosher with (it has a modified assembler that uses special psuedo-ops like nacljmp for the ABI, etc.) There was an OCaml patch that allowed the compiler to produce NaCl compatible binaries, but that googlecode project page is gone now. I was actually quite surprised at how minimal and non-invasive the patches were on that note.

There are also a whole other set of things to keep in mind. For one, it would be incredibly difficult to work GHCi into this; the stage2 compiler needs to be able to run on the system its built for IIRC, so you'd somehow to have to make the build system go through sel_ldr or something which I'm not even sure is possible. That means you have to axe the stage2 compiler and all of its features (notably DPH, and template haskell, which both need GHCi and its dynamic linker.)

Also, GHC isn't terrifically good at cross compiling at the moment anyway, which is one of the real benefits of the NaCl toolchain: you always get amd64 and x86 binaries spit out, and Chrome picks out the correct version. So you may need to make two totally separate binary distributions and somehow mash them together or something. You're at least going to need two copies of the same library for amd64/x86 to make builds work.

There are lots of other details. NaCl just started to move to dynamic linking and a glibc implementation, but manifests require a bit of manual work now. NaCl also has some pecularities in the data model (IIRC, it's LP32-ish, always.)

On the whole I think a registered stage1 compiler is quite plausable, however (if only for amd64 or x86, one of the two.) You could probably even get the threaded RTS to work - the NaCl API does support pthreads.

Would it be more plausible to skip ahead and target (NaCl's eventual replacement) PNaCl which is based on LLVM? It seems like the GHC LLVM backend might make this a more natural fit to my intuitive and uneducated eye.

The ultimate plan is to create a new version of Native Client executables that can run on any processor. This new Native Client technology is called Portable Native Client (PNaCl, pronounced "pinnacle"), and it is already under development. Instead of generating x86 or ARM code, PNaCl transforms native code into bitcode using a compiler based on the open source LLVM (low-level virtual machine) project. When the browser downloads the bitcode, PNaCl then translates it to machine code and validates it in the same way Native Client validates machine code today.

I wouldn't hold my breath for PNaCl. LLVM isn't particularly fast (and load times matter for the NaCl people). LLVM bitcode also isn't actually portable at this point -- there are various platform assumptions baked in even at the higher level. For example, what should the portable bitcode look like if the input program contained the expression sizeof(void*)? They have to solve some pretty difficult problems.

Right. I too wouldn't hold my breath for PNaCl. There was a recent discussion about this kind of stuff in the past few months on cfe-dev, and the fact that LLVM has many built-in assumptions about a platform is a major show stopper. The IR and optimizers also have many built in assumptions about the target, which is another major problem.

The sizeof(void*) thing isn't so much of an issue as the ABI dictates all pointers on all systems are 32 bit (even amd64, IIRC.) But there are other bigger problems like this: if you want LLVM to generate correct code for some constructs, on some platforms, you have to construct the IR specifically so the backend can lower it properly. As an example, on amd64, Clang has to emit very careful IR when it compiles code that passes structures as a parameter to a function, by value. It emits IR that the amd64 backend lowers into code that correctly abides by the AMD64 SYSV ABI. If you don't do IR construction correctly, it'll generate invalid assembly code. But on ARM, the story is similar: Clang has to emit specific IR for constructs to lower properly in the ARM backend, and abide by the ARM ABI (by-val structs are just one of the more obvious examples, and they're especially complicated on amd64.)

So now, you no longer have 1 piece of bitcode that can work properly on say, amd64 and ARM. The lowering passes for the two targets require different input bitcode to be crafted, even if the input source code was identical. Depending on the target you're looking at, you'll need different bitcode for each one.

Floating point is another example that's also horrific. Different processors are going to have vastly different FPUs, and you can't possibly take the IR for one target and lower it properly to another with this in mind. It might not even be expressible given the instruction set (maybe FP registers are much wider/smaller, for example.)

There was a talk at the european LLVMdev conference recently that addressed these concerns and gave an example of a 'higher level' IR that really was platform independent, built on top of LLVM. It's work done by the South Korean Govt. IIRC, so there's a strong possibility that it'll never see the light of day, unfortunately.

More generally, LLVM isn't a virtual machine - it's a compiler IR. It's not suitable or meant to be used as a platform for target-independent optimizations or bitcode distribution: by design you have to bake some decisions about the target platform into the code you give it.

I've been wanting something like this for a long time. Something like Mathematica meets GHCi. Add an inline editor, so you can edit the code after typing it in (and the embedded object update to match), and it'd be the perfect Haskell IDE.

If you keep touch interfaces in mind while developing it, it would also work remotely on phones and tablets.

DrScheme/ DrRacket is awesome though; why shouldn't there be some sort of a Haskell equivalent? It would be much more pleasant and rational. But maybe apfelmus hasn't seen eg http://docs.racket-lang.org/quick/

And if it does follow the sage notebook approach (which can be done as a local app w/o even thinking about hosting/service architecture or permissions and sharing, etc.) then you have a sort of editable interactive record of what you've worked through that's much richer than a transcript of a ghci session.

Social features are great too. Sharing, voting and community moderation, comments -- all great social ideas which can be included as extras as time allows! That said, I detect a hint in your link of going in the http://quid2.org direction... and this latter direction might make for a fascinating idea, but it's ill-defined at this point and just risks losing to tool complexity. So my advice is stick to building a pretty vanilla organized Haskell environment; the kind where code lives in packages in a package repository plus maybe modules of your current project... and NOT in a cloud of stuff collectively built by the users.

What abouut, for example, making each notebook live in its own cabal-dev environment, so you can play safely with different packages and test them? Maybe later it can be integrated in the social features of Hackage 2.0.

I'd encourage you to play with sage notebook if you haven't. The thing about following it as a model is that its widely used, appreciated, and successful. So there may be improvements to its model, but through a fair amount of iteration they've already found a sweet-ish spot in the design space. A worksheet is sort of like a "live" displayed module. Sharing can be done a number of ways (and completely ignored for the SoC but added later if the project takes off) but conceptually just means giving someone else access to read/run your module. I don't think it means running off to invent some fancy notion of cloud-pluggable code or the like.