A Look at Web Assembly and Molecular Analysis

On the BioNano team at Autodesk Research, running molecular analysis in the cloud is something we already enable scientists to do with ease. Our Molecular Design Toolkit gives users access to this power in a web interface with a Python backend. But what if we wanted to scrap the network lag and run some analysis directly in the browser?

Molecular Design Toolkit calculating the orbitals of butene in an iPython notebook

With the Web Assembly browser preview recently landing in V8, I got curious about what it would take to use it to run some of these computations. Web Assembly is a new browser feature that runs a binary code format directly in the browser. It’s mainly being designed as a compilation target for C and C++ right now, which is the language of many of the open source chemical analysis tools we use. If an entire 3D Unity game can be compiled to run in the browser via web assembly, chemical analysis should be a piece of cake, right? This is my journey into the early days of web assembly.

Why Web Assembly?

The ability to run native code in the browser opens up many new types of web apps that were previously confined to desktop applications or command line utilities. Many algorithms and utilities, particularly in our computational chemistry space, exist only in C++ (or even only in languages like FORTRAN). A port to JavaScript might be daunting or even impossible depending on the language features used.

Just the ability to execute these programs in the browser isn’t enough, however. In order to really enable new types of web apps, the compiled code must be fast and easily portable, and this is where Web Assembly comes in. It has already been possible to run C++ in the browser for a few years by compiling it to asm.js, but Web Assembly takes this a step further by representing the compiled code in a binary format. This means file size is much smaller, browsers can eliminate parsing delays, and functionality can be expanded beyond what is possible in the JavaScript environment. The result is programs that run at near-native speed, and a whole new set of possibilities for the browser.

Getting Started

In this article, we’re going to look at compiling a major C++ library into a format that can be built into a Web Assembly project, compiling that project itself to Web Assembly, and getting it all running in the browser. Here I’m using the OpenBabel library, which translates between various chemical data formats, but this should work just as well for most C++ libraries.

Installing Emscripten

Assuming your computer is already set up to compile C++ programs (make sure you’ve got the Xcode command line tools installed if you’re on a Mac), the first thing you need to do is install Emscripten.

See if you can get the provided Hello World example running as a test. You’ll need to make sure your browser supports Web Assembly, which you can do by enabling the WebAssembly flag in the latest version of Chrome Canary. Be sure to try including the -s WASM=1 flag which will compile to web assembly instead of asm.js.

At the time I tried this, it seemed there was a bug with binaryen on the Mac. If you run into this, check the issue in Github and keep in mind you might have to install binaryen yourself, which ended up being the solution in my case.

Compile the OpenBabel Library to LLVM Bitcode

OpenBabel provides a nice set of instructions for compiling their project on your own, however, in order to use OpenBabel with an Emscripten project, we’ll need to compile it specially to LLVM bitcode. The steps we take will be very similar to those given in the instructions, but we’re going to use emcmake with cmake, and emmake with make, following Emscripten’s instructions on compiling with libraries.

After you finish, check the directory you passed to DCMAKE_INSTALL_PREFIX, and you should see all of the compiled OpenBabel code. Also be sure to take a look inside embuild/bin. There you will find that OpenBabel has also been compiled directly to JavaScript. You can even run node obabel.js, and it should behave just like the native OpenBabel CLI!

Compile and Run Your Project

Now that you have OpenBabel in LLVM bitcode, you can use it to build your C++ project that uses OpenBabel into Web Assembly with Emscripten. You might want to try one of the simple examples from OpenBabel’s C++ examples page copied locally into a .cpp file.

To compile, just use em++ instead of g++, and make sure to link your OpenBabel bitcode using the path that you passed to -DCMAKE_INSTALL_PREFIX when you compiled OpenBabel:

That should emit several myproject.* files, that when accessed through a web server, should run your project. Start up a server such as python -m SimpleHTTPServer 8080, and then open localhost:8080/myproject.html in your Web Assembly enabled browser. You will see a nice interface generated by Emscripten that should show you the output of your program.

Emscripten’s generated HTML page running a Web Assembly program

When you want to use your compiled C++ program inside of your own JavaScript project, you can tell Emscripten to expose a C++ function in the resulting JavaScript. Declare the function in C++, and the tell the compiler to export it with the following flag: -s EXPORTED_FUNCTIONS=”[‘my_accessible_function’]”. Then if you include the resulting .js file in your JavaScript project, you will have access to my_accessible_function.

Troubleshooting

If you’re having trouble, you might want to try compiling your code with g++ as well as em++ in order to make sure your program works as a normal native binary. In addition to swapping the em++ command for g++ and dropping any Emscripten-specific flags, you’ll also have to point to a natively compiled version of OpenBabel. The -L and -I flags that point to the LLVM bitcode version of OpenBabel won’t work with g++. If you’re on a Mac, you can easily install a precompiled version of OpenBabel with homebrew. Once that’s installed, you can use pkg-config to tell you exactly what your new -L and -I flags should be: pkg-config --libs --cflags /usr/local/Cellar/open-babel/2.4.1/lib/pkgconfig/openbabel-2.0.pc

From Here

At this point we have the ability to compile a simple program that uses OpenBabel into something that runs natively in the browser. From here you should be able to build a normal JavaScript app that uses your Web Assembly piece and takes advantage of the access to native code. While it would still be a long path to duplicate Molecular Design Toolkit all in the browser, this example is the first step in showing what’s possible with this new piece of browser technology.

The Web with Web Assembly

Web Assembly is a big step forward for the web. The simple ability to compile a C++ library and use it in a browser, as shown in this article, opens up a whole new realm for what’s possible in web apps. Algorithms that are not practical to run in JavaScript, like many in computational chemistry, are suddenly doable in real time without needing to wait for an AJAX request.