JavaScript and the JVM

Introduction

When it comes to server-side JavaScript programming, there are other choices besides v8 based solutions like NodeJS, TeaJS, SilkJS, and others. For the longest time, the Rhino JavaScript engine has been around for the JVM, and recently Java 8 was released with a brand new and improved JavaScript engine for the JVM called Nashorn. There is another project called DynJS that shows a lot of promise as well. In this post, I will investigate the benefits of JavaScript running on the JVM and demonstrate how easy it is to integrate with, or script, Java from JavaScript.

JavaScript in the JVM

A few years back, I read a blog post by a fellow named Steve Yegge, which talked about JavaScript on the JVM. The post is long, but well worth the read. At one point, he talks about the benefits of scripting on the JVM, and all of what he wrote and talked about back then is still valid today.

First, if there ever has been a computing problem, there is a solution for it in Java. Many times, the Java implementation of some library will be superior to what you might cobble together from other sources (see Apache Lucene). Why not leverage all this prior work? On top of the availability of all this code, in .jar format, it is portable between operating systems and CPUs – it almost runs everywhere.

Second, the JVM itself has a considerable number of man hours of research and development applied to it and it is ongoing. When they figure out how to make something smaller/faster/better for the JVM, it benefits everything that uses the JVM – including JavaScript execution and the libraries we’d call from JavaScript. We also get the benefit of Java’s excellent garbage collection schemes.

Third, the JVM features native threads. This means multiple JVM threads can be executing in the same JavaScript context concurrently. If v8 supported threads in this manner, nobody would be talking about event loops, starving them, asynchronous programming, nested callbacks, etc. Threads trivially allow your application to scale to use all the CPU cores in your system and to share data between the threads.

I’ll add a fourth, that you can compile your JavaScript programs into Java class files and distribute your code like you would any Java code.

So let’s have a look at JavaScript and the JVM.

Introducing Mozilla Rhino

Rhino is an open source JavaScript engine written in Java, and is readily available for most operating systems.

For OSX, I use HomeBrew to install it:

$ brew install rhino

For Ubuntu, the following command should work:

$ sudo apt-get install rhino

Once installed, we can run it from the command line and we get a REPL similar to what we’re used to with NodeJS:

Note the hello was printed on what looks like the command line. That was printed from the background thread and I had to hit return to see the next prompt. The Thread[Thread-1,5,main] was the return value of the spawn() method; it is a variable containing a Java Thread instance.

Spawning threads is that easy!

The JVM has first class synchronization built in. In Java, you use the synchronized keyword something like this:

This allows only one thread at a time to enter the foo() method. If a second thread attempts to call the function while a first has entered it (but not returned yet), the second thread will block until the first returns.

If we spawn() two threads that call bar.foo(), only one will be allowed to enter the function at a time.

Synchronization is vital for multithreaded applications to avoid race conditions where one thread might be modifying a variable/array/object while another thread is trying to examine it. The state of the variable/array/object is inconsistent until the modification is complete.

To recap so far, Rhino provides print(), load(), spawn(), and sync() functions, among others. In practice, I only see the load() and sync() methods being necessary because Rhino and other JVM JavaScript implementations allow us to “script Java” from JavaScript programs.

Scripting Java

Rhino makes scripting Java rather easy. It exposes a global variable Packages that is a namespace for every Java package, class, interface, etc., on the CLASSPATH.

On that page is the definition of the field, “out” and an example that reads something like:

public static final PrintStream out

The “standard” output stream. This stream is already open and ready to accept output data. Typically this stream corresponds to display output or another output destination specified by the host environment or user.

For simple stand-alone Java applications, a typical way to write a line of output data is:

What this is showing is that there are a number of implementations of println() in Java with different signatures. Rhino is smart enough to choose the right implementation based upon how we call it. Also note that the types in the println() signatures are Java native types.

For example:

js> Packages.java.lang.System.out.println('hello')
hello
js>

Rhino also exposes a global java variable which is identical to Packages.java – this is a handy way to access the builtin Java classes.

A minimal console class

We can now use load() to load a primitive JavaScript console implementation:

Java types in JavaScript

When writing JavaScript, things work as expected. An object is an object, an array is an array, a string is a string, and so on. But when we script Java from JavaScript, our variables often are instances of Java objects. A trivial example:

Note that getBytes() is a method you can call on Java strings, but not on JavaScript strings. Also note that we can cast Java strings to JavaScript strings.

Fortunately, we rarely have to instantiate Java strings, but we will have to deal with binary data when scripting Java. JavaScript has no real native binary type, but we can have our variables refer to instances of Java binary types.

Java Byte Arrays

One thing we’re certainly going to do is deal with Java byte arrays. We can instantiate one (1024 bytes) like this:

Useful example

Let’s look at how to read in a text file by scripting Java, and it does look a lot like Java. All the Java classes we use are in the package java.io and you can read up on FileInputStream, BufferedInputStream, and ByteArrayOutputStream. There are certainly many examples of their use (in Java) on the web.

Maybe this is a bit ugly, but we can encapsulate all the bridging between JavaScript and Java in nice JavaScript classes. Then we only need to call our JavaScript from JavaScript and not care so much about how Java is being called or the conversions between JavaScript native objects and Java ones is being done. One thing for sure is that this seems a lot cleaner and simpler than writing C++ modules to link with NodeJS or other V8 alternatives.

In other words, we only had to write the cat() function once. We can load() it in any or all of our applications from now on and not have to write the interface code to Java again.

Threads without spawn()

This example is a bit longer, but it demonstrates how to implement a Runnable interface in JavaScript.

This version works, but it is not quite perfect. You see, the bumpX() function returned by sync() synchronizes on the this object, which isn’t harmful in this example. However if we had another two threads bumping a y variable with a bumpY() method also synchronized on this, there’d be unnecessary contention among the 4 threads. When thread1() calls bumpX(), the remaining 3 threads will be blocked when they call bumpX() or bumpY().

The fix is:

javascript
var bumpX = sync(function() {
return x++;
}, x);

Note the extra argument to sync(), the object we want to synchronize on. Now the callers that call bumpX() will block appropriately, not affecting callers of bumpY().

About synchronization

I wouldn’t count on any JavaScript operation to be atomic. That is, array.pop() could in theory get interrupted by a thread switch interrupt, so if you have two threads manipulating that array, you have a seriously bad race condition. So be aware of thread safety. If you ever expect to have two threads access the same memory, synchronize around the accesses, as I demonstrated.

Extending Rhino (3rd party java)

We’re interested in calling 3rd party libraries, so here’s an example. I created a file, Example.java and compiled it into a .class file:

From this we can craft our own command lines, including some that add .jar files to the class path. To see a full description of the java command and all the command line options, enter this at your shell prompt:

$ man java

We cannot pass a CLASSPATH via -cp flags to the java command if we also specify -jar. So we are going to have to use a form of the java command that specifies CLASSPATH and the initial class/function to call. I dug into the rhino sources and found that the main function is org.mozilla.javascript.tools.shell.Main.

You should note that our x variable holds a reference to a Java String, not a JavaScript string. We can pretty much use it like a JavaScript string, and Rhino does the type conversions automagically as needed.

A brief note about the Java CLASSPATH

We can trivially create our own shell scripts to launch rhino with our own CLASSPATH.

eIt seems intuitive to me that if a directory is part of your CLASSPATH that Java runtime should find .class files as well as .jar files in that directory. But it does not work that way! CLASSPATH may specify a directory where only .class files are considered or it may specify .jar files that basically act like a directory containing only .class files.

This means if you want to use classes in two separate .jar files, you have to include both .jar files in the CLASSPATH.

Introducting Nashorn

Nashorn is a completely new JavaScript engine that is officially part of the recently released Java 8.

In order to run it, I installed the Java 8 JDK on my Mac. I haven’t seen any ill effects yet, so I guess it is safe. There were some negative effects of installing Java 7 on a Mac, particularly that Java 7’s browser plugin is 64-bit only and Google Chrome is 32-bit only; you lose the ability to run Java from WWW sites in Chrome. I haven’t tested to see if this is true for Java 8, but I haven’t seen any similar warnings.

The installation process is not 100% right. There is a jjs program that we are supposed to be able to run to execute Nashorn scripts (jjs is roughly Nashorn’s version of the rhino command). After installing Java 8, jjs is not in /usr/bin as it should be. A little bit of digging turned up the file here:

There is also a /usr/bin/jrunscript and a manual page for that dated 2006. The jrunscript program appears to launch Nashorn as well. There is also a jrunscript in the same directory as jjs that is different than the one in /usr/bin. A lot of confusion caused by all this, but I will use jjs for the rest of this article.

The jjs program presents a REPL just like rhino does:

$ jjs
jjs> print('hello')
hello
jjs> x = 10
10
jjs> x
10
jjs>

There is quite a bit of useful information about the JavaScript environment provided by Nashorn here:

I had to load("nashorn:mozilla_compat.js") to provide the sync() function.

The new Thread calls no longer work with what looks like a Runnable interface, or an object like:

{
run: function() { ... }
}

Instead, Nashorn can figure out that Runnable has only one member (run) and Runnable is required for Thread constructor, so it does the right thing if you pass the constructor a JavaScript function.

One other change I had to make was to call join() on one of the threads started. Without this, jjs exited right away. This is a different behavior from rhino.

Nashorn also features a scripting mode that adds some very non-standard features to the JavaScript language. The concept is a good one if you want to use Nashorn to write shell scripts. The only problem is anything you write using these extensions will not be portable to any other JavaScript environment. For this reason, I won’t go into more depth about this feature.

Nashorn Performance

I created 2 very simple and probably worthless programs to try to get a sense of how fast Nashorn is compared to Rhino (and NodeJS/v8).

The first program simply concatenates 1 million integers into a very long string:

Conclusion

Rhino is the gold standard of JavaScript for the JVM. It simply has been around for a very long time (since the 1990s) and it is feature rich and relatively bug free. Nashorn represents a new code base and new commitment by Oracle to JavaScript for the JVM. It’s brand new, and already appears to be a solid implementation in its own right. It’s only going to get better, too. Rhino is likely to run on any new release of Java for a long time to come, but it’s not as likely to get the attention to improvements as Nashorn.

The question is when is it time to ditch Rhino in favor of Nashorn? My guess is soon if Java 8 gains the adoption that I expect.

Mike Schwartz is an Architect at Modus Create and the designer of DecafJS and other JavaScript frameworks, such as SilkJS, a command shell belt on top of Google's V8 JavaScript engine and optimized for server-side applications. Mike has a deep history in developing games and web applications.@ModusSchwartz

This is a bad idea. Node.js scales pretty sweet, it just does it without threads. It’s a best-practice to start as much processes in Node.js as you would start threads.

Many times, the Java implementation of some library will be superior

This is simply not true anymore with the current state of npm. In addition, most of the available Node.js libraries will be more lean and support [FRP](http://en.wikipedia.org/wiki/Functional_reactive_programming) better than anything the Java community has to offer. This is less true for other stuff on the JVM though (think Scala/Clojure).

the JVM itself has a considerable number of man hours of research and development applied to it

So does V8, don’t underestimate that. V8 has got JIT as well. And V8 is always the first EcmaScript implementation to support ES6 features. That’s also important to a lot of people.

the JVM features native threads

Everything is Node.js is build for horizontal scaling, so to scale up, you’d just spawn another process, your Ops guy can do that alone. Since a Node.js app developer is forced very early in development to make it scale horizontally, it’s easy when you need it.

Threading is actually closer to vertical scaling, you’d need as much cores as possible. Yes this gives you shared state and memory, but that’s not an advantage when eventually you need to scale. Because eventually you’re going to need to scale to another machine (or vm), and your app then requires modification.

Threading is not just better, it’s a different way of solving stuff. But if you’ve got a heavy I/O bound application threading will lose to a non-blocking event-loop.

Mike Schwartz

If I said “EVERY TIME” about the Java implementations, you’d have a point. I specifically mentioned Lucene, which is an outstanding document indexing library that people have tried to port from Java with inferior results.

C++ is out of reach for most JavaScript coders, so they’re not going to have a good time at creating modules that need to link with C/C++ libraries. Scripting Java in Rhino or Nashorn is so trivial, these guys won’t have any issue.

I agree about the ES6 features. I’m told by the guy in charge of Nashorn at Oracle that ES6 is front burner.

I think you are wrong about threading vs. event loop. Very wrong. There’s an old adage in programming that you can trade memory for speed. In this case, memory for thread stacks. What event loop gains you is the ability to handle a huge number of simultaneous connections.

I agree 100%. I mentioned in my previous comment that this was meant to stress a feature I know is slow in Rhino. That same benchmark crushed DynJS, so I didn’t cover it in my article (not ready for prime time, but great promise, IMO).

Philipp N.

But I hope Nashorn’s “cold performance” will improve in the future. I use RingoJS for a lot of small short scripts / helpers and Rhino is up and running quite fast.

Marcus Lagergren

Hi – you are basing your perf numbers on a single string concatenation (micro) benchmark? That is probably unwise to draw conclusions from.

The Nashorn that will be released in 8u40 will have significantly bridged the gap towards native performance and will be on par with v8 on a couple of benchmarks in e.g. the octane suite, and virtually orders of magnitude (no exaggeration) faster than rhino -opt 9 on all of them… Rhino is left way behind.

As Nashorn relies heavily on invoke dynamic, which tends to generate a lot of intermediate methods due to its implementation, warmup will most likely take longer as well as the JVM warms up. We are also working on addressing various warmup issues in Nashorn – adding persistent code stores, class caching and lazy JITting.

Mike Schwartz

I don’t suggest that anyone should use these benchmarks to decide on which JavaScript engine to use. I just know that rhino before Hannes’ 1.7R5 version had horrible string concatenation performance. To the point where I would use array.join() all the time.

Marcus Lagergren

I’m going to talk about Nashorn performance and the roads we have taken to get where we are at the JVM Language Summit this summer as well as at JavaOne in the fall, so feel free to check out the presentations when I’m done with them.

Regards
Marcus

Marcus Lagergren

Also – any benchmark that runs for less than 1 second will have zero impact on actual JVM performance. You are measuring JVM startup and interpreter speed. You probably have to do that orders of magnitude more times to get any valid optimisation metrics for either JVM based runtime. I tried wrapping your benchmark as a workload that couldn’t escape any local variables and that did the work 1000 times. Nashorn is on par with v8 and faster than Rhino in that very simple configuration. (without actually doing any analysis of performance factors). I used the latest JDK9 build of Nashorn, and JDK9 build 20, where the relevant parts will be back merged to 8u40.

Philipp N.

Rhino has a long tradition for server-side Javascript. I worked a lot with Helma (started around 1998!) in the past, which utilizes Rhino as JS engine. RingoJS, which is the successor of Helma, is still on Rhino. I wrote applications for Google App Engine with RingoJS before they provided support for custom runtimes and so for Node. This approach worked fine and since the engine was never a real performance bottleneck, it didn’t influence my applications.

But the future path for JS on the JVM is clear if Oracle continues to support Nashorn and the lack of attention for Rhino. For me RingoJS is a solid and good base for my applications, even if the underlying engine is not the hottest stuff around. But I’m sure this will change if Nashorn gets more mature and incorporates some features I’m missing at the moment.

Since RingoJS’ core has a very small footprint, it should be possible to bring it on top of Nashorn then, but it will require some effort. I think Rhino and Nashorn will coexist the next years, but new developments will switch to Nashorn and legacy stuff (e.g. Java 6/7 support) will stay as long as possible on Rhino.

Ben Evans

Hi Mike,

Thanks for writing the post. A couple of things, though:

1) That benchmark is very badly flawed. It really isn’t measuring you what you think it is, and is pretty close to a textbook example of “what not to do”. I would actually consider removing that section as it is so misleading.

2) jrunscript isn’t specific to Nashorn. It’s a generic “run script from java” script, that uses whatever default scripting enging Java has installed (with 7 that was Rhino, with 8 it’s Nashorn). If you install a scripting engine to use Groovy or JRuby instead, then jrunscript can run scripts using that engine instead (with the -l switch).

3) The lack of a symlink for jjs is a real pain, but it’s not really Oracle’s fault. The symlinks are created by some Apple packagaing/installer code, and they didn’t update it yet to know about jjs as a new “binary that ships with Java”. That’s also why there appear to be two versions of jrunscript (in fact you should change the symlink for jrunscript to point at the one in the same dir as jjs).