Thursday, September 11, 2008

Over the past couple weeks I've had a few departures from typical JRuby development. I consider it a working vacation. I'm hoping to report on all of it soon, but for now we'll focus on one of the most exciting items: JSR-292, otherwise known as "InvokeDynamic".

I've reported on invokedynamic previously (InvokeDynamic: Actually Useful?), and of course the technical bits of John Rose's blog should be required reading for anyone interested in this stuff. What I'm going to try to do today is give you an inside picture of the pieces of InvokeDynamic and how they fit together. It will be technical, but everyone should be able to follow it. Ready?

The Problem

Any description of a solution must first describe the problem.

As you probably know, Java is a statically-typed language. That means the types of all variables, method arguments, method return values, and so on must be known before runtime. In Java's case, this also means all variable types must be declared explicitly, everywhere. A variable cannot be untyped, and a method cannot accept untyped parameters nor return an untyped value. Types are pervasive.

The problem, put simply, is this: Because Java is the primary language on the JVM, almost all language implementations on the JVM are written in Java. When implementing a statically-typed language, especially one with structure and rules similar to Java, this is not much of a problem. But when implementing a dynamic language that stubbornly refuses to yield type information until runtime, all this static-typing is a real pain in the neck. Of course this is pretty much the same situation when implementing a dynamic language on top of C or C++ or C#, since they're all generally statically-typed languages too. Or is it? An example is in order.

Here we see a short, reasonably simple snippit of Java code. An ArrayList is constructed, populated with five strings based on the incoming first command-line argument and a numeric iteration count, and then displayed as a string on the console. The type declarations (shown in bold) represent a lot of the visual noise, the "ceremony" that dynamic language fans decry. From a usability perspective, they're both a positive and negative influence; they noise up the code and require more typing, but they also make it trivial to determine the type of a variable (in most cases) or build tools that safely restructure your code (so-called "refactoring"). From a technical perspective, they give the "javac" compiler all the information it needs to produce very clean, optimized bytecode, and they give the JVM itself type information it uses to execute and optimize that bytecode at runtime. Ahh, but what about the bytecode?

If we peel the Java layer away, the situation changes a bit. At the JVM bytecode level, types are still visible, but they're not nearly as prevalent. Here's the same code in bytecode, with the type names again in boldface:

Since not everyone reads JVM bytecode like their native language, a description of these operations is in order.

Java provides what's called an "operand stack" for bytecode it executes. The stack is analogous to registers in a "real" CPU, acting as temporary storage for values against which operations (like math, method calls, and so on) are to be performed. So most JVM bytecode spends its time either manipulating that stack by pushing, popping, duping, and swapping values, or executing operations that produce or consume values. It's a pretty simple mechanism. So then, with a general understanding of the operand stack, lets look at the bytecode itself:

The "load" and "store" instructions are all local variable accesses. "load" retrieves a local variable and pushes it on the stack. "store" pops a value off the stack and stores it in a local variable. The prefix indicates whether the value is an object or "reference" type (denoted by "a") or one of the primitive types (denoted by "i" for integer, "f" for float, and so on). The standard load and store operations take an argument (embedded along with the operation into the bytecode) to indicate which indexed local variable to work with, but there are specialized bytecodes (denoted by a suffixed underscore and digit) for a "compressed" representation of heavily-used low-index variables.

The "invoke" bytecodes are what you might expect: method invocations. Method invocations consume zero or more arguments from the stack and in some cases a receiver object as well. "virtual" refers to a normal call to a non-interface method on an object receiver. "interface" refers to an interface invocation on an object receiver. "static" refers to a static invocation, or one that does not require an object to call against. The "strange quark" of the bunch is "invokespecial", which is used for calling constructors and superclass implementations of methods. You'll notice a couple invokespecials above paired with "new" operations; "new" instantiates the object and "invokespecial" initializes it.

The "const" instructions are what you might guess: they push a constant on the stack. Again, the prefix and suffix denote type and "compressed" opcodes for specific values, respectively.

"aaload" and all "*aload" operations are retrievals out of an array. As with local variables, the first letter indicates the type of the array. Here, the "aaload" is our retrieval of args[0].

"iinc" is an integer increment operation. The arguments are the index of the local variable and how much to increment it by (usually 1).

"if_icmpge" performs a conditional jump after testing whether the second-topmost int on the stack (indicated by the "i" in "icmpge") is greater than or equal to the topmost int on the stack (the >= relationship represented by the "ge" in "icmpge"). This is our "for" loop test i < 5 reversed to act as a loop exit condition rather than a loop continue condition. The looping itself is provided by the "goto" operation further down (yes, the JVM has goto...it's just Java that doesn't have goto).

Finally, we see the "return" instruction, which represents the void return from main. If it were a return of a specific value or object type, it would be preceded by the appropriate type character.

Now the astute reader may already have noticed that other than being specified as reference or primitive types, the opcodes themselves have no type information. Even beyond that, there are no actual variable declarations at the bytecode level whatsoever. The only types we see come in the form of opcode prefixes (as in aload, iinc, etc) and the method signatures against which we execute invoke* operations. The stack itself is also untyped; we push a reference type (aload) one minute and push a primitive type (iload) the next (though values on the stack do not "lose" their types). And when I tell you that the type signatures shown above for each method invocation or object construction are simply strings stuffed into the class's pool of constants...well...now you may start to realize that Java's sometimes touted, oft-maligned static-typing...is just a façade.

The Greatest Trick

Let's dispense with the formality once and for all. The biggest lie that's been spread about the JVM (ok, maybe the biggest after "it's slow") is that it's never going to be a good host for dynamic languages. "But look at Java," people cry, "it's so staticky and rigid; it's far too difficult to implement a dynamic language on top of that!" And in a very naive way, they're partially correct. Writing a language implementation in Java and following Java's rules can certainly make life difficult for a dynamic language implementer. We end up stripping types (making everything Object, since we don't know types until runtime), boxing types (stuffing primitives in carrier objects, to simplify passing them through our Object-only code), and boxing array arguments (since many dynamic languages also have flexible "arities" or numbers of arguments, and others allow optional, "rest", and other special argument types). With each sacrifice we make, we lose many of the benefits static typing provides us, not to mention confounding the JVM's efforts to optimize.

But it's not nearly as bad as it seems. Because much of the rigid, static nature of Java is in the language itself (and not the JVM) we can in many cases ignore the rules. We don't have to declare local variable types. We can juggle items on the stack at will. We can cheat in clever ways, allowing much of normal code execution to proceed with very little type information. In many cases we can get that code to run nearly as well as statically-typed code of twice the size, because the JVM is so dynamic already at its core. JVM bytecode is our assembly, and it's a powerful tool in the right hands.

Unfortunately, on current JVMs, there's one place we absolutely, positively must follow the rules: method invocation.

Know Thyself

Question: In the bytecode above, all invocations came with a formal "signature" representing the type to call against and the types of the method's arguments and return value. If we do not know those types until runtime, and they may be variant even then...how do we support invocation in a dynamic language?

Answer: Very carefully.

Because we are bound to following Java's method invocation rules, the once sunny and clear forecast turns rather cloudy. Every invocation has to be called against a known type. Its arguments must be known types. Its return value must be a known type. Making matters worse, we can't even provide signatures with similar types; the signatures must exactly match the method we intend to invoke. So we understand limitation #1: invocations are statically typed.

There's another way this affects dynamic languages, especially those that may not present normal Java types or that run in an interpreted mode for some part of execution: Invocations must be against real methods on real types. There's simply no way to tell the JVM that instead of calling method W on type X with param Y and return value Z, I want you to enter this interpreter loop; don't mind the types, we'll figure it out for you. Oh no, you have to be part of the Java club and present a normal Java type to get invocation privileges. That's limitation #2: invocations must be against Java methods on Java types.

Adding insult to injury, JVMs even run verification against the bytecode you feed them to make sure you're following the rules. One little mistake and zooop...off to the exception farm you go. It's downright unfair.

The traditional way to get around all this rigidity (a technique used heavily even by normal Java libraries, since everyone wants to bend the rules sometimes) is to abstract out the act of "invoking" itself, usually by creating "Method" objects that do the call for you. And oddly enough, the reflection capabilities of the JVM come into heavy play here. "Method" happens to be one of the types in the java.lang.reflect package, and it even has an "invoke" method on it. Even better, "invoke" returns Object, and accepts as parameters an Object receiver and an array of Object arguments. Can it truly be this easy? Well, yes and no.

Using reflection to invoke methods works great...except for a few problems. Method objects must be retrieved from a specific type, and can't be created in a general way. You can't ask the JVM to give you a Method that just represents a signature, or even a name and a signature; it must be retrieved from a specific type available at runtime. Oh, but that's at runtime, right? We're ok, because we do actually have types at runtime, right? Well, yes and no.

First off, you're ignoring the second inconvenience above. Language implementations like JRuby or Rhino, which have interpreters, often simply don't *have* normal Java types they can present for reflection. And if you don't have normal types, you don't have normal methods either; JRuby, for example, has a method object type that represents a parsed bit of Ruby code and logic for interpreting it.

Second, reflected invocation is a lot slower than direct invocation. Over the years, the JVM has gotten really good at making reflected invocation fast. Modern JVMs actually generate a bunch of code behind the scenes to avoid a much of the overhead old JVMs dealt with. But the simple truth is that reflected access through any number of layers will always be slower than a direct call, partially because the completely generified "invoke" method must check and re-check receiver type, argument types, visibility, and other details, but also because arguments must all be objects (so primitives get object-boxed) and must be provided as an array to cover all possible arities (so arguments get array-boxed).

The performance difference may not matter for a library doing a few reflected calls, especially if those calls are mostly to dynamically set up a static structure in memory against which it can make normal calls. But in a dynamic language, where every call must use these mechanisms, it's a severe performance hit.

Build a Better Mousetrap?

As a result of reflection's poor (relative) performance, language implementers have been forced to come up with new tricks. In JRuby's case, this means we generate our own little invoker classes at build time, one per core class method. So instead of calling through our DynamicMethod to a java.lang.reflect.Method object, boxing argument lists and performing type checks along the way, we're able to create a fast, specialized bit of bytecode that does the trick for us.

Here's an example of a generated invoker for RubyString.split, the implementation of String#split, taking one argument. We pass into the "call" method a ThreadContext (runtime information for JRuby), an IRubyObject receiver (the String itself), a RubyModule target Ruby type (to track the hierarchy during super calls), a String method name (to allow aliased methods to present an accurate backtrace), and the argument. Out of it we get an IRubyObject return value. And the bytecode is pretty straightforward; we prepare our arguments and the receiver and we make the call directly. What would normally be perhaps a dozen layers of reflected logic has been reduced to 10 bytes of bytecode, plus the size of the class/method metadata like type signatures, method names, and so on.

But there's still a problem here. Take a look at this other invoker for RubyString.slice_bang, the implementation of String#slice!:

Oddly familiar, isn't it? What we have here is called "wastefulness". In order to provide optimal invocation performance for all core methods, we must generate hundreds of these these tiny methods into tiny classes with everything neatly tied up in a bow so the JVM will pretty please perform that invocation for us as quickly as possible. And the largest side effect of all this is that we generate the same bytecode, over and over again, with only the tiniest of changes. In fact, this case only changes one thing: the string name of the method we eventually call on RubyString. There are dozens of these cases in JRuby's core classes, and if we attempted to extend this mechanism to all Java types we encountered (we don't, for memory-saving purposes), there would be hundreds of cases of nearly-complete duplication.

I smell an opportunity. Our first step is to trim all that fat.

Hitting the Wall

Let me tell you a little story.

Little Billy developer wanted to freely generate bytecode. He'd come to recognize the power of code generation, and knew his language implementation was dynamic enough that compiling once would not be optimal. He also knew his language needed to do dynamic invocation on top of a statically-typed language, and needed lots of little invokers.

So one day, Billy's happily playing in the sandbox, building invokers and making "vroom, vroom" sounds, when along comes mean old Polly Permgen.

"Get out of my sandbox, Billy," cried Polly, "you're taking up too much space, and this is *my* heap!"

"Oh, but Polly," said Billy, rising to his feet. "I'm having ever so much fun, and there's lots of room to play on that heap over there. It's oh so large, and there's plenty of open space," he desperately replied.

"But I told you...this is MY heap. I don't want to play over there, because I like playing *right here*." She threw her exceptions at Billy, smashing his invokers to dust. Satisfied by the look of horror on Billy's face, she plopped down right where he had been sitting, and smiled terribly up at him.

Dejected, Billy sulked away and became a Lisp programmer, living forever in a land where data is code and code is data and everyone eats butterscotches and rides unicorns. He was never seen nor heard from again.

This story will be very familiar to anyone who's tried to push the limits of code generation on the JVM. The JVM keeps in memory a large, pre-allocated chunk of reserved space called the "heap". The heap is maintained as a contiguous area of space to allow the JVM's garbage collector to move objects around at will. All objects allocated by the system come out of this heap, which is usually split up into "generations". The "young" generation sees the most activity. Objects that are created and immediately dereferenced (like, abandoned?), never make it out of this generation. Objects that persist longer stick around longer. Some objects live forever and get to the oldest generations, but most objects die an early death. And when they die, their bodies become the grass, and the antelope eat the grass. It's a beautiful circle of life. But why are there no butterscotches and unicorns?

The dirty secret of several JVM implementations, Hotspot included, is that there's a separate heap (or a separate generation of the heap) used for special types of data like class definitions, class metadata, and sometimes bytecode or JITted native code. And it couldn't have a scarier name: The Permanent Generation. Except in rare cases, objects loaded into the PermGen are never garbage collected (because they're supposed to be permanent, get it?) and if not used very, very carefully, it will fill up, resulting in the dreaded "java.lang.OutOfMemoryError: PermGen space" that ultimately caused little Billy to go live in the clouds and have tea parties with beautiful mermaids.

So it is with great reluctance that we are forced to abandon the idea of generating a lot of fat, wasteful, but speedy invokers. And it's with even greater reluctance we must abandon the idea of recompiling, since we can barely afford to generate all that code once. If only there were a way to share all that code and decrease the amount of PermGen we consume, or at least make it possible for generated code to be easily garbage collected. Hmmm.

Enter java.dyn.AnonymousClassLoader. AnonymousClassLoader is the first artifact introduced by the InvokeDynamic work, and it's designed to solve two problems:

Generating many classes with similar bytecode and only minor changes is very inefficient, wasting a lot of precious memory.

Generated bytecode must be contained in a class, which must be contained in a ClassLoader, which keeps a hard reference to the class; as a result, to make even one byte of bytecode garbage-collectable, it must be wrapped in its own class and its own classloader.

It solves these problems in a number of ways.

First, classes loaded by AnonymousClassLoader are not given full-fledged symbolic names in the global symbol tables; they're given rough numeric identifiers. They are effectively anonymized, allowing much more freedome to generate them at will, since naming conflicts essentially do not happen.

Second, the classes are loaded without a parent ClassLoader, so there's no overprotective mother keeping them on a short leash. When the last normal references to the class disappear, it's eligible for garbage collection like any other object.

Third, it provides a mechanism whereby an existing class can be loaded and slightly modified, producing a new class with those modifications but sharing the rest of its structure and data. Specifically, AnonymousClassLoader provides a way to alter the class's constant pool, changing method names, type signatures, and constant values.

Here's a very simple example of passing an existing class (Invoker) through AnonymousClassLoader, translating the method name "fake" in the constant pool into the name "real". The resulting class has exactly the same bytecode for its "doIt" method and the same metadata for its fields and methods, but instead of calling the "fake" method it will call the "real" method. If we needed to adjust the method signature as well, it's just another entry in the constPatchMap.

So if we put these three items together with our two invokers above, we see first that generating those invokers ends up being a much simpler affairs. Where before we had to be very cautious about how many invokers we created, and take care to stuff them into their own classloaders (in case they need to be garbage-collected later), now we can load them freely, and we will see neither symbolic collisions nor PermGen leaks. And where before we ended up generating mostly the same code for dozens of different classes, now we can simply create that code once (perhaps as normal Java code) and use that as a template for future classes, sharing the bulk of the class data in the process. Plus we're still getting the fastest invocation money can buy, because we don't have to use reflection.

Who could ask for more?

Parametric Explosion

I could. There's still a problem with our invokers: we have to create the templates.

Let's consider only Object-typed signatures for a moment. Even if we accept that everything's going to be an Object, we still want to avoid stuffing arguments into an Object[] every time we want to make a call. It's wasteful, because of all those transient Object[] we create and collect, and it's slow, because we need to populate those arrays and read from them on the other side. So you end up hand-generating many different methods to support signatures that don't box arguments into Object[]. For example, the many call signatures on JRuby's DynamicMethod type, which is the supertype of all Ruby method objects in a JRuby runtime:

And this doesn't even consider the fact that ideally we want to move toward calling methods with *specific types* since any good JVM dynlang will eventually have to call a normal Java method with a non-Object-based signature. Oh, we could certainly generate new versions of "call" into their own little interfaces at runtime, but we'd have to load them, manage them, make sure they can GC, make sure they don't collide with each other, and so on. We end up back where we started, because AnonymousClassLoader is only part of the solution. What we really need is a way to ask the JVM for a lightweight, non-reflected, statically-typed "handle" to a method that's primitive enough for the JVM to treat it like a function pointer.

MethodHandle is the next major piece of infrastructure added for InvokeDynamic. Instead of having to pass around java.lang.reflect.Method objects, which are slower to invoke and carry all that metadata and reflection bulk with them, we can now instead deal directly with MethodHandle, a very primitive reference type representing a specific method on a specific type with specific parameters.

But wait, didn't you say specifics get in the way?

Specifics can get in the way if we're concerned only about invoking dumb dynamic-typed methods that could accept any number of types, as is the case in dynamic languages. Being forced to specify a specific type means that specific type becomes Object, and so all paths must lead to the same generic code. And truly, if MethodHandle was no more than a "detachable method" it wouldn't be particularly useful. But in order to support the more complex call protocols dynamic languages introduce, with their implicit type conversions, dynamic lookup schemes, and "no such method" hooks, MethodHandles are also composable.

Say we have a target method on the Happy type that takes a single String argument.

public class Happy { public void happyTime(String arg){}}

We can capture a method handle for this class in one of two ways. We can either "unreflect" a java.lang.reflect.Method object, or we can ask the MethodHandles factory to produce one for us:

Our new happyTimeHandle is a direct reference to the "happyTime" method. It's statically typed, with a type signature of "(Happy, String)void" (meaning it accepts a Happy argument and a String argument and returns void, since we must include the receiver type). And the code looks very similar to retrieving a java.lang.reflect.Method instance. So if all we're concerned about is calling happyTime on a Happy instance with a String argument, this is basically all there is to it. But that's rarely enough for us dynamic types. No, we need all our "magic" too.

Luckily, MethodHandles also provides a way to adapt and compose handles. Perhaps the simplest adaptation is currying.

Currying a method (and really when we talk about methods here we're talking about functions with a leading receiver argument) means to grab that method reference, stuff a couple values into its argument list, and produce a new method reference that uses those values plus future values you provide at call time to make the target call. In this case, we'll insert a Happy instance we want this handle to always invoke against.

The resulting curried handle has a signature of only "(String)void", since we've curried or bound the handle to a specific instance of Happy.

There are also more complicated adaptations. We may need to have what John Rose calls a "flyby" adapter that examines and possibly coerces arguments in the arg list. So we grab a handle to the method representing that logic, attach it to our MethodHandle as a flyby argument adapter, and the resulting handle will perform that adaptation as calls pass through it. We may want to "splat" or "spread" arguments, accepting a variable argument count and automatically stuffing it into an array. MethodHandles.spreadArguments can return a handle that does what we're looking for. Perhaps we need pre and post-call logic, like artificial frame or variable scope allocation. We just represent the logic as simple functions, produce handles for each, and assemble a new MethodHandle that brackets the call. Bit by bit, piece by piece, the complex vagaries of our call protocols can be decomposed into functions, referenced by method handles, and composed into fast, efficient, direct calls. Are we having fun yet?

We haven't even gotten to the coolest part.

Brief History

JSR-292 started out life as a proposal for a new bytecode, "invokedynamic", to accompany the four other "invoke" bytecodes by allowing for dynamic invocation. When it was announced, the early concept provided only for invocation without a static-typed signature. It still required a call to eventually reach a real method on a real type, and it did not provide (or did not specify) a way to alter the JVM's normal logic for looking up what method it should actually invoke. For languages like JRuby and Groovy, which store method tables in their own structures, this meant the original concept was essentially useless: most dynamic languages have "open" types whose methods can be added, removed, and redefined later, so it was impossible to ever present a normal type invokedynamic could call.

It also included nothing to solve the larger problems of implementing a dynamic language on the JVM, problems like the restrictive, over-pedantic rules for loading new bytecode and the limitations and poor performance of reflected methods. It was, in essence, dead in the water. That was mid 2006.

Fast-forward to September of that year. Sun Microsystems, after years of promoting Java as the "one true language" on the JVM, has decided to hire on two open-source developers to work on the JRuby project, a JVM implementation of Ruby, a fairly complex dynamically-typed language. The pair had managed to run the most complicated application framework the Ruby world had to offer, and for the first time in a long time it started to look like directly supporting non-Java languages on the JVM might be a good idea.

Around this time or shortly after, John Rose became the new JSR-292 lead. John was a member of the Hotspot VM team, and among his many accomplishments he listed a fast Scheme VM and a bytecode-based regular expression engine. But perhaps most importantly, John knew Hotspot intimately, knew that the its core was simply *made* for dynamic languages, and had a pretty good idea how to expose that core. So it began.

The culmination of InvokeDynamic is, of course, the ability to make a dynamic call that the JVM not only recognizes, but also optimizes in the same way it optimizes plain old static-typed calls. AnonymousClassLoading provides a piece of that puzzle, making it easy to generate lightweight bits of code suitable for use as adapters and method handles. MethodHandle provides another piece of that puzzle, serving as a direct method reference, allowing fast invocation, argument list manipulation, and functional composability. The last piece of the puzzle, and probably the coolest one of all, is the bootstrapper. Now it's time to blow your mind.

There's two sides to a invocation. There's the call, presumably a chunk of bytecode doing an "invoke" operation, and there's the target, the actual method it invokes. Under normal circumstances, targets fall into three categories: static methods, virtual methods, and interface methods. Because two of these types--static and virtual--are explicitly bound to a specific method, they can be verified when the method's bytecode is loaded. If the type or method do not exist, the bytecode is considered invalid and an error is thrown. However the third type of target, an interface method, may have any number of targets at runtime, potentially targets that have not even been loaded into the system yet. So the JVM gives invokeinterface operations much more flexibility. Flexibility we can exploit.

Much of the JVM's optimizations come from it treating what looks like normal code as "special". Hotspot, for example, has a large list of "intrinsic" methods (like System.arraycopy or Object.getClass), methods that it always tries to inline directly into the caller, to ensure they have the maximum possible performance and locality. It turns out that adding bytecodes to the JVM isn't really even necessary, if you have the freedom to define special new behaviors based solely on the methods, types, or operations in play. And apparently, the Hotspot team has that freedom.

Because of the low probability of a new bytecode being approved, and because it really wasn't necessary, John introduced a "special" new interface type called java.dyn.Dynamic. Dynamic does not include any methods, nor is it intended as a marker interface. You can implement it if you like, but its real purpose comes when paired with the invokeinterface bytecode. For you see, under InvokeDynamic, an invokeinterface against Dynamic is not really an interface invocation at all.

Here's a simple example of code that won't compile. Because the incoming argument's type is Object, we can only call methods that exist on Object. "myDynamicMethod" is not one of them. The hypothetical bytecode for that call, if it did compile, would look roughly like this:

In its current state, this bytecode would not even load, because the verifier would see there's no myDynamicMethod on Object and kick it out. But we want to make a dynamic call, right? So let's transform that virtual invocation into a dynamic one:

We've made it an interface invocation, so the JVM won't kick it out and it loads happily. And we've provided our "special" marker, the java.dyn.Dynamic interface, so the JVM knows not to do a normal interface invocation. That wraps up the call side...myDynamicMethod is now recognized as an "invokedynamic". But what about the target? How do we route this call to the right place?

Now we finally get to the bootstrap process. In order to make dynamic languages truly first-class citizens on the JVM, they need to be able to actively participate in method dispatch decisions. If method lookup and dispatch is forever only in the hands of the JVM, it's a much more complicated process to do fast dynamic calls. Believe me, I've tried. So John came up with the idea of a "bootstrap" method.

The bootstrap method is simply a piece of code that the JVM can call when it encounters a dynamic invocation. The bootstrap receives all information about the call directly from the JVM itself, makes a decision about where that call needs to go, and provides that information to the JVM. As long as that decision remains valid, meaning future calls are against the same type and method tables don't change, no further calls to the bootstrap are needed. The JVM proceeds to link and optimize the dynamic call as if it were a normal static-typed invocation. Here's what this looks like in practice:

This is a simple bootstrap method for the "myDynamicMethod" call above. When "myDynamicMethod" is invoked, the JVM "upcalls" into this bootstrap method. It provides the original argument list (with the receiver first, since invokeinterface always takes a receiver), and a CallSite. CallSite is a representation of the "site" in the original code where the dynamic invocation came from, and it has a type just like a method handle. In this case, the CallSite.type() is "(Object)Object" since we always pass along the receiver (the one Object argument) and the method returns an Object.

In this case, we're just going to bind any dynamic call coming into this bootstrap to the same method, which might look like this:

Notice that now we actually have a formal argument for the receiver; because we have bound an instance invocation (invokeinterface) to a static method (invokestatic) the receiver becomes the first argument to the call. Back in bootstrap, we retrieve a handle to this method and set it into the CallSite. At this point the CallSite has everything it needs for the JVM to link future calls straight through. As a final step, we perform the invocation ourselves to provide a return value for the current call. And the bootstrap method will never be called for this particular call site again...because the JVM links it straight through.

As I alluded to earlier, we can also invalidate a CallSite by clearing its target. Clearing the target tells the JVM the originally linked method is no longer the right one, please bootstrap again. We're basically a direct participant in the JVM's method selection and linking process. So cool.

Oh, there's one more bit of magic I should show you: how to get from point A to point B, i.e. how to tell the JVM which bootstrap method to use. Remember our SimpleExample class above? The one we coaxed into doing dynamic invocation? Here's how we point SimpleExample's dynamic calls at our bootstrap method...we just this code add to SimpleExample itself:

Linkage is another class from InvokeDynamic, responsible primarily for wiring up dynamic-invoker classes to their bootstrap logic. Here we're registering a bootstrap method for SimpleExample by creating a handle to DynamicInvokerThingy.bootstrap. Linkage has a convenient BOOTSTRAP_METHOD_TYPE constant we can use for the type. And that's basically it. What could be easier?

Status

InvokeDynamic is a work in progress. It first successfully performed a dynamic invocation on August 26, 2008 - International InvokeDynamic Day. John had given me wind of the "imminent" event, so I had already started to look at wiring it into JRuby. Ultimately, it was into the first week of September before I got all the bits together and working, but after a day or two of back-and-forth emails, a bug report (I found a bug! I'm helping!), and a little JRuby refactoring, I managed to successfully wire InvokeDynamic directly into JRuby's dispatch process! Such excitement! The code is already in JRuby's trunk, and will ship with JRuby 1.1.5 (though it obviously will be disabled on JVMs without InvokeDynamic).

Now before you go off and get all excited, you should know that I wired it up in probably the most primitive way possible. A lot of the method-adapting logic isn't fully implemented yet, and what is there isn't wired into Hotspot's JIT, so it's still early days. But I'm absolutely giddy when I think about the possibilities of MethodHandles alone, much less the entire InvokeDynamic package all together. It gives me shivers just thinking about it.

(Before you think I'm some kind of crackpot, imagine how much work it's taken to get JRuby running as well as it is today and how much work each tiny incremental improvement requires. The idea that the next round of *major* improvements will be a simple matter of functionally decomposing JRuby's core--something we've wanted to do all along--is pure butterscotches and unicorns.)

And there's also the sobering fact that at best this would be a Java 7 feature; there's no possibility of backporting it other than as an emulation layer. So production users looking for InvokeDynamic-enabled JRuby are going to have to be ambitious or at least wait for Java 7...and that's assuming we're able to get the JSR approved and included (though I'm going to do whatever I can to make that happen).

But at the end of the day, make no mistake: The JVM is going to be the best VM for building dynamic languages, because it already is a dynamic language VM. And InvokeDynamic, by promoting dynamic languages to first-class JVM citizens, will prove it.

More Information

If you'd like to read more about InvokeDynamic, here's a few resources:

The JSR-292 JCP page has a link to the draft document about InvokeDynamic. It's starting to get a little aged now but the general concepts are all there. A good read.

The JRuby SVN repository already contains the early InvokeDynamic work I've done. Look for the classes InvokeDynamicInvocationCompiler and InvokeDynamicSupport, both referenced from StandardASMCompiler. And feel free to email or stop into #jruby on FreeNode IRC if you have questions.

The Multi-Language VM page can get you started with John Rose's InvokeDynamic patches, along with some other oddities like JVM continuations and something called "quid". And you'll need a good walkthrough on building OpenJDK, so try Volker Simonis's OpenJDK instructions for now. Unfortunately the MLVM bits only work on Linux and Solaris builds of OpenJDK at the moment; that will change in the future.

Update: I can't believe I forgot to do my final plug for the JVM Languages Summit, which is coming up at the end of this month. I believe there's still a few seats open. If you're in the SF bay area or feel like taking a trip, the slate of talks is going to be awesome. You will hear John Rose talk about InvokeDynamic, me talk about JRuby past and future, and lots, lots more. There's even a couple Microsofties coming down and a Parrot presentation. Great fun!

And as always, feel free to contact me, comment on this blog, or look me up on IM or IRC. I'm keen to see InvokeDynamic put through its paces all the way through its specification and development process, and I could use some help.

Friday, September 05, 2008

I was just having a conversation with a friend, a Rubyist whose opinion I respect, who clued me in that he really hates when JRuby users use Java libraries with little or no Ruby syntactic sugar. He hates that there's a better chance every day that Java-related technologies will enter his world. That he's going to have to fix someone's Java-like Ruby. He lamented the lack of decent wrapper libraries that hide "the Java insanity", that are just bare-metal shims over the Java classes they call. He expressed his frustration that JRuby being successful will mean he's going to have to deal with Java. He doesn't want to *ever* have to do that.

And he said it's our fault.

I've heard variations of this from other key Rubyists too. There's a lot of hate and angst in the Ruby community. Many of them are Java escapees, who long ago decided they couldn't tolerate Java as a language or were fed up dealing with some of the many failed libraries and development patterns it has spawned. Some of them are C escapees who've never quite been able to let go of C, be it for performance reasons or because of specific libraries they need. Some of them have been Rubyists longer than anything else (or maybe just longer than anyone else), and see themselves as the purists, the elite, the Ivory Tower, keepers of all that's good in the Ruby world and judge, jury, and executioner for all that's bad. In the end, however, there's one thing these folks share in common.

They think JRuby is a terrible idea.

Of course it's not everyone. I think the general Ruby populace still looks at JRuby as an interesting project...for Java developers. Or maybe just as a gateway to bring people into the community. A growing minority of folks, however, have managed to move beyond prejudices against Java to make new tools, applications, and libraries using JRuby that might not otherwise have been possible. And some folks are simply ecstatic about JRuby's potential.

Why is JRuby such a polarizing issue?

I don't see this in the Python community, for example, which might surprise some Rubyists. Pythonistas seem to have positively embraced both IronPython and Jython. There's no side-chatter at the conferences about the evils of anything with a J in it. There's no mocking slides, no jokes at Jython or IronPython developers' expense. No "Python elite" cliques actively working to shut Jython or IronPython out, or to discourage others from considering them. The community as a whole--Guido included--seems to be genuinely thankful for implementation diversity. Even if one of them does have a J in it.

What's different about these two communities? Why?

I work on JRuby. For the past 3-4 years, it has been my passion. There's been pain and there's been triumph: compatibility hassles; performance numbers steadily increasing; rewriting subsystems I swore I'd never touch like IO and Java integration. Over the past two years, I've put in four years' worth of work, writing compilers, rewriting JRuby's runtime, rewriting whole subsystems, speaking at conferences, staying up late nights (frequently ALL night) helping users on the JRuby IRC channel or mailing lists, and hacking, hacking, hacking almost all day, every day. For what? Because I want to infect the JRuby community with a new and more virulent strain of Java? Because I don't know any better?

I work on JRuby because I love Ruby and I honestly believe JRuby is one of the best things ever to happen to Ruby. JRuby takes a decade of Java dogma and turns it on its head. JRuby isn't about Java, it's about taking the best of the Java platform and using it to improve Ruby. It's about me and others working relentlessly, writing Java so you don't have to. It's about giving Ruby access to one of the best VMs around, to one of the largest collections of libraries in the world, to a pool of talented engineers who've written this stuff a dozen times over. Sure there's crap in the Java world. Sure the Java elite took power in the late 90s and started to jam a bunch of nonsense down our throats. Sure the language has aged a bit. That's all peripheral. JRuby makes it possible to filter out and take advantage of the good parts of the Java world without writing a single line of Java.

Tell me that's not a good idea.

I sympathize with my friend...I really do. I've not only seen a lot of really bad Ruby code come out of JRubyists, I've created some of it. Writing good code is hard in any language, but writing Ruby code that meets the Ivory Tower's standards is like trying to decipher J2EE specifications. If I have to listen to some speaker meditate on what "beautiful code" means one more time I think I'm going to kill someone. Yes, beauty is important. I have my idea of beautiful code and you have yours, and there may be a nexus where the two meet. But tearing into people who are trying to learn Ruby, trying to move away from Java, doing the best they can to meet the Ivory Tower's standards of "beauty"...well that's just mean. And it doesn't have to be that way. "Beauty" doesn't have to be Ruby's "Enterprise".

JRuby doesn't mean Java any more than MRI means C, Ironruby means C#, or Rubinius means C++ and LLVM. JRuby, like the other implementations, is a tool, an enabler, an alternative. JRuby does many things extremely well and others poorly, just like the other implementations. It's bringing new people into Ruby, and for that we should be thankful. It's pushing the boundaries of what you can do with Ruby, and for that we should be thankful. It's not about Java...it's about learning from the successes and mistakes of the past and using that knowlege to push Ruby forward.

So what do we do about JRuby users that start writing Java code in Ruby? We teach them. We help them. We don't slap a scarlet J on their chest and run them out of town. What do we do about shim layers over Java libraries? We build a layer on top of that shim that better exercises Ruby's potential, or we help build a new wrapper to replace the old. That's what Nick Sieger did with Warbler. That's what the Happy Campers are doing with Monkeybars and Jeremy Ashkenas did with Ruby-Processing. More and more people are recognizing that JRuby isn't a threat, doesn't represent the old world, doesn't mean Java...it means empowerment, it means standing on the shoulders of giants, and never having to leave Ruby.

I guess what it really comes down to is this:

The next time someone tries to cut down JRuby, tries to convince you it's a bad idea, to avoid it, to stay away from the evils of Java; the next time someone tears into a library author who hasn't learned the best way to utilize Ruby; the next time someone complains about a library that doesn't lend itself to reimplementation on the C-based implementations, doesn't hide the fact that it's wrapping Java code; the next time someone tries to convince you that JRuby is going to hurt the Ruby community...you tell them to remember this:

JRuby is not going away. More people try JRuby every day. As long as Rubyists who know "the way", who have learned how to create beautiful APIs and DSLs, who serve as the stars, the leaders of the Ruby community, setting standards for others to follow...as long as those people try to marginalize JRuby, treat it like a pariah, or convince others to do the same...

Tuesday, September 02, 2008

I'm just reading through the comic and thought I'd jot down a few things I notice as I go. Take them for what they're worth, high-level opinions.

Browsers single-threaded? Maybe 10 years ago. I routinely have a CPU-heavy JS running in Firefox and it doesn't stop me working in other tabs.

Threading is hard. Let's go shopping. Or use processes. So now when I have 50 tabs open, it won't just be 50 idle tabs, it will be 50 idle processes all holding on to their resources, not sharing, not pooling. Hmm.

The majority of sites I view don't have copious amounts of JS running constantly. In response to events, sure. On load, sure. But not sitting there churning away. So a process per tab for pages that are rendered once and executed once seems a little wasteful.

Browsers eat more memory because of memory fragmentation, eh? Memory management is hard. Let's go shopping and use processes. Or use a compacting memory manager, eh? What year is this?

Even without GC it's not difficult to use opaque handles instead of pointers so you can juggle memory around as needed. Perhaps this is the over-optimization of the C/C++ crowd still living large. "I can't afford to dereference ONE MORE POINTER to get at my data! I've got to be direct-fucking-to-the-metal!"

Not particularly looking forward to seeing 50 Chrome processes in my process lists. The ability to see individual pages as processes and kill them separately does sound intriguing; not sure it's worth every damn tab being a process though.

Fuzz testing for browsers is a great idea. Zed Shaw and I talked about doing something similar for Ruby implementations...a sort of "smart fuzzing" that sends them parsable but random input. I don't know why more testing setups don't fuzz.

It will be interesting to see V8 versus TraceMonkey. Hopefully my minders will see there's some value in revisiting Rhino and bringing it up to date (it hasn't had any major work done in a long time, and could be a *lot* faster).

Pattern-based code-generation behind the scenes is also good. Kresten Krab Thorup demonstrated something similar with Ruby on JVM where instance variables (and I think, method dispatch tables) could be promoted to temporary classes with real fields at runtime, making them a lot faster. I've prototyped something similar in JRuby, but until Java 7 it's expensive to generate and load lots of throwaway bytecode.

Ahh, now they start talking about using better garbage collection technology. Perhaps the folks who decided processes were the only way to efficiently manage cross-tab memory shoulda had a V8?

Dragging tabs between windows sounds pretty cool. Except that I actually find multiple browser windows to be a nuisance most of the time (and no, not because I can't drag tabs...because I generally browse everything full-screen and would have to search for the right window). Besides, I can drag tabs in Firefox already. I suppose the benefit in Chrome is they wouldn't have to reload. Do I care?

I just dragged the Chrome comic to another window to test it and then spent 30 seconds trying to figure out where I dragged the Chrome comic. Multiple browser windows = fail.

Finally someone makes local browser caching smart! I'd say probably 50% of the pages I visit never ever change, like online documentation. I never want to have to go to the net to view them if I don't have to, and if I'm not connected...dammit, just give me what's in the cache!

On the other hand, there's privacy concerns too. Hopefully I can have full control over what is and is not cached and an easy way to flush it. I can imagine someone borrowing my browser for a moment and stumbling onto a non-public page I've cached.

Opening an MRU page in a new tab is fine, but I'm almost always using the keyboard to open a tab. Hopefully this isn't going to get in the way of my immediately typing a place to go, which is usually why I open a tab.

Ahh, there we go, an "incognito tab" for private browsing. Seems like a good way to go.

Not sure I'm seeing the keyboard angle well-understood here. Hopefully that's not the case. As much as possible, I avoid the mouse, even when browsing. Attention to the keyboard crowd would go a long way, especially in geekier communities. Example: keyboard support in most Google apps is bafflingly arbitrary. Maybe that's a limitation of JS or the "old world" browsers. Maybe not. Still baffling.

I hope there's going to be support for Java in Chrome. Seems like it would be a big win, since it already can share some data across processes (like the the class libraries), misbehaving applets wouldn't impact the rest of the process (which is admittedly a good case for process-based isolation...maybe use the presence of heavy JS, plugins, applets as the indication to process-isolate?), and I'd wager Google could do a cleaner job of integrating it (without it being intrusive) than other browsers have so far. Plus since plugins are already being offloaded to a separate process, that could be a single JVM kept warm (or separate JVMs, for more memory use but better isolation and sandboxing). Java ought to be one of the best-behaved plugin citizens...done right.

I wonder how long it will be until Chrome gets Google API updates (like Gears) before the other browsers. I don't buy the "we just want to make the user experience all unicorns and lollipops" thing. There's a business motivation for Chrome. More ad exposure? A first-class deployment for Google APIs so more people write for them so more people use Chrome so ads are easier to channel?

Ahh yes...we want them to be open standards. Maybe they will, maybe they won't. If they don't...here's where it gets squirrely...it's open source! Anyone can take what they want and put it in their own browser, right? Yeah, you and the other 100 plugins that only work in one type of browser. How quickly do they get gobbled up by others? You do realize it takes some dedicated resources to "take what you want" and maintain it, not to mention keeping it up to date with the original. A commitment to always supporting all browsers seems to be much harder once you've made a monetary commitment to building your own browser. Left hand, meet right hand.

I won't ask the "why didn't you just help Firefox" question, since it's obvious and there's a million reasons why someone starts a new project. But I will ask why this project to "help all browsers become more powerful" is sprung on the world a day before the beta. There's a desire for exclusivity here or I'll eat my hat. Unicorns and lollipops would have required opening the project months ago, so "all browsers" could benefit from it *as it progressed* and contribute *as it progressed*. Open is not "open once we're ready to beta our product because we think it's nearing completion", it's "we're working on it now and want everyone to benefit from it as we move forward."

Overall I'm sure it will be a really excellent browser that I may or may not use. Partially because Google's client app support is basically nonexistent for OS X (how long has Google Talk been promised for OS X?), and partially because...I dunno...I feel better using a browser not designed and controlled by an ad-funded megapower. I'd rather not allow Google to control the vertical AND the horizontal. Not that I have anything against Google in general. This blog is hosted on Google-powered Blogger, I use Google for search pretty much exclusively, and I host services for my personal domain headius.com on Google as well. But it seems disingenuous to say Chrome is supposed to "help everyone" and yet nobody gets to see any of it until they're in beta. I guarantee Firefox folks would have started integrating portions as soon as the code were opened up, which of course would have taken some steam out of the eventual announcement. Try to look past the fireworks and bluster, folks.

And I reserve the right to completely flip any of these opinions after the beta is released this evening...though I probably won't, since I can't run it (Windows only).

Update: I borrowed a friend's Windows machine to give Chrome a 15-minute try. Here's my additional 15-minute thoughts, so take them at face value.

I hate installers that download additional stuff. When my friend and I first downloaded, we proceeded to walk away from the interwebs for some offline fiddling. Only then did we discover we didn't have the whole thing.

Love the interface. It's almost too clean. Unfortunately I can guarantee I'd immediately clutter it up with bookmarks I need (want) one-click access to. Such is life. But I like starting from a blank slate first, rather than starting from a cluttered mess.

Very fast, true to form. It also feels snappier than Firefox, but Firefox isn't known for it's blazing speed. Maybe feels faster than Safari. Of course, young products are always fast.

Not quite a process per tab. It seems like tabs manually opened and presumably tabs opened from bookmarks do get their own processes. Tabs opened via right-clicking on the link and choosing to open in a new tab stay in-process. That's a reasonable way to reduce process load, since a good portion of the tabs I open are from existing pages like Reader or News. Unfortunately, this also means that a good portion of the tabs I open are not subject to the sandboxing or isolation touted as a key feature of Chrome.

The developer pages list a Mozilla Java plugin wrapper among the included technologies. Yay! I did not get a chance to try it out (Windows rapidly started to piss me off again).

I picked a tab at random to forcibly kill and the entire browser disappeared. I guess I picked the right one.

This is a little worrisome:

All told it's about what I expected. Very clean, very polished, very young. I'm sure a lot of these issues will get shaken out during the beta. I do hope there's a way to turn off a tab-per-process, or I can't see myself ever wanting to run Chrome. I can see myself gathering several dozen Chrome processes in the course of a week. Process isolation for other aspects (like JS or plugins), no worries. I'm looking forward to an OS X version, and from looking at the Chrome developer pages it sounds like that isn't too far off. Perhaps marketroids pressured the team to get out a Windows version first, so they could make some headlines. Damn marketroids.

And as regular readers of my blog will tell you, I can be a bit salty about young up-and-coming technologies with a chip on their shoulders. Ignore that.