So at one point I related to him the extended version of my plans for speeding up dynamic dispatch. I relate it now to you to hopefully spur some discussion. There are no original ideas anymore, right? Or are there?

The initial experiment with Fixnum basically was the static version of my fuzzy vision. During parse time, a very trivial static table mapped method names to numbers. I only handled three method names, +, -, and The ultimate vision, however, is much grander.

The general idea is that during dynamic dispatch, we want to use the method name and the class as keys to locate a specific method instance. The typical way this is done is by keeping a hashtable on every class, and looking up methods by name. This works reasonably well, but each class must have its own table and we must pay some hashtable cost for every search. This is compounded when we consider class hierarchies, where the method we're looking for might be in a super class or a super class's super class, ad infinatum.

Retrieving the class is the easy part; every method we want to dispatch will have a receiver, and receivers have a reference to their class.

So then, how do we perform this magic mapping from M method names and N classes to a given method? Well hell, that's a matrix!

So then the great idea: build a giant matrix mapping method names (method IDs) and classes (class IDs) to the appropriate switchable value. Then when we dispatch to the class, we use that switchable value in a neatly dense switch statement.

Let me repeat that in another way to make it clear...the values in the matrix are simple "int" values, ideally as closely-numbered (dense) as possible for a given class's set of methods. In order to dispatch to a named method on a given object, we have the following steps:

Get the method ID from the method (likely calculated at parse or compile time)

Get the class ID from the class (calculated sequentially at runtime)

Index into the matrix using method ID and class ID to get the class-specific method switch value

Use the switch value to dispatch (via a switch statement, of course) to the appropriate method in the target class

Ignoring the size of this matrix for a moment, there are a number of benefits to this data structure:

We have no need for per-class or inline method caches

Repopulating the switch table for a given class's methods is just a matter of refilling its column in the matrix with appropriate values

Since the matrix represents all classes, we can avoid searching hierarchies once parents' methods are known; we just represent those methods in the matrix under the child's column and fast-dispatch to them as normal

We can save off and reconstitute the mapping to avoid having to regenerate it

...and there's also the obvious benefit: using the switch values stored in the matrix, we can do a fast dispatch on the target object with only the minimal cost of looking up a value in the matrix. An additional benefit for C code would be simply storing the address of the appropriate code in the matrix, avoiding any switch values completely.

Now naturally we're talking about a big matrix here, I'll admit that. A fellow by the name of "Defiler" piped up and said an incomplete app of his showed 1147 unique method names across 1258 classes. That works out to a matrix with 1.4M elements. In the Java world, where everything is 32 bits, that's around 6MB of memory consumed just for method mapping. Acceptable? Perhaps. But what about using a sparse matrix?

A sparse matrix attempts to be as efficient as possible when there are large numbers of "zeros" in the matrix elements. That is, interestingly enough, exactly what we have here. The vast majority of entries in our method-map would contain a zero, since most methods would not exist on most classes. Zero, as it turns out, would neatly map to our good friend "method_missing", and so dispatching to a method on a target class that doesn't implement it would continue to work as it does today. Even better, method_missing dispatch would be just as fast as a regular method dispatch, excluding the actual method_missing implementation code, of course.

So then, there it is...a potential plan for fast dynamic method invocation in JRuby (and perhaps in Rubinius, if Evan agrees the idea has merit). Any of you out there heard of such a thing? Is this an old idea? Are there problems with it? Share!

This week (what's left of it) I'm spending on performance again. It's officially a holiday for Sun, so I'm not technically on the clock. I could even sleep the entire week and pick things back up on Tuesday.

Yeah, right.

One of the oddities of working on JRuby while working at Sun stems from the fact that I was working on JRuby for fun for the past two years. Now that this is the full-time job, it's even harder to pull myself away from it. I thought perhaps making JRuby my job might take away some of the attraction. In reality, it's just made a good thing better, since I can spend long hours on the hard problems I would never have tackled before.

So instead of stopping work completely, I'm shifting gears a bit. Instead of the heavy Rails focus we've had over the past month, I'm hitting other "fun" stuff instead. Compilation, interpretation performance, and so on. We know there's still lots of fruit to be plucked from the performance tree, and of course any additional work/research on compilation will help that eventual goal.

Compiler Refactoring Underway

As far as compilation goes, I've started committing a refactored compiler; it separates AST-node-walking from code generation, so the backend could be swapped out with a YARV generator or future compiler revisions. I think that's needed to allow the compiler and the AST to evolve independently, since I believe there will be many possible compile targets and potentially many AST changes in the future. Nothing too fancy there, but it's a bit more readable so hopefully others can also contribute.

Interpreter Enhancements and Fixes

On the interpretation front, there are a few items I've been working on.

1. Speeding up method invocation and block management (committed)

Ruby's AST represents method calls that take a block by putting the block first in the AST. You encounter an "iter node" that points at the "call" with which it is associated. So the typical way you evaluate those nodes is to create the block object and then visit the call. Unfortunately, the call itself may lead to other calls with their own blocks, for evaluating the arguments or receiver.

The end result of this node ordering is that every time a method is called, the evaluation of its args and receiver has to juggle the current "block available" status around, so the block doesn't get consumed before it's needed. Because the block gets created early, we have to push it and the "block available" status onto a stack. This is the primary reason the block logic in the interpreter and ThreadContext is so complicated.

An example would help illustrate what's happening. For the following code:

foo("hello") { 1 }

The parser generates the following hierarchy of AST nodes:

IterNode[] <= this is the block NewlineNode[] FixnumNode[] 1 <= this is the Fixnum 1 inside the block FCallNode[] foo <= this is the call to foo ArrayNode[]: {StrNode[]} StrNode[]"hello"

The node associated with the block is encountered first, so we construct the block then. We move on to the "foo" call, and the block is consumed. No problem, right? However, here's a more complicated example that illustrates the trouble with this AST ordering:

foo { 1 }.bar { 2 }.baz(hello { 3 }) { 4 }

Here things are a bit more interesting. The parser produces the following output:

So what actually happens here? First the "4" block is encountered, instantiated, and pushed onto the block stack. Then we proceed to the "baz" call. Unfortunately, the "baz" call has both a receiver and arguments, so we have to hide the "4" block to prevent it being consumed. We proceed to evaluate the receiver for "baz", encountering the "2" block. The "2" block is instantiated and pushed down, and we move on to the "bar" call. The "bar" call has another receiver; we evaluate that, encountering the "1" block and the "foo" call it's associated with. The "foo" call consumes the "1" block and returns a receiver for "bar". The "bar" call consumes the "2" block and returns a receiver for "baz". Now the "baz" call has to evaluate its arguments, so the "3" block is created and consumed by the "hello" call. Finally, with a receiver and args, we can call "baz" and consume the "4" block.

Confused? Me too. Why would you order it this way? Perhaps it's to ease parsing, or perhaps there's some reason I don't know. However, I believe the following ordering is much simpler (and I know it makes interpretation easier):

The advantages here should be obvious. We encounter the blocks in order, so there's no stack juggling involved. Because there's no stack juggling, we don't have to "hide" blocks as we evaluate receivers and arguments. Finally, because we know we'll only encounter blocks for methods that require them, there's no additional overhead for methods that don't need blocks. It's a good change, and I would love to understand why Ruby uses the more complicated AST structure instead of this.

I have already committed a change to reorder the way these AST nodes are handled. The AST itself is unchanged, but the visit to a given block (IterNode) just sets that node into the associated call and proceeds. The calls themselves are now responsible for creating an associated block (if necessary)...*after* receiver and args have been dealt with. This means two things: calls that don't accept blocks don't pay any block-manipulation penalty; and calls with peripheral blocks (for finding args or receivers) don't pay any block-manipulation penalty either. Only the calls that need blocks have to deal with them.

Eventually this will become an AST change, but this short-term fix resolves 90% of the interpreter goofiness right now. This change will also eventually mean the iter stack (and potentially the block stack) disappear too. Huzzah!

2. LoadService fixes, enhancements, and optimizations (committed)

LoadService cleanup and improvements are well under way. LoadService is responsible for "load" and "require" calls and does all the searching for files and management of loaded extensions and libraries. Unfortunately the existing heuristic was both broken and terribly inefficient.

Ruby's 'load' behavior is easy enough...just look for the exact file and execute it in the current runtime. Ruby's 'require' however has a bit more magic to it.

If you specify a full filename to 'require' it will use that filename to load either a source file or an extension, depending on whether you specify ".rb" or ".[so|o|dll|etc..]". If you do not specify an extension, it will search for .rb, .so, etc in turn until it finds something, If it finds nothing, that's a load error. Here's the "ri" doc for MRI's "require":

--------------------------------------------------------- Kernel#require require(string) => true or false------------------------------------------------------------------------ Ruby tries to load the library named _string_, returning +true+ if successful. If the filename does not resolve to an absolute path, it will be searched for in the directories listed in +$:+. If the file has the extension ``.rb'', it is loaded as a source file; if the extension is ``.so'', ``.o'', or ``.dll'', or whatever the default shared library extension is on the current platform, Ruby loads the shared library as a Ruby extension. Otherwise, Ruby tries adding ``.rb'', ``.so'', and so on to the name. The name of the loaded feature is added to the array in +$"+. A feature will not be loaded if it's name already appears in +$"+. However, the file name is not converted to an absolute path, so that ``+require 'a';require './a'+'' will load +a.rb+ twice.

There's also the issue of what paths to search. Ruby searches the current directory first, followed by custom load paths, site_ruby dirs, and ruby/1.8 dirs. This allows a number of mechanisms for overriding more general locations with more specific ones for particular uses.

JRuby adds a new wrinkle here: classloader resources. JRuby supports JARing up source files and loading them through Java's classloader mechanisms. This is how the JRuby applet and the "complete" JRuby JAR work: they simply include all Ruby source into the archive. So then for us the classloader/classpath represents an additional path to be searched.

The primary problem with JRuby's load heuristic is that it ended up searching the classloader far too frequently; and in many cases it searched it multiple times for files that could not exist, such as for complete absolute paths (starting with '/', which *never* works for classloader resources). A second problem with the heuristic is that it would try all locations for all extensions, so for example it would search for xxx.rb everywhere possible, then xxx.rb.ast.ser (our serialized AST format) everywhere possible, then xxx.so (used internally for extensions...though that may change), and so on. The result of this is that any extensions or serialized scripts went through the full monty of searches before being found on a second or third pass.

These two issues were compounded when running JRuby with a very large classpath. Because classloader resources can be expensive to search, our loading became linearly slower in relation to the number of JARs (or perhaps the size of jars) included into the JVM. More JARs, slower classloader resource searching, slower startup.

The fix for these issues is twofold: search a given load location for all filename extensions before moving on, and only search the classloader as a last resort for filenames likely to be found there. It's fairly simple to explain, but the LoadService code had been endlessly hacked and rejiggered over the years. The cleanup was 90% of the battle; the fixes were considerably easier.

A final issue with LoadService, which is now fixed, was that it allowed require to include files with no extensions at all. This is not correct Ruby behavior, and so now those files can't be loaded. This actually caused a bug a long time ago where the extension-free "rake" startup script was being loaded before the "rake.rb" library file it tried to locate. The result was an eventual stack overflow as the file tried to continually load itself. Goofy behavior we should never see again.

3. Speeding up dynamic dispatch

I'm also doing some experimental work to speed up dynamic dispatch. Currently, methods are located by name, looking up a callable object out of a big hash on a per-class basis. This works reasonably well, and there are caches to speed the process, but with all the recent performance work this search has started to become the new bottleneck.

The eventual callable objects looked up have another flaw: they make Hotspot optimization more difficult. Because they're behind an ICallable interface, because they have multiple levels of logic as well as pre/post-call setup and teardown, and because there are many different implementations of ICallable, they end up slowing execution down significantly.

Hotspot is really an amazing piece of work. For the vast majority of Java code, it's able to unroll loops, inline invocations, dynamically optimize conditionals and switches, and generally improve the speed of code by a drastic amount. When running in "server" mode, where Hotspot lets code run longer in interpreted mode before optimizing it, even greater improvements can be seen. For example, A fib benchmark--recursive and iterative--under Java 6 client and server VMs:

(best times shown)

client recursive:8.551000

client iterative:17.995000

server recursive:5.008000

server iterative:13.191000

MRI recursive:1.670000

MRI iterative:16.964403

These numbers aren't bad, really. We even beat MRI for this trivial benchmark when we start hitting Bignums heavily (Java's BigInteger implementation is quite a bit faster than Ruby's). However, we can certainly do better.

There are two experimental changes I'be been working on. The first eliminates the pre/post method setup for core libraries when that setup is not necessary. I call it "fast invocation", and it's applicable to a large majority of core class methods. For example, with just fast invocation, the recursive fib numbers above drop to the sub-5s range. This change is perfectly safe, since it's just eliminating interpreter overhead that would otherwise be wasted cycles. You can expect to see it included in JRuby soon.

The second change is the holy grail of dynamic invocation: eliminate, to the greatest extent possible, the overhead of looking up and dispatching to a given method. In short, make it as close to a simple static dispatch as possible. This is where the real speed gains in JRuby will start to show up.

I have some experimental code right now, focused on the fib benchmark, that is both safe and drastically improves performance. It's sub-par code at the moment, but it does produce results like this:

recursive before:5.008000

recursive after:3.864000

Now of course, this is still interpreted. The same change when applied to my experimental Ruby compiler produces a much more drastic effect:

compiled recursive after:1.550100

Now we start to see the value of eliminating dynamic-dispatch overhead. This is actually *faster* than Ruby's recursive fib, a feat that hasn't been accomplished by JRuby at any time in the past.

The trick to this is fairly simple. For common core methods which are known to be simple Java code, such as for Fixnum's +, -, and < implementations, I provide integer IDs. Within the Fixnum implementation there's a new implementation of "callMethod", our dynamic-dispatcher, which switches on these IDs. For methods it knows, such as the aforementioned +, -, and <, it dispatches directly to op_plus, op_minus, or op_lt, the Java implementations. This skips the lookup phase, the ICallable implementation, the ThreadContext manipulation, and the pre/post-method setup code completely. It's also perfectly safe, again, because all those pieces only waste cycles for simple methods like this.

Now one problem with a simple approach like this is that if you redefine Fixnum#+, that change won't be picked up. The simple Fixnum callMethod won't ever try a method search for methods it knows can be fast-dispatched. I resolved this in my experimental code by adding a "clean" flag to the Fixnum class. If any any point after its initial definition the Fixnum class becomes "dirty", e.g. if you add or redefine a method, the old, slow dispatch will come back into play. My simple version is too coarse-grained, killing fast dispatching for all methods if any of them are changed, but the principal is sound. I'm going to be exploring this the rest of the week, trying to find a more complete solution...but some good things are around the corner.

--

All told, it's been a productive few days since JavaPolis. I'm hoping to write up a JavaPolis recap soon, but I'm keen to use this holiday time to get some cool stuff done. You'll hear more after the first of the year...hopefully with committed code and additional benchmarks against JRuby trunk :)

Sunday, December 10, 2006

The first is part of the "University" sessions tomorrow. We're the middle hour of a three-hour bit on scripting languages for the JVM. During that session we're going to be focusing on practicalities like building a simple app, using IRB, and getting a basic Rails app scaffolded and working. We'll talk a bit about JRuby futures, but we won't demo any of the newer, crazier things like NetBeans or GlassFish.

The second talk is on Wednesday, and will cover some of the same material as the first but hopefully have different demos. We'll show the Agile Web Development with Rails v2's "Depot" application running in JRuby under WEBrick (hopefully GlassFish if some issues are worked out, but likely not), we'll demo some NetBeans Ruby support (try to fit in as much as Tor can finish by Tuesday afternoon), and if there's time to implement some of it, we'll do Rails + JavaEE stuff as well (maybe, maybe not, given that we only have two days left).

Both talks should be useful, but there's no way we can avoid some duplication, especially since many folks will only attend one of the talks.

Join us! We haven't decided on a location, and we're taking suggestions. These little meetups have been great fun in the past, so don't be shy and feel free to grab us after one of our talks to help figure out a good place to go.

Monday, December 4, 2006

Damian Steer, regular JRuby contributor, has taken the IRB applet and run with it. He's gotten readline working (history, line editing, tab-completion), added some fonts and colors to differentiate things, and even put in an intellisense-like menu for tab completion of method names.

Very cool stuff. Keep in mind also that this could be embedded in any app (like an IDE) to provide a really nice looking interactive console. Thanks Damian!

Saturday, December 2, 2006

Ashish Sahni has posted instructions for packaging a Rails app as a WAR on his blog, based on the work of a number of JRuby community members. I ran through his instructions, and have only two modifications to make:

When installing Rails, you still probably want to pass --no-ri --no-rdoc since rdoc generation is still far too slow.

Because JRuby has a classloading bug (not loading from context classloader) you'll need to put the rails-integration JAR in glassfish/lib.

I was only able to get the "properties" page to display, at rails/info/properties. A separate page that used ActiveRecord didn't successfully execute...but it was close enough to taste.

So then, there's two important points you should get out of this:

Rails in a WAR file will happen...soon

This has always been our goal, with every improvement and fix we made to JRuby. The holy grail of Rails deployment would be to "zip it up and deploy it" like Java app developers can do with WAR files. And that goal is just around the corner, thanks to the JRuby community. That leads me to the second point:

The JRuby community is growing fast and doing great things

This is especially exciting to me. Tom and I are only two folks, even with full-time license to work on JRuby and related projects. The only way Ruby can be successful on the JVM is with community cooperation, and the work toward WAR deployment exemplifies what can happen.

I'd like to thank the folks on the JRuby-Extras project who have been working on this: Robert Egglestone, Chris Nelson, Fausto Lelli, and also Ashish Sahni at Sun who put together a great walkthrough.

Wednesday, November 29, 2006

Tor Norbye is a programming machine on par with the legendary Ola of Bini. He's the one-man force working on NetBeans Ruby support, and his progress has been epic. Here's his latest screenshot and a short blurb about it:

In just over two months' time, Tor seems to be (in my opinion) on the verge of eclipsing every other Ruby editor/IDE out there. I've been using development builds of his stuff and it's really superb. A few features I've missed before that are now rapidly maturing in NB+RB:

Highlight usages: not genius, but smarter than dumb-as-a-post right now; great for local and block vars, getting smarter for methods

Clickable methods and variables: limited to method-scope for variables or file-scope for methods, but coming along very quickly...and even those limited scopes are way better than nothing at all. I don't know how many times I've wanted to Ctrl-Click a method in some giant Ruby file and go straight to the method def. Awe-some.

Inline refactoring: I've demoed this a couple times, but for local and block vars you can do inline renaming...essentially renaming all instances of the variable at the same time. It's really nice, looks cool, and represents the tip of the iceberg for refactoring capabilities.

There's plenty of other stuff that's cool, like the Navigator view of a file (shows classes, modules, methods, of currently-selected file), AST view (great for us Ruby language hackers that like to see the actual structure of things), and more.

Sunday, November 19, 2006

Now this is really cool. Tony Hursh, commenter on the previous "Advanced Rails Deployment" post, put together an OS X Application Bundle template that allows you to use the JRuby "complete" JAR file as the base of a typical OS X app. What does that mean? That means you just toss the complete JAR into this template, code up some Ruby code, and have a nice dock icon and menu bar like any other app. You can ship the entire app as a bundle, with JRuby as the built-in Ruby interpreter. Awesome.

Some pics showing the menu working like you'd expect and the JRuby logo as a dock icon:

I'm trying to be 100% NetBeans these days, and I'll be documenting tips and tricks as I learn them. Hopefully others going through the same exercise will find these tips and they'll help make the transition smoother.

Why am I making this move, you ask? Well, of course there's the whole fact that I work for Sun, but that's not the primary motivation; if a tool doesn't accomplish what I want, I'm not going to use it. But NetBeans has made amazing strides this past year. It's now not only better-looking and faster than Eclipse, it also includes includes a much, much larger set of functionality in the base download. And that download? 30-60MB smaller than Eclipse 3.2. That's pretty amazing.

Anyway, one feature I sorely missed was the "Quick Outline" in Eclipse, where Command-O (or Ctrl-O) brings up a fast search for members of the current class. It's pretty darn useful, and I really, really missed it.

However, Sandip Chitale to the rescue. He's created a "Java File Structure" module, which exactly duplicates the Quick Outline functionality. Thank goodness!

Oh, and you can find information about the modules in Help after they're installed, but to save you some confusion: The shortcut for File Structure is initially Cmd-Shift-S (Ctrl-Shift-S) and for Hierarchy is Cmd-Shift-H (Ctrl-Shift-H).

Friday, November 17, 2006

There's a lot of work going on right now focusing on various mechanisms for deploying JRuby-based apps. This article will summarize some of the work happening and why it's really, really important for the Ruby world.

Ruby-in-a-JAR

First, a little sideline into general Ruby embeddability work.

Over the past few days I've made modifications to enable running a Ruby app completely out of a single JRuby JAR file (Java ARchive). The major changes required were:

Add jarjar to the project to combine all dependency jars into a single archive, including jline (readline support), asm (compiler, other stuff in the future), bsf (scripting API), jvyaml and plaincharset (Ola's JvYAML library and supporting charset lib).

Add an Ant task to build the "complete" jar and include all Ruby standard libraries in the same archive.

Somewhat unrelated, a small patch for irb/init.rb to allow it to fail gracefully if it can't load locale-specific files from the filesystem.

So by running "ant jar-complete" you get out a single JAR file that contains a complete, working Ruby interpreter plus stdlib. For example:

Of course, you don't have to take my word for it. I've uploaded a copy of this jar for you to try yourself. The full archive is about 2900kb. That's suitable for embedding in just about any application, and in fact I used a stripped-down 1600kb version for the JRuby Applet. Note: this is JRuby trunk code and mostly experimental...but that's what makes it so fun :)

Now the main course: several folks have been working on several exciting deployment scenarios for JRuby on Rails apps.

The current "best option" for deploying Rails apps into production generally involves the HTTP front-end Mongrel. Although Mongrel is largely written in Ruby, it is very fast, largely because of its native C component for HTTP request parsing. It's also considerably more secure than CGI-based options, largely because of creator Zed Shaw's attention to detail. The typical Rails app will be deployed as a "pack of Mongrels", where the number of desired concurrent requests is multiplied by the number of independent Rails apps to determine a total number of processes. These processes must be managed, monitored, and respawned as appropriate, but the result is a fairly stable and scalable deployment model.

The potential here should be obvious. GlassFish, like other Java EE application servers, is extremely good at scaling up many concurrent requests across many independent applications; so good that many organizations deploy only a single appserver-per-machine and stuff it full of applications to serve. That means a single server, a single process to manage. GlassFish also supports clustering, which means you'll be able to hit the deploy button once and have your n-server cluster instantly start serving up Rails. But there's one last area that trumps all the rest:

That single app server can handle as many concurrent requests across as many independent Rails apps as you desire, scaling them across all the CPU cores you can throw at it.

That's right...no more N * M process management, no more zombie processes, no more immature tooling to manage all those servers and all those deployments. One tool, one server process, no headaches.

That's an extremely bright future, and we're almost there.

What Next?

Naoto-san is not the only one working on JRuby on Rails deployment options. There are a number of folks in the JRuby community approaching the same goal from different directions, using innovative techniques like servlets implemented in Ruby and Spring-based service wiring. The JRuby community sees the potential here and things are moving very quickly.

My work on Ruby-in-a-JAR will also play directly into this. Currently most deployment scenarios require Rails app files to remain "loose" on the filesystem, as with the current standard deployment model. However it won't be long before you can zip up your Rails app into a WAR file (Web ARchive) and deploy it lock, stock, and barrel to as many servers as you want.

These efforts combined will create, in my opinion, the most manageable, scalable, powerful Rails deployment model yet available...and it's just around the corner.

We've also launched into Rails compatibility work in earnest. I've created a wiki page on JRuby support for Rails that details the results of running Rails' own test cases. Long story short: we're looking pretty damn good.

JRuby on Rails is in the home stretch. And we're covering ground very quickly.

Wednesday, November 15, 2006

I thought I'd make a diversion from my usual JRuby activities this weekend and ping the Jython dev list. I'd been lurking for a month or two, seeing almost no activity other than the occasional email, bug report, or request for help. There were perhaps 5 emails in the last month. Not good.

So I reached out to see if anyone was actually listening and to offer whatever support I can provide. As it turns out, there's still a dev team and a user community on the lists, though development has slowed almost to a standstill. There hasn't been a release in years, and Jython is currently about 2x slower than current CPython, while only supporting Python 2.1 semantics in the widely-available release.

But there's a light. The existing team and users are very much interested in getting Jython going again, and from my examination of the code it shouldn't be too difficult for new devs to get involved. Jython also has a pretty good story for compilation to Java bytecode--better than JRuby currently--so it has a strong base to start from.

So here's my request to folks reading this post: If you're a Python fan and a Java developer, now's your time to show devotion to both communities. The Jython guys could use some help, and the project could certainly use some new blood. It doesn't matter to me if you're not a Ruby fan, or heck, if Jython ends up reaching CPython 2.5 compatibility before we get a JRuby 1.0 release out. I'd just really like the JVM Python story to have a happy ending...and I think Jython's long slumber needs to come to an end.

Jython is available on SourceForge, and the existing dev team are friendly, enthusiastic folks. Post this entry to your blogs, send it to your friends, print up flyers and hand them out at your local Python or Java User Group meetings. Jython needs some love, and now is the age of dynlangs for the JVM. Stop by and lend them a hand...I have.

Well friends, it's time for another episode of "Impossible Or Not!" Today's contender is Ruby in the browser, long desired but never achieved. There are front-ends to Ruby services, delicious Ruby-JavaScript libraries, and of course the ever-popular Ruby on Rails web framework. However, developers have been clamoring for something more.

IRB in an Applet

A long, long time ago in a web far away, there was born a bright-eyed new child named Java. Java found its first public uses in flashing, twirling, annoying buttons called Applets. It was also a bit slow, having just been released into the world. So the big bad public said "Java is slow and only useful for annoying buttons!" And so Java was branded the "slow annoying button" language.

But Java has grown up. It hid away from the public eye, dwelling in dark, dank servers and enterprises. It learned the value of five nines uptime and horizontal scalability. All the while, it improved its public face, preparing for a return home to its birthplace on the web.

Now Java has grown up. It has learned how to appease the enterprise gods while presenting itself to the web in beautiful, performant glory. And it has a new friend: Ruby.

Yes, JRuby can run in an applet. No, it's not that hard. No, the archive doesn't have to be this big (the applet above is about a 1.6MB JAR file, but it includes stuff it doesn't need). Yes, this means you could start writing stuff for web pages in Ruby. No, I'm not kidding.

Yes, Virginia, there is a Santa Claus.

Updated: I fixed the issues under Windows, so it should work for those of you that reported errors. Thanks for the heads up!

Tuesday, November 14, 2006

Thijs van der Vossen posts an interesting perspective on recommending Sun and how his opinion of the company has been changed by recent events. It seems that the truth is really getting out: Sun "gets it" and is rising again as a great innovator.

Monday, November 13, 2006

Yes, it's official. News is already starting to pop up around the net about Java's open-sourcing and especially about the choice of license: GPLv2. I must admit my jaw dropped when I first heard about this a couple weeks ago, and it was a hard secret to sit on. Not only is it open source...it's open source using the most vigorously open license out there.

I think the GPL is a great choice for Java. Not only will it be fully compatible with the vast range of GPLed software, but any folks hoping to release their own versions will be compelled to make their changes available as source. Say what you like about the GPL and its "virulence" or its "tainting", but for an open development platform about to explode in the open-source world, it's hard to say what license would be a better choice.

I'm proud to be at Sun surrounded by thought-leaders smart enough to see this is the right thing to do. Sun is BACK, baby!

Now I haven't generally agreed with James McGovern in the past. His hilarious post about how "Ruby isn't ready for the enterprise" was pretty ill-informed, though perhaps well-meaning. My primary issue with that post was that there's nothing about Ruby--the language, libraries and apps--that would prevent it from being perfectly suited to enterprise development. There are issues with the implementation, certainly, but I don't believe that Ruby necessarily has to equate with the C version. Enter JRuby...

JRuby is Ruby. It looks like Ruby, it acts like Ruby, it walks like a Ruby and talks like a Ruby. We aim for it to run Ruby apps and libraries; we hope for it to be as close to 100% compatible as possible. The fact that some of it is written in Java or that it runs on the VM-formerly-known-as-Java is wholely irrelevant; JRuby is Ruby.

I think it's been pretty well proven that the Java VM is well-suited for enterprise development. The majority of enterprise apps out there today are written for or being written for the JVM, and Sun's had whole teams of folks making the JVM run as well as possible for exactly those scenarios. There's no doubt about the JVM's enterprise capability.

So it should follow that Ruby on the JVM, in the form of JRuby, would inherit much of that enterprise-readiness. Does that mean McGovern is right? Should the Ruby community abandon YARV and Ruby 2.0 and the C impl for greener pastures (or in the case of threading, less green pastures)?

No. To do such a thing would be absurd. And there's a simple reason for this: Not everyone wants to run a full-featured VM.

Ruby in its current form has served its users well. It's an outstanding administrative language, great for text processing, network tickling, application scripting. It's even proven itself for small to medium-sized web applications using numerous frameworks, from Camping to Rails. Even more, it has shown its capability for targeted "enterprisey" tasks, like tying together services or generating code and components to be consumed by other systems. Ruby has done its job admirably, and that job isn't going anywhere.

I will fully admit that JRuby in its current form is probably not ideal for heavy command-line use. The minimal runtime that the C implementation starts up is a better fit for quick hit scripts, there's no doubt about that. And for many web deployment scenarios, the C implementation works suitably well, fulfilling its responsibilities without issue. Where McGovern is right is that JRuby is better suited to much larger applications, where scaling across multiple CPUs or multiple machines is an absolute necessity; where resources are quickly consumed by thousands of independent processes; where monitoring, management, and deployment needs can't be addressed by current pure Ruby or C-based options. In short, JRuby fills the medium to large application realm where Ruby has trouble venturing.

Of course I'd love to see JRuby become the best Ruby implementation. What would be the point of working on JRuby if that weren't an ultimate goal? And of course I have a love for Java and the JVM; they've proven themselves in my eyes, and continue to amaze me. But I want JRuby to be part of a larger Ruby world, where programmers run through flower-covered pastures holding hands, objects sing and swirl through the heavens, classes condense, evaporate, and recombine like vapor. Where programming is "fun", like it was when I started BASIC on my Atari 400 25 years ago. Where our time spent writing software produces results, rather than more problems.

None of those things requires Java or the JVM...they just require cooperation within the community and a desire to see Ruby succeed on all fronts. The question that remains, I believe, is this:

A number of JRubyists have recently started actively looking at the problem of deploying Rails apps as a WAR file. Some have working prototypes as well. However the most interesting development is that a number of them have joined the jruby-extras project on RubyForge to combine their efforts. I'll be helping to oversee their progress, but this is a perfect example of a community-driven project. I don't doubt they'll make great progress.

RMagick will soon have a full-featured equivalent for JRuby. Tom Palmer has been working on a Java-based RMagick for some time, and now has a version of his RMagickJr that can render some basic Gruff Graphs. He's been in communication with the RMagick creator, and it's likely that we'll start to see gems available soon. Tom's work will help ensure that Rails apps using RMagick for image processing can work seamlessly under JRuby as well. RMagickJr is also hosted in the jruby-extras project.

Ola Bini has been continuing his quest to bring the openssl library to JRuby. He says he's getting very close to having a working library, and it's been a long, hard road. Full support for openssl will mean all Ruby libraries that depend on it will work without modification on JRuby. It's quite an effort, and Ola deserves a lot of credit for making it happen.

At RubyConf 2006, Matz and Koichi made the announcement that Thread.critical was very likely to go away in Ruby 1.9.1/2.0. The reasons for this are simple, and well-known to the JRuby project: Thread.critical is incompatible with native threading. My own implementations of Thread.critical have ranged from a very strict version which frequently deadlocked to the current version which only enforces critical sections in a very loose sense. I am very pleased to hear that critical will go away, but that doesn't help us now. However, there's hope. MenTaLguY has recently taken on the challenge of implementing a fast Mutex for both the C and Java versions of Ruby (MRI and JRuby, respectively). As I understand it, the current C implementation he's built exceeds even low-level Thread.critical performance, and we both agree that a Java version should be extremely easy to construct using Java's built-in synchronization capabilities. A fast Mutex is the first step toward moving people off Thread.critical...and saving me doing yet another doomed reimplementation of it in JRuby.

Other news:

Tom and I spoke with some folks from the HotSpot VM team last night, and it was an extremely helpful discussion. We talked about Ruby's language design and quirkier features, the future of dynlangs on the VM, compilation and optimization strategies for dynlangs, and the current roadmap for JRuby development. Bottom line: everything we're doing is right, and if we keep on this course we'll rapidly approach their notion of an optimal Ruby implementation. We also agreed there's very little about Ruby that couldn't be compiled straight down to Java code. It was great vindication to hear that our "best guess" strategies for slowly redesigning, refactoring, and compiling JRuby are all on the right track. It was also great to hear that our confidence in the JVM has not been misplaced: it IS an excellent VM for dynlangs, and JRuby should eventually perform extremely well. The future of Ruby on the JVM is looking great.

I will be presenting JRuby again today at the Twin Cities Code Camp, as one of the few Java-based presentations (the rest being primarily .NET-related). I guess that's all there is to say about that...it's going to be a condensed version of the Gateway JUG presentation with fewer walkthroughs and a much shorter overall time.

Tom and I are also scheduling our trip to Europe in December. We'll be in Prague from the 5th to the 9th to meet with the NetBeans development team; in Antwerp the following week for JavaPolis; and in Rotterdam on the 19th for my JRuby presentation at Finalist. It remains to be seen if I'll spend some of the holidays in Europe or if my wife will join me, but if you'd like to propose any speaking engagements that could keep me in the Old Country, certainly let me know :)

Thursday, November 9, 2006

It doesn't look like the poll has been hit too hard, but JRuby seems to score much better than I would have expected, given our relatively recent entry into the public eye. Groovy has garnered about twice as many votes, but of course it's been very public and available for several years.

Also interesting are the poor scores for Jython and Rhino. Jython is a great implementation, but it's pretty far behind at this point. I wish there were more resources to pour into it...perhaps soon. Rhino is also interesting...only four votes so far, even though it's really fast and being included in Java 6.

BTW, please don't go stuffing the ballots for JRuby or anything. I just figured I'd get this poll some exposure to see how things play out. It already looks really good for JRuby, and if you want to vote for Groovy or any of the others, be my guest.

Java 6 is really an incredible piece of software. Once again the Sun engineers have boosted performance--in some JRuby benchmarks by as much as 20%. Beyond that, they've include the first round of native platform scripting support in the form of JSR 223, a scripting API intended to replace and improve upon the Bean Scripting Framework. Java 6 also includes native support for Javascript in the form of Rhino, just about the fastest mainstream JavaScript engine available.

Honestly, give Java 6 a shot. I've been using it constantly because of the performance gains, and JRuby support for 223 is already available. This is the first Java release during my time at Sun, and I'm extremely proud to be a part of this team.

Last night I gave my 2+ hour talk on Ruby, JRuby, and JRuby on Rails to the Gateway JUG in St Louis, and it seems to have been a resounding success. We had extremely good attendance, pushing 50 folks from what I could tell, and everyone seemed to be very excited for the possibilities of Ruby on the Java platform. Many folks told me they were going to be looking into Ruby and Rails for future development, and others promised to contribute to the project however possible.

It was a great experiment for me to do this presentation, primarily because it was one long demo. There were a few slides to bridge things together, but ultimately I spent two hours typing into IRB, vi, and bash to demonstrate Ruby, JRuby, and building and running a simple JRuby on Rails app. There were a few glitches (I forgot a few metaprogramming methods, and my new migration initially failed because I ran it on an already-migrated database) but I managed to recover from everything and get all my demos across. I think the live walkthroughs coupled with a very enthusiastic and interactive crowd made a fun presentation for folks in attendance.

Afterwards I had a few beers with the locals and we shared our war stories about Java application development and excitement for a Ruby-filled future. Judging by their reactions to the talk and the stories they related, I think this JRuby thing is poised to really take off.

I've uploaded my Gateway JUG JRuby slides so you can see and share them, and Alex Miller from BEA took a transcript of the JRuby live walkthroughs.

Friday, November 3, 2006

My former employer, Ventera Corporation, is still looking to fill my old position. I was their lead/senior Java EE architect, in charge of development of EE applications for their USDA Food and Nutrition Service contracts. I worked at FNS's downtown Minneapolis office, though a large part of the development team resides in Virginia at Ventera HQ.

Update: BTW, I'll be interviewing you, if that's any motivation for or against you applying...

Wednesday, November 1, 2006

Well the arrangements have all been made for my trip down to St. Louis for the November Gateway JUG meeting. I'll be presenting the usual fare, JRuby and JRuby on Rails stuff, except this time there's a few twists.

Twist One: I'll be presenting for 2+ hours

Yes, this one has me just a bit nervous...not because I think I'll have a problem presenting for that long or because I don't think I'll have enough material, but because I want to keep people awake the whole time. I've been told I'm a pretty good presenter, but even the best presentations can drag on after a while. Which leads me to the second twist:

Twist Two: I'll do 80-90% of the presentation manually

Yes, that's right...I'm going to be using IRB, command-line tools, IDEs, browsers, etc for most of the presentation. I think I've got my typing speed up to a comfortable enough level to keep people interested, and it's quite a bit more fun to watch things happen live than to see pictures or read slides. Plus it will be a unique challenge to do what basically amounts to a two-hour demo. You know me and challenges.

And a rough overview of the talk (yes, this is the short, short list. I've got two pages for the long list):

Intro to Ruby and JRuby

Interactive demo of Ruby's major features

What JRuby adds

Interactive demo of JRuby

Break

Intro to Ruby on Rails

What JRuby adds

Building a simple Rails app

JRuby status and future

Conclusion

If you're going to be in the area on November 7, come watch me battle demo gremlins for two solid hours. It should be rife with action and adventure, love and laughs, guts and glory. Or at least you'll get to see both a JRuby and a JRuby on Rails presentation in one fun-filled night for free. That's worth it, no?Update: There's an event flyer now as well.

Tuesday, October 31, 2006

I spent today hacking on the Ruby to Java compiler and made some good progress. Here's the highlights:

It now parses a full script rather than just single method bodies.

The toplevel of the script is given its own method, and defined methods get theirs.

It uses MultiStub to implement the methods, so it will be faster than reflection.

It's more aware of incoming arguments, rather than assuming a single argument as in the previous revision.

It's generating a bit faster code, maybe 5-10% improvement.

MultiStub is our way of implementing many relocatable methods without generating a class-per-method or using reflection. We're using it today in our Enumerable implementation, and it works very well. Some numbers comparing our two reflection-based method binding techniques with MultiStub:

So simply switching to the MultiStub trims off around 20% for this benchmark. Removing the time to actually do 10M invocations of the block it comes closer to the 30% range. We're looking to start using MultiStub more in core classes. Anyway, back on topic...

What still needs to be done on the compiler:

I didn't implement any additional nodes, so it only handles perhaps 20% of them.

The toplevel method should define the contained methods as they're encountered. I'm wiring them all up manually in the test script right now.

It doesn't have any smarts for binding Ruby method names to the generated MultiStub methods yet.

It's a big leap in the right direction though, since you can pass it a script and it will try to compile the whole thing to Java code. Here's the results of the recursive fib benchmark with the new compiler (calculating fib(30)):

Time for bi-recursive, interpreted: 14.859Time for bi-recursive, compiled: 9.825

Ruby 1.8.5:

Time for bi-recursive, interpreted: 1.677

This was in the mid 10-second range previously, so this is the first time we've dropped below 10 seconds. This puts the compiled code around 6x as slow as Ruby for this benchmark, which is very method-call intensive. Still, it's a solid 33% improvement over the interpreted version...probably an even larger percentage improvement if we don't count method-call overhead. Now on to iterative results, which are very light on interpretation (calculating fib(500000)):

Time for iterative, interpreted: 58.681Time for iterative, compiled: 58.345

JRuby sans ObjectSpace support:

Time for iterative, interpreted: 47.638Time for iterative, compiled: 47.563

Ruby 1.8.5:

Time for iterative, interpreted: 50.770461

For the iterative benchmark we're still about on par with (or around 20% slower than) Ruby because there's no interpretation involved and Java's BigInteger is faster than Ruby's Bignum. When ObjectSpace is turned off (it's pure overhead for us), the iterative version runs faster in JRuby. Once we eliminate some method overhead, things should improve more.

Larry the Liquid has also posted a JRuby of Second Life Recap that gives readers a good idea what it was like to "be there" and has a few photos to show it off. I'm the one standing down by the screen, and though you can't really tell, I'm wearing a JRuby t-shirt.

If only it were as cheap and easy to print t-shirts in Real Life as it is in Second Life...

Thursday, October 26, 2006

As most of you know, I was recently at RubyConf to see the latest developments in the Ruby world and to meet up with others interested in Ruby's future. A lot's happened in the past year, and Ruby has become the big ticket language in many circles. Developers from all parts of the world are sitting up and taking notice of the little gem that could, and those of us who love the language couldn't be happier about it.

This was my third RubyConf. I have attended since 2004, when I first came to Ruby as a rank newb. Immediately after arriving at RubyConf 2004 I started poking around for a Ruby implementation on the JVM. That brought me to JRuby, where my old friend and coworker had taken the reins. The rest you know...and it's not hard to see how Ruby has completely changed my life.

RubyConf this year was an outstanding collection of presentations and personalities. The applicability of the presentations seemed considerably higher than in previous years. Where last year many presentations were demonstrations of "this cool app I wrote" or explorations of problem domains I'll never approach, this year's topics seemed to cover whole subdomains of the programming ecosystem. They ranged from alternative Ruby implementations (two total plus two bridges), to natural language processing and generation (at least two I can think of), to Unicode and i18n (Tim Bray's excellent presentation and many discussions that followed), to networking, testing, graphics, and more. A whirlwind tour of damn near everything you might want to do with a programming language.

Beyond the standard conference fare, there was RejectConf, an ad-hoc sub-conference for all those who had their presentations rejected (or who missed deadlines, like me). It was a multi-hour flurry of 5-15 minute presentations on all those OTHER topics not covered in the main track. I gave a five-minute demo of JRuby's growing capabilities and NetBeans' nascent Ruby refactoring features--to many oohs and ahhs and even one f-word.

And then there was the hallway track, by far the most interesting part of the conference for me this year. I joined with several other Ruby implementers and interested parties for a full-on "implementers summit" Friday night. I met up with Ruby dignitaries, JRuby enthusiasts, and late-night hackers. I shared my pains implementing Ruby with others and had a good cry over some of Ruby's trickier-to-implement features. It was eating, sleeping, and breathing Ruby for three solid days. And now I'm home again.

I will here call out the more interesting presentations, discussions, and developments from my RubyConf 2006 adventure. I hope you will find these events as interesting and exciting as I did.

The Presentations

I won't pretend I loved every single presentation. I was bored by several and fell asleep during one. But there were some real gems this year, and they show the face of Ruby to come.

TAKAHASHI Masayoshi -- The History of Ruby

Takahashi-san kicked off the conference perfectly with his presentation on the history of all things Ruby. From Ruby's Pre-Historic Age in the mid-90's to the pre-Rails Modern Age, all the important facts and dates were delivered in Takahashi's trademark style. Everyone in the room learned something about Ruby's past and the long road it has traveled, including the rational for why Ruby came into existence in the first place and the name-that-almost-was: Coral.

Evan Phoenix -- Sydney and Rubinius: Hardcore Ruby

Evan is the creator of the semi-controversial "Sydney Ruby", an ambitious attempt to bring full native threading to Ruby 1.8. Sydney's drastic code changes prevented it from ever being merged into MRI, but the experience led Evan to attempt a more ambitious project: Rubinius.

Rubinius is, simply put, a Smalltalk-like self-bootstrapping VM for Ruby. A small microkernel provides memory management, IO, and other native interfaces, and then actual VM features are all implemented in a customized subset of Ruby. The resulting VM can then load in full-featured Ruby code and execute it like the existing interpreters do today.

Evan and I had talked at length online about his plans for Rubinius, and we both discovered we weren't alone in understanding some of the darker corners of Ruby's C implementation. Rubinius is a project to watch.

This was my first time seeing Zed present, and I definitely agree he's very entertaining. Zed was at RubyConf to talk about RFuzz, a library for fuzzing HTTP services to test for security or stability issues. Fuzzing, for those of you who don't know, is generating a lot of random or pseudo-random noise (in this case, HTTP requests) and brutalizing some target service. Zed described how under fuzzed loads, many servers and libraries previously thought to be rock-solid crumble quickly and painfully. It seems to be in our nature to assume that all clients of our code will follow the rules we expect them to, even though we consciously know they won't. Fuzzing helps remind us.

It was also my first experience having Zed read my mind. Toward the end of his presentation, I raised my hand to ask about the possibility of using fuzzing against language implementations...to test parsers for robustness. Before I had a chance to ask, Zed talked about working on exactly that. He's planning to pull out the HTTP bits of RFuzz so it can be used as a general-purpose fuzzing tool, and being able to fuzz Ruby is already on his mind.

Saturday night Nick Sieger and I stayed up late hacking on stuff with Evan, Eric Hodel, Zed Shaw, and others. I had a chat with Zed about Mongrel-Java and we made some plans for how the future might play out. We agreed it would be best to just focus on getting 0.4 to work well and to explore the new Java support in Ragel. Zed also agreed it would be a good idea to provide a Mongrel gem that would work in JRuby. When Zed's ready to release 0.4, it's very likely there will be a new platform option: Java.

John Long - Radiant -- Content Managment Simplified

I mention Radiant not because it was a particularly stunning presentation or a particularly innovative application, but because it looked so polished...so businesslike. Other folks weren't terribly impressed, perhaps because it wasn't a lot of sound and fury. I was impressed for exactly the same reason: it's apps like this that are going to legitimize Ruby and Rails for prime-time business applications. We'll have to make an effort to get Radiant running in JRuby.

Tim Bray - I18n, M17n, Unicode, and all that

This was my second time watching Tim present. He does a great job of conveying a large amount of information in a very digestable way. His presentation helped everyone in the room understand a bit better how Unicode works, why it's important, and what Ruby and other languages can or should do to support it. I learned quite a bit myself.

Of particular interest were related events peripheral to the talk. Matz apparently was worried about what Tim might say, since so many folks are worried about Unicode support in Ruby and questions have been raised about Matz's m17n plan for Ruby 2.0. I think Tim was very equitable, however, laying out the facts and allowing Matz and everyone else to draw their own conclusions. Hopefully the community and the Ruby implementers will all take what they learned and run with it.

SASADA Koichi - YARV: on Rails?

I was pleasantly surprised by Koichi's progress on YARV. Getting Rails to run on an alternative implementation is no small feat, but that's exactly what Koichi demoed. Sure, it was just a simple scaffolded app, and much of Ruby's internals are still the same code under YARV, but it's excellent to see YARV starting to run real apps.

I was a bit disappointed, however, with the plan for native thread support in YARV. According to Koichi, the difficulty of native-threading Ruby will prevent full parallel thread support any time soon. Even when Ruby 1.9.1 with YARV merged is released in December 2007 (yes, that's right, over a year from now) it will only support one native thread running at a time. I understand Koichi has been given an almost impossible task, but I'd hoped that a year from now native threading would be real and robust.

Koichi did announce, however, that he would now officially have a fulltime job where one of his tasks is to work on YARV. He's got an office in Akihabara, and he sounded pretty excited about the future of his work. I talked with him a few times during the conference, and I think we're going to try to implement a YARV machine in JRuby. It seems to make sense, since Koichi has done much of the hard work determining an appropriate set of bytecodes and has a compiler already mostly working. Koichi is also excited about that possibility.

John Lam - You got your Ruby in my CLR!

John Lam is the creator of the RubyCLR project, a bridge from the Ruby runtime to the .NET CLR. He demonstrated how you can call CLR-hosted code from Ruby and implement interfaces that call back our of the CLR into Ruby code, much like JRuby supports today. He also demonstrated a very pretty Avalon-based Ruby console that works like IRB but with on-screen documentation and Intelli-Sense popup method completion.

John announced that he's been hired by Microsoft to work on their dynamic language initiatives--especially Ruby--so it seems he will be my counterpart there come January. john promised not to do evil, and he and I agreed we should not allow our differing employers to come in the way of providing world-class support for Ruby. I hope politics won't get in the way of innovation and creativity, as it so often does...and I believe that John agrees.

--

I'll check back in my next post with a summary of the implementer's summit and other excitement from the conference. This post is a bit long already :)

Wednesday, October 18, 2006

I should have tossed this in a blog entry earlier, but perhaps it's not too late.

I will be presenting JRuby to the Rubyists of Second Life this evening at 6PDT. Here's the announcement sent to Ruby-Talk, which has all the relevant info:JRuby at Rubyists of Second Life

I hope you'll stop by and join in the conversation. I'll have a set of slides, but more importantly I'll be available for discussions on JRuby, Ruby, Java, and their future together. I'm "Headius Exodus" in SL. See you there!

Although the numbers he shows for WEBrick seem *awfully* slow (4-6 seconds per request...we haven't been that slow since JavaOne running Rails in "development" mode) what's most interesting is the speed gains he gets from AsyncWeb: something like a five times improvement. If we assume there would be an improvement running a more recent JRuby (0.9.1 should be considerably faster than all previous versions) and running Rails in production mode...well, this thing starts to look real-world ready.

He has a link to a snapshot...I'm investigating that now and will update this post when I know more.

See also Takai's post about about JRuby on Rails running under AsyncWeb:

These sorts of things show real promise...AsyncWeb scales extremely well and is built upon Java's NIO library. A new contender enters the Rails front-ending competition!

Update: Ok, I've spent five minutes looking at the code, and it's cooler than I thought. He's got AsyncWeb and JRuby on Rails wired together using Spring, and it's a trivial amount of code to do it...like less code than WEBrick. I hope he's able to get a release out for this soon.

Friday, October 6, 2006

It's been approximately a year since I launched into my first redesign of the JRuby interpreter engine, and a lot has happened since then. We've gotten Rails working as well as improving compatibility to the point where most other pure Ruby apps just work. We've filled out the base set of extensions and have active projects to complete what's left. We've built up a very active and intelligent community, all of whom are helping by contributing bug reports and patches. And perhaps most importantly for the project, Tom and I have been given the privilege of working on JRuby full time.

But none of that was because of the interpreter.

JRuby Interpreter History 101

The original JRuby interpreter, written by the original authors (Jan Arne Petersen and others), followed a pretty standard Visitor pattern. The AST nodes were visited in turn, and on each callback appropriate actions were performed. It was a perfect demonstration of Visitor in action, but the additional calls required to visit each node in turn badly impacted the Java call stack. At its lowest, before I began my work, JRuby on Windows under Java 1.4.2 could recurse no deeper than 300 levels in a simple single-recursive fib algorithm. And this while Ruby itself could go into the thousands.

However stack depth alone wasn't a good enough reason to work on a new interpreter. There were various ways we could clean up the stack size without a complete rework. Unfortunately, there were two Ruby features that at the time meant a new interpreter would be necessary: green threads and continuations.

Green Rubies

Ruby currently supports only green threading. The threading model is very simple; one native thread, the main process, pumps the entire interpreter. Once threads come into play, a signal or timer is set up using OS-specific calls. A new thread means a new set of global variables; variables that point into the thread's call stack, variable scopes, and current execution point. The signal/timer, once initiated, triggers a potential thread switch every 10ms by setting a global switch. On certain boundaries during execution, that switch is checked, and if it is set a call is made into the thread scheduler to set up the next thread for execution.

In short, a 10ms timeslicing green thread scheduler. And I believe that thread context switching is *voluntary*...it would be possible in a C extension or through specific Ruby scripts to prevent context switching at all.

But I digress. Green threading does carry with it some benefits. Because you are not dependent on the operating system for initialization and cleanup of thread data, you can manipulate threads much more easily. Ruby supports operations that native threading libraries forbid, like killing existing threads or stopping them indefinitely. Now it would not be such a problem if these operations were used sparingly, but during our testing we found that many libraries depend on them. So then we had a dilemma: how to support unsafe threading operations with safe native threads?

The initial solution, and the one we currently have in place, works reasonably well but could certainly be improved. On the same boundaries that the C implementation checks for a context switch, we check a set of runtime-global variables and locks. Various "unsafe" thread events trigger those locks and variables to change state. Kill sets a kill flag; upon encountering it the native JRuby thread will throw a ThreadKill exception. Entering a critical section sets a critical flag on all threads; when they reach it, they'll freeze in their tracks until the critical thread has completed. For the most part, the Ruby code behaves as it does in MRI, though backed by native threads. It's not foolproof, but it's sneaky and good enough for now. Emulation and simulation. Subterfuge and legerdemain.

A New Hope

When I launched into my redesign a year ago, I hoped to enable a different solution. I wanted to build a purely stackless Ruby interpreter that could support continuations and green threads. Because it would still run on top of a native-threaded VM, I planned to also support m:n threading, so JRuby could scale through a full range of threading options. And even though it was a very large task, the basic building blocks were already necessary: continuations would not be possible while deepening the Java stack (without really nasty tricks), and so green threading came along for the ride.

The first step was to make the interpreter stackless. This I mostly achieved by hacking the existing visitor into individual pieces. Each method that had previously been called during visitation was encapsulated in an Instruction implementation. The state associated with the interpreter was put into an EvaluationState object. Each instruction, when called, would receive an instance of EvaluationState and an InstructionContext that was usually just the AST node being traversed. The visitor now became a factory for Instructions; as it visited the AST nodes, instructions were pushed down into an instruction stack in the EvaluationState object; the previous deepening of the Java stack being emulated via a soft stack. As traversal went deeper, both the visitor and the instructions themselves would push more instructions down onto the stack. Execution meant popping the top instruction, running it, and returning back to the interpreter loop. An instruction could affect the state of the runtime or change the flow of execution by conditionally pushing further instructions on the stack. It was a fairly clever way to directly translate the old code into a stackless design.

As of this past spring, perhaps 80% of the AST nodes were fully represented with what I called "collapsed" instructions. This basically meant that for much of the AST, execution could proceed without the Java stack deepening. The tradeoff was in the additional overhead of maintaining a soft stack in the EvaluationState and a considerably more complicated interpreter design. But it worked, and after several months of tweaking it was faster than the original. Even better, it greatly improved our maximum stack depth.

The next step, if had we taken it, was to begin the painful task of making the rest of the interpreter stackless. The design I had in mind would have called for all Java-implemented methods to trampoline; in other words, calling a method would mean that whether it made an additional call or returned, control came back to the caller. The caller, in that design, would have been the overarching ThreadContext. Execution of a thread would begin by pointing a new ThreadContext at a starting node in the AST. As the thread executed, the stackless interpreter would maintain the Java stack depth at the ThreadContext level. Calls into Java-based methods would dispatch back to ThreadContext for deeper calls, so that any deepening of the Ruby stack did not actually deepen the Java stack. At any given time, ThreadContext could be frozen; the native thread pumping it could then be passed off to another ThreadContext, or used for other purposes. And of course, continuations would have been trivial to implement with this model, since the current running state of a ThreadContext could be saved at any time.

Whew! It was an ambitious goal, but with the mostly-stackless interpreter in place we were well on the way. I even had designs for how to compile Ruby code into stackless Java methods...so stackless that you could even pull off a continuation in the middle of a call. Like magic!

A Fork in the Road

So what happened? Sometime around JavaOne we heard about the Ruby KaiGi in Japan, a Ruby conference or get-together of some sort. If RubyConf is the big conference for us Westerners, this at least provided a mid-year update for English-speaking Rubyists. Matz was there, Koichi was there, and I believe other Ruby dignitaries made the trip as well.

And then Matz and Koichi dropped the bomb: Ruby 2.0 would support neither continuations nor green threads.

As many of you know, Koichi's next-generation Ruby interpreter engine, YARV, has been coming along in fits and starts. Because of the desire to keep existing extensions working and because of a lack of resources to work on Ruby internals, YARV has had many difficult problems to overcome. How do you write a next-generation interpreter without actually changing how the runtime works? Is it possible to do it without a new threading implementation, a new memory manager, a new garbage collector? Can you keep all your internal APIs exposed to C extensions and successfully migrate to a new interpreter design? I think the answer is that yes, you can, but it's really, really hard. And along with the announcement about continuations and green threads came the YARV/Rite (Ruby 2.0 + YARV) beta timeline: no earlier than Christmas 2007.

So we were faced with a decision. Do we continue along the much more complicated path toward green threads and continuations, knowing that they will eventually be unsupported by the language? Is it worth the effort to support them now if they'll go away?

We stewed over this decision for a week or two. We had been making all the right moves toward a stackless design, and were on the cusp of launching into the next set of battles. But it was a painful process.

Pragmatism

Eventually we decided that under the circumstances, the Ruby and JRuby communities would be better served by having a solid, native-threaded, continuation-free implementation on the JVM. We were unable to find any use of continuations in the most popular applications, and it seemed that very few people had a good reason to use them. In addition, there was growing concern over Ruby's lack of support for native threads; so there too was a opportunity for JRuby to shine.

Very little of the interpreter changed over the summer. We worked furiously in our spare hours getting Rails running, and were able to present it in a primitive form at JavaOne. I started work on a number of traditional compiler designs for JRuby that showed potential gains of 50-75% over the existing code. And most importantly, our community, compatibility, and test library all began to grow rapidly.

Toward the end of the summer, it became likely that we would join Sun Microsystems as full-time JRuby developers. Sun had wisely taken an interest in Ruby along with other scripting languages, and because we had shown some of the potential for Ruby on the JVM, they asked us to join their team. And just four weeks ago, I became a Sun employee.

JRuby Now and Into the Future

So where has all this led? JRuby has been getting more and more attention from folks within Sun, Rubyists around the world, and especially from Java developers anxious to escape from their Java-only prisons. Our compatibility is increasing faster than before; we've had over a hundred new bugs reported in the past few weeks...almost all of them with community-contributed patches. We have added our first non-Sun team member Ola Bini, a star of the JRuby community who has proven his dedication to making Ruby on the JVM succeed. And we have started to solidify our short-term goals for the project.

The primary goal remains the same: JRuby should be as close to 100% compatible with Ruby 1.8 as possible. Today, we are doing extremely well in this department. Rails generally works without issues, and most pure Ruby applications run without modification. There's still plenty of edge cases to iron out, but we're moving very rapidly now. Getting Rails to work as well as it does is already a major achievement.

We have also started to iron out what a JRuby 1.0 release should look like. A few major points come up again and again:

Compatibility should be such that we can safely claim "Rails is supported"

Java integration should look like we want it to look for the future, and should be performant, lightweight, and seamless

All major codebase refactorings should be complete; this includes a solid design for wiring up Java-based method implementations, external extensions, and IO channels

Unicode should be supported out-of-the-box, giving Ruby code access to everything Java is capable of

Threading should work perfectly, both for JRuby-launched threads and for "adopted" threads from outside the runtime

Performance should be markedly improved

This last bullet may sound a little less ambitious than the others. Performance has been a major concern of mine, but it's also the most difficult goal to quantify. If JRuby runs Rails almost as fast as Ruby, is that enough? Well, perhaps, but Rails is a very IO-intensive application, so it's not a particularly good measure of core JRuby performance. If JRuby is many times slower than Ruby for certain command-line tools, should a release be delayed? Perhaps, but it's not so cut-and-dried; a JRuby that's slow on the command-line may perform exceedingly well once hosted on a long-running server.

There is one piece of the puzzle, however, that performance neatly excludes: vastly complicated stackless green-threaded continuable interpreter engines.

Late-Night Hacking

We had basically made the decision not to continue down the green threading path this summer, but we had made no effort to reverse the work I had already done. We always suspected that a more traditional engine would perform better, but attempting another redesign was a very large task nobody wanted to tackle.

Last night I got bored of hunting around for juicy performance tidbits. I wanted to tackle something bigger.

About 9:00PM on Thursday, I started rewriting the interpreter engine to be a more straightforward switch-based affair. Instead of Instruction stacks and EvaluationState, or even Visitors, it would be a fast, concise switch statement, recursing as the AST deepened and affecting state changes along the way. I felt a certain sadness ripping up the old interpreter; I had spent weeks on that code last year, and was proud of what I'd accomplished. It's no small feat to turn a recursive, visitor-based interpreter into a stackless, instruction-based machine. Unfortunately, I knew that design would never serve JRuby the way we needed it to. It was designed for another purpose, and its complexity and cost gained us very little in the long term. And as of 9:00PM Friday, after working furiously for 24 hours, it was gone.

The new interpreter is, as I stated, a large switch statement. Each AST node is assigned an int, so the main "eval" call can quickly determine the correct code to execute for each. The new design does result in faster deepening of the Java stack (about a 15-20% hit to fib max-depth), but it still performs far better than the original visitor-based implementation. And although we'd only theorized up to this point, it does perform a good bit better than the Instruction-based engine.

Rake Installation Performance

Pretty much the first thing anyone does with Ruby is install some application or tool they need. And nine times out of ten, they install it with RubyGems.

One of our favorite benchmarks is a local RubyGems install of Rake, Ruby's answer to make. For reasons which are not yet clear, the documentation-generator RDoc--which is called as part of a RubyGem installation--performs extremely poorly under JRuby. RDoc is basically implemented as a series of source-code parsers that generate documentation based on special comment tags embedded in the code. And it's written in pure Ruby, so it's a good benchmark for JRuby.

We've been steadily improving performance. The following sets of numbers show our progress, all the way up through teh new interpreter:

The numbers above spell it out pretty clearly; we've managed to double performance since the 0.9.0 release at the beginning of the summer. We've also managed to do that without writing a compiler and with a vast number of optimizations still on the table. Things are looking very good. Among future performance-enhancing changes:

Reducing the cost of method calls. Currently a given method call generates an absurd amount of transient objects, like arrays copied into ArrayLists unwrapped into arrays, again and again. We must clean up the call path to minimize object churn.

Building more pre-optimization into static structures in the system. Many areas of the system are cloned or regenerated again and again without ever changing. Blocks, for example, have only 5 mutable fields...but we clone the entire block for every invocation. By saving off the static bits of the system once, we lower the cost of all operations.

ThreadContext is still alive and well; it has always represented the Ruby thread within the JRuby runtime, holding stacks for framing calls, scoping variables, and managing blocks. Unfortunately, the only way to access it is through a moderately expensive threadlocal call, and believe me we hit that call hard. Part of the new interpreter design helps limit those calls; further work in the rest of the runtime will help eliminate them.

The bytecode compiler *will* happen. It's been on hold primarily because the runtime is still evolving. Because even compiled Ruby code will likely still have ties to the JRuby runtime, we need to first iron out how the runtime should look and act long-term.

I've had a great time this past year working on JRuby, and things are really starting to pick up speed. I'm sure the next year's going to be even crazier...especially once JRuby 1.0 comes around the corner.

Tuesday, October 3, 2006

We've added a cleaner, simpler syntax to JRuby for including Java classes into a given namespace. It stemmed from a response of mine to a user on the JRuby mailing lists, and has been committed to trunk already. Almost everyone on the list agreed this new syntax is better.