Thursday, September 27, 2007

Tom and I have been traveling in Europe the past two weeks, first for RailsConf EU in Berlin and currently in Århus, Denmark for JAOO (which was an excellent conference, I highly recommend it). And usually, that would mean getting very little work done...but time is short, and I've been putting in full days for almost the entire trip.

Let's recap my compiler-related commits made on the road:

r4330, Sep 16: ClassVarDeclNode and RescueNode compiling; all tests pass...we're in the home stretch now!

r4339, Sep 17: Fix for return in rescue not bubbling all the way out of the synthetic method generated to wrap it.

r4341, Sep 17: Adding super compilation, disabled until the whole findImplementers thing is tidied up in the generated code.

r4342, Sep 17: Enable super compilation! I had to disable an assertion to do it, but it doesn't seem to hurt things and I have a big fixme on it.

r4355, Sep 19: zsuper is getting closer, but still not enabled.

r4362, Sep 20: Enabled a number of additional flow-control syntax within ensure bodies, and def within blocks.

r4363, Sep 20: Re-disabling break in ensure; it caused problems for Rails and needs to be investigated in more depth.

r4367, Sep 21: Removing the overhead of constructing ISourcePosition objects for every line in the compiler; I moved construction to be a one-time cost and perf numbers went back to where they were before.

r4368, Sep 21: Small optz for literal string perf and memory use: cache a single bytelist per literal in the compiled class and share it across all literal strings constructed.

r4392, Sep 25: Compilation of break within ensurified sections; basically just do a normal breakjump instead of Java jumps

r4400, Sep 25: Fix for JRUBY-1388, plus an additional fix where it wasn't scoping constants in the right module.

r4401, Sep 25: Retry compilation!

r4402, Sep 26: Multiple additional cleanups, fixes, to the compiler; expand stack-based methods to include those with opt/rest/block args, fix a problem with redo/next in ensure/rescue; fix an issue in the ASTInspector not inspecting opt arg values; shrink the generated bytecode by offloading to CompilerHelpers in a few places. Ruby stdlib now compiles completely. Yay!

r4405, Sep 26: A few additional fixes for rescue method names and reduced size for the pre-allocated calladapters, strings, and positions.

r4410, Sep 27: A number of additional fixes for the compiler to remedy inconsistent stack issues, and a whole slew of work to make apps run correctly with AOT-compiled stdlib. Very close to "complete" in my eyes.

r4412, Sep 27: Fixes to top-level scoping for AOT-compiled methods, loading sequence, and some minor compiler tweaks to make rubygems start up and run correctly with AOT-compiled stdlib.

r4413, Sep 27: Fixed the last known bug in the compiler. It is now complete.

r4414, Sep 27: Ok, now the compiler is REALLY complete. I forgot about BEGIN and END nodes. The only remaining node that doesn't compile is OptN, whichwe won't put in the compiled output (we'll just wrap execution of scripts with the appropriate logic). It's a good day to be alive!

I think I've done a decent job proving you can get some serious work done on the road, even while preparing two talks and hob-nobbing with fellow geeks. But of course this is an enormous milestone for JRuby in general.

For the first time ever, there is a complete, fully-functional Ruby 1.8 compiler. There have been other compilers announced that were able to handle all Ruby syntax, and perhaps even compile the entire standard library. But they have never gotten to what in my eyes is really "complete": being able to dump the stdlib .rb files and continue running nontrivial applications like IRB or RubyGems. I think I'm allowed to be a little proud of that accomplishment. JRuby has the first complete and functional 1.8-semantics compiler. That's pretty cool.

What's even more cool is that this has all been accomplished while keeping a fully-functional interpreter working in concert. We've even made great strides in speeding up interpreted mode to almost as fast as the C implementation of Ruby 1.8, and we still have more ideas. So for the first time, there's a mixed-mode Ruby runtime that can run interpreted, compiled, or both at the same time. Doubly cool. This also means that we don't have to pay a massive compilation cost for 'eval' and friends, and that we can be deployed in a security-restricted environment where runtime code-generation is forbidden.

I will try to prepare a document soon about the design of the compiler, the decisions made, and what the future holds. But for now, I have at least one teaser for you to chew on: there is a second compiler in the works, this time for creating real Java classes you can construct and invoke directly from Java-land. Yes, you heard me.

Compiler #2

Compiler #2 will basically take a Ruby class in a given file (or multiple Ruby classes, if you so choose) and generate a normal Java type. This type will look and feel like any other Java class:

You can instantiate it with a normal new MyClass(arg1, arg2) from Java code

You can invoke all its methods with normal Java invocations

You can extend it with your own Java classes

The basic idea behind this compiler is to take all the visible signatures in a Ruby class definition, as seen during a quick walk through the code, and turn them into Java signatures on a normal class. Behind the scenes, those signatures will just dynamically invoke the named method, passing arguments through as normal. So for example, a piece of Ruby code like this:

It's a pretty trivial amount of code to generate, but it completes that "last mile" of Java integration, being directly callable from Java and directly integrated into Java type hierarchies. Triply cool?

Of course the use of Object everywhere is somewhat less than ideal, so I've been thinking through implementation-independent ways to specify signatures for Ruby methods. The requirement in my mind is that the same code can run in JRuby and any other Ruby without modification, but in JRuby it will gain additional static type signatures for calls from Java. The syntax I'm kicking around right now looks something like this:

If you're unfamiliar with it, this is basically just a literal hash syntax. The return type, String, is associated with the types of the two method arguments, Integer and Array. In any normal Ruby implementation, this line would be executed, a hash constructed, and execution would proceed with the hash likely getting quickly garbage collected. However Compiler #2 would encounter these lines in the class body and use them to create method signatures like this:

public String mymethod(int num, List ary) { ... }

The final syntax is of course open for debate, but I can assure you this compiler will be far easier to write than the general compiler. It may not be complete before JRuby 1.1 in November, but it won't take long.

So there you have it, friends. Our work on JRuby has shown that it is possible to fully compile Ruby code for a general-purpose VM, and even that Ruby can be made to integrate as a first-class citizen on the Java platform, fitting in wherever Java code may be used today.

Wednesday, September 19, 2007

Recently, the JRuby team has gone through the motions of getting a definitive JRuby book underway. We've talked through outlines, some some chapter assignments, and discussed the overall feel of a book and how it should progress. I believe one or two of us may have started writing. However the entire exercise has made one thing abundantly clear to me:

Good authors do not have time to be good developers.

Think of your favorite technical author, perhaps one of the more insightful, or the one who takes the most care in their authoring craft. Now tell me one serious, nontrivial contribution they've made in the form of real code. It's hard, isn't it?

Of course I don't intend to paint with too wide a brush. There are, without a doubt, good authors that manage to keep a balance between words and code. But I'm increasingly of the opinion that it's not practical or perhaps even possible to maintain a serious dedication to both writing and coding.

What brings me to this conclusion is the growing realization that working on a JRuby book would--for me--mean a good bit less time spent working on JRuby and related projects. I fully intend to make a large contribution to the eventual JRuby book, but JRuby as a passion has meant most of my waking hours are spent working on real, difficult problems (and in some cases, really difficult problems). It physically pains me to consider taking time away from that work.

And I do not believe it's from a lack of interest in writing. I have long wanted to write a book, and as most of my blog posts should attest, I love putting my thoughts and ideas into a written form. I enjoy crafting English prose almost as much as I enjoy crafting excellent code. But at the end of the day, I am still a coder, and that is where my heart lies. I suspect I am not alone.

When I decided to write this post, I tried to think of concrete examples. Though many came to mind, I could not think of a way to name names without seeming malicious or disrespectful. So I leave it as an exercise for the reader. Am I totally off base? Or is there a direct correlation between the quality and breadth of an author's work and a suspicious (or obvious) lack of real, concrete development?

Friday, September 14, 2007

At conferences and online, Tom and I have long been talking about a mystery deployment option coming soon from the GlassFish team. It would combine the agile command-line-friendly model of Mongrel with the power and simplicity of deploying to a Java application server. We have shown a few quick demos, but they have only been very simple and very limited. Plus there was no public release we could give people.

So what is this tasty little morsel? Well, it's a 2.9MB Ruby gem containing the GlassFish server and the Grizzly connector for JRuby on Rails. It installs a "glassfish_rails" script in JRuby's bin directory, and you're done.

That's all there is to it...you've got a production-ready server. Oh, did I mention you only have to run one instance? No more managing a dozen mongrel processes, ensuring they stay running, starting and stopping them all. One command, one process.

Of course this is a preview...we expect to see bug reports and find issues with it. For example, it currently deploys under a context rather than at the root of the server, so my app above would be available at http://localhost:8080/testapp instead of http://localhost:8080/. That's going to be fixed soon (and configurable) but for now you'll want to set the following in environment.rb:

And of course, you're going to be running JRuby, so you'll need to take that into consideration. JRuby's general Rails performance still needs more tweaking and work to surpass Mongrel + Ruby, but out of the box you already get stellar static-file performance with the GlassFish gem...something like 2500req/s for the testapp index page on my system. The remaining JRuby performance is continuing to improve as well...we'll get there soon.

Best Open Source IDE - InfoWorld heaps praise on NetBeans largely because it has *not* gone the way of an amorphous, all-encompassing "platform" and has continued to take risks to improve the overall experience of the IDE. Before the work on NetBeans 6 (don't download M10; use a daily build) I was a nay-sayer myself; having used NetBeans 6 for many months now, I can honestly say it's far better than NetBeans 5.5 and has caught up or passed Eclipse in many ways. There's more to do, but I'm extremely impressed with the progress.

I try not to be too much of a Sun marketing shill, but both Bossies are pretty close to my heart. JRuby has been my passion for three years now, and NetBeans has taken a serious about face with greatly improved Java support and best-in-class Ruby support. It's good to see some recognition for hard work.

Tuesday, September 11, 2007

Hello again friends! It's time to update you on the status of the JRuby compiler.

Compiler Status

I've been working feverishly for the past several weeks to get the rest of the compiler complete. Currently, it's able to handle the majority of Ruby syntax. Here's a list of the remaining language features that do not compile:

"rescue" blocks; exception handling in Ruby is rather complicated, and there's some particularly odd uses of rescue that will be a bit tricky to support with normal Java exception-handling.

"class var declaration" is not yet supported. This is when you declare a class variable (@@foo) from within the body of a class or module. This primarily affects compiling class bodies, so although it prevents AOT compilation of some scripts, it doesn't usually affect individual methods.

"opt n" execution. This is specifying "-n" to the Ruby runtime, and it loops the provided script as though it were surrounded by "while gets(); ... end". It's useful for line-by-line processing of stdin.

"post execution" blocks. Post exe blocks are when you specify an END { ... } block somewhere in your script. These blocks are saved up and executed at the end of the script execution, regardless of where they appear in the script. They're a bit like Kernel#at_exit blocks.

"retry". Tell me friends, do you know what "retry" actually does? Retry is used within a block/closure, and it causes the method containing the closure to be re-called anew. And as an interesting quirk, the original arguments to the method are re-evaluated, so if you call foo(bar()) and a retry is triggered within foo(), bar() will get invoked again for the retried call to foo(). Weird, eh? Update: I didn't explain this well. Here's another attempt: if you have the following code:

def foo(x = bar()); 1.times {retry}; end

And you call foo with no arguments, allowing the default argument logic to fire, retry will cause that logic to fire again and again. It's essentially re-entering the method anew with the original arguments, but causing *argument processing* to be revisited. I'm not sure why you'd want this behavior, since it could frequently result in default arguments to re-call methods that might only be valid the first time.

Some non-local flow control is not yet complete. Non-local flow control happens any time you return, break, or next from within a block (when not immediately inside a normal loop construct). Much of non-local flow control is working, but I need to flush out any remaining cases that aren't running correctly.

It's a pretty short list, eh? Obviously "rescue" is the biggest and trickiest item here. Without exception handling, it's hard to say the compiler is near completion. The complications I mentioned involve the ability to embed rescue processing into arbitrary expressions. Here's a good example:

a = [1, 2, (begin; raise; rescue; 3; end)]

When this code is compiled, it turns into a local variable assignment. The value assigned is a literal array construction with three elements: a Fixnum 1, a Fixnum 2, and a rescued block of code. The typical way to construct the array then is to follow these steps:

Construct an array of the appropriate size

Dup the array reference

Push a constant integer zero

Push Fixnum 1

Insert Fixnum 1 at index zero in the array. This consumes the dup'ed array, the index, and the Fixnum1.

Dup the array reference again

Push a constant integer one

Push Fixnum 2

Insert Fixnum 2

Dup the array reference again

Push a constant integer two

Now it gets complicated; we must recurse in the compiler to handle the rescue block

The rescue block is compiled and a "raise" is triggered in the code

The exception raised is handled, resulting in the whole rescue leaving a Fixnum 3 on the stack

Insert the Fixnum 3

Construct a RubyArray object with the remaining object array

Now that seems simple enough. However there's a sneaky complication at steps 13 and 14: catching an exception clears the operand stack, and the original created array, its duplicated reference, and the integer two disappear as a result. The value "returned" from the rescue section therefore has nowhere to go.

We will likely have to solve this complication in one of two ways:

We could save off the stack when entering code that might trigger exception handling

We could put exception-handling logic in a separate method and invoke it in-place, thereby protecting our executing stack from clearage.

It remains to be seen which mechanism will work out to be simplest to compile and most performant.

A Nice Performance Milestone

And on the topic of performance, the recent compiler work has allowed us to reach a new milestone: we now exceed Ruby 1.8.6's performance on M. Edward (Ed) Borasky's MatrixBenchmark.

Some months back, after the Mountain West RubyConf in Salt Lake City, Ed posted an interesting blog entry where he professed a lot of confidence in JRuby's future. We emailed a bit offline, and he pointed me to this matrix benchmark he'd been using to measure the relative performance of Ruby 1.8.6 and Ruby 1.9 (YARV). I told him I'd give it a try.

Originally, we were perhaps 50% to 100% slower than Ruby 1.8.6. This was back when hardly anything was compiling, and there had been few serious efforts to optimize the JRuby runtime. Performance slowly crept up as time went on. But as recent as a week ago, JRuby performance was still roughly 20-25% slower than 1.8.6.

So last week, I dug into it a bit more. I turned on JRuby's JIT logging (-J-Djruby.jit.logging=true) and verbose logging (-J-Djruby.jit.logging.verbose=true) to log compiling and non-compiling methods, respectively. As it turned out, the "inverse_from" method in matrix.rb was not yet compiling...and it was where the bulk of MatrixBenchmark's work was happening.

The final sticking point in the compiler for this method was "operator element assignment" syntax, or basically anything that looks like a[0] += 5. It's a little involved to compile; you have to retrieve the element, calculate the value, call the operator method, and reassign all in one operation. For the ||= or &&= versions, you have to perform a boolean check against the element to see if you should proceed to the assignment. A good bit of compiler code, but it had to be done.

So then, with "OpElementAsgn" compiling, it was time to re-run the numbers. And finally, finally, we were comfortably exceeding Ruby 1.8.6 performance:

Or should I say vastly exceeding? By my calculation this is an easy 2x performance increase, and perhaps a 70% improvement just by getting this one extra method to compile.

On Beyond Zebra

I believe we're pretty well on-target to have the compiler completed by RubyConf in November. I'm about to embark on a refactoring adventure to prepare for the stack-juggling I'll have to do to support rescue blocks. That will mean minimal progress on adding to the compiler until the end of the month, but ideally the refactoring will make it easy to get rescue compilation complete. The others are just a matter of spending some time.

Once the JRuby compiler is complete, we will start testing in earnest against a fully pre-compiled Ruby stdlib. Along with that, we'll wire in support for pre-compiling RubyGems as they install and pre-compiling Ruby scripts as they are executed and loaded. Much of this works already in prototype form, but it waits for the completion of the compiler to go into general use.

I also have plans for a "static" compiler for JRuby that enable compiling Ruby classes into normal, instantiable, callable, static Java classes. This would bring us on par with other compiled languages on the JVM, and allow you to directly instantiate and invoke JRuby/Ruby objects from within your Java code.

Beyond all this work, Tom and I have been discussing a whole raft of performance improvements we could make to the underlying JRuby runtime. There's a lot more performance to be had, and it's just around the corner.

Sunday, September 2, 2007

I know, I know. It's got native code in it, and that's bad. It's using JNI, and that's bad. But damn, for the life of me I couldn't see enough negatives to using Java Native Access that would outweigh the incredible positives. For example, the new POSIX interface I just committed:

So the idea behind JNA (jna.dev.java.net) is to use a foreign function interface library (libffi in this case) to wire up C libraries programmatically. This allows loading a library, inspecting its contents, and providing interfaces for those functions at runtime. They get bound to the appropriate places in the Java interface implementation, and JNA does some magic under the covers to handle converting types around. And so far, it appears to do an outstanding job.

With this code in JRuby, we now have fully complete chmod and chown functionality. Before, we had to either shell out for chmod or use Java 6 APIs that wouldn't allow setting group bits, and we always had to shell out for chown because there's no Java API for it. Now, both work *flawlessly*.

The potential of this is enormous. Any C function we haven't been able to emulate, we can just use directly. Symlinks, fcntl, file stats and tests, process spawning and control, on and on and on. And how's this for a wild idea: with a bit of trickery, we could actually wire up Ruby C extensions, just by providing JNI/JRuby-aware implementations of the Ruby C API functions.

So there it is. I went ahead and committed a trunk build of JNA, and I'm working with the JNA guys to get a release out. We'll have to figure out the platform-specific bits, probably by just shipping a bunch of pre-built libraries (not a big deal; the JNI portion is about 26k), but the result will be true POSIX functionality in JRuby...the "last mile" of compatibility will largely be solved.

Many thanks to Timothy Wall, who's been nursemaiding me through getting JNA working well, and who appears to be the primary driver of the JNA project right now. If you're interest in this stuff, check out JNA, start using it, make it popular. As far as I'm concerned this is how native access is supposed to be.

Update: My mistake, Wayne Meissner has also been pouring a metric buttload of effort into JNA. Credit where credit's due!