Friday, October 31, 2008

One of the largest problems plaguing Ruby implementations (and plaguing some other language implementations, so I hear from my Pythonista friends) is the ever-painful story of "extensions". In general, these take the form of a dynamic library, usually written in C, that plugs into and calls Ruby's native API as exposed through ruby.h and libruby. Ignoring for the moment the fact that this API exposes way more of Ruby's internals than it should, extensions present a very difficult problem for other implementations:

Do we support them or not?

In many cases, this question is answered for us; most extensions require access to object internals we can't expose, or can't expose without extremely expensive copying back and forth. But there's also a silver lining: the vast majority of C-based extensions exist solely to wrap another library.

Isn't it obvious what's needed here?

This problem has been tackled by a number of libraries on a number of platforms. On the JVM, there's Java Native Access (JNA). On Python, there's ctypes. And even on Ruby, there's the "dl" stdlib, wrapping libdl for programmatic access to dynamic libraries. But dl is not widely used, because of real or perceived bugs and a rather arcane API. Something better is needed.

Enter FFI.

FFI stands for Foreign Function Interface. FFI has been implemented in various libraries; one of them, libffi, actually serves as the core of JNA, allowing Java code to load and call arbitrary C libraries. libffi allows code to load a library by name, retrieve a pointer to a function within that library, and invoke it, all without static bindings, header files, or any compile phase.

In order to address a need early in Rubinius's dev cycle, Evan Phoenix came up with an FFI library for Rubinius, wrapping the functionality of libffi in a friendly Ruby DSL-like API.

A simple FFI script calling the C "getpid" function:

require 'ffi'

module GetPid extend FFI::Library

attach_function :getpid, [], :uintend

puts GetPid.getpid

Because JRuby already ships with JNA, and because FFI could fulfill the C-extension needs of almost all Ruby users, we endeavored to create a compatible implementation. And by we I mean Wayne Meissner.

Wayne is one of the primary maintainers of JNA, and has recently spent time on a new higher-performance version of it called JFFI. Wayne also became a JRuby committer this spring, and perhaps his most impressive contribution to date is a full FFI library for JRuby, based on JNA (eventually JFFI, once we migrate fully) and implementing the full set of what we and Evan agreed would be "FFI API 1.0". We shipped the completed FFI support in JRuby 1.1.4.

But what good is having such a library if it doesn't run everywhere? Up until recently, only Rubinius and JRuby supported FFI, which made our case for cross-implementation use pretty weak. Even though we were getting good use out of FFI, there was no motivation for anyone to use it in general, since the standard Ruby implementation had no support.

That is, until Wayne pulled another rabbit out of his hat and implemented FFI for C Ruby as well. The JRuby team is proud to announce a wholly non-JRuby library: FFI is now available on Ruby 1.9 and Ruby 1.8.6/7, in addition to JRuby 1.1.4+ and Rubinius (though Rubinius does not yet support callbacks).

Our hope with JRuby's support of FFI and our release of FFI for C Ruby is that we may finally escape the hell of C extensions. Next time you need to call out to a C library, don't write a wrapper shim in C! Write it using FFI, and it will work across implementations without recompile.

Here's some links to docs on FFI. As with most open-source projects, documentation is a little light right now, but hopefully that will change.

As more docs come to my attention, I'll update this post and add links to the JRuby wiki page. For those of you interested in the Ruby FFI project itself, check out the Ruby FFI project on Kenai. And feel free to hunt down any of the JRuby team, including Wayne Meissner, on the JRuby mailing lists or in #jruby on FreeNode.

Wednesday, October 29, 2008

After a long day fixing bugs, ya gotta hack on something frivolous once in a while.

I've started to play with a proof-of-concept branch that uses Rubinius's kernel (pure Ruby core class implementations) in replacement for our own (pure Java). Initially I've been playing with Hash, which is one of the few Rubinius core classes that does not have any "primitives" (native methods implemented in C++).

The challenge to make it work was mostly in adding a few utilities it needed (using Array for "Tuple", setting the top "self" to "MAIN", loading a Type module for coercions, etc) and getting the intialization paths to use the pure-Ruby Hash impl instead of our concrete Java impl. I was able to overcome all these and get Rubinius's hash to function in both compiled and interpreted modes, and run a few simple tests and benchmarks.

It is, as you would expect, dozens of times slower than our Java impl, but the performance is better than I expected once the various JIT layers kick in. I doubt it will be possible to get its performance to the same level as hand-written Java code. However, I do believe it's possible to improve performance substantially by enabling some normally unsafe optimizations only for these pure-Ruby kernel classes.

I don't expect too many additional challenges loading the other core classes. Obviously we'll need to wire up primitives, but Rubinius's new C++ VM defines them the same way we do in JRuby. Methods in C++ are annotated with information that allows the VM to bind them to the right class and name, just like in JRuby. The only major difference is that Rubinius has a fairly rigid boot sequence I'll mostly be able to fake, since JRuby is already booted by the time the Rubinius kernel loads. My bootloading code to get Hash running was about 10 lines of code, plus some modifications wherever we instantiated our native RubyHash type directly. Basically, JRuby starts up with its own core class impls, then loads the Rubinius kernel to replace them.

At any rate, this will probably become a fun side project, since the goal of running the same pure Ruby kernel across many implementations is certainly attractive...even if our kernel is pretty much done already. I'd love to see how fast we can reasonably make Rubinius's kernel run atop the JVM, and the exercise will certainly put JRuby through its paces.

I'll push it to an SVN branch later today. Comments and suggestions are welcome :)