söndag, juni 03, 2007

There can be only one, a tale about Ruby, IronRuby, MS and whatnot

(Updated: added a quote from John Lam about not being able to look at the MRI source code)

After RailsConf in Portland, there has flared up a discussion centered around IronRuby and Microsoft. We discussed many of these points in depth at the conference, and I'll elaborate some on my views on the issues in a bit.

But first I would like to talk some about the multitude of Ruby implementations springing up. I firmly believe that a language evolves in phases. The first phase, germination, is the period where a language needs one consistent implementation (or a spec). It's during this phase when most "alpha geek adoption" happens. Many important libraries are written, but most applications are not in the main economic center. Ruby have been in this phase for a long time, but the fact that new implementations are springing up left and right is a sure sign that Ruby is entering phase 2: implementation. For adoption to happen, there need to exist several competing implementations, all of them good. This is the evolutionary stage, where it's decided what kind of features an implementation should provide. Should we have green or native threads? Are all the features of the original implementation really that necessary? (Continuations, ObjectSpace). Is there cruft in the standard library that needs to be weeded out? (timeout.rb). All of these questions get answered when other people implement the language. The last phase, which I guess could be called adoption, is when the language have several working implementations, all good enough to deliver high end applications on, when many applications are written in the language, and there exists a plethora of libraries, systems and support for the language.

What this means is that for a language to be successful, there needs to exist competing implementations. They need to implement their features in different ways and make different choices during development. Otherwise, the language will die. (This is obviously not enough, since Smalltalk fulfilled this admirably and still never got widespread adoption.). But I still believe it's incredibly important for a language to evolve with many implementations, which is why I find Rubinius, JRuby, YARV and IronRuby to be extremely important projects for the welfare of Ruby. I want Ruby to be successful. I want Ruby to be the next major language for several reasons. But most importantly: I want Ruby to be a better language tomorrow, than it is today. The only way that's going to happen is by having lots of people implement the language.

So, that's enough of the introductory flame bait. This describes one half of why IronRuby is an important project, and why we can't let it fail. The other side of the coin is the same reason JRuby is important. .NET as a platform have some wildly useful features. There are many developers who swear by .NET for good reason. And what's more important, there are lots of large enterprises with such a vested interest in .NET, that they will never choose anything else. Now, for the welfare of all programmers in the world, I personally believe the world would be a better place if those .NET-environments also used Ruby. So that's the other coin of why IronRuby is important.

The most well read blog about the current Microsoft/Ruby controversy is Martin Fowlers article RubyMicrosoft. Go read it now, and then I'll just highlight the points I find most important.

First: John Lam is committed to creating a "compliant" Ruby implementation. I have no doubts that he can do it. But there are a few problems lurking.

For example, what is a compliant Ruby implementation? Since there exists no spec, and no comprehensive test suite, the only way to measure compliance is to check how close the behavior matches MRI. But this have some problems too. How do you check behavior? Well, you need to run applications. But how do you get so far as you can run applications?What JRuby did was that we looked at MRI source code.

John Lam can not look at MRI source code. He cannot look at JRuby source code. He cannot look at Rubinius source code. If he does, he will be terminated.

So, the next best alternative: accepting patches from the community, which can look at Ruby source? Nope, no cigar. Microsoft is not really about Open Source yet. Their license allows us to look at their source code, and also to fork it and do what we want with it. But that's only half of what open source is about. The more important part is that you should be able to contribute back code without having to fork. You can't do that with IronRuby, since Microsoft is too scared about being sued for copyright infringement.

There was some doubt about Lam actually being banned from looking at MRI source code. This is the first quote that said it is so. It's from the discussion "Virtual classes and 'real' classes -- why?" on Ruby-core, this quote posted at 29/03/07:

Is this how things are actually implemented? (BTW I'm not lazy here - we cannot look for ourselves).

I am going to make a bold statement here. Under the current circumstances, I don't believe it's possible for John Lam and his team to create a Ruby implementation that runs Rails within at least 18 months. And frankly, that's not soon enough.

As I said above, I have all confidence that John can do great stuff if he has the right resources. But creating a Ruby implementation is hard enough while having all the benefits of the open source community.

The two points I want to make with this point is this: The Ruby community must damned well get serious about creating a good, complete specification and test suite. It's time to do it right now, and we need it. It's not a one-man job. The community needs to do it. (And yes, the two SoC projects are a very good start. But you still need to be able to run RSpec to take full advantage of them; and let's face it, the RSpec implementation uses many nice Ruby tricks.)

The second point is simpler: Microsoft needs to completely change how they handle Open Source. Their current strategy of trying to grow it into the organization will not work (at least not good enough). They need to turn around completely, reinvent themselves and make some really bold moves to be able to meet the new world. If they can't do this, they are as dead as Paul Graham claims.

12 kommentarer:

It is true and verified, by among others Chris Sells, Scott Hanselman (who has good insight into MS) and John Lam. It's not FUD. MS employees have been before, and will be terminated if they look at source code with "viral" licenses.

Fair enough - although the IronPython team test IronPython by running the standard Python test suite, so they certainly *use* Python (and a lot of Python is written in Python - so it is a 'grey area').

Concerning RSpec : Rubinius uses something called "minirspec" that implements a subset of the DSL. It's really simple (2 files or so) and doesn't use most of the RSpec magic. It hasn't yet been defined what is the minimal implementation needed to run it. All in all, I think that it would be a good idea to use that to build the BFTS :-)

Btw, let's see this in the good way ; maybe the IronRuby team will take some of their time to complete the specs and the test suites. Most of the programmers I know prefer to do more exciting stuff in their free time anyway.

@ola bini : how is the Ruby license viral ? I though it was a MIT derivative

(if this message appears more than one time, it's blogger's fault :-P)

I think that your intial thesis, that there must be multiple implementations of a language for it to succeed, is provably false. Perl, for one, is an excellent counter-example. It may be the case that most successful languages have been through several implementations, but it's certainly not a requirement. To date, Perl has been very widely ported to various environments and embedded within many other applications, but all from a common principle implementation.

It is true that Perl 6 is working its way toward two implementations (Perl 6 on Parrot and Pugs), and, similarly, there are plans to make a Perl 5 implementation on Parrot... but these are all in the future, and Perl has managed to become a widely accepted successful language without parallel implementations, to date.

Well, the team over at Queensland University are making rapid progress at creating Ruby.Net. And they certainly do not have any restrictions on looking at open source code. Last time I checked what John Lam was up to it looked like the aussies were a little bit ahead.

I think this 'process of mistrust' is exactly what was experienced by the IronPython team from the Python community. A lot were very distrustful of Microsoft providing a Python implementation.

A few in the Python community are still distrustful - but it has proved to be very good for Python.

Also Python is a good counterexample of having multiple implemntations *before* it could become mature.

It is only now, long after Python was mature and well used as a language that we have multiple implementations. I agree that having multiple implementations is very good for a language though.

IronPython in particular has highlighted inadequacies in the specification of Python for example, and also examples of bad practise in the standard library (relying on semi-private and undocumented features which simply don't port to another implementation).

John Lam discussed the difficulties of looking at Open Source code in a podcast with me. Its a total CYA move by Microsoft lawyers, but it is the reality of working at Microsoft. But today John mentioned that they are licensing the Garden's Point Ruby.NET code, so hopefully that shields them a bit from litigation.

Another problem is accepting community contributions, which they are not doing now. Microsoft's sponsorship and licensing of GP's Ruby.NET may also explain why that project is also not overtly accepting community contributions.