March 25, 2003

(Perl|python|Ruby) on (.NET|JVM)

InfoWorld has an article on scripting languages, and Jon Udell has an entry in his blog about it. The main bit in there is the postulation that the big reason we're going with Parrot rather than using the JVM or .NET is a cultural choice, rather than a technical one. The rather flip answer in the Parrot FAQ (which I wrote--it's a good bet that any of the flip answers in there are mine) doesn't really explain things, so it's time I sat down and did so. Then I should go update the FAQ. (There's a sidebar about it in the April Linux Magazine, but it's not on their website yet, and doesn't really go into details anyway)

The easy answer for why we're not using .NET is that it wasn't out when we started the design, at least not such that we knew anything about it. IIRC, and I may not, it hit public beta in summer 2000. Regardless, I didn't know about it until 2001 sometime. .NET has major portability issues as far as we're concerned, since we have to run on any of a half-zillion platforms, and .NET is windows only. Mono makes that somewhat better, but still... got Mono for a Cray system, a VMS system, or a Palm? Probably not. I certainly don't.

Regardless of its portability issues, .NET has the same fundamental problems as the JVM does for our purposes, that is to run Perl. (Both perl 5 and perl 6) That's what I want to address.

First things first--both the JVM and .NET are perfectly capable of being target machines. They're fully turing complete, so it's not an issue of capability. But, like the Infocom Z machine, which is also turing complete, the issue is one of speed.

Perl 5 has two big features that make using the JVM or .NET problematic--closures and polymorphic scalars. Perl 6 adds a third (which Ruby shares) in continuations, and a fourth (which Ruby doesn't) of co-routines. (Though arguably once you've got continuations, everything else is just a special case) Python has similar issues, though I'm not the guy to be making statements about Python, generally.

To do closures means capturing and maintaining persistent lexical state. Neither .NET nor the JVM have support for this, as they use a simpler stack-based allocation of lexical variables. To handle lexicals the way perl needs them means we'd have to basically ignore the system variable allocation system and do it ourselves.

The same goes for the polymorphic "It's a string! No, an integer! No, an object reference! No, wait, a filehandle!" scalar that perl has. They're really useful, and a mostly-typeless (Perl is strongly typed, it just has very few types) language makes some things quite easy. To do that with .NET or the JVM would require a custom type capable of doing what perl needs.

So, to make perl work means completely rewriting the system allocation scheme, and using our own custom polymorphic object type. In JVM/.NET bytecode. Doable? Sure. Fast? No way in hell.

And continuations. Yow. To do continuations is non-trivial, and I don't think it's possible to do in the JVM or .NET without treating them as glorified CPUs and use none of their control and stack features. We'd essentially write all the functionality of the interpreter and target the JVM the way we do now with C and hardware CPUs, including complete stack management, completely ignoring any features at all of the VMs. I don't want to think about how slow that would go. It's bad enough doing it all in twisted, insane C targeting real hardware. Another layer of indirection would kill us dead.

With a custom interpreter, we can write the code to support perl's features, and have them run as fast as we can manage. Will it necessarily be as fast as, say, C# code targeting .NET? Probably not. (Though that'd be really cool... :) The required functionality we have forces a certain amount of overhead on us, and there's just no way around it.. Give me a budget of $30M a year, three or four years, and plenty of office space and maybe we could change that, but until then...

The other question is "could .NET or the JVM change to support features perl needs?" In this case mostly closures and continuations, which are the biggies. (weakly, runtime typed variables are less of a problem, though still a problem) The answer is yes, but they'd be stupid to do so. As I said, those features have unavoidable overhead. Running perl faster at the cost of running C# slower is not, at least in my estimation, a good tradeoff.

All features have costs associated with them, and nothing is free. You design your feature set, then the software to run it, and it's all a huge mass of tradeoffs. This feature lets you do something, but has that cost. Wanting something to be fast means something else is very slow, or effectively impossible, and sometimes two features are mostly incompatible. You make your list, make your choices, and do what you can.

That's engineering, and there ain't no such thing as a free lunch. Neither is ther eany such thing as a language-neutral VM. Or real M, for that matter. Anyone who tells you different is either lying, a fool, or ignorant. Real hardware doesn't like closures and continuations--VMs that don't do closures and continuations running on otp of hardware that doesn't like them is not a recipe for speed.

Does the existence of Jython [Python implemented in Java, but I assume this is known to all] show that Python is happily implementable in .Net/Java [ignoring platforms], or have the same issues been met in the jython creation?

Some of the folks at activestate think otherwise--their port of Python to .NET was, by their report, amazingly slow. I'm not particularly familiar with the speed of Python on the JVM, but by all reports it isn't at all snappy either. (Though closures, at least, aren't quite the same issue for Python given how it handles variables)

Any language that generates bytecode that follows the calling conventions (and object conventions, if it's an OO language) will be interoperable. That means you'll be able to use Perl and Python libraries from within Scheme and Befunge code. It's similar to the way .NET (and the JVM) work.

I should point out that we're not lifting the idea from .NET here. Rather, we're lifting it from VMS, a system that's expected that compiled languages will interoperate for decades now. If we're going to steal ideas, I'd rather they be time-tested ones. (Hence continuations... :)

I don't understand why you people didn't start off with a lisp VM instead. They have continuations and closures for, erm, ages ? AND are freaking fast. But well, suppose must have something to do with the PERL background (there's more than one way to do it :)

Once you accept that .NET and the JVM are unsuitable for one reason or another, the next obvious question is, "why not start with a Lisp/Scheme VM?" Especially when dynamic typing, closures and continuations enter the equation.

It's such an obvious question, that it's been asked dozens of times if not hundreds of times. Literally. I'm not exaggerating here.

The real reason (and it has nothing to do with a PERL, or even a Perl, background) is empirical, actually. If Scheme were such a strong foundation upon which to build software, why isn't it used more often? There are plenty of theories about this. The one critical reason why Parrot isn't a Scheme VM (or an extension of a Scheme VM) is that the people who care about Perl5, Perl6, Python, Ruby, etc. who are interested in Parrot don't have a strong Scheme background. Using a Scheme VM inhibits cooperative development and scares away potential developers. C, on the other hand, is a really nasty language for VM development, except that anyone who is interested in Parrot already knows C or can learn it quickly.

An undergraduate project 'Optimising Java Threads' at <http://www.doc.ic.ac.uk/~ajf/Teaching/Projects/Distinguished02/AndreasMartens.ps> explains how to implement coroutines in Java bytecode. The aim was to transform multithreaded code to coroutines so it would run faster (of course only one 'thread' can run at once, but in some applications such as process-oriented simulations this is all you need).

It's said to be 10-30 times faster than Java threads, but I don't know how it compares with implementing coroutines on a different virtual machine such as Parrot.

10 to 30 times? Yow! That number's... difficult to believe. I remember when I went through and nearly fully mutexed the perl core and only cut its speed in half. It's hard to believe that Java has that much overhead in mutexes that removing them speeds it up by an order of magnitude. Ouch.

What kind of hardware is needed to run Parrot well? The x86 instruction set is just bytecode interpreted by microcode. Is Parrot well designed to be the machine at the microcode level? How fast could we go if Parrot was interpreted by microcode on the right hardware design? You can shred me if this is a bogus concept. I just want to go faster.

Well, for right now anything with a lot of registers will stand you in good stead. I think in general we'll find the RISC systems with their large register sets are, with equal engineering effort on our part, better suited for parrot than the x86.

All things are rarely equal, of course, and as such the x86 has the advantage of more people with domain-specific knowledge, so we'll probably see that get fast quickest. AMD's 64-bit x86 extensions show some promise as well, as they throw in a sane number of registers, so we can go faster.

Hardware implementations of parrot would be somewhat problematic, mostly because we allow lexically scoped loading and overriding of opcode functions. But that's a problem for another time. :)