A little while ago, I submitted a Hague Grant application to do a significant meta-model overhaul for Rakudo. I’m hopeful it’ll be approved soon, but lack of approval so far has, of course, not stopped me spending time considering the problem space, and even engaging in a little prototyping. I’ve got a lot of posts planned for the coming weeks and months about this work. In this one, I want to take some time to discuss some of the aims of this work and give you an idea of the roadmap.

I’ve been hacking on Rakudo for the best part of two and a half years now. It goes without saying that I – and others – have learned an enormous amount about Perl 6’s needs in that time. Perl 6 is a very feature rich language, but it’s built upon a rather smaller set of primitives. As I set about re-designing and re-implementing a couple of those, I constantly have to ask myself, “OK, can this design handle everything we want to put on top of it?” Of course, experiences from Perl 6 implementations – Rakudo and others – haven’t been and won’t be my only sources of ideas. There’s plenty of other object model implementations to look at, including Moose, CLOS and Smalltalk. It’s also worth looking at what languages like Java and C# do in this area – let’s face it, they’re OO languages that are known to run pretty fast atop of their respective VMs, so it’s good to understand how they achieve that. And then there’s all of the various papers that academia has churned out, which can also provide a great deal of inspiration, or at least exploration of problem spaces.

So, what do I actually want out of this process?

A really big ticket item is that I want a meta-model that supports gradual typing. It may seem odd to put this at the top of the list, but it’s actually one of the things that I think will play out as being really important, and I’m not sure it’s received enough attention yet in Perl 6 – at least, not in the meta-model design space. We often talk about Perl 6 as being a dynamic language. Well, it is – but it makes more stuff statically knowable than you might first imagine. The possibility for a programmer to write type annotations means we may be given a lot of information to work with, but even knowing that everything descends from Mu gives us some optimization opportunities. Most parameters are actually of type Any, which gives us more information to work with.

Today in Rakudo, writing type annotations may make your program slower rather than faster, even though you give more information. I want to fix that. Putting it another way, this work is about producing a meta-model that deeply supports both static and dynamic typing, with the ability to give us gradually greater amounts of optimization and safety as we move gradually towards the static end of the scale. (Some people at this point may ask about type inference and wonder if we can do some optimizations to dynamic programs using that; the answer is yes, we may well be able to do that and it’ll be some interesting future work, but having a way to put the calculated types to use is a prerequisite.)

This directly ties into performance, which is one of Rakudo’s biggest issues at the moment. We need improvements in terms of compile time performance and runtime performance. Various things need to get faster, notably:

Method dispatch

Attribute access

Type checks

Object allocation

Perl 6 is built out of method dispatches. While the programmer may not give us that much information to work with so far as types go, that need not be the case in the internals. In fact, if we want to write a lot of Perl 6 in Perl 6 and have it performant, we’re going to have to do things that give the compiler and runtime enough information to generate fast code. Of course, when we are in a dynamic situation, we need to be able to do runtime type checks much more quickly than we can today too.

Then there’s memory consumption. Consider a Rat (rational number type) today. It contains two attributes holding Ints, which in term constitute two PMCs each (a Parrot Integer PMC that holds the value and the Perl 6 Int subclass of it). Worse, the Object PMC uses a ResizablePMCArray for attribute storage. I’ve probably lost many of you in Parrot guts terminology by now, but the bottom line is that one Rat today can carry around 8 garbage-collectable objects! Well, we made it work – but this clearly has gotta change if we want to make it fast. 3 objects is perhaps far more reasonable – a Rat object with two Int objects within them. (It goes without saying that the delegation we end up doing here as well as the extra GC overhead is also a source of speed woes.)

This leads me nicely to native types, and thus representations. The SMOP project has done a lot of work looking into the area of representation polymorphism – something that Rakudo has largely ignored up until now. However, now we want to do native types it’s time to dig in and take this issue seriously. This especially comes into play in the area of boxing and unboxing – a task we’ve mostly punted on and let Parrot worry about. Today we often avoid boxing things into Rakudo-level objects when we don’t have to, because it’s so much slower than boxing to Parrot PMCs. This leads to other bits of cheating and – now and then – the cheating leaks through to user space and causes trouble. We should be able to box natives up fast enough that we feel no need to cheat and lie. These cheats and lies also carry a runtime cost.

One other area that Rakudo does badly at today is startup time. We don’t need an incremental improvement here, we need an order of magnitude one. At startup today, we essentially build up the entire runtime library ecosystem from scratch (apart from some things we can lazily defer until later). Even if we can get faster at building that, it’d be faster still to be able to just build it once, freeze (aka serialize) it, and then just have to quickly thaw (aka deserialize) it at startup and be ready to run. Parrot today does have some freeze/thaw support, but it’s not enough. Notably, it doesn’t handle the issue of linkage – that is, having instances of a type defined in library A serialized in library B and matching them up. So I’ll be looking at how we can extend that.

Going even further, I want us to actually build the meta-objects at *compile time*. Not just at compile time, but actually in action methods (so that by the time we’re done with the parse, we have them, and just have to fudge in the compiled Parrot Sub objects into the various slots. Why? Because that way we have the meta-objects around to provide us with the information we need to do gradual typing and a host of other optimizations and error checking. We could, of course, maintain a parallel set of compile-time and run-time versions of the information. We partially do that today, but carrying around two copies of the same information is just asking for them to get out of sync. Not to mention that we want to serialize them anyway, so we need to build them at some point during the compile. Put another way, I want to unify our compile time and runtime meta-models. I’ll write in a lot more detail on this later on – it represents something of a sea-change to our compilation model for packages.

Portability is another concern. Today Rakudo only runs on Parrot, but for some time now the Rakudo team have been talking about running on other backends too. So far, getting something out that works – in the form of Rakudo * – has been more important than that, and it goes without saying that I’ll be delivering an implementation of the meta-model and all the bits around it that runs on Parrot. However, there’s quite a bit of interest in having Rakudo working on additional backends, and I’m very keen that whatever meta-model design Rakudo ends up with, it is cleanly implementable on Parrot, the JVM, the .Net CLR, LLVM and [your runtime here].

There’s a lot of things to juggle here, and there’s going to be a lot of changes landing in the coming months. In these early days, I’ll be mostly prototyping. Once I’m happy with the overall design, I’ll dig in to implementing it on top of Parrot, and modifying nqp-rx. That will happen in a branch. Once that is done, I’ll branch Rakudo and do similar there. At some point, as we did when implementing the new grammar engine, those branches will become the mainline.

I *do* expect this to be a lot less painful on the Rakudo side than what we went through with the infamous “ng” branch. On the nqp-rx side that’s going to be a bit harder to promise; I will of course not break anything for the sake of breaking things, and for people writing compilers using PCT I really hope this will be close to seamless. People writing NQP code in expectation that it’s going to produce a certain bit of PIR, on the other hand, may be in for more of a surprise. As an overview (subject to change):

The notion that “nqp-rx is a compiler that generates code for Parrot and has no runtime library” is simply not going to survive this work. In some senses, it’s already not completely true; nqp-rx depends on P6object if you start writing classes, for example. So it’s just going to become less true.

I *may* abandon using various of the Parrot built-in types in favor of ones defined in an “NQP setting”, so that everything really can behave as a proper object. This depends if I can get the performance needed to do so, and if the other wins seem worth it.

nqp-rx is likely to end up with its own multi-dispatcher at least for multi-methods, just because various aspects of the meta-model’s functioning depend on this (of note, we need to be able to dispatch based on whether something is a type object or not rather than just by type). nqp-rx *may* end up using this for dispatching its built-in operators rather than leaving Parrot opcodes to go through Parrot-level multi-dispatch. This goes hand in hand with what I said above with regard to the built-in types. (Also, from an NQP user perspective, having a single coherent multi-dispatcher at work may just be a winner. Then there’s factoring in the desire for NQP to be portable, which argues against tying it to one VM’s built-in semantics. We’ll see.)

nqp-rx (and Rakudo) will get an nqp::foo pseudo-namespace for “portable operations” that can be mapped per-VM, a bit like the pir::foo one. Of course, NQP on Parrot will – possibly with the requirement of a pragma – keep pir::foo, Q:PIR { … } and I’m also pondering an “is vtable(‘…’)” trait for specifying how an object maps Parrot v-table operations to methods.

It was just under a year ago when pmichaud++ dug into the nqp-rx effort, which was a prerequisite for bringing Rakudo up to the level where we could reach the Rakudo Star release series. We made it – Rakudo may not be fast today, and for sure there’s bugs, but we successfully put out a release that delivered on many of the feature promises of Perl 6. This batch of work is a prerequisite for taking us to the next level: towards one where we run faster, have a path to implement features and optimizations that we can’t today, where the meta-programmer can work productively, and where more stuff Just Works the way it should.

5 Responses to Rakudo’s meta-model: The Road Ahead

An interesting read as always, and I for one hope your your project will be selected for more than one reason. Performance yes, no-one can argue about that. Also knowledge sharing. You are very expressive in your blog posts about your work when you are doing grant work, and I enjoy your discussions and learn from your insight and challenges.