Wed, 24 Aug 2011

Why Rakudo needs NQP

Rakudo, a popular Perl 6 compiler, is built on top of a smaller
compiler called "NQP", short for Not Quite Perl.

Reading through a recent
ramble by chromatic, I felt like he said "Rakudo needs NQP to be able
to ditch Parrot, once NQP runs on a different platform" (NQP is the "another
layer", which sits between Rakudo and Parrot, mentioned in the next-to-final
paragraph).

I'm sure chromatic knows that VM independence is the least important reason
for having NQP at all, but the casual reader might not, so let me explain the
real importance of NQP for Rakudo here.

The longer version is that large parts of Rakudo are written in Perl 6
itself (or a subset thereof), and something is needed to break the
circularity.

In particular the base of the compiler is written in a subset of Perl 6,
and NQP compiles those parts to bytecode, which can then compile the rest of
the compiler.

This is not just because we have a fancy for Perl 6, and thus want to write
as much of the code in Perl 6, but there are solid technical reasons for
writing the compiler in Perl 6.

In Perl 6, the boundary between run time and compile time is blurred, as
well as the boundary between the compiler, the run time library and user-space
code. For example you alter the grammar with which your source code is parsed, by
injecting your own grammar rules.

"Your own grammar rules" above refers to user-space code, while the grammar
that is being altered is part of the compiler. If we had written the compiler
in something else than Perl 6 (for example Java), it would be horribly
difficult to inject user-space Perl 6 code into compiled code from a different
language.

And the code not only needs to be injected, but the data passed back
and forth between the compiler and the user space need to be Perl 6 objects, so all
important data structures in the compiler need to be Perl 6 based anyway.

And it's not just for grammar modifications: At its heart, Perl 6 is an
object oriented language. When the compiler sees a class definition, it
translates them to a series of method calls on the metaobject, which again
needs to be a Perl 6 object, otherwise it wouldn't be easily usable and
extensible from the user space.

Now you might think that grammar modifications and changes to the
Metaobject are pretty obscure features, and you could get along just fine
with an incomplete Perl 6 compiler that neglected those two areas. But even
then you'd have lots of interactions between run time and compile time. For
example consider a numeric literal like 42. Obviously that needs
to be constructed of type Int. What's less obvious is that it
needs to be constructed to be of type Int at compile time already, because
Perl 6 code can run interleaved with the compilation. So the compiler needs to
be able to handle Perl 6 objects in all their generality, which is a huge pain
if the compiler is not written in Perl 6.

Rakudo has cheated on that front in the past, and consequently has
had lots of bugs and limitations due to non-Perl 6 objects leaking out at
unexpected ends. If you ever got a "Null PMC Access" from Rakudo, you know what I
mean.

The lesson we learned was that you need a Perl 6 compiler to
implement a Perl 6 compiler, even if that first Perl 6 compiler can
handle only a rather limited subset of Perl 6.

And there are also quite some benefits to this approach. For example NQP's new
regex engine is implemented as a role in NQP. It is mixed into an NQP
class which allows us to build Rakudo, but it is also mixed in a Perl 6 class,
which allows the generation of Perl 6-level Match
objects without any need to create NQP-level match objects first, and then
wrap them in Perl 6 Match objects.

That's what NQP does for us. It allows us to actually write a Perl 6
compiler.