Sun, 19 Oct 2008

Perl 6 Tidings from September and October 2008

Specification

Perl 6 has got two new operators.

Series, again and again

The series operator
... (yes, literally three dots) lazily constructs infinite series
of objects. It takes a list on the left and a closure on the right, and calls
the closure to determine the next values. It is best introduced by a few
examples:

The ~ meta character is a parser combinator inspired by
Haskell's parsec. It basically sets up a parsing goal (in this
example the closing paren) and then executes the following subrule. If the
goal can't be found, a nice error message is given.

To produce even nicer error messages on parse failures, the new
:dba (stands for "do business as") adverb can give rules
a human-understandable name (human = not a compiler implementer, but a mere
mortal Perl 6 hacker).

If the example rule above failed to match, it would say Error while
parsing postcircumfix:sym<( )>: can't find closing ')'. Adding a
:dba<argument list> would replace the unfriendly
postcircumfix blah with... well, you get the picture.

Tests part of the language?

There's also (sic) ongoingtalk about moving the testing
capabilities that are now in Test.pm into the language
core. A likely scenario is that there will be a :ok adverb on
comparison operators that has roughly the same semantics the the
ok() sub, but enables better diagnosis messages in case of
failure.

To achieve this, all comparison operators would need to get a new multi
that accepts the named (mandatory) argument :ok. That seems like
a big change, but most (or even all) could be generated automatically, and the
candidate lists for the multis could be pre-computed mostly at compile time,
so no (runtime) performance hit is expected.

Enhance the Match object by token list

Currently the Match object (the thing that's returned from a regex match)
gives you no easy access to the sub-matches in the order in which they occured
in the string, which means it's quite hard to extract information from it that
the writer of the original regex didn't think about making available.

In particular I tried to (ab)use the match tree from STD.pm to write a
syntax hilighter, and found it to be a rather daunting task. So I suggested to
add such sequential information to the Match object. Larry liked the idea,
because he knows what pain the Perl 5 parser causes because it forgets
information too quickly (actually it never builds a full parse tree, it
generates the optree on the fly. That's efficient, but makes introspection
very hard).

However it's not entirely clear if it will be added, and if yes, in which
form. Unclear are (a) performance impact, (b) access syntax and (c)
symmetry.

(b) and (c) need more explanation: The match object already has list
semantics for accessing positional captures, so you can't make the sequential
chunks available through an array index. The most simple solution is
composition, ie a method< code>$/.splits or $/.tokens
that returns such a list.

But that
breaks a
fundamental symmetry that now exists between match
objects and captures (ie argument lists). Both can have a scalar component
(the object that make returned/the method invocant), list
components (positional captures/positional arguments) and hash components
(named captures/named arguments). Introducing a way of accessing the
components of a match object in a completely different order breaks that
symmetry. We're not yet entirely clear on what that means for the language.

Implementations

Rakudo

Jerry "particle" Gay implemented the is export on subroutines,
taking a large leap towards making modules usable in Rakudo. To facilitate
testing of this new feature he also implemented the =:= infix
operator (tests whether two variables are aliases).

Allison Randal merged a branch which reworked Parrot's multi sub and multi
method dispatch system. That broke some complex math in Rakudo, leaving us
with "only" about 4380 passing spec tests. Otherwise we might have hit the
4500 mark by now. Still it's good work, and is expected to solve some
fundamental problems in the mid or long term.

Elf

Mitchell Charity worked furiously on the new Lisp backend for elf, bringing it
(almost?) to bootstrap. That demonstrated the flexibility of elf, and allows to
get rid of some quirks that can creep into a compiler if it has only one
backend.

It also opens up an opportunity for hackers that want to help with a Perl 6
compiler by writing code in Perl 6. Or Lisp.

STD.pm and viv

STD.pm is the Perl 6 grammar written in Perl 6. It now parses all Perl 6
code that we know of, so it's time to find out if it actually parses it in a
useful way, and to check if it loses information while parsing.

Finding that out is one of the goals of viv (Read that as
VI → V and
think of Roman numbers). The other goal is to replace gimme5,
which currently does the ugly, hacky job of translating STD.pm to Perl 5
code.

It's a script that uses reduce actions at the end of each grammar rulle to
build some kind of parse tree or abstract syntax tree, and it's planned to
produce either Perl 6 or Perl 5 output. We'll see what the future (and $larry)
brings.

Pugs and the test suite

Pugs is still hibernating, and waiting for the release of GHC 6.10.1.

If pugs hibernates, the test suite has a light slumber, and is occasionally
disturbed in its peace by a few more tests now and then.

SMOP

Daniel Ruoso and Paweł Murias are both hacking actively on smop. Currently
on the agenda is multi dispatch, which is more complicated than it sounds at
first. Remember that slurpy argument lists are lazy, which makes things more
complicated.

For me it's rather hard to judge how much progress they are making, or how
close they are to run basic, real-world code.

Update: Paweł contributed another small explanation:

SMOP now has a new compiler named mildew which uses viv/STD as
its parser. Right now it supports only a handful of operations, the most
advanced of it is creating an object with a simplified meta model.

It lives in the Pugs repository under v6/mildew/. Anyone who
wants to hack on it (in Perl 5) is welcome, instructions can be found on
#perl6