Each time I have downloaded a new copy of Rakudo Perl 6, I have run the following expression just to get an idea of its current performance:

say [+] 1 .. 100000;

And the speeds have been increasing, but each time, there is a noticeable delay (several seconds) for the calculation. As a comparison, something like this in Perl 5 (or other interpreted languages) returns almost instantly:

use List::Util 'sum';
print sum(1 .. 100000), "\n";

or in Ruby (also nearly instant):

(1 .. 100000).inject(0) {|sum,x| sum+x}

Rewriting the expression as a Perl6 loop ends up being about twice as fast as reducing the range, but it is still a very noticeable delay (more than a second) for the simple calculation:

my $sum;
loop (my $x = 1; $x <= 100000; $x++) {$sum += $x}

So my question is, what aspects of the Perl6 implementation are causing these performance issues? And should this improve with time, or is this overhead an unfortunate side effect of the "everything is an object" model that Perl6 is using?

And lastly, what about the loop construct is faster than the [+] reduction operator? I would think that the loop would result in more total ops than the reduction.

EDIT:

I'd accept both mortiz's and hobbs's answers if I could. That everything is a being handled as a method call more directly answers why [+] is being slow, so that one gets it.

'[A] loop would result in more total ops than the reduction. ? I would've imagined that folding the list results in more computation because it's transforming the list to a single value. The loop doesn't appear to do that.
–
ZaidJun 28 '10 at 20:32

@Zaid => my reasoning (which might be wrong) was that the loop would have 3 high level ops per iteration (the bounds check, the loop increment, and the $sum increment), whereas the reduction operator would be 2 ops (fetch next item from iterator, add to internal accumulator).
–
Eric StromJun 28 '10 at 20:46

2

One note here is that comparing to List::Util::sum is more than a bit unfair, since that's a special-purpose function. You should probably be comparing with List::Util::reduce { $a + $b } (1 .. 100000). Mind you, that still runs 50 times per second on my box.
–
darchJun 28 '10 at 21:39

4 Answers
4

Another thing you have to understand about the lack of optimization is that it's compounded. A large portion of Rakudo is written in Perl 6. So for example the [+] operator is implemented by the method Any.reduce (called with $expression set to &infix:<+>), which has as its inner loop

in other words, a pure-perl implementation of reduce, which itself is being run by Rakudo. So not only is the code you can see not getting optimized, the code that you don't see that's making your code run is also not getting
optimized. Even instances of the + operator are actually method calls, since although the + operator on Num is implemented by Parrot, there's nothing yet in Rakudo to recognize that you've got two Nums and optimize away the method call, so there's a full dynamic dispatch before Rakudo finds multi sub infix:<+>(Num $a, Num $b) and realizes that all it's really doing is an 'add' opcode. It's a reasonable excuse for being 100-1000x slower than Perl 5 :)

Update 8/23/2010

More information from Jonathan Worthington on the kinds of changes that need to happen with the Perl 6 object model (or at least Rakudo's conception of it) to make things fast while retaining Perl 6's "everything is method calls" nature.

+1 everything as a runtime dispatched method call definitely explains why we are seeing this level of performance. Any idea when Rakudo is going to support optimization of Num and Str to be handled by Parrot directly?
–
Eric StromJun 29 '10 at 19:31

The first and maybe most important reason is that Rakudo doesn't do any optimizations yet. The current goals are more explore new features, and to become more robust. You know, they say "first make it run, then make it right, then make it fast".

The second reason is that parrot doesn't offer any JIT compilation yet, and the garbage collector isn't the fastest. There are plans for a JIT compiler, and people are working on it (the previous one was ripped out because it was i386 only and a maintenance nightmare). There are also thoughts of porting Rakudo to other VMs, but that'll surely wait till after end of July.

In the end, nobody can really tell how fast a complete, well-optimized Perl 6 implementation will be until we have one, but I do expect it to be much better than now.

BTW the case you cited [+] 1..$big_number could be made to run in O(1), because 1..$big_number returns a Range, which is introspectable. So you can use a sum formula for the [+] Range case. Again it's something that could be done, but that hasn't been done yet.

+1, these are the kind of details I'm looking for (and from a reputable source no less). detecting [+] Range and using the sum formula would be a great optimization. As it stands right now, does [+] treat Range as an iterator, or does it first convert it to a list and then reduce it? From memory, it seems that increasing $big_number resulted in a nonlinear performance regression which would imply listification. Do you know?
–
Eric StromJun 28 '10 at 21:30

1

Since the June release of Rakudo converting to a list is lazy, and thus really the same as iterating.
–
moritzJun 29 '10 at 9:46

It certainly isn't because everything is an object, because that's true in a number of other languages too (like Ruby). There's no reason why Perl 6 would have to be magnitudes slower than other languages like Perl 5 or Ruby, but the fact is that Rakudo is not as mature as perl or CRuby. There hasn't been much speed optimization yet.

Yep, that's exactly why I added in the Ruby example, because comparing perl5 to perl6 performance just didn't seem fair at all :). Do you know if these issues are on the Parrot side or the Rakudo side?
–
Eric StromJun 28 '10 at 20:41

I submitted these to Fefe's language competition in December 2008. wp.pugs.pl is a literal translation of the Perl 5 example, wp.rakudo.pl is far more sixier. I have two programs because the two implement a different subset of the spec. Build information is outdated meanwhile. The sources:

What's going on with the segfault in the last test, are those times complete?
–
Eric StromJun 29 '10 at 15:14

I suspect "$*IN.lines.split(/\s+/).map: { %words{$_}++ };" should be replaced with "%words{$_}++ for $*IN.lines.split(/\s+/);" since map in sink (void) context is currently broken
–
PatJun 29 '10 at 16:24

1

Replacing this with $*IN.slurp.words instead -- .slurp reads the entire file into a single string, and .words extracts all the words from a string, basically the opposite way of stating split on whitespace -- shaved 80% off the scripts execution time for my 100-line test file.
–
SolJun 29 '10 at 21:36