Day 4 – Having Fun with Rakudo and Project Euler

Rakudo, the leading Perl6 implementation, is not perfect, and performance is a particularly sore subject. However, the pioneer does not ask ‘Is it fast?’, but rather ‘Is it fast enough?’, or perhaps even ‘How can I help to make it faster?’.

To convince you that Rakudo can indeed be fast enough, we’ll take a shot at a bunch of Project Euler problems. Many of those involve brute-force numerics, and that’s something Rakudo isn’t particularly good at right now. However, that’s not necessarily a show stopper: The less performant the language, the more ingenious the programmer needs to be, and that’s where the fun comes in.

All code has been tested with Rakudo 2012.11.

We’ll start with something simple:

Problem 2

By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

The solution is beautifully straight-forward:

say [+] grep * %% 2, (1, 2, *+* ...^ * > 4_000_000);

Runtime: 0.4s

Note how using operators can lead to code that’s both compact and readable (opinions may vary, of course). We used

whatever stars * to create lambda functions

the sequence operator (in its variant that excludes the right endpoint) ...^ to build up the Fibonacci sequence

the divisible-by operator %% to grep the even terms

reduction by plus [+] to sum them.

However, no one forces you to go crazy with operators – there’s nothing wrong with vanilla imperative code:

Note the declaration of sigilless variables \N, \a, …, how $(…) is used to return the triplet as a single item and .list – a shorthand for $_.list – to restore listy-ness.

The sub &triplets acts as a generator and uses &take to yield the results. The corresponding &gather is used to delimit the (dynamic) scope of the generator, and it could as well be put into &triplets, which would end up returning a lazy list.

We can also rewrite the algorithm into dataflow-driven style using feed operators:

Note how we use destructuring signature binding -> […] to unpack the arrays that get passed around.

There’s no practical benefit to use this particular style right now: In fact, it can easily hurt performance, and we’ll see an example for that later.

It is a great way to write down purely functional algorithms, though, which in principle would allow a sufficiently advanced optimizer to go wild (think of auto-vectorization and -threading). However, Rakudo has not yet reached that level of sophistication.

But what to do if we’re not smart enough to find a clever solution?

Problem 47

Find the first four consecutive integers to have four distinct prime factors. What is the first of these numbers?

This is a problem where I failed to come up with anything better than brute force:

Note the use of BEGIN to initialize the cache first, regardless of the placement of the statement within the source file, and multi to enable multiple dispatch for &factors. The where clause allows dynamic dispatch based on argument value.

Even with caching, we’re still unable to answer the original question in a reasonable amount of time. So what do we do now? We cheat and use Zavolaj – Rakudo’s version of NativeCall – to implement the factorization in C.

It turns out that’s still not good enough, so we refactor the remaining Perl code and add some native type annotations:

You can contribute by sending pull requests, or better yet, join #perl6 on the Freenode IRC network and ask for a commit bit.

Have the appropriate amount of fun!

Like this:

LikeLoading...

Related

This entry was posted on December 4, 2012 at 12:01 am and is filed under 2012. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

3 Responses to “Day 4 – Having Fun with Rakudo and Project Euler”

You can probably get a fast enough problem 47 using pure perl6 (though you’d need a priority queue to do it).

Like anything, you flip the problem onto it’s head. Instead of factoring the number, you work from the fact you are going through these numbers in sequence and essentially write a sieve of Eratosthenes that keeps track of how many primes cross out a number.

(The essential difference with the priority queue is that you pop off all the items with key == $_, then push them back on with the updated number).

To reduce churn in the data structure, you’d probably also want to just keep 2 and 3 as separate counters because they come up so often.

I have a very old version of perl6 at work, so I can’t really give you speed numbers. But the c version of the same algorithm is in the less than a second category for N = 4, even without a priority queue.

Very nice. I actually thought about looking into a sieve-of-Eratosthenes-like solution when writing the article, but decided against it as having an example of what to do when all else fails seemed like a good idea.

Anyway, I ran your algorithm on my machine and it clocks in at 1.7s for N=3. For N=4, I stopped the run after ~40m without result.

If you have a github account, feel free to add your code to the repository – as-is or after making the suggested improvements.

I agree completely that your solution was the right one to put in the post, native call is important to show off, it’s a cool feature.

Not surprised on the lack of result for N=4, performing a linear extrapolation of the comparisons on the 1.7 seconds leads to about 10 hours. (41,314 for N=3 vs. 880,972,037 comparisons for N=4).

I created a priority queue based version last night, but I tickled Rakudo the wrong way with my implementation and it was very slow (heaps are pretty simple, they pretty much just need push, pop, and swap to be fairly fast operations on the list type). It may still end up faster for N=4 because 10 hours is not a hard time to beat, but N=3 was about 5-10 times slower using it. I’ll see if I can get it running faster, perl 6 should have a priority queue module implemented for it anyways, so it will be good to have either way.