1.4 Cleaner impure version

Here's a translation of the fast C version. It's unoptimised so far, but
already runs much faster than our best `pure' version. Use -O2 -optc-O3.
It's not pretty (or easy to reason about -- how do the C programmers do
it?), but it works :)

1.6 Sebastian Sylvan

I contributed what I think is a "neat and elegant" solution which emphasizes clarity over speed (but is still pretty fast).
The inlinings here really helped a lot (something like 2x improvment).
It's been submitted (and accepted) in the shootout already as an example of an idiomatic "elegant" approach and is currently the fastest Haskell entry (note that they have changed the benchmark to use N=10).
I think that if we want anything which is to compete with the imperative languages we need to use imperative style code (in-place reversions etc.). It's probably a good idea to have an "idiomatic" version and a "fast" version.

Note the permutations generator (a rewritten version of Bertram's) which on my system performed slighty better than Bertrams and is also a lot clearer (IMHO). It basically does the same thing but with less "magic" syntax :-)
I should clarify that Bertram's version is certainly faster altogether (my version is all about elegance and clarity), but I didn't experience any downside to rewriting the permutation generator in a clearer way (in fact, i got a slight speedup).