I've written an algorithm to count the number of inversions in a list of integers (pairs of integers that are out of order), but it runs very slowly. It takes ~400 seconds to run on a list of 100,000 integers. In contrast, the same algorithm written in Java runs in less than a second.

The algorithm itself is basically merge-sort, a binary recursive call, but counting the number of merges that would need to be done without actually doing them.

Why does this take so long? Is there anyway to get the running time comparable to Java? I have read that properly optimized CL can be competitive with C, so I would hope that this algorithm can be made better than a hundred times slower than Java.

Could you try it with something like this? I'm not 100% certain I understood what exactly the function has to do, it looks like reduce'ing instead of looping might do a better job (although insignificantly, I guess). The major problem with your algorithm is that it:- intensively recalculates the length of the lists many-many times, and it's a O(n), while in Java you probably used ArrayList (which, contrary to its name is a dynamic array, nothing to do with lists), or just an array of int (even faster), and "calculating its length is O(1).- creates a lot of needless conses. subseq creates new conses - but this is what you don't want to do! You've no use for the old conses after new are created, and, although the runtime may try to be smart and reduce the conses creation by reusing some old ones you threw away, it will be still taxing the memory. However, splitting arrays in two is a much simpler task then splitting lists (again O(1) vs O(n), you could say it's O(n/2) but there's no such thing). And you had to call subseq twice!

Alright, I'm hoping that the above will improve your code somewhat, but, ultimately, if you want to get near Java's speed on this task you need to use arrays, not lists, because arrays are better for this algorithm.

Thanks wvxvw. I think that you're right. I hadn't appreciated the difference between a lisp list and an array. I'll modify my code to work on an integer array (for starters) and see what sort of improvement that yields. I'm guessing that it will be significant.

Is there any simple method to convert a list to an array? Some thing like (vector *input-list*), I'm guessing.

tensorproduct wrote:Thanks wvxvw. I think that you're right. I hadn't appreciated the difference between a lisp list and an array. I'll modify my code to work on an integer array (for starters) and see what sort of improvement that yields. I'm guessing that it will be significant.

Is there any simple method to convert a list to an array? Some thing like (vector *input-list*), I'm guessing.

I tested your program with Clozure CL v1.6 under Windows and without the funcall to predicate it took 0.205 seconds for a list of 10.000 elements.I also rewrote your program in C with Visual C++ 6 using a static array and it took 0.175 seconds. I doubt Java would make 100.000 elements in less than a second.

Actually, a lot of the problem here is not with FUNCALL itself but with the fact that FUNCALL screens optimizations. If you pass as a predicate not #'<, which is generic multi-argument function, but (lambda (a b) (declare (fixnum a b)) (< a b)) it will reduce the time by more that half alone.

Of course for a very small operation like numeric comparison the cost of indirection will always be significant. A first class function shouldn't be an innermost operation in a long loop. There are ways around this (for example, use a compiler macro to catch a compile-time predicate or recompile at runtime when the predicate is available), but it should be done only when you know that it would be actually useful, and if it is impractical to pass a larger behaviour.

tensorproduct wrote:So, does this mean that writing fast Lisp code is mutually exclusive to passing first class functions? Or are there other ways of doing this than "funcall"-ing things?

It's all about the implementation. The Lisp implementations I know (SBCL and Clozure CL for Windows) generate abyssmal code for funcalls of functions that they cannot determine at compile time. I don't know if the commercial implementations (Allegro/Lisp Works) are better at this.

tensorproduct wrote:Moving from a list to an array didn't quite have the impact I was expecting, maybe 10% off the running.

This isn't a surprise to me. The purpose of arrays is to speed up code but ironically untyped arrays are horribly slow in Common Lisp. Typed arrays should be fast but SBCL and Clozure CL don't seem to be able to make proper use of that type information. Arrays seem to be pretty much useless in the free implementations. Lists are amazingly fast, though.

tensorproduct wrote:I admit that I quoted that figure without doing any proper timing. 3 to 4 seconds is a more accurate quotation.

PS: Maybe this post sounds a bit more negative about Lisp than I intended. In fact, your original code (without a funcalled predicate) runs only ~15% slower (in Clozure CL v1.6) than the C version on my PC. I think thats pretty impressive for a dynamic language, especially since your code uses lists and does a lot of consing due to subseq, while the C version uses a static array.

That's why I like Lisp. It's powerful and still pretty fast. Even without arrays.

Last edited by Konfusius on Thu Jun 21, 2012 5:48 am, edited 1 time in total.

here are the results. It may be possible to optimize it further a bit by manipulating pointers to the segments of the array being compared instead of dividing the array, but it appears that memory allocations are rather super fast, so that's not important, unless memory itself is important.

And... by thee wee, the results for my last example (without funcall) are: