While trying to make a little algorithm more efficient, I was wondering whether it was possible to use low-level-integer-arithmetic of the architecture I am using, directly. I like the many numeric types of common lisp, which are flexible and well-optimized for their general purposes, but for my special purpose, it would be enogh to have integers ranging from -10000 to 10000 to add, subtract, multiply (and divide rounded). For any value to be out of this range, I wouldnt care what happens.As far as I know, most processors are calculating sums and products directly modulo 2^n for some n, i.e. x86 modulo 2^32. This would be sufficient for my purposes (and it would certainly be faster).

Specifying (integer -10000 10000) or fixnum seems not to be enough. I read http://www.sbcl.org/manual/Modular-arithmetic.html - this doesnt seem to work for multiplying, and it only holds for sbcl - but on the other hand, it is about a very old version of sbcl, maybe it meanwhile has gotten better.

On the other hand, sbcl has some functions in sb-vm, namely sb-vm::*-smod30 and sb-vm::+-smod30. These look like they were optimized to some degree. I dont like to write special code for different lisp implementations, but if necessary (and if it is useful - meanwhile it is fast enough, but i am still interested in this issue) I would do that, but then at least I would like to write optimized code for many other implementations, too. Maybe there is already some kind of Library allowing to find types which are some kind of more low-level in different implementations, maybe there is even a standard for such things.

Unfortunately, there are is no portable/standard way to embed assembly code in lisp, nor do I know of any finished libraries to do so. The most portable approach is to compile assembly in C and use http://common-lisp.net/project/cffi/ to call this from Lisp. However, FFI calls tend to be expensive; so they're poor for numerics unless you can call into larger algorithms. Less portably, ECL lets you directly embed C code in lisp, and CMUCL and SBCL allow the use of VOPs...

Extracting the assembly from compiled lisp is more portable; try something like (disassemble #'disassemble). But the printed output is highly implementation specific.

"Common" sizes such as (signed-byte 16) or (signed-byte 32) might yield better results than (integer -10000 10000) since these map directly to machine primitives. Sometimes, a few forms like (the type expr) are needed to coax the best speed out of SBCL, and a few special cases benefit from (sb-ext:truly-the type expr). But generally the best results come from simply declaring a function's argument types and optimization settings.

schoppenhauer wrote:As far as I know, most processors are calculating sums and products directly modulo 2^n for some n, i.e. x86 modulo 2^32. This would be sufficient for my purposes (and it would certainly be faster).

Specifying (integer -10000 10000) or fixnum seems not to be enough. I read http://www.sbcl.org/manual/Modular-arithmetic.html - this doesnt seem to work for multiplying, and it only holds for sbcl - but on the other hand, it is about a very old version of sbcl, maybe it meanwhile has gotten better.

As far as I know, declaring types of functions arguments, local variables (with declare) and intermediate results (with the) and setting high level of optimization for speed and low level of optimization for safety should use CPU's native arithmetics (by the way, on x86 it is not modulo 2^32, but modulo 2^28). The trick is that if you have not declared not to optimize for safety, SBCL treats type declarations as assertions (and it will insert additional code to functions that check the assertions). But if you declare (optimize (safety 0)), type assertions are trusted by SBCL and are not checked (but you lose safety and your program behaves like a C program and can segfault, for example).And if you set high level of optimization for speed, during compilation SBCL will emit a lof of compiler notes that help write optimized code.

dmitry_vk wrote:As far as I know, declaring types of functions arguments, local variables (with declare) and intermediate results (with the) and setting high level of optimization for speed and low level of optimization for safety should use CPU's native arithmetics (by the way, on x86 it is not modulo 2^32, but modulo 2^28).

Everything dmitry_vk said is true, but for the sake of completeness I should probably point out that while this works for SBCL and many other compilers, it's not guaranteed. The type declarations simply tell the compiler that certain optimizations are possible because the programmer is guaranteeing certain things about the data values, but it doesn't the force the compiler to use that knowledge in any standard, portable way. When you want to go fast, it's fairly common to declare integers as fixnums and the result of any arithmetic as a fixum, for instance, using THE. When you do so, the compiler has the opportunity to use modular arithmetic and the native machine's underlying word size.

I should also note that SBCL is fairly smart with its type propagation and inferencing and will do a lot with the declarations you give it. The code generation is also pretty good and there are many optimizations to handle this sort of thing, particularly on x86. Other compilers may not do as well, either because of their implementation or the compiler. For instance, CLISP is byte-code interpreted and so while it has a good compiler, everything is still interpreted.

Typically, you'd only resort to this after you have profiled your code and determined which functions hold your inner loops. Then you'd work on those. Code written to include lots of declarations gets ugly pretty quickly and it's best to minimize it to only the inner loops.