Monday, May 17, 2010

Upgraded array element types and pinned arrays (updated)

I am planning to do a major rewrite of my LLA library in the near
future. LLA, which stands for Lisp Linear Algebra, uses BLAS and
LAPACK to perform operations on matrices (and vectors), ranging from
matrix multiplication to singular value decompositions. LLA is
experimental, and its aim is to provide semantics which recognize
special matrix types both as inputs and outputs. Maybe I will blog
about it sometime, but that's not the focus of this post.

Currently, LLA uses Lisp arrays, but wrapped in a class which has a
slot signifying the element type of the arrays. This is because it is
meant to be portable, and not all implementations have upgraded array
element types for every type LAPACK recognizes (see below). Also, LLA
matrix representation is currently column major, and Lisp arrays are
row-major.

However, I realized that I can use Lisp arrays in more places - LAPACK
handles row-major arrays just fine (as transposed column-major
arrays). Besides, having a native Lisp array is nice, eg one can just
use functions like REDUCE or REPLACE, without dealing with a
special type. However, there is a price to pay: when LAPACK routines
are called, the array element type has to be detectable. For
implementations which upgrade element types use by LLA to themselves
or something specific, this is a cheap operation. For implementations
that don't, it requires an extra pass.

Another issue is whether your implementation has what SBCL calls
"pinned" arrays. Pinned arrays can make their contents available to
foreign routines directly with negligible overhead for a limited
duration (or maybe indefinitely, but I prefer the first).

I asked for implementation-specific details on c.l.l, and got the
following information:

Implementation

Machine

S

D

CS

CD

I32

I64

pinned?

SBCL (1.0.38)

64-bit

*

*

*

*

*

*

yes

Lispworks (6.0.1)

32-bit

*

*

T

T

*

*

?

Lispworks Personal Edition (5.1.1)

32-bit

*

*

T

T

*

T

?

Clozure CL (1.4, 1.6)

32-bit

*

*

T

T

*

T

yes

Clozure CL (1.6)

64-bit

*

*

T

T

*

*

yes

ECL (10.3.1)

32-bit

*

*

T

T

T

T

yes

ABCL (0.13.0)

64-bit

T

T

T

T

T

T

?

Allegco CL Enterprise Edition (8.1 & 8.2, Linux and Windows)

32-bit

*

*

*

*

*

T

?

Allegco CL Enterprise Edition (8.1 & 8.2, Linux and Windows)

64-bit

*

*

*

*

*

*

?

CLISP (2.48)

both

T

T

T

T

T

T

?

CMUCL (19e)

32-bit

*

*

*

*

*

T

yes

Corman Common Lisp (3.01, Windows)

?

*

*

T

T

T

T

?

The interpretation is this: S, D, CS and CD correspond to (complex)
single and double float. I32 and I64 are signed-byte types of 32 and
64 bits. * means that the element type is upgraded to itself,
otherwise the table shows the upgraded type.

The following table gives some details on the pinning mechanism when
the implementation has it:

Implementation

Pinning construct

Notes

SBCL

sb-sys:with-pinned-objects with sb-sys:vector-sap

pins object only where GC granularity allows

Clozure CL

ccl:with-pointer-to-ivector

disables GC

ECL

?

conservative GC, data can be used as long as the object is alive

So LLA is likely to have the following optimization model: it will be
blazingly fast (basically the speed of LAPACK, with a tiny bit of
overhead) on implementations which support all upgraded types and
pinning, and a bit slower on other ones. If your implementation
supports upgrading some of the above element types to themselves,
those arrays will require no element type detection so they will be
faster. I64 is only needed on 64-bit machines.

The motivation behind this decision is that if you are doing serious
numerical work, your implementation should support all relevant array
element types and also pinning. SBCL does, and since I am using that,
the new version of LLA will take advantage of all its nice facilities
— it already does, but currently it only runs on SBCL so that was
natural. LLA will be fully functional on other implementations, but
may be slower.

Please keep sending in information for the implementations you don't
see in the table above. I am especially interested in which
implementations support pinning. You can use this code snipped to
generate output on array element type upgrading:

12 comments:

SINGLE-FLOAT is upgraded to TDOUBLE-FLOAT is upgraded to T(COMPLEX SINGLE-FLOAT) is upgraded to T(COMPLEX DOUBLE-FLOAT) is upgraded to T(SIGNED-BYTE 32) is upgraded to T(SIGNED-BYTE 64) is upgraded to T

Two things. The first one is that all arrays in ECL are pinned: our garbage collector is conservative and thus the pointer to its data can be passed to foreign routines. You just have to make sure that a reference to the array is kept live somewhere.

The other one is that the I32 and I64 integers are upgraded not to "internal" types, but just to integers. The fact we use some names for them should not prevent you from supporting them efficiently -- if you don't, I would like to read why and help you solve whatever problems you find.

Hey, I'd love something like this to be available on Windows with Corman CL, so here's:

Corman Common Lisp (3.01) on NIL (NIL)

SINGLE-FLOAT is upgraded to itselfDOUBLE-FLOAT is upgraded to itself(COMPLEX SINGLE-FLOAT) is upgraded to T(COMPLEX DOUBLE-FLOAT) is upgraded to T(SIGNED-BYTE 32) is upgraded to T(SIGNED-BYTE 64) is upgraded to T

SINGLE-FLOAT is upgraded to itselfDOUBLE-FLOAT is upgraded to itself(COMPLEX SINGLE-FLOAT) is upgraded to T(COMPLEX DOUBLE-FLOAT) is upgraded to T(SIGNED-BYTE 32) is upgraded to itself(SIGNED-BYTE 64) is upgraded to itself

SINGLE-FLOAT is upgraded to itselfDOUBLE-FLOAT is upgraded to itself(COMPLEX SINGLE-FLOAT) is upgraded to itself(COMPLEX DOUBLE-FLOAT) is upgraded to itself(SIGNED-BYTE 32) is upgraded to itself(SIGNED-BYTE 64) is upgraded to itselfNIL

SINGLE-FLOAT is upgraded to itselfDOUBLE-FLOAT is upgraded to itself(COMPLEX SINGLE-FLOAT) is upgraded to T(COMPLEX DOUBLE-FLOAT) is upgraded to T(SIGNED-BYTE 32) is upgraded to itself(SIGNED-BYTE 64) is upgraded to TNIL