18.4. Using AltiVec in Clozure CL LAP functions

18.4.1. Overview

It's now possible to use AltiVec instructions in PPC LAP
(assembler) functions.

The lisp kernel detects the presence or absence of
AltiVec and preserves AltiVec state on lisp thread switch and
in response to exceptions, but the implementation doesn't
otherwise use vector operations.

This document doesn't document PPC LAP programming in
general. Ideally, there would be some document that
did.

This document does explain AltiVec register-usage
conventions in Clozure CL and explains the use of some lap macros
that help to enforce those conventions.

All of the global symbols described below are exported
from the CCL package. Note that lap macro names, ppc
instruction names, and (in most cases) register names are
treated as strings, so this only applies to functions and
global variable names.

Much of the Clozure CL support for AltiVec LAP programming
is based on work contributed to MCL by Shannon Spires.

The EABI (Embedded Application Binary Interface) used in
LinuxPPC doesn't ascribe particular significance to the vrsave
special-purpose register; on other platforms (notably MacOS),
it's used as a bitmap which indicates to system-level code
which vector registers contain meaningful values.

The WITH-ALTIVEC-REGISTERS lap macro generates code that
saves, updates, and restores VRSAVE on platforms where this is
required (as indicated by the value of the special variable
that controls this behavior) and ignores VRSAVE on platforms
that don't require it to be maintained.

On all PPC platforms, it's necessary to save any non-volatile
vector registers (vr20 .. vr31) before assigning to them and to restore
such registers before returning to the caller.

On platforms that require that VRSAVE be maintained, it's
not necessary to mention the "use" of vector registers that
are used as incoming parameters. It's not incorrect to mention
their use in a WITH-ALTIVEC-REGISTERS form, but it may be
unnecessary in many interesting cases. One can likewise assume
that the caller of any function that returns a vector value in
vr2 has already set the appropriate bit in VRSAVE to indicate
that this register is live. One could therefore write a leaf
function that added the bytes in vr3 and vr2 and returned the
result in vr2 as:

(defppclapfunction vaddubs ((y vr3) (z vr2))
(vaddubs z y z)
(blr))

When vector registers that aren't incoming parameters are used
in a LAP function, WITH-ALTIVEC-REGISTERS takes care of maintaining VRSAVE
and of saving/restoring any non-volatile vector registers:

AltiVec registers are not preserved by CATCH and UNWIND-PROTECT.
Since AltiVec is only accessible from LAP in Clozure CL and since LAP
functions rarely use high-level control structures, this should rarely be
a problem in practice.

LAP functions that use non-volatile vector registers and
that call (Lisp ?) code which may use CATCH or UNWIND-PROTECT
should save those vector registers before such a call and
restore them on return. This is one of the intended uses of
the WITH-VECTOR-BUFFER lap macro.