Context Navigation

Notes on an IA-32 port

A port of OpenMCL to IA-32 (Intel's name for the 32 bit x86 architecture) would be
a win for several reasons.

The most obvious reason would be to support Apple hardware that isn't 64-bit
capable. This includes the first-generation Intel-based iMac, MacBook, MacBook Pro, and
all Intel-based Mac Minis. Of course, support for non-Apple IA-32
hardware would also be nice.

Another reason would be to support access to Cocoa and Carbon (and
other frameworks)
on Intel-based Macintoshes running Mac OS X Tiger.

Although Apple has announced that Cocoa will be 64 bit in Leopard, they
have publically confirmed that Carbon won't be.
Therefore, a 32 bit lisp would still be needed to use Carbon, even on Leopard running on
64 bit hardware.

It would be interesting to support the AMD Geode LX (as used in the
OLPC laptop)
as the minimum processor. This processor supports the
P6 family instructions, including MMX instructions. We can therefore
use the conditional move instructions, and maybe some MMX
instructions to help out with bignums.

On the other hand, the AMD Geode LX processor doesn't support any of
the SSE/SSE2/SSE3 instructions; this means that we'd have to use the
x87 FPU (which is sort of funky). This would require modifications to
the compiler, which
believes that every floating point register can be accessed
independently.

It might be reasonable to target the Core Solo/Duo? processor, at least
to begin with. This would cover all Intel-based Macintosh systems
ever shipped, and would allow us to avoid adding x87 FPU support for now.

We will augment this with a dynamic scheme: we will set or clear a
bit in thread-private memory whenever a register transitions from
one class to another. The GC will then look at these flag bits to
decide how to treat the registers. (This may make the lispy
register names confusing, since at times imm0 might actually
contain a node, or arg_y an immediate.)

Callee-saved "non-volatile" registers are probably a non-starter.

The tagging scheme can basically follow the PPC32 port. An important
difference is that the three-bit tag #b101, which is for NIL on PPC32,
would be used for a thing called a tagged return addresses on IA-32.
(More on this later.)

Comment by gb on Wed Aug 1 22:16:05 2007

It's probably sanest to think of the dynamic register partitioning as being a set of (local, temporary) changes relative to a baseline scheme, where the baseline scheme is in effect any time a function is entered (and therefore at the time of a function call). At that time, we probably need more node regs and fewer imm regs than the scheme suggested above provides, and we can probably overload nargs and imm0.

If we pass two arguments in registers, then we probably need a node register to address the callee on a function call (something like:

The CLOS implementation will sometimes funcall a method-function with an invisible argument (not counted against nargs) in a node register. (That's context information for CALL-NEXT-METHOD and it's done in a way that's not MOP-compliant.)

I think that in general if we err on the side of "too many node regs" in the baseline partitioning, we always (cheaply) save the values in those node regs if we need to temporarily make the register immediate for consing, shifts, multiply/divide, memory assignment, whatever else needs more than one imm reg.)

I think that having imm0=nargs would work fairly well, since we're usually either validating/defaulting based on nargs or doing tag/bounds checking, but rarely if ever need to do both at the same time.

Threads

We don't have the __thread storage class on Darwin, so we will
need to use i386_set_ldt to install a segment
descriptor for each thread into the LDT. When a thread's segment
descriptor is loaded into the %fs segment register, %fs can be used
to refer to thread-local storage.

(This implies an 8K limit to the number of threads, by the way. Probably not
a big deal for a 32 bit lisp.)