From the outset it was decided that we should make certain kinds of
generic arithmetic as fast as we could without having any type
information. It was further decided that remaining kinds of generic
arithmetic should be as fast as it was convenient to make them. The
optimized cases were to be same-representation arithmetic on fixnums,
flonums, and compnums (all operations that are not unary are binary; the
compiler translates from e.g. (+ a b c) to (+ (+ a b) c)).
The operations on these combinations of arguments that were to be
optimized were primarily the common arithmetic operations. In addition,
certain efficiency concessions were to be made when operating on one flonum
and one compnum whose imaginary part is 0.0.

The strategy taken was to generate in-line code for + and -,
by far the most common operations. The check that both operands are
fixnums is performed, and if it succeeds, the operation is performed
in-line. If the operation overflows or the tag-check fails, a millicode call is made to a subroutine for
the operation.

In addition, in-line code is generated for the arithmetic ordering
predicates and for some other predicates, although the exact set of
optimized primitive operations has been changing over time; there is
also the question of what kinds of optimizations the compiler does. For
example, if the compiler replaces (abs x) with the equivalent of

(let ((t x))
(if (< t 0)
(- t)
t)

then writing code generator macros that generate in-line code for
abs is pretty silly, since the back-end won't ever get to use them.

In the future, we may also generate in-line code for *,
quotient, and remainder; when Larceny was first designed
it was pointless to do so because current systems did not implement
multiplication and division in hardware.

Note the code for mixed flonum/compnum arithmetic where the imaginary part
of the compnum is 0.0.

By carefully hand-coding the dispatch code in assembly language and
tuning it, the dispatch is reasonably efficient, and is not the
dominating factor in adding two flonums, for example (the time there
being spent loading data and allocating the resulting flonum on the
heap). For those really curious, here is the assembly code for generic add
from Larceny v0.25.