Log of /sml/trunk/src/MLRISC/backpatch

Removed the native COPY and FCOPY instructions
from all the architectures and replaced it with the
explicit COPY instruction from the previous commit.
It is now possible to simplify many of the optimizations
modules that manipulate copies. This has not been
done in this change.

Changed the representation of instructions from being fully abstract
to being partially concrete. That is to say:
from
type instruction
to
type instr (* machine instruction *)
datatype instruction =
LIVE of {regs: C.cellset, spilled: C.cellset}
| KILL of {regs: C.cellset, spilled: C.cellset}
| COPYXXX of {k: CB.cellkind, dst: CB.cell list, src: CB.cell list}
| ANNOTATION of {i: instruction, a: Annotations.annotation}
| INSTR of instr
This makes the handling of certain special instructions that appear on
all architectures easier and uniform.
LIVE and KILL say that a list of registers are live or killed at the
program point where they appear. No spill code is generated when an
element of the 'regs' field is spilled, but the register is moved to
the 'spilled' (which is present, more for debugging than anything else).
LIVE replaces the (now deprecated) DEFFREG instruction on the alpha.
We used to generate:
DEFFREG f1
f1 := f2 + f3
trapb
but now generate:
f1 := f2 + f3
trapb
LIVE {regs=[f1,f2,f3], spilled=[]}
Furthermore, the DEFFREG (hack) required that all floating point instruction
use all registers mentioned in the instruction. Therefore f1 := f2 + f3,
defines f1 and uses [f1,f2,f3]! This hack is no longer required resulting
in a cleaner alpha implementation. (Hopefully, intel will not get rid of
this architecture).
COPYXXX is intended to replace the parallel COPY and FCOPY available on
all the architectures. This will result in further simplification of the
register allocator that must be aware of them for coalescing purposes, and
will also simplify certain aspects of the machine description that provides
callbacks related to parallel copies.
ANNOTATION should be obvious, and now INSTR represents the honest to God
machine instruction set!
The <arch>/instructions/<arch>Instr.sml files define certain utility
functions for making porting easier -- essentially converting upper case
to lower case. All machine instructions (of type instr) are in upper case,
and the lower case form generates an MLRISC instruction. For example on
the alpha we have:
datatype instr =
LDA of {r:cell, b:cell, d:operand}
| ...
val lda : {r:cell, b:cell, d:operand} -> instruction
...
where lda is just (INSTR o LDA), etc.

Implemented a complete redesign of MLRISC pseudo-ops. Now there
ought to never be any question of incompatabilities with
pseudo-op syntax expected by host assemblers.
For now, only modules supporting GAS syntax are implemented
but more should follow, such as MASM, and vendor assembler
syntax, e.g. IBM as, Sun as, etc.

Fix for a backpatching bug reported by Allen.
Because the boundary between short and long span-dependent
instructions is +/- 128, there are an astounding number of
span-dependent instructions whose size is over estimated.
Allen came up with the idea of letting the size of span
dependent instructions be non-monotonic, for a maxIter
number of times, after which the size must be monotonically
increasing.
This table shows the number of span-dependent instructions
whose size was over-estimated as a function of maxIter, for the
file Parse/parse/ml.grm.sml:
maxIter # of instructions:
10 687
20 438
30 198
40 0
In compiling the compiler, there is no significant difference in
compilation speed between maxIter=10 and maxIter=40. Actually,
my measurements showed that maxIter=40 was a tad faster than
maxIter=10! Also 96% of the files in the compiler reach a fix
point within 13 iterations, so fixing maxIter at 40, while high,
is okay.

A CVS update record!
Changed type cell from int to datatype, and numerous other changes.
Affect every client of MLRISC. Lal says this can be bootstrapped on all
machines. See smlnj/HISTORY for details.
Tag: leunga-20001207-cell-monster-hack