This is the HISTORY file for the Yale SML/NJ CVS repository.
An entry should be made for _every_ commit to the repository.
The entries in this file will be used when creating the README
for new versions, so keep that in mind when writing the
description.
The form of an entry should be:
Name:
Date:
Tag:
Description:
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/08/20 15:35:00 EDT
Tag: blume-20010820-debugprof
Description:
!!!! NEW BOOTFILES !!!!
This is another round of reorganizing the compiler sources. This
time the main goal was to factor out all the "instrumentation"
passes (for profiling and backtracing) into their own library.
The difficulty was to do it in such a way that it does not depend
on elaborate.cm but only on elabdata.cm.
Therefore there have been further changes to both elaborate.cm and
elabdata.cm -- more "generic" things have been moved from the former
to the latter. As a result, I was forced to split the assignment
of numbers indicating "primtyc"s into two portions: SML-generic and
SML/NJ-specific. Since it would have been awkward to maintain,
I bit the bullet and actually _changed_ the mapping between these
numbers and primtycs. The bottom line of this is that you need
a new set of bin- and bootfiles.
I have built new bootfiles for all architectures, so doing a fresh
checkout and config/install.sh should be all you need.
The newly created library's name is
$smlnj/viscomp/debugprof.cm
and its sources live under
src/compiler/DebugProf
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/08/15 17:15:00 EDT
Tag: blume-20010815-compreorg
Description:
This is a first cut at reorganizing the CM libraries that make up the
core of the compiler. The idea is to separate out pieces that could
be used independently by tools, e.g., the parser, the typechecker, etc.
The current status is a step in this direction, but it is not quite
satisfactory yet. Expect more changes in the future.
Here is the current (new) organization...
What used to be $smlnj/viscomp/core.cm is now divided into
six CM libraries:
$smlnj/viscomp/basics.cm
/parser.cm
/elabdata.cm
/elaborate.cm
/execute.cm
/core.cm
The CM files for these libraries live under src/system/smlnj/viscomp.
All these libraries are proxy libraries that contain precisely
one CM library component. Here are the locations of the components
(all within the src/compiler tree):
Basics/basics.cm
Parse/parser.cm
ElabData/elabdata.cm
Elaborator/elaborate.cm
Execution/execute.cm
core.cm
[This organization is the same that has been used already
for a while for the architecture-specific parts of the visible
compiler and for the old version of core.cm.]
As you will notice, many source files have been moved from their
respective original locations to a new home in one of the above
subtrees.
The division of labor between the new libraries is the following:
basics.cm:
- Simple, basic definitions that pertain to many (or all) of
the other libraries.
parser.cm:
- The SML parser, producing output of type Ast.dec.
- The type family for Ast is also defined and exported here.
elabdata.cm:
- The datatypes that describe input and output of the elaborator.
This includes types, absyn, and static environments.
elaborator.cm:
- The SML/NJ type checker and elaborator.
This maps an Ast.dec (with a given static environment) to
an Absyn.dec (with a new static environment).
- This libraries implements certain modules that used to be
structures as functors (to remove dependencies on FLINT).
execute.cm:
- Everything having to do with executing binary code objects.
- Dynamic environments.
core.cm:
- SML/NJ-specific instantiations of the elaborator and MLRISC.
- Top-level modules.
- FLINT (this should eventually become its own library)
Notes:
I am not 100% happy with the way I separated the elaborator (and its
data structures) from FLINT. Two instances of the same problem:
1. Data structures contain certain fields that carry FLINT-specific
information. I hacked around this using exn and the property list
module from smlnj-lib. But the fact that there are middle-end
specific fields around at all is a bit annoying.
2. The elaborator calculates certain FLINT-related information. I tried
to make this as abstract as I could using functorization, but, again,
the fact that the elaborator has to perform calculations on behalf
of the middle-end at all is not nice.
3. Having to used exn and property lists is unfortunate because it
weakens type checking. The other alternative (parameterizing
nearly *everything*) is not appealing, though.
I removed the "rebinding =" warning hack because due to the new organization
it was awkward to maintain it. As a result, the compiler now issues some of
these warnings when compiling init.cmi during bootstrap compilation. On
the plus side, you also get a warning when you do, for example:
val op = = Int32.+
which was not the case up to now.
I placed "assign" and "deref" into the _Core structure so that the
code that deals with the "lazy" keyword can find them there. This
removes the need for having access to the primitive environment
during elaboration.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/08/13
Tag: blume-20010813-closures
Description:
This fix was sent to us by Zhong Shao. It is supposed to improve the
performance of certain loops by avoiding needless closure allocation.
----------------------------------------------------------------------
Name: Lal George
Date: 2001/07/31 10:03:23 EDT 2001
Tag: george-20010731-x86-fmalloc
Description: Fixed bug in x86 calls
There was a bug where call instructions would mysteriously
vanish. The call instruction had to be one that returned
a floating point value.
----------------------------------------------------------------------
Name: Lal George
Date: 2001/07/19 16:36:29 EDT 2001
Tag: george-20010719-simple-cells
Description:
I have dramatically simplified the interface for CELLS in MLRISC.
In summary, the cells interface is broken up into three parts:
1. CellsBasis : CELLS_BASIS
CellsBasis is a top level structure and common for all
architectures. it contains the definitions of basic datatypes
and utility functions over these types.
2. functor Cells() : CELLS
Cells generates an interface for CELLS that incorporates the
specific resources on the target architecture, such as the
presence of special register classes, their number and size,
and various useful substructures.
3. CELLS
e.g. SparcCells: SPARCCELLS
CELLS usually contains additional bindings for special
registers on the architecture, such as:
val r0 : cell (* register zero *)
val y : cell (* Y register *)
val psr : cell (* processor status register *)
...
The structure returned by applying the Cells functor is opened
in this interface.
The main implication of all this is that the datatypes for cells is
split between CellsBasis and CELLS -- a fairly simple change for user
code.
In the old scheme the CELLS interface had a definitional binding of
the form:
signature CELLS = sig
structure CellsBasis = CellsBasis
...
end
With all the sharing constraints that goes on in MLRISC, this old
design quickly leads to errors such as:
"structure definition spec inside of sharing ... "
and appears to require an unacceptable amount of sharing and where
constraint hackery.
I think this error message (the interaction of definitional specs and
sharing) requires more explanation on our web page.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/07/19 15:00:00 EDT
Tag: blume-20010719-libreorg
Description:
This update puts together a fairly extensive but straightforward change
to the way the libraries that implement the interactive system are
organized:
The biggest change is the elimination of structure Compiler. As a
replacement for this structure, there is now a CM library
(known as $smlnj/compiler.cm or $smlnj/compiler/current.cm)
that exports all the substructures of the original structure Compiler
directly. So instead of saying Compiler.Foo.bar one now simply
says Foo.bar. (The CM libraries actually export a collection of
structures that is richer than the collection of substructures of
structure Compiler.)
To make the transition smooth, there is a separate library called
$smlnj/compiler/compiler.cm which puts together and exports the
original structure Compiler (or at least something very close to it).
There are five members of the original structure Compiler
that are not exported directly but which instead became members
of a new structure Backend (described by signature BACKEND). These are:
structure Profile (: PROFILE), structure Compile (: COMPILE), structure
Interact (: INTERACT), structure Machine (: MACHINE), and val
architecture (: string).
Structure Compiler.Version has become structure CompilerVersion.
Cross-compilers for alpha32, hppa, ppc, sparc, and x86 are provided
by $smlnj/compiler/.cm where is alpha32, hppa, ppc, sparc,
or x86, respectively.
Each of these exports the same frontend structures that
$smlnj/compiler.cm exports. But they do not have a structure Backend
and instead export some structure Backend where is Alpha32,
Hppa, PPC, Sparc, or X86, respectively.
Library $smlnj/compiler/all.cm exports the union of the exports of
$smlnj/compiler/.cm
There are no structures Compiler anymore, use
$smlnj/compiler/.cm instead.
Library host-compiler-0.cm is gone. Instead, the internal library
that instantiates CM is now called cm0.cm. Selection of the host
compiler (backend) is no longer done here but. (Responsibility for it
now lies with $smlnj/compiler/current.cm. This seems to be more
logical.)
Many individual files have been moved or renamed. Some files have
been split into multiple files, and some "dead" files have been deleted.
Aside from these changes to library organization, there are also changes
to the way the code itself is organized:
Structure Binfile has been re-implemented in such a way that it no
longer needs any knowledge of the compiler. It exclusively deals
with the details of binfile layout. It no longer invokes the
compiler (for the purpose of creating new prospective binfile
content), and it no longer has any knowledge of how to interpret
pickles.
Structure Compile (: COMPILE) has been stripped down to the bare
essentials of compilation. It no longer deals with linking/execution.
The interface has been cleaned up considerably.
Utility routines for dealing with linking and execution have been
moved into their own substructures.
(The ultimate goal of these changes is to provide a light-weight
binfile loader/linker (at least for, e.g., stable libraries) that
does not require CM or the compiler to be present.)
CM documentation has been updated to reflect the changes to library
organization.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/07/10 17:30:00 EDT
Tag: Release_110_34
Description:
Minor tweak to 110.34 (re-tagged):
- README.html file added to CVS repository
- runtime compiles properly under FreeBSD 3.X and 4.X
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/07/10 17:30:00 EDT
Tag: Release_110_34
Description:
New version number (110.34). New bootfiles.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/07/09 16:00:00 EDT
Tag: blume-20010709-more-varargs
Description:
I changed the handling of varargs in ml-nlffigen again:
The ellipsis ... will now simply be ignored (with an accompanying warning).
The immediate effect is that you can actually call a varargs function
from ML -- but you can't actually supply any arguments beyond the ones
specified explicitly. (For example, you can call printf with its format
string, but you cannot pass additional arguments.)
This behavior is only marginally more useful than the one before, but
it has the advantage that a function or, more importantly, a function
type never gets dropped on the floor, thus avoiding follow-up problems with
other types that refer to the offending one.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/07/09 11:25:00 EDT
Tag: blume-20010709-varargs
Description:
1. ckit-lib.cm now exports structure Error
2. ml-nlffigen reports occurences of "..." (i.e., varargs function types)
with a warning accompanied by a source location. Moreover, it
merely skips the offending function or type and proceeds with the
rest of its work.u As a result, one can safely feed C code containing
"..." to ml-nlffigen.
3. There are some internal improvements to CM, providing slightly
more general string substitutions in the tools subsystem.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/27 15:10:00 EDT
Tag: blume-20010627-concur
Description:
Fixed a small bug in CM's handling of parallel compilation.
(You could observe the bug by Control-C-interrupting an ordinary
CMB.make or CM.stabilize and then attaching some compile servers.
The result was that all of a sudden the previously interrupted
compilation would continue on its own. This was because of
an over-optimization: CM did not bother to clean out certain queues
when no servers were attached "anyway", resulting in the contents
of these queues to grab control when new servers did get attached.)
There is also another minor update to the CM manual.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/26 16:15:00 EDT
Tag: blume-20010626-cmdoc
Description:
Minor typo fixed in CM manual (syntax diagram for libraries).
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/25 22:55:00 EDT
Tag: blume-20010625-x86pc
Description:
Fixed a nasty bug in the X86 assembly code that caused signal
handlers to fail (crash) randomly.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/25 12:05:00 EDT
Tag: blume-20010625-nlffigen
Description:
This update fixes a number of minor bugs in ml-nlffigen as reported by
Nick Carter .
1. Silly but ok typedefs of the form "typedef void myvoid;" are now accepted.
2. Default names for generated files are now derived from the name of
the C file *without its directory*. In particular, this causes generated
files to be placed locally even if the C file is in some system directory.
3. Default names for generated signatures and structures are also derived
from the C file name without its directory. This avoids silly things
like "structure GL/GL".
(Other silly names are still possible because ml-nlffigen does not do
a thorough check of whether generated names are legal ML identifiers.
When in doubt, use command line arguments to force particular names.)
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/21 12:25:00 EDT
Tag: blume-20010621-eXene
Description:
eXene now compiles and (sort of) works again.
The library name (for version > 110.33) is $/eXene.cm.
I also added an new example in src/eXene/examples/nbody. See the
README file there for details.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/20 16:40:00 EDT
Tag: blume-20010620-cml
Description:
CML now compiles and works again.
Libraries (for version > 110.33):
$cml/cml.cm Main CML library.
$cml/basis.cm CML's version of $/basis.cm.
$cml/cml-internal.cm Internal helper library.
$cml/core-cml.cm Internal helper library.
$cml-lib/trace-cml.cm Tracing facility.
$cml-lib/smlnj-lib.cm CML's version of $/smlnj-lib.cm
The installer (config/install.sh) has been taught how to properly
install this stuff.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/19 17:55:00 EDT
Tag: blume-20010619-instantiate
Description:
This un-breaks the fix for bug 1432.
(The bug was originally fixed in 110.9 but I broke it again some
time after that.)
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/19 17:25:00 EDT
Tag: blume-20010619-signals
Description:
This should (hopefully) fix the long-standing signal handling bug.
(The runtime system was constructing a continuation record with an
incorrect descriptor which would cause the GC to drop data on the floor...)
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/15 15:05:00 EDT
Tag: blume-20010615-moresparc
Description:
Here is a short late-hour update related to Sparc c-calls:
-- made handling of double-word arguments a bit smarter
-- instruction selection phase tries to collapse certain clumsily
constructed ML-Trees; typical example:
ADD(ty,ADD(_,e,LI d1),LI d2) -> ADD(ty,e,LI(d1+d2))
This currently has no further impact on SML/NJ since mlriscGen does
not seem to generate such patterns in the first place, and c-calls
(which did generate them in the beginning) has meanwhile been fixed
so as to avoid them as well.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/15 15:05:00 EDT
Tag: blume-20010615-sparc
Description:
The purpose of this update is to provide an implementation of NLFFI
on Sparc machines.
Here are the changes in detail:
* src/MLRISC/sparc/c-calls/sparc-c-calls.sml is a new file containing
the Sparc implementation of the c-calls API.
* The Sparc backend of SML/NJ has been modified to uniformely use %fp
for accessing the ML frame. Thus, we have a real frame pointer and
can freely modify %sp without need for an omit-frame-ptr phase.
The vfp logic in src/compiler/CodeGen/* has been changed to accomodate
this case.
* ml-nlffigen has been taught to produce code for different architectures
and calling conventions.
* In a way similar to what was done in the x86 case, the Sparc
backend uses its own specific extension to mltree. (For example,
it needs to be able to generate UNIMP instructions which are part
of the calling convention.)
* ml-nlffi-lib was reorganized to make it more modular (in particular,
to make it easier to plug in new machine- and os-dependent parts).
There are some other fairly unrelated bug fixes and cleanups as well:
* I further hacked the .cm files for MLRISC tools (like MDLGen) so
that they properly share their libraries with existing SML/NJ libraries.
* I fixed a minor cosmetic bug in CM, supressing certain spurious
follow-up error messages.
* Updates to CM/CMB documentation.
TODO items:
* MLRISC should use a different register as its asmTemp on the Sparc.
(The current %o2 is a really bad choice because it is part of the
calling conventions, so things might interfere in unexpected ways.)
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/07
Tag: blume-20010607-calls
Description:
A number of internal changes related to C calls and calling conventions:
1. ML-Tree CALL statements now carry a "pops" field. It indicates the
number of bytes popped implicitly (by the callee). In most cases
this field is 0 but on x86/win32 it is some non-zero value. This
is information provided for the benefit of the "omit-frameptr" pass.
2. The CALL instruction on the x86 carries a similar "pops" field.
The instruction selection phase copies its value from the ML-Tree
CALL statement.
3. On all other architectures, the instruction selection phase checks
whether "pops=0" and complains if not.
4. The c-calls implementation for x86 now accepts two calling conventions:
"ccall" and "stdcall". When "ccall" is selected, the caller cleans
up after the call and pops is set to 0. For "stdcall", the caller
does nothing, leaving the cleanup to the callee; pops is set to
the number of bytes that were pushed onto the stack.
5. The cproto decoder (compiler/Semant/types/cproto.sml) now can
distinguish between "ccall" and "stdcall".
6. The UNIMP instruction has been added to the supported Sparc instruction
set. (This is needed for implementing the official C calling convention
on this architecture.)
7. I fixed some of the .cm files under src/MLRISC/Tools to make them
work with the latest CM.
----------------------------------------------------------------------
Name: Matthias Blume
Date: 2001/06/05 15:10:00 EDT
Tag: blume-20010605-cm-index
Description:
0. The "lambdasplit" parameter for class "sml" in CM has been documented.
1. CM can now generate "index files". These are human-readable files
that list on a per-.cm-file basis each toplevel symbol defined or
imported. The location of the index file for