xtc Release History and Notes

2.4.0 (8/17/14)

Major feature release.

This release significantly improves Blink. The Jeannie expression
evaluator has been greatly simplified to reduce the number of commands
sent to the component debuggers. The intermediate agent has been
enhanced to remove hard-coded limitations such as the maximum number
of threads and synchronized accesses to shared data structures. This
version is the one used for all experiment in the SPE ’14
paper “Debugging Mixed-Environment Programs with Blink”
by Byeongcheol Lee, Martin Hirzel, Robert Grimm, and Kathryn
S. McKinley.

This release also improves SuperC's AST node names, fixes various
bugs in AST creation, and adds further regression tests.

Clang's C structure layout appears to be different from that of
gcc, leading to a regression test failure on Mac OS 10.9.

2.3.1 (4/4/12)

Minor bug fix release. This release fixes incorrect license
headers in several source files and replaces 2.3.0 as described
below.

2.3.0 (3/25/12)

Major feature release.

This release significantly improves SuperC again.
SuperC's performance has been tuned, the regression tests have been
expanded, numerous bugs have been fixed, and the scripts for running
and evaluating SuperC have been enhanced. All SuperC code is now
released under the GPL version 2.0. This version is the one used for
all experiments in the PLDI ’12 paper
“SuperC:
Parsing All of C by Taming the Preprocessor” by Paul
Gazzillo and Robert Grimm.

2.2.0 (11/19/11)

Major feature release.

This release significantly improves SuperC. The
parsing algorithm, Fork-Merge LR, has been completely reimplemented.
It is now based on the novel token follow-set, which captures the
actual variability of static conditionals independent of how they are
nested within each other and appended to each other. The parser also
includes three optimizations, shared reductions, lazy forking, and
early reductions, which further decrease the number of forked
subparsers.

Both SuperC preprocessor and parser are now language-independent.
To support a new language, a user needs to provide an annotated JFlex
lexer definition, an annotated Bison grammar, and, optionally, a Java
implementation of semantic actions.

Several new scripts help with running SuperC and collecting
experimental data. They include a script to distribute the processing
of Linux kernel source files across machines and scripts to compute
summary statistics from SuperC's raw data output, e.g., a CDF of the
number of subparsers.

The new SuperC technical manual can be found
in src/xtc/lang/cpp. It documents basic SuperC usage,
its scripts, and the format of its statistics output. The manual is
built by invoking make manual.

This release also fixes a bug in the Jeannie
regression test harness, which failed on some Linux distributions.
Thanks to Jacob Shufro and Martin Hirzel for their help in identifying
and resolving this bug.

2.1.1 (9/9/11)

Minor bug fix release.

This release removes support for type checking the simply typed
lambda calculus from xtc.lang.TypedLambda, since it
depends on the already discontinued Typical compiler. Thanks to
Thomas Huston for identifying this bug.

This release also fixes a C type checker regression test to not use
a deprecated preprocessor feature anymore. Thanks to Jacob Shufro for
identifying this bug.

2.1.0 (9/7/11)

Minor feature and bug fix release.

This release improves SuperC by adding
significantly more regression tests and introducing attendant bug
fixes. It also adds more code comments and fixes code formatting.

This release adds support for parsing and pretty
printing Java 7. To support Java 7's
try-with-resources statements, the AST for regular try-catch
statements has been changed, even when using parsers for earlier Java
versions. The Java analyzer, xtc.lang.JavaAnalyzer, has
been updated accordingly. To support Java 7's underscores in numeric
literals, the definition of Java constants
in Rats!'xtc.lang.JavaConstant has been
modified to more closely follow the language specification.

Tokens are now extensible; the corresponding
class, xtc.tree.Token, is not final anymore but has
become abstract. Rats!' support for parse trees has been
updated to utilize the new concrete
subclass xtc.tree.TextToken. All parsers distributed
with xtc have been updated accordingly.

To support Mac OS X 10.7 (Lion), this release adds
support for the stpncpy_chk() builtin function.

Finally, this release removes a residual dependency on the
Typical-generated type checker from xtc.lang.C. Thanks
to Thomas Huston for identifying this bug.

2.0.0 (7/20/11)

Major feature release.

This release introduces a preview of SuperC, a
new tool for parsing C code with arbitrary preprocessor usage. SuperC
first lexes C code, then uses a new configuration-preserving
preprocessor to resolve all directives modulo conditionals, and
finally uses a novel variant of LR parsing to generate a well-formed
AST containing static choice nodes for conditionals. The
corresponding Java package is xtc.lang.cpp.

This release removes the following unmaintained code: the ANTLR
and JavaCC parsers for Java; the C4 compiler; the Overlog compiler;
the Typical compiler; and the XForm AST query engine. The last
release containing this code is
xtc version
1.15.0 (with
corresponding testsuite).

This release removes the unnecessary dependency on the "perfctr"
library from Jinn. Thanks to Mengtao Sun for pointing out this
bug.

1.15.0 (6/14/10)

Major feature release. This release
introduces Jinn, a dynamic bug detector for the Java
Native Interface (JNI). It currently supports HotSpot and J9 running
on the x86 version of Linux. Support for other OS and processor
combinations is under development. The source directory
is src/xtc/lang/blink/agent and the make target
is agent. Please direct any feedback
to Byeong
Lee.

This release also removes the unnecessary analyzers
target from the make file in src/xtc/lang/blink. Thanks
to Tony Sloane for pointing out this bug.

1.14.4 (9/29/09)

Minor bug fix release:

Added rats.manifest to the source distribution.
Thanks to Tony Sloane for pointing out this omission.

Cleaned up xtc.tree.Visitor
and xtc.lang.C to avoid raw type warnings.

Changed parser generator to correctly handle top-level repetitions
and options in productions that are marked as resetting or stateful.
Thanks to Christoff Bürger for helping to identify this
issue.

Changed parser regression test to eliminate non-ASCII character,
which may lead to test failures depending on a system's default
encoding. Thanks to Christopher Mangus for helping to identify this
issue.

Cleaned up limits.c to support 64-bit architectures
and eliminate compiler warnings.

Changed the Java and Typical type checkers for C to support
different structure layout for packed bitfields in gcc 4.2 and later.
Thanks to Christopher Mangus, Martin Hirzel, and Anh Le for helping to
resolve this issue.

Changed Jeannie build scripts to also
use include/win32 as an include file path for Cygwin,
which is necessary for Sun's JDK 1.6. Thanks to Martin Hirzel for
resolving this issue.

Changed Jeannie regression tests to support 64-bit pointers.
Thanks to Martin Hirzel for orchestrating this change.

Updated xtc.lang.c.ml.Machdep to reflect changed
constants in xtc.Limits. Thanks to Mike Chrzanowski for
identifying this bug.

Due to many of the above changes, xtc now passes all regression
tests on Apple's Mac OS X Snow Leopard (10.6), whose C compiler and
Java virtual machine default to 64-bit.

1.14.3 (4/6/09)

Minor feature and bug fix release.

Rats! has been updated as follows:

Text-only productions may now contain syntactic predicates in
addition to semantic predicates.

The source locations for errors caused by any character constants
or visibility attributes are now correctly reported. Thanks to Chris
Capel for raising this issue.

The code generator now emits break statements for
character switches in optional expressions. Thanks to Chris Capel and
the C# compiler for identifying this bug.

The Jeannie grammar has been updated to eliminate a bug that caused
null pointer exceptions. Thanks to Matt Renzelmann for identifying
this bug.

The Blink debugger has been updated to perform dynamic consistency
checks on the arguments to JNI functions. For example, it detects
when NULL is passed to NewStringUTF and
reports this invalid argument.

1.14.2 (10/18/08)

Minor feature and bug fix release.

Rats! has been updated as follows:

A code generator bug for repetitions nested in options nested
in repetitions has been fixed.

Based on feedback by Marek Gilbert and Sukyoung Ryu, error
messages for string literals, string matches, and optional/repeated
elements have been improved. Additionally, the new per-production
explicit attribute instructs Rats! to generate
error messages relative to the marked production's name (and thus
ignore any already generated parse errors).

Generic productions that are directly left-recursive and
explicitly assign yyValue in recursive alternatives are
now rejected, since Rats! cannot deduce their semantic value.
Thanks to Chris Capel for raising this issue.

Productions containing expressions lifted from public productions
do not inherit public visibility anymore. Thanks to Chris Capel for
raising this issue.

Rats! now has its own JAR file, rats.jar.
Thanks to Adrian Quark for raising this issue.

The regression tests now check for errors and warnings as well as
correct inputs and outputs.

C support has been improved as follows:

The C grammar has been changed to avoid stack overflow errors on
some Java virtual machines when parsing very long character or string
literals. Thanks to Eric Hielscher for raising this issue.

The C type checker has been updated to type check programs read in
with the parsetree option, which preserves all
formatting. Thanks to Eric Hielscher for raising this issue.

The C grammar and type checker now support
the __thread specifier provided by gcc for the ELF object
format. Thanks to Matt Renzelmann for raising this issue and aiding
in its resolution.

The new printFeatures option
for xtc.lang.C prints major GCC extensions used by the
code being processed.

Please remember to run make configure to recreate
the appropriate xtc.Limits for your hardware, OS, and
compiler. Thanks to BK Lee's tireless help, configuration now also
works with Microsoft's Visual C.

The Blink inter-language debugger has been
improved as follows:

Blink now natively supports Microsoft Windows using Microsoft CDB
as a component debugger. It features breakpoints, call stack tracing,
and single stepping across Java and native code. However,
mixed-language Jeannie expressions are not (yet) working, since
Microsoft CDB does not support complete C/C++ expression evaluation
and convenience variables.

Blink is now more robust by wrapping calls from Java to C through
JVMTI, the JVM Tool Interface, which is available in JVMs such as
Sun's HotSpot and IBM's J9. Previously, Blink relied on native
breakpoints inside the JVM to interpose on the transition from Java to
native code. This approach can lead to surprising behavior during
inter-language single stepping and has thus been replaced.

1.14.1 (7/31/08)

Bug fix release.

Rats! has been updated as follows:

User-specified bindings in the base cases of directly
left-recursive productions are now preserved (instead of being
renamed). Thanks to Chris Capel for identifying this bug.

Left-recursive productions are now recognized in time linear to a
grammar's number of productions. Thanks to Chris Capel for raising
the issue of parser generator performance.

If an alternative contains only a predicate, it will not result in
an unreachable alternative error anymore. Similarly, if an option
contains only a predicate, it will not result in a matching empty
input warning anymore. Thanks to Chris Capel for identifying these
bugs.

If an alternative starting with character and/or string literals
is a prefix of a subsequent alternative, the latter alternative will
now be reported as unreachable (instead of Rats! generating
unreachable Java code). Thanks to Janus Dam Nielsen for identifying
this issue.

Voided null literals are now rejected with an error message (as
they serve no purpose). Thanks to Eric Hielscher for raising this
issue.

Java primitive types and keywords are now reported as invalid
types for productions.

Optional or repeated actions now result in a single error message
instead of two messages.

1.14.0 (7/26/08)

Major feature release.

This release introduces Blink, a portable
mixed-mode Java/native debugger. It currently supports Sun's Java
virtual machine running on the x86 versions of Linux and Cygwin, with
support for other JVM, OS and processor configurations under
development. Please direct any feedback
to BK Lee.

1.13.3 (5/14/08)

Minor feature and bug fix release.

Rats! has been updated as follows:

Variant typing now supports constructors with the same simple name
appearing in different variants. The corresponding generic nodes must
be created in different modules. Furthermore, polymorphic variants
may not reference different monomorphic variants containing
constructors with the same simple name.

Variant typing now better handles generic productions that do not
pass the value through and whose generic nodes have already been
assigned to monomorphic variants. Additionally, a bug in processing
such productions has been fixed.

String literals in string match expressions are now properly
escaped. Thanks to Dejan Jovanović for identifying this
bug.

The runtime JAR file now contains nested and anonymous classes as
well. Thanks to Chris Jones for identifying this bug.

Due to improvements in variant typing, Rats! now statically
types the Jeannie grammar, requiring no
additional variant annotations.

The Typical compiler has been updated to
support fun expressions, and the translation
of let expressions has been optimized. Additionally,
bugs in the exhaustiveness checking for match expressions
and when explicitly matching bottom have been fixed.
Thanks to Christopher Conway for reporting these bugs.

The Java, Typical, and O'Caml type checkers for
C have been updated to:

correctly track the compile-time constant values of enumerators
that are defined in terms of other enumerators within the same
enumeration,

warn on (in)equality comparisons between integers and pointers
(instead of reporting errors).

Thanks to Matt Renzelmann for identifying the last two issues.

To track size, alignment, and offset values, the C type checkers
now include a re-engineered version of gcc's structure layout
algorithm. The local system's C configuration
in xtc.Limits has been improved in support.
Run make configure to recreate the appropriate
version for your hardware and operating system.

The syntax for Jeannie top-level compilation
units has changed. The package and import declarations now come
before the initial `.C {…} block instead of
after it. That way, top-level C code can use simple instead of fully
qualified names when referring to Java entities.

Internally, the Jeannie grammar and AST for array declarators has
been updated to create "variable length" nodes, just like the C
grammar and AST in release 1.13.0. Furthermore, the compiler has been
updated to address several bugs, mostly thanks to helpful reporting by
Matt Renzelmann.

Support for Overlog has been extended with a
translator targeting Java. The corresponding runtime is being
developed by Nalini Belaramani at UT Austin; the necessary JAR file is
available here.
Additionally, the Overlog language has been extended with tuple and
function type declarations, the Overlog grammar has been cleaned up,
and a bug in the inference of function return types has been fixed.
The corresponding Java package has been renamed
to xtc.lang.overlog (from xtc.lang.p2).

1.13.2 (12/1/07)

Minor feature and bug fix release.

The Jeannie compiler now supports backticked Java
primitive types, e.g., `boolean or `int, as
C type specifiers. This change eliminates the need for using the
equivalent JNI types, e.g., jboolean
or jint, in C contexts. This release also includes
various bug fixes to the Jeannie compiler and
a user
guide.

The Typical compiler now supports
the guard construct for protecting
against bottom values in arbitrary expressions. It also
incorporates various bug fixes, including mapping bottom
to bottom in optimized pattern matches.

This release includes three type checkers for C.
The first is the previously released version, which is written in Java
and used by the Jeannie compiler. The second is new to this release
and written in Typical. It is invoked through
the -analyze and -typical options to the C
driver xtc.lang.C. Just like the type checker written in
Java, the type checker written in Typical passes all of gcc 4.1.1's
regression tests. Both type checkers also process the entire Linux
2.6 kernel. To this end, the handwritten C type checker now:

correctly tracks compile-time constant addresses across all
address of expressions,

treats offsetof expressions as having a compile-time constant
value,

suppresses duplicate errors when processing case labels.

The third type checker for C is new to this release as well and
written in O'Caml. It re-uses the parser and AST representation
of CIL and is
contained in the src/xtc/lang/c/ml directory. Like the
other two type checkers, the O'Caml version processes the entire Linux
2.6 kernel; though it does not recognize C99's variable length
arrays.

xtc now includes support for type inference and concurrency
analysis of Overlog programs; the corresponding code
lives in the xtc.lang.p2 package.

Rats! has been updated as follows:

Variant typing now supports modularized AST definitions, i.e., it
now supports variants with the same simple name but in different
modules. It also performs stricter error checking.

Support for the rawTypes attribute has been fixed; it
does not result in a class cast exception anymore. However, support
for this attribute has been deprecated and will be removed in a future
release.

All tools now support a -no-exit
option for not exiting a Java virtual machine. As a result, tools can
now be invoked by other Java code in the same JVM without terminating
the JVM after tool completion.

The licensing of most classes
in xtc.util has been changed to the LGPL version 2.1. As
before, the complete list of LGPL-ed classes can be found
in overview.html.

1.13.1 (10/16/07)

Bug fix and minor feature release.

This release makes the following changes
to Rats!:

Parsers generated with the withLocation option now
start counting columns at 1 (instead of 0) for consistency with most
modern development environments. The following code fixes Emacs'
column number mode:

Thanks to Martin Hirzel for updating Emacs' original hook. The start
column is now defined by xtc.Constants.FIRST_COLUMN;
the xtc.tree.Printer utility has been updated to use this
constant.

Parsers generated with the withLocation option now
correctly annotate nodes resulting from directly left-recursive
generic productions with their source locations (again). Release
1.12.0 introduced a regression, which annotated nodes with a source
location past the position of the recursive nonterminal. This release
restores an optimized version of the correct approach introduced in
release 1.8.0.

A parser's internal state for tracking source locations can now be
updated through
the xtc.parser.ParserBase.setLocation(int,String,int,int)
method. The C, C4, and Jeannie grammars utilize this method to update
the corresponding parsers' source location based on gcc line markers
in the preprocessed input. As a result, all error messages now report
the original file name and line number; though the column number may
be inaccurate due to macro expansion.

Parsers containing generic productions now include a
static toText() helper method that returns a string. For
regular parsers, the method is the identity function for strings. For
parsers generated with the withParseTree option, the
method takes an annotated token as its only argument and returns the
corresponding string. The C and Java grammars have been rewritten to
utilize this method instead of various kludges for converting
annotated tokens to strings.

Parsers generated with the withParseTree option now
correctly preserve formatting in list-valued productions.
Furthermore, they now correctly preserve formatting in some generic
productions that are not directly left-recursive and end with a
sequence consisting only of formatting; Rats! also does not
split such productions any more.

The new noinline attribute for productions prevents
inlining even if the production is marked as or recognized
as transient. Furthermore, the new memoized
attribute for productions prevents productions from being treated
as transient.

Variant typing now performs stricter error checking before
assigning a polymorphic variant to a production. It also generates
more consistent constructor names for polymorphic variants. Finally,
it now correctly assigns some generic nodes to variants that were
previously ignored.

The accuracy of production voiding, which voids productions whose
semantic values are never bound, has been improved.

The Typical compiler now supports the
hierarchical syntax tree definitions generated by Rats!,
including polymorphic variants and the 'a var type.
The type describing the syntax tree's root defaults
to node but can be overridden through
the -node command line flag. Additional changes to
Typical include:

The Typical type checker itself is now built with all
optimizations enabled: pattern matches are optimized through switch
statements, let scopes are collapsed where possible, and the unused
type record is optimized away.

The implementation of the reduce construct now
correctly follows its semantics.

Similarly, the implementation of the parent
and ancestor built-ins now follows their semantics.

The Jeannie compiler has been updated to reflect
the language described in the OOPSLA paper. In particular, it now
supports with statements for non-primitive arrays,
declarations in with statement initial clauses, and
compound initializers. Additional changes include:

Keywords and built-ins new to Jeannie can now be written without
leading underscores. However, to avoid a name clash with the standard
C library, abort (or _abort) has been
renamed to cancel (or _cancel). The
new -underscores command line option overrides this new
default behavior, reverting to the underscored versions.

The compiler now emits line marker comments in generated Java code
of the form

//#line <line> <file>

and indents both generated C and Java code identically to the source.
The new -pretty command line option overrides this new
default behavior, reverting to the Java and C pretty printers.

The new jeannie.sh shell script
in src/xtc/lang/jeannie manages the entire build process
from Jeannie source code to Java and C binaries.

The C regression tests have been updated to include all relevant
tests from GCC version 4.1.1. The C type checker has
been updated accordingly. In particular, it now explicitly checks
for:

structs or unions not being redefined within themselves,

variables, fields, and parameters not being declared as void
(instead of reporting the types as incomplete),

void parameter type lists not having a storage class, qualifier,
or function specifier,

identifiers with internal linkage not being redeclared with
external linkage in a block-level declaration,

labels being declared but not defined and labels being defined but
not used,

the type of the target in gcc's computed goto statements being a
non-float scalar,

Additionally, the processing of block-level extern declarations has
been much improved.

The limits.c utility for determining a local
system's C configuration has been improved to more accurately
determine the local pointer difference, size, and wide character
types. The corresponding xtc.Limits class included in
the source distribution is valid for 32-bit x86-based Mac OS X
systems, but differs in endianness from PowerPC-based Mac OS X systems
and in the definitions for size and wide character types from Linux
and Windows systems. The new configure target for the
global Makefile rebuilds xtc.Limits
and xtc.type.C (whose constants depend
on Limits) for a local system.

Thanks to Thomas Moschny, the implementation of for expressions in
the XForm AST query and transformation engine has
been fixed to properly iterate over nested sequences. Also thanks to
Thomas Moschny, a bug causing a null pointer exception has been
fixed.

All tools now support
a -diagnostics option to print tool internal state.
Given this option, the C driver now prints the local system's
configuration parameters (as determined by limits.c
— see above).

Finally, the Java and C drivers now support
the -locateAST command line option to print each node's
source location when printing the AST with the -printAST
option.

1.13.0 (8/31/07)

Major feature and bug fix release.

Starting with this release, xtc
includes Typical, a domain-specific language and
compiler for implementing semantic analysis including type checking.
The Typical language builds on the functional core of ML and extends
it with novel declarative constructs specifically designed for
implementing type checkers. The package description
for xtc.typical provides an overview and introduction.
Examples included with xtc are a type checker for the simply typed
lambda calculus in src/xtc/lang/TypedLambda.tpcl and for
the Typical language itself in src/xtc/lang/Typical.tpcl.
A type checker for C written in Typical is under development. The
main developers for Typical are Laune Harris and Anh Le.

Starting with this release, xtc also includes "a compiler
contributed to xtc" a.k.a. Jeannie, which integrates
Java with C. In Jeannie, Java and C code are nested within each other
at the level of individual statements and expressions and compile down
to JNI, the Java platform's standard foreign function interface. By
combining the two languages' syntax and semantics, Jeannie eliminates
verbose boiler-plate code, enables static error detection across the
language boundary, and simplifies dynamic resource management.
The OOPSLA '07
paper by Martin Hirzel and Robert Grimm describes both language and
compiler in detail; the package description
for xtc.lang.jeannie provides instructions on how to
compile source code to binaries.

Instead of using strings, Rats! now
relies on xtc.type.Type and its subclasses to internally
represent the types of semantic values. The first new feature to
leverage this improved internal representation is variant
typing for grammars. When the -ast command line
option is combined with the new -variant
option, Rats! automatically determines ML-style variant
types representing a grammar's generic AST. To facilitate type
inference, Rats! relies on the new variant
attribute for productions, which indicates that all generic nodes
returned by a production are members of the same variant type, named
after the production. The C, Java, Typical, and simply typed lambda
calculus grammars have been updated accordingly.

The Java grammar and AST for this
expressions have been improved. Instead of accepting any primary and
postfix expression, the grammar now recognizes only a qualified
identifier with a trailing dot before the this keyword.
For well-formed inputs, this changes replaces zero or more nested
selection expression nodes as a this expression node's first child
with an optional qualified identifier.

The C grammar and AST have also been improved.
The "*" string denoting variable-length arrays in array
declarator nodes and direct abstract declarator nodes has been
replaced with a dedicated "variable length" node. Next, the
identifier string in structure designators has been replaced by a
primary identifier node. Finally, goto statement nodes now have two
children. A "*" string as the first child now indicates
a computed goto statement. The second child always is a node, with a
primary identifier providing a regular goto statement's label.

1.12.0 (7/18/07)

Major feature and bug fix release.

As described below, Rats!' handling of list values in
generic productions has changed. If your grammar contains generic
productions and you do not want to update your AST processing code,
add the flatten option to your grammar.

For grammars with the new withParseTree
attribute, Rats! rewrites generic, list-valued, text-only,
and void productions as well as productions that pass the value
through to generate parsers that preserve all formatting as
annotations. Annotations are instances of the new
class xtc.tree.Formatting, which replaces the generic
annotations introduced in version 1.9.0.

The embedded AST generally has the same structure as for parsers
generated without the withParseTree attribute. The
exception are strings, which are represented as instances
of xtc.tree.Token. Additionally, generic nodes include
additional children (consisting of Formatting annotating
a null value) if a voided expression or void nonterminal
appears between two list-valued expressions.

By default, visitors continue to ignore all annotations and
process only AST nodes, thus ensuring that the same visitors can
process both parse trees and abstract syntax trees.
The Token.test and Token.cast methods
can be used to test for and cast to strings, irrespective of whether
the tree is a parse tree or abstract syntax tree.

The new xtc.tree.ParseTreePrinter prints parse trees
including formatting, and the
new xtc.tree.ParseTreeStripper strips all formatting and
tokens, extracting the embedded AST (but preserving any other
annotations).

The C and Java drivers have been updated with
a -parsetree option to use parse trees instead of
abstract syntax trees. Furthermore, the -strip option
removes all formatting and tokens from a parse tree again.

The interface to abstract syntax tree nodes has
been improved as following:

Access to a node's source location has been factored into its own
interface xtc.tree.Locatable. The corresponding field
in xtc.tree.Node has been marked private. Rats!
now uses this interface for parsers with the withLocation
attribute, thus removing the dependency on xtc's node
representation.

All nodes now support a write(Appendable) method for
incrementally creating a human-readable
representation. Node.toString() now utilizes this
method. Similar functionality for classes in xtc.type
has been modified to utilize this generalized version.

Nodes' support for children that are lists, i.e., instances
of xtc.util.Pair has been improved. In particular, the
new Node.getList method returns a node's child as a list,
and the new Node.isList and Node.toList
methods test for and cast to lists of nodes, respectively.
Additionally, the
new Visitor.iterate, Visitor.map,
and Visitor.mapInPlace methods apply a visitor to all
nodes on a list.

The representation of programming language types
in xtc.type has been cleaned up and expanded:

Type annotations, i.e., a type's source location, language, scope,
constant value, memory shape, and attributes, can now be stored
in each type instead of only a dedicated wrapped type.
Correspondingly, the wrapped types for source locations, constant
values, and memory shapes have been removed;
though AnnotatedT is still available to annotate a type
without directly modifying it.

The wrapped types for variables, fields, and C's struct and union
members have been folded into a single wrapped
type xtc.type.VariableT.

The new UnitT, VariantT,
and TupleT classes model the corresponding types in
functional languages such as ML or Haskell. The latter two classes
replace the ListT, OptionT,
and ProductT classes introduced in version 1.10.0.

Types can now be parameterized through the Parameter
and Wildcard classes representing named parameters and
wildcards, respectively. The new wrapped
types ParameterizedT and InstantiatedT
capture a type's declared parameters and its instantiation with
concrete types, respectively.

Type.Tag now defines a Java enumeration over all type
classes. Each instance's tag is accessible through the
Type.tag() and Type.wtag() methods (with
invocations of the former method being forwarded across wrapped
types). As a result, it is now possible to implement switch
statements for types.

The Tag interface for C's enum, struct, and union
types has been renamed to Tagged in order to avoid
confusion with the new Type.Tag enumeration.
The Constant interface for types' constant values has
been replaced with a concrete implementation.

All C-specific operations have been factored into a separate
class, xtc.type.C.

The new class xtc.type.AST contains common constants
and operations for typing abstract syntax trees.

The C and Java type checkers have been updated to utilize the
modified package. Rats! has also been updated to
utilize xtc.type, though the conversion is not yet
complete.

The Java grammar and AST have been re-engineered
to (mostly) eliminate the need for a separate AST simplification
phase. Notably, the AST for postfix and primary expressions has been
significantly cleaned up. The Java type checker has been updated
accordingly.

Additionally, xtc now includes a grammar for
Java 5. The Java 5 grammar is implemented
as a modification of the Java 1.4 grammar, and ASTs for the two
versions are compatible, i.e., every valid Java 1.4 AST also is a
valid Java 5 AST. The Java pretty printer has been updaged to
support both versions. Furthermore, the FactoryFactory
concrete syntax tool has been updated to use the Java 5 grammar.
Since ASTs for the two language versions are compatible, the concrete
syntax tool will create Java 1.44 ASTs as long as the input only
uses Java 1.4 features.

The C type checker now verifies that external
declarations without initializers are complete only at the end of a
translation unit, thus correctly allowing for the definition of a
struct or union type after it has been used in an external
declaration. It also adds support for three more GCC extensions:

Global register variables (but without checking the register
names),

extern and inline functions, which
effectively are macros and may be defined in the same translation unit
before a regular function definition,

structures with trailing incomplete arrays as struct member types
and array element types.

As a result, the C type checker now passes all GCC regression tests
ported to xtc.

In addition to supporting the generation of parse trees and using
the new Locatable
interface, Rats! has been improved as
follows:

Rats! now deduces the semantic value of productions of
type Pair<T>, automatically creating a list from
the values of each alternative's component expressions. If the last
component expression has a list value, that value becomes the tail of
the production's list value. If the only component expression has a
list value, that value becomes the production's value. For
example,

creates a list of nodes, automatically consing the first expression's
value onto the list of expression nodes. In contrast,

Pair<Node> TwoExpressions = Expression Expression ;

also creates a list of nodes, but by consing the two expressions'
nodes onto the empty list.

Rats! now supports a null literal, which
simply provides a null value. Previously, the C
and Java grammars used a production

Node Null = ;

to generate null values; the null literal provides a more direct and
efficient alternative. The old xtc.util.Null
and xtc.util.NullNode modules have been removed.

When a component expression of a generic production has a list
value, Rats! now directly adds the list value as the generic
node's child. Previously, Rats! flattened the list by adding
the list's elements as children of the generic node. The old behavior
is still available through the grammar-wide flatten
attribute.

The implementation of the -ast command line option,
which instructs Rats! to print a formal definition of a
grammar's abstract syntax tree, has been rewritten (again) to produce
a more accurate definition. It now uses the optional
modifier to indicate that an AST node's child may be null
and the variable modifier to indicate that a child may
not even be present. This feature remains under active
development.

The new location optimization (-Olocation)
causes Rats! to (1) use simpler code for updating a node's
source location where possible and (2) omit updates altogether where
possible. This optimization is enabled by default.

The detection of malformed voided expressions, bindings, and
string matches has been improved. Notably, string matches on bindings
and other string matches are now flagged as grammar errors; though
bindings of string matches are still legal. This well-formedness
check prevents a ClassCastException during code
generation.

The detection of unreachable alternatives has been generalized,
thus improving error reporting for malformed grammars.

A bug in the code generation for options nested within repetitions
in transient productions has been fixed. With this bug, the option's
semantic value always was null, even if the option was
matched in the input; furthermore, the repetition was not matched
completely. Thanks to Eclipse for raising the "unused variable
binding" leading to the bug's discovery.

The AttributeList
and MalformedNodeException classes
in xtc.tree have been removed. All code using the former
has been changed to use a List<Attribute>; there
was no code using the latter.

Finally, this release incorporates several fixes to minor bugs
identified by Eclipse and
by FindBugs.

1.11.0 (5/14/07)

Major feature and bug fix release.

The licensing of several classes has been
changed. The Node, GNode,
and Annotation classes in xtc.tree and
the Action and State classes
in xtc.util are now licensed under the LGPL version 2.1
instead of the GPL version 2. Consequently, parsers generated from
grammars with generic or stateful productions are not covered by the
GPL anymore.

This release simplifies the interface between nodes and
visitors. Processing methods cannot be specified as part of
nodes anymore; i.e., visitWith(Visitor) methods are not
recognized by dispatch() anymore. Furthermore, if a
visit method has void as its return
type, dispatch() now returns null; i.e., it
does not return the specified node anymore. The first feature has
been removed because it has not been used in over 1 1/2 years; the
second feature has been removed because it is inconsistent with Java
reflection and programmer expectations about void methods (while also
having some runtime overhead).

Other changes to nodes and visitors include:

If dispatch() cannot identify an appropriate visit
method, it now invokes the new unableToVisit(Node)
method. That method's default implementation simply raises a visitor
exception, thus resulting in the already familiar behavior. However,
visitors can override this method and thus implement their own error
handling strategies. Note that dispatch() caches
resolutions to unableToVisit(), just like it caches
resolved visit methods.

Both VisitorException
and VisitingException now inherit from a common
superclass TraversalException. That class removes stack
trace elements corresponding to dispatch() and Java
reflection invocations from a strack trace, thus resulting in less
clutter when printing the stack trace. Thanks to Martin Hirzel for
raising this issue.

Nodes that support generic traversal now also support
indexOf(),
lastIndexOf(), and contains() operations
consistent with the Java collections framework.

Rats! has been improved as follows:

With the above mentioned relicensing in place, the functionality
of FullParserBase has been rolled
into ParserBase, thus eliminating the need to
differentiate a parser's base class according to license.

Generic productions may now contain so-called node markers, which
specify the names of automatically created generic nodes, overriding
the production's name. Node markers are written as
"@Name"; the last node marker in a sequence
specifies the created generic node's name. Node markers are
especially useful for expressing different left-associative operators
that have the same precedence with a single directly left-recursive
production. Where possible, explicit semantic actions in the C
grammar have been replaced with node markers.

The new profile attribute instructs Rats!
to include code for profiling the usage of the memoization table. For
grammars with this attribute, Rats! includes a counter for
every field (i.e., memoized production) in the memoization table. The
parser then increments the appropriate counter on every table
access. Rats! also includes
a profile(xtc.tree.Printer) method, which prints
the maximum value for all of a production's fields across all
memoized productions. If that number consistenly is 1 over a sampling
of representative inputs, the corresponding production should probably
not be memoized (i.e., marked as transient). The C and Java drivers
have been updated to support parsers generated with this
attribute.

The new factory attribute instructs Rats!
to use a class different from xtc.tree.GNode for creating
generic nodes.

The implementation of the verbose attribute has been
rewritten to produce considerably more informative traces of a
parser's execution. In particular, the parser now traces when it (1)
enters a production, (2) exits a production (with either a match or
parse error), and (3) looks up a previously memoized result.

The new nowarn attribute instructs Rats! to
suppress warnings for a production or the entire grammar.

The implementation of the -ast command line option,
which instructs Rats! to print a formal definition of a
grammar's abstract syntax tree, has been generalized (and simplified)
to produce a more accurate definition.

The interface for turning parser results into either values (on
success) or exceptions (on failure) has been simplified through the
new value(Result), format(ParseError),
and signal(ParseError) methods of the parser base
class xtc.parser.ParserBase. The old error reporting
code has been removed from ParserBase and
ParseException; all tools have been updated accordingly.
The easiest way to use a parser with the updated interface is:

parser.value(parser.pNonterminal(0))

This expression tries to recognize nonterminal Nonterminal,
starting at the beginning of the input, and either returns the
corresponding semantic value or signals a parse exception.

The new -valued command line option instructs
Rats! to reduce a grammar to only those expressions that
directly contribute to the abstract syntax tree and to then print the
reduced grammar. Like the -ast option, it helps
developers understand a grammar's abstract syntax tree without them
needing to understand the complete grammar.

Rats! now checks that all alternatives in an ordered
choice are reachable, i.e., are not preceded by an alternative that
accepts the empty input. Otherwise, it reports an error. Thanks to
Petar Maymounkov for reminding me of this issue.

Rats! now checks that the semantic value of each base
case in a directly left-recursive generic production is a node.
If Rats! determines that the value definitely is not a node,
it reports an error. If it determines that the value possibly may not
be a node, it reports a warning.

The global state object for stateful grammars is now allocated per
parser instance and not per parser class anymore. As a result,
several instances of the same parser can be used concurrently.

When sole nonterminals are inlined, Rats! now preserves the
source location of the original nonterminal, leading to more
informative error locations. Thanks to Petar Maymounkov for raising
this issue.

A bug in the code generation for directly left-recursive generic
productions has been fixed; with this bug, Rats! generated
malformed Java code for recursive alternatives that result in generic
nodes with a single child.

A bug in the pretty printing of grammars, which printed the any
character constant as a dot instead of an underscore, has been
fixed.

xtc now supports concrete syntax for creating
Java and C abstract syntax trees. The
new xtc.lang.FactoryFactory tool reads in a factory
declaration, which includes one or more snippets of Java or C code,
and creates the corresponding factory class. That class has one
method per snippet, with each method creating the abstract syntax tree
representing the code snippet. Code snippets may be declarations,
statements, or expressions; they may also contain pattern variables,
which are bound on method invocation.

The Java grammar has been improved by
introducing a distinct production for variable declarations and by not
recognizing constructor, method, and field declarations inside method
bodies anymore. At the same time, the AST fragment for variable
declarations has the same structure as that for field declarations;
i.e., both nodes have the same name ("FieldDeclaration") and one or
more children indicating the modifiers.

Additionally, the pretty printing of Java ASTs
has been improved: synchronized statements now include parentheses
around their expressions, compilation units and class bodies do not
contain unnecessary blank lines any more, and the spacing of class
declarations, catch clauses, and new expressions has been improved.
Thanks to Martin Hirzel and Laune Harris for identifying several of
these issues.

Thanks to Martin Hirzel, xtc now includes a type checker
for Java (version 1.4). Comparable to the C type checker,
the Java type checker is invoked through the -analyze
command line option to xtc.lang.JavaDriver.
The -printSymbolTable option instructs the Java driver to
print the symbol table after analysis. Note that the Java type
checker requires a simplified AST, as indicated by
the -simplifyAST option.

Support for processing C programs has been
improved as follows:

As already indicated above, the C grammar has been modified to
utilize node markers for recognizing function and array declarators as
well as for recognizing postfix expressions.

Line marker, source identity, and pragma annotations in C program
ASTs are now annotated with the correct source location
information.

The pretty printer now supports lining up printed output with a
node's original source location; it also supports GNU-like formatting
of braces in addition to Java-like formatting.
The CDriver exposes these features through
the -preserveLines and -formatGNU command
line options.

The C type checker (xtc.lang.CAnalyzer) now correctly
type checks variable declarations with compound initializers, even if
the -markAST command line option is specified. Under
certain conditions, it previously aborted with an exception indicating
that a node already has a type. Thanks to Martin Hirzel for
identifying this issue.

Tool support for I/O has been improved. In
particular, xtc.util.Runtime now manages input/output
directories and can open chracter streams.
Furthermore, xtc.util.Tool now allows for the
specification of character encodings on the command line. As a
result, Rats! now supports user-specified character
encodings. Thanks to Steven Foster for raising the issue of character
encodings.

This release fixes the following bugs in XForm,
the AST query and transformation engine:

When creating new AST nodes in nested expressions, XForm now
correctly sets the newly created nodes' parents.

A duplicate call to Iterator.next() has been removed
when processing ASTs.

A duplicate loop when processing for expressions has
beeen removed.

Duplicate processing of inputs in the XForm driver has been
removed.

Thanks to Karen Osmond for identifying several of these issues,
and thanks to Laune Harris for fixing them.

Finally, this release makes the following miscellaneous
changes:

xtc.util.Pair now has improved support for treating
pairs as lists. In particular, the following methods have been
added: hashCode() to determine a list's
hashcode, equals() to test for list equality,
toString() to determine a list's human-readable
representation, get() and set() to access a
list's elements, contains() and consists()
to test for a list's elements, and setLastTail()
and append() to append two lists (either destructively or
not).

Pair has also been changed to
implement Iterable<T>, thus enabling the use of
pairs in Java's enhanced for loop. Thanks to Petar Maymounkov for
suggesting this change.

The new printHeader() method
in xtc.util.Tool prints a header appropriate for
machine-generated code. Rats! has been updated to use this
method.

The xtc.type.TypePrinter now tracks already printed
compound types and prints just a reference on subsequent encounters
(instead of printing the complete type). This change avoids an
infinite recursion when a complex type references itself, e.g., a C
structure containing a pointer itself.

xtc.tree.Printer now supports close().
Furthermore, a NullPointerException when
invoking reset() on a Printer that does not
buffer the output has been fixed. Thanks to Patrick Winters for
identifying the latter issue.

A NullPointerException when
invoking dump() on
a xtc.util.SymbolTable.Scope containing
a null value has been fixed. Thanks to Laune Harris for
identifying this issue.

The unnecessarily complex fmt()
and msg() methods of xtc.util.Utilities have
been removed after refactoring the error reporting code
in xtc.parser.ParserBase
and xtc.util.Runtime.

1.10.0 (12/24/06)

Major feature and bug fix release.

All code is now compiled with Java 5:

Most classes in the xtc.util, xtc.tree,
xtc.parser, and xtc.type packages have been
updated to utilize the new language features, notably generics. As
part of the conversion process, many classes have been simplified,
notably by replacing explicit iterations with for-each loops.

Most code in the xtc.lang and xtc.xform
packages still needs to be updated and thus results in "unchecked
operation" warnings.

The ANTLR- and JavaCC-generated Java parsers
in xtc.lang.antlr and xtc.lang.javacc have
been annotated with "@SuppressWarnings("unchecked")" to
avoid unnecessary warnings; since they depend on external tools, they
will not be updated to Java 5.

Tools such
as Retroweaver
or Retrotranslator can
be used to backport compiled binaries to version 1.4 virtual machines.
Furthermore, parsers generated with the rawTypes grammar
attribute will still compile with previous versions of the runtime
classes (after removing one annotation, see below).

Documentation generation has been fixed to automatically link to
Sun's web site for version 1.5 of the Java platform libraries.

This release makes the following changes
to Rats!:

If a repetition, option, or nested choice that is not the last
expression in a sequence contains only nonterminals referencing void
productions, voided expressions, or predicates, Rats! now
automatically voids the entire repetition, option, or nested choice.
For example, if nt references a void production, then
"nt*" is now treated as "void:(nt*)".

The declared type of a production's semantic value may now be a
parameterized type such as "Set<Integer>" or
"Map<String, Integer>". Note
that Rats! does not recognize wildcards. It does, however,
allow white space (but not comments) between typenames, type argument
brackets, and commas.

The new rawTypes grammar attribute
instructs Rats! not to use generics and to include a
"@SuppressWarnings("unchecked")" annotation in the
generated parser. Otherwise, Rats! now leverages xtc's new
support for Java 1.5. Performance measurements of the Java parser
show that (1) there is no difference in throughput or heap
utilization between the version using generic types and the version
using raw types and (2) both versions running on the Java 5 virtual
machine are 4-6% slower than previous versions of xtc running on the
Java 1.4 virtual machine for Mac OS X.

The untyped set grammar attribute has been replaced
by the more specific setOfString grammar attribute.
Other type-specific set attributes will be added as needed.

The new "-ast" command line option
instructs Rats! to print a description of a grammar's
abstract syntax tree as an ML-like type definition; it only considers
generic productions.

Parsers now correctly import xtc.tree.Node even if
the dump option is specified. Thanks to Sukyoung Ryu for
identifying this bug.

Semantic predicates spanning several source lines in a grammar now
result in correct code (instead of each line being terminated with
") {"). Thanks to Sukyoung Ryu for identifying this
bug.

Ordered choices in syntactic predicates within generic productions
are now correctly lifted into void productions instead of into
productions returning Object as the semantic value.

A voided expression nested within another voided expression
without parentheses, i.e. "void:void:expr", is
now parsed as a voided expression and not a voided binding to the
(illegal) identifier "void". The redundant voiding
operator is ignored.

Parser actions do not result in "last element in alternative
without semantic value" errors anymore.

If an alternative's first expression is a followed-by syntactic
predicate and that predicate is followed by a string literal, string
match, or parser action, the parser generator now creates correct code
instead of using the yyBase variable without declaring
it. Thanks to Sukyoung Ryu for identifying this bug.

Parsers do not include casts to Pair anymore when
creating generic nodes. These casts became unnecessary with the
improved deduction of semantic values' types in release 1.9.0; this
release (1.10.0) further refines type deduction, notably for the types
of repeated expressions.

In addition to the conversion to Java 5, Rats!' code has
been significantly cleaned up. Notably, the
production's element field, which had
type Element, has been replaced by the more specific
choice field (of type OrderedChoice). Next,
an ordered choice's alternatives are now sequences (and not arbitrary
elements anymore). Finally, all properties used by the parser
generator are now collected in the new Properties
class.

A bug in the implementation of generic nodes has
been fixed: GNode.ensureVariable() does not reverse the
children anymore if it is invoked on a generic node with a fixed
number of children.

Support for language tools has been improved by
adding two new methods to xtc.util.Tool:
The process(String) method recursively processes the file
with the specified name and the wrapUp() method is called
after all files have been processed. Thanks to Hunter Freyer for
suggesting these improvements.

The Java grammar has been changed to support an
optional comma in array initializers and to allow single-line comments
to be terminated by the end-of-file. Thanks to Martin Hirzel for
identifying and fixing these issues.

The Java simplifier now correctly processes this()
and super() call expressions. Thanks to William Moy for
identifying this bug.

Finally, this release changes the C type checker
to correctly use composite types for function definitions following
one or more declarations.

1.9.3 (9/20/06)

Minor bug fix release.

This release fixes bugs when pretty printing switch, case, and
default constructs for Java ASTs. Thanks to William Moy for pointing
out this issue.

Thanks to Martin Hirzel, this release also improves the
documentation for the Java AST simplifier.

1.9.2 (9/12/06)

Minor bug fix release.

Thanks to Martin Hirzel, this release includes further fixes for
simplifying and printing Java abstract syntax trees.

1.9.1 (9/7/06)

Minor bug fix release.

Thanks to Martin Hirzel, this release fixes a bug when processing
assignments during simplification of abstract syntax trees for
Java.

1.9.0 (9/5/06)

Major feature and bug fix release.

xtc now requires JDK 1.5 to build and run.
While xtc still is written in version 1.4 of the Java language, it now
uses classes and interfaces from version 1.5 of the platform
libraries. Notably, all uses of StringBuffer have been
replaced with StringBuilder.

The interface to abstract syntax tree nodes has
been generalized by moving the methods for generic tree traversal and
for adding/removing children from xtc.tree.GNode up
to xtc.tree.Node. As part of that
move, hasChildren() was renamed to isEmpty()
and children() to iterator() to be more
consistent with the Java platform libraries.

To avoid forcing every subclass into implementing these methods,
Node provides default implementations for all methods,
which effectively signal unsupported operation exceptions. Code using
nodes can determine whether a node actually supports generic tree
traversal through the hasTraversal() method and
adding/removing children through the hasVariable()
method. To support generic tree traversal, a subclass only needs to
implement the size(), get(int), and
set(int, Object) methods. To support
adding/removing children, a subclass only needs to implement the
add(Object), add(int, Object)
and remove(int) methods.

Support for AST annotations has been improved,
with xtc.tree.Annotation now supporting generic
annotations through the before1(), after1(),
round1(), and variable() factory methods.
Furthermore, the new node type xtc.tree.Token supports
the representation of source file symbols as nodes.

In the presence of annotations and tokens, instance tests and
casts on objects returned from an AST node may not work as expected.
Code processing trees should use getString() to access
string children and getGeneric() to access generic nodes.
Furthermore, it should use Token.test()
and Token.cast() to test for and cast to strings and
GNode.test() and GNode.cast() to test for
and cast to generic nodes.

All code using generic nodes has been updated to reflect the new
interface. Furthermore, xtc.tree.Printer.format() now
accepts any node and uses generic traversal to print that node.

xtc now includes working support for semantic
analysis of C. xtc.lang.CAnalyzer provides a
type checker for C99 and commonly used GCC extensions. While it
successfully passes most of GCC's regression tests, its support for
C99's variable length arrays is not yet complete. It also does not
support GCC's extern inline functions and variables in
specified registers. In support of CAnalyzer,
the xtc.type package has been significantly improved,
notably with a class hierarchy of references to model the memory
layout of lvalues. Several bugs have also been fixed. Furthermore,
the creation of fresh symbols in xtc.util.SymbolTable has
been fixed so that symbols are, in fact, fresh.

The new type checker is invoked through the -analyze
command line option to xtc.lang.CDriver.
The -strict option instructs the C driver to disable
GCC's extensions. The -markAST option instructs the C
driver to annotate AST nodes with their types. Finally,
the -printSymbolTable instructs the C driver to print the
symbol table after analysis.

The C grammar has been extended with support for
unnamed struct and union fields within structs and unions.
Furthermore, an initialized declarator now starts with an optional
attribute specifier list, shifting all previous component expressions.
Next, the C grammar now recognizes
GCC's __builtin_offsetof() function
and __complex__ as an alternative to
C99's _Complex. Finally, the order of identifiers and
constants in PrimaryExpression has been reversed, so that
wide C character and string constant are now correctly recognized.

This release makes the following changes
to Rats!:

In generic productions, alternatives with semantic actions that
assign yyValue are now treated just like bindings
to yyValue: the parser uses the explicitly specified
value instead of creating a new generic node.

The parser generator now deduces that the semantic value of a
sequence without any elements is null. Module
xtc.util.Null has been updated accordingly, removing the
explicit semantic action.

The parser generator now more precisely deduces the type of
productions containing an automatically recognized null
value. In particular, productions representing desugared options now
have the type of the optional expression and not
necessarily Object anymore.

The parser generator now checks that an expression appearing in a
repetition does not match the empty input. Otherwise, Rats!
reports an error. This checks prevents infinite recursions during
parser execution. Thanks to Christine Flood for identifying this
issue.

Similarly, the parser generator now checks that an expression
appearing in an option does not match the empty input.
Otherwise, Rats! reports a warning.

The parser generator now checks that every alternative in a
production sets the semantic value, either because the grammar
specifies the value or because Rats! has deduced the value.
Otherwise, Rats! reports an error. This check preempts Java
compiler errors reporting that "variable yyValue might not have been
initialized".

Module resolution has been modified to preserve the source
location of nonterminals. It does not replace each nonterminal with
the nonterminal from the defining production anymore.

If a grammar has the genericAsVoid attribute,
productions with type Node are now automatically voided
as well.

For generic productions, yyValue is now declared as
Node and not as GNode.

Code generation has been modified so that, if the semantic value
of an optional expression in a generic production is a list (i.e.,
xtc.util.Pair), parsers now add the list's values to the
production's generic node only if the list is not null.
As a result, parsers for grammars containing such expressions do not
fail with a null pointer exception anymore. Thanks to Uwe Simm for
identifying this issue.

The old transformer phase (see release notes for 1.8.2) and any
supporting code have been removed from Rats!.

All grammars have been updated to use Node (instead
of GNode) as the type of productions that pass generic
node values through. That way, they can accommodate annotated
nodes.

The XForm AST query and transformation engine
now supports add and remove operations. For example, "add
Child<> to //Parent" adds a Child node to
all Parent nodes in the AST, and "remove
//SomeName" removes all SomeName nodes from the
AST. Additionally, an out of range or otherwise malformed integer
predicate no longer causes a runtime exception; rather, an empty
sequence is returned.

1.8.2 (8/8/06)

Minor feature and bug fix release.

This release improves Rats! by
featuring a completely rewritten Transformer phase. This
phase deduces semantic values, lifts nested choices, repetitions, and
options, and desugars repetitions and options. The rewritten code is
more modular and (hopefully) more easily maintainable. It also is
more accurate in deducing semantic values and more uniform in
processing (deeply) nested choices, repetitions, and options. As a
result, the rewritten code also fixes a regression identified by
Thomas Moschny.

A set of regression tests for Rats! has been added. The
tests are invoked by typing make check-rats in the
top-level directory of the distribution.

The old version of the transformer phase is still available
through the -oldTransform command line option
to Rats!. However, it is deprecated and will be removed in
the near future.

Error checking of grammars has been improved. In particular:

Rats! now checks that the (unqualified) name of the
top-level module is consistent with the (unqualified) name of its
file.

Rats! also checks that a production does not have both
inline and transient attributes,
since inline subsumes transient.

Finally, Rats! now checks that bindings are not bound
again and that predicates are neither bound nor matched.

The folding of equal sequences has been modified so that it does
not result in a trailing choice of empty alternatives anymore.

Code generation has been modified to avoid declaring and
assigning the yyPredIndex variable if the variable's
value is never used. Thanks to Thomas Moschny (and Eclipse) for
pointing out this issue.

This release improves the Java grammar by adding
support for empty declarations (a semicolon by itself), assert
statements, and class selection expressions. Thanks to Terence Parr
for identifying these issues.

This release also contains a snapshot of the on-going effort
towards supporting semantic analysis. Notably,
the xtc.type package has been significantly improved and
xtc.lang.CAnalyzer has been updated accordingly.
However, for now, typing of C programs still is buggy and
incomplete.

Finally, unnecessary import declarations have been removed
throughout xtc, including from parsers generated
by Rats!.

1.8.1 (6/10/06)

Minor bug fix release.

This release renames xtc.parser.BaseParser to
ParserBase and xtc.parser.PackratParser
to FullParserBase.
Additionally, FullParserBase now inherits from
ParserBase to avoid code duplication.

Next, this release makes the following changes to Rats!'
code generator:

Parser classes now inherit from the
renamed ParserBase and FullParserBase
classes.

Self-assignments of index variables are now suppressed. Thanks to
Thomas Moschny for reporting this issue.

Bindings for optional expressions are now declared with the types
of the bound values, even if several optional expressions with
different types appear in a sequence. Rats!' own grammar has
been updated accordingly, removing now unnecessary explicit casts.
Thanks to Thomas Moschny for reporting this issue.

This release also fixes a bug
in xtc.lang.JavaAstSimplifier
and xtc.lang.JavaPrinter that caused a null pointer
exception when pretty printing simplified method declarations. The
fixed version of JavaAstSimplifier preserves the number
of children in MethodDeclaration AST nodes.

1.8.0 (6/6/06)

Major feature and bug fix release.

This release considerably improves xtc's support for
the semantic analysis of programs. In particular,
the new xtc.util.SymbolTable class implements a scoped
symbol table that easily integrates with AST traversal through xtc's
visitors. The new xtc.type package provides
representations for a program's types. It currently covers all of C's
and Java's types (as of JDK 1.4). The
new xtc.lang.CAnalyzer visitor leverages the new classes
to fill in the symbol table for a program and to check semantic
correctness along the way. However, CAnalyzer is still
incomplete and buggy.

The new interface xtc.Limits specifies
the integer range limits for a local system's C
compiler. The version distributed with xtc's release is consistent
with GCC for Mac OS X on the PowerPC and for Mac OS X, Linux, and
Windows on x86 processors. limits.c in the same package
can be used to generate the correct limits for other operating systems
and architectures.

Next, the C grammar has been changed as
following:

Typedef's enum constants are now treated as regular identifiers
and not as type aliases anymore.

Redefinitions of variables, functions, and typedef names within
the same scope are now ignored by the C parser's internal state, as
they are erroneous anyway.

Each component of an integer or floating point type specifier and
each kind of storage class specifier now results in the creation of a
separate AST node. The pretty printer has been changed accordingly.
This change simplifies semantic analysis.

Array qualifiers are now represented by regular type
specifiers/qualifiers instead of dedicated array qualifier nodes. The
pretty printer has been changed accordingly.

The recognition of GCC attributes now is more accurate. The
productions and AST nodes for initialized declarators and bit fields
have been modified, and a new production and AST node for attributed
abstract declarators have been added. The pretty printer has been
changed accordingly.

Support for imaginary numbers has been removed, since they
(thankfully) are not part of the C standard anymore.

xtc.lang.CParserState, which is used to disambiguate
typedef names from object, function, or enum constant
names, has been changed to support subclassing and thus to simplify
the implementation of extensions to the C language.

Next, the Java grammar has been improved by using
more descriptive names for a large number of productions, by
optimizing several productions, and by eliminating the creation of
unnecessary AST nodes. The Java printer has been updated
accordingly.

Both the recognizer-only and the AST-building Java parsers are
now generated from the same grammar through the
new genericAsVoid grammar attribute (see below). The
top-level module for both versions is xtc.lang.Java and
the corresponding parsers now are xtc.lang.JavaRecognizer
(no AST) and xtc.lang.JavaParser (AST).

To better evaluate and compare parser performance, the Java
driver can now generate ASTs when using JavaCC- or
ANTLR-generated parsers. The AST-building JavaCC grammar has been
generated with Java Tree
Builder (version 1.2.2) from the original JavaCC grammar (dated
5/5/02). The AST-building ANTLR grammar is distributed by the ANTLR
project, with the recognizer-only version being manually derrived from
the original. Both versions of the ANTLR grammar have been updated to
version 1.21.

The xtc distribution now contains support
for SDF and Elkhound generated Java
parsers (again to evaluate and compare parser performance):

SDF

The new top-level glr directory contains Java 1.5 and
1.4 grammars for SDF. The 1.5 version is the grammar from the
java-front
0.8 distribution (with a differently named top-level module) and
the 1.4 version has been derrived from the former by removing support
for generics, the enhanced for loop, typesafe enums, varargs, static
imports, and metadata. The glr/buildsdf.sh script is
used to generate the corresponding parse tables and
the data/sdf.sh script is used to perform a performance
evaluation. The buildsdf.sh script depends on the
pack-sdf and sdf2table tools, while the
sdf.sh script depends on the sglr
and sglri tools.

Elkhound

The Elkhound-based Java parser, called Ella, is contained in the
glr/ella directory. It includes the corresponding
lexical, syntactic, and AST specifications as well as any supporting
C++ code. Ella depends on
the smbase, ast, elkhound, and
elsa packages from Elkhound's source distribution. It
can be built by copying the corresponding directories into
the glr directory and then
executing ./configure and make in that
directory. The data/ella.sh script is used to evaluate
Ella's performance.

This release makes the following changes
to Rats!:

The grammar-wide reserved attribute has been replaced
with the new set attribute, which results in the
generation of a static final set with the attribute's value as its
name. It also results in the inclusion of a convenience
method add(Set,Object[]) for filling this set. The
XForm, C, and Java grammars have been modified accordingly.

The new grammar-wide genericAsVoid attribute can be
used to generate a parser that only recognizes a language but does not
build an AST from the same tree-building grammar. It is now used for
generating the recognizer-only Java parser from
the xtc.lang.Java module.

Nonterminals may contain the underscore character again, but only
if it appears within a name but not at the beginning or end.

The cost (-Ocost), choices2
(-Ochoices2), and prefixes (-Oprefixes)
optimizations are now enabled by default. The choices2 optimization
now only inlines productions that have been marked with the
new inline attribute. Otherwise, this attribute is
semantically equivalent to transient.

The new gnodes optimization (-Ognodes) leverages
xtc.tree.GNode's new factory methods to create leaner
generic nodes. It is enabled by default.

The new -lgpl option generates parsers that are not
restricted by the GPL. Parsers generated with this option use the
new xtc.parser.BaseParser base class, which, unlike
xtc.parser.PackratParser, does not reference any classes
released under the GPL.

A performance bug in the select optimization
(-Oselect) has been fixed. Thanks to Laune Harris for
helping to identify and fix this issue.

A long-standing bug in the application of actions
(xtc.util.Action) has been fixed. Actions are used to
construct left-recursive data structures from right-recursive
productions. But their application did not annotate nodes with
source code locations; this has been fixed through the new
PackratParser.apply(Pair, Object, int) method.
Furthermore, the Action class has been turned into an
interface.

A bug in the code generator has been fixed so that bindings with
the same name occurring in subsequent repated or optional expressions
do not result in compiler errors anymore.

As a result of these changes, the throughput of the AST-building Java
parser has improved by 31.5% and the throughput of the C parser has
improved by 52%.

Thanks to Laune Harris, this release makes the following major
changes to XForm:

XForm's expressivity has been significantly improved. Notably,
XForm now supports node insertion ("insert before" and
("insert after") and set difference
("differ"). Next, arbitrary expressions including
function calls can now appear in predicates. Finally, function
arguments can now be sequences, strings, or integers instead of just
integers.

The XForm function library has been extended, including support
for sequence and string manipulation. Functions defined by XForm do
not need to be explicitly imported anymore.

The XForm driver supports more flexible command line options,
including for specifying the language parser and pretty printer as
well as for measuring engine performance.

New example queries have been added to
the xform/samples directory. In addition an example Java
language extension has been added
to xform/samples/javaproperty.

Additionally, several minor XForm bugs have been fixed.

Java's access control is now disabled for xtc's visitor
dispatch. As a result, visitors can now be specified as
anonymous inner classes. For example, xtc.lang.CAnalyzer
uses this feature to analyze declaration specifiers and
declarators.

Generic nodes now need to be created through a
set of factory methods; look for the create() methods
in xtc.tree.GNode. Several of these methods directly
accept a generic node's children and return generic nodes that are
specialized for the specified number of children. As a result, such
fixed size nodes do not support
the add(), addAll(),
and remove() methods defined
by xtc.tree.GNode. They can be distinguished from
variable sized nodes through isVariable() and converted
to variable sized nodes through ensureVariable(GNode).
Rats!' new gnodes optimization (see below) utilizes these
factory methods to reduce the memory and performance overhead of
parsers with generic productions.

This release introduces improved support for building
language tools with xtc. In particular, the
new xtc.util.Runtime class manages command line options,
errors and warnings, and output to the standard console. The
new xtc.util.Tool class provides a skeleton tool
implementation, including support for several default command line
options. Rats! and the C, Java, and XForm drivers have been
rewritten to utilize both classes. Note that, as a result of this
rewrite, some command line options for these tools have changed.

This release also introduces our first unit
tests. We rely
on JUnit as our unit
testing framework and JUnit's binary release (junit.jar)
must be in the classpath. Thanks to Anh Le, this release also
introduces our first regression tests, based on GCC's
regression tests. Just like GCC, we rely
on expect
and DejaGnu to
perform these tests. The
description of our development setup
and the sample shell scripts (setup.bat
and setup.sh) have been updated accordingly.

xtc now builds with JDK 1.5 by passing
the -source 1.4 flag to the javac compiler.
All sources remain at Java version 1.4.

xtc's licensing has been changed: Most of the
code is now released under the GNU General Public License (GPL)
version 2. The exceptions
are xtc.parser.BaseParser, xtc.parser.Column,
xtc.parser.Result, xtc.parser.SemanticValue,
xtc.parser.ParseError, xtc.tree.Location,
and xtc.util.Pair, which are released under the GNU
Lesser General Public License (LGPL) version 2.1. The main licensing
change is that the option of using later versions of the GPL and LGPL
has been removed.

Thanks to Marco Yuen and Marc Fiuczynski, this release
incorporates C4, the
CrossCutting C Compiler. C4 makes aspect-oriented
software development techniques available to C programmers, with the
goal of simplifying the development of software variants, notably for
the Linux kernel.

1.7.1 (8/17/05)

Minor feature and bug fix release.

This release makes the following changes to Rats!:

The new grammar-wide reserved attribute results in
the generation of a static final set of reserved
identifiers RESERVED and a convenience
method reserve(String[]) for filling this set. This
attribute eliminates the need for explicitly defining this set in a
body action (though the set still has to be filled in an action).

The new grammar-wide flag attribute results in the
generation of a static final boolean with the attribute's value as its
name. This attribute eliminates the need for explicitly defining such
a flag in a body action.

To effectively support the new flag attribute, the
processing of attributes has been updated. As a result, attributes
such as transient, whose values used to be ignored, now
must not have values. Internally, the
class xtc.tree.AttributeList has been added and
xtc.tree.Attribute.equals() has been changed to take an
attribute's value into account.

A modifying module's attributes now override the modified module's
attributes, with the exception of any
stateful, reserved, or flag
attributes, which are preserved. When pretty printing modules with
the -html command line option, globally effective
attributes are now highlighted (assuming
the grammar.css
stylesheet contained in the source distribution's root directory also
is in the same directory as the HTML files).

A modifying module's header, body, and footer actions are now
combined with the modified module's actions.

Modules may now contain no productions at all. This is useful for
separating header, body, and footer actions as well as globally
effective attributes
(i.e., stateful, reserved,
and flag).

Stateful or resetting productions may now appear in a module if
any of the dependent modules has a grammar-wide stateful
attribute and not just the module itself.

Public productions in dependent modules are not treated as
top-level productions anymore.

Qualified nonterminals are now resolved correctly, even if the
corresponding production is defined in module modified by a module
that is imported by the referencing module. Furthermore, the speed of
look-ups in presence of multiple definitions (across all grammar
modules) has been improved.

Sequence names are now preserved when copying sequences.
Furthermore, productions are now correctly removed
by Analyzer.remove(Module). Finally, ambiguous
nonterminals are now always detected. As a result of these bug fixes,
it is now possible to apply multiple independent modifications to the
same base module. Thanks to Martin Hirzel for identifying the first
bug (whose resolution triggered discovery of the other two).

Error locations are now formatted as
filename:linenumber:column-number
to better integrate with Emacs. Thanks to Martin Hirzel for
suggesting this improvement.

The C, Java, and XForm grammars have been modified to utilize the
new attributes. Additionally, the C and Java grammars have been
further modularized, up to the respective top-level module, which now
simply modifies another, parameterized module.

Additionally, this release makes the following changes to xtc's C
support:

xtc.lang.CSymbolTable has been renamed
to CParserState to emphasize that it does not implement a
full symbol table.

xtc.lang.CCounter can now print its own statistics
through the print(xtc.tree.Printer) method. It has also
been updated to reflect recent changes in the C grammar. The C driver
has been updated accordingly.

xtc.tree.GNode's interface has been improved. In
particular, numberOfChildren() has been renamed
to size(), addAll(List) has been changed
to addAll(Collection), and add(int,Object),
addAll(int,Pair), and addAll(int,Collection)
have been added.

A bug in XForm, which causes the result of a query to contain
internal item objects, has been fixed.

1.7.0 (8/9/05)

Major feature release.

In short, this release adds a module system to Rats!, adds
support for building and printing an AST in the Java driver, fixes
several bugs in the C parser and printer, and includes a significantly
improved XForm, our AST query and transformation engine.

In more detail, this release introduces a simple yet powerful
module system for Rats!. The module system
supports basic modules to factor grammars into re-usable units. It
supports module modifications to concisely specify extensions.
Finally, it supports module parameters to easily compose different
extensions with each other. As a result, the format of grammar
specifications has been changed and grammars not distributed with this
release need to be modified. The module system is described in detail
in the package documentation for xtc.parser.

To get a peek at modules, execute the following command
in src/xtc/lang:

java xtc.parser.Rats -in ../.. -instantiated -html C.rats

Then open the resulting xtc.lang.C.html file in your web
browser and explore.

This release makes the following, additional changes
to Rats!:

The search path for grammar modules can be explicitly specified
from the command line by using one or more -in options.
If no such options are present, the search path is the current
directory.

To help understand and debug grammars, Rats! can now
print all modules after loading (through the -loaded
command line option), after instantiating
(-instantiated), after applying modifications
(-applied), and after all processing
(-processed). If the -html command line
option is present (as illustrated above), the last three printing
options will generate hyperlinked HTML in Rats!' output
directory (which can be set with -out). The
corresponding stylesheet is grammar.css.

Grammar-wide attributes can be specified from the command line by
using one or more -option command line options. Most
attributes have also been renamed. Notably, debug is
now verbose, constantBinding is now
constant, state is now stateful,
reset is now resetting, ignoreCase
is now ignoringCase, and location is now
withLocation. Furthermore, mainMethod is
now main and usePrinter is
now printer.

The character constant has been changed from '.' to
'_'. Nonterminals may not contain underscores
anymore.

A NullPointerException when processing undefined
nonterminals in xtc.parser.TextTester has been
eliminated.

A NullPointerException when processing optional
sequences with no bindable value
in xtc.parser.MetaDataSetter has been eliminated. Thanks
to Stacey Kuznetsov for identifying this bug.

Voided repetitions and options do not result in unnecessary
warnings (indicating a lack of a bindable element) anymore. Thanks to
Stacey Kuznetsov for identifying this bug.

The processing of nested repetitions and options in generic
productions has been improved so that repeated or optional elements
are only bound if strictly necessary.

The voiding of unbound productions has been improved to recognize
more opportunities.

The reporting of parse errors has been improved. The new methods
PackratParser.format(ParseError)
and PackratParser.print(ParseError) simplify the display
of parse errors, while the new
exception xtc.parser.ParseException simplifies the
propagation of parse errors.

The Java driver can now optionally build an
abstract syntax tree and also pretty print that tree. Thanks to
Stacey Kuznetsov for implementing the necessary changes.

The C grammar and pretty printer have been
improved as follows:

GCC attributes can now appear at the end of parameter
declarations. Thanks to Marco Yuen for identifying this bug and
suggesting a fix.

Obsolete GCC field designations when initializing structures and
unions are now parsed correctly and printed as standard C field
designations. Thanks to Marco Yuen for identifying this
short-coming.

Compiler directives such as line markers nested in structures and
unions are now added as annotations to the correct nodes and printed
correctly.

Field names in structures and unions do not shadow type aliases
anymore. Thanks to Marco Yuen for identifying this bug.

XForm, the query and transformation engine, has
been improved as follows. Thanks to Joe Pamer for realizing these
changes.

XForm now supports the or and and
logical operators.

ASTs can now be traversed inside-out (or bottom-up), instead of
only outside-in (or top-down), by using the inside_out
operator.

The structure of results as lists of lists (for example, when
evaluating comma-separated expressions) is now preserved. If
necessary, XForm iterates over individual elements as if such lists of
lists were flat.

Replacement expressions now return the inserted items instead of
the AST's root.

The engine implementation has been rewritten for efficiency, with
considerable savings in heap utilization. In our experiments, the
number of xtc.xform.Item objects allocated while
performing a query has been reduced by a factor of 90.

1.6.1 (6/11/05)

Minor bug fix release. This release eliminates
NullPointerException's
in xtc.lang.CPrinter.visitStructureDeclarationList() and
in xtc.xform.Item.equals().

1.6.0 (6/11/05)

Performance tuning release. This release focuses on improving
performance and a corresponding code clean-up; as a
result, this release may break existing code.
Performance tests on an 2002 iMac (with a 800 MHz PowerPC G4 processor
and 1 GB of RAM) show that Java driver throughput has improved by 49%,
from 256 KB/s up to 382 KB/s, and heap utilization has improved by
25%, from 58:1 (i.e., 58 bytes of heap per 1 byte in the input) down
to 43:1. C driver performance for parsing and pretty printing the
entire Linux 2.6.10 kernel (~1,000 files) has improved by 35%, from
211 minutes down to 137 minutes. Improvements are similar for a
faster machine: C driver performance for parsing and pretty printing
the Linux kernel on a 2004 PowerMac (with two 2.5 GHz PowerPC G5
processors and 1 GB of RAM) has improved by 34%, from 56 minutes down
to 37 minutes. All our C driver experiments used a Java heap size of
512 MB (both minimum and maximum size); performance improvements for
configurations with smaller heaps are likely to be much more
pronounced.

In detail, this release makes the following performance-related
improvements:

Input files are not buffered in their entirety anymore. The
corresponding -buffer and -nobuffer command
line options for Rats! and the C and Java drivers have been
removed. Parse error reporting now uses a new method
(xtc.parser.PackratParser.lineAt()), as input lines are
not directly available
anymore. xtc.util.Utilities.msg(), which is used for
error printing, has been changed accordingly.

All output is now buffered and not flushed on each newline
anymore. xtc.tree.Printer's constructors have been
changed accordingly.

Productions that are only referenced once within a grammar are now
automatically marked as transient. While such productions were not
memoized before, repetitions appearing in such productions were
desugared into the equivalent right-recursions. With this change,
repetitions are not desugared any more. The command line option
corresponding to this optimization is -Onontransient (for
"optimize non-transient productions"). This optimization is enabled
by default. Since this optimization creates new transient
productions, the -Oerrors2 optimization is now disabled
by default.

The C and Java grammars have been modified to optimize the
recognition of hierarchical syntax. Based on a simple, albeit manual
analysis most productions are now marked as transient (unless they are
only referenced once in a grammar and thus automatically recognized as
transient; see above). The analysis compares the tokens appearing
before all occurrences of a nonterminal. If they are all different,
the corresponding production is marked as transient.

The C and Java grammars have also been modified to optimize the
recognition of identifiers and keywords. Additionally, the Java
grammar now relies on Character.isJavaIdentifierStart()
and Character.isJavaIdentifierPart() instead of
(incorrectly specified) explicit character classes.

Where possible, options are not desugared and lifted into their
own productions anymore, but rather implemented directly. The
corresponding command line option for Rats!
is -Ooptional. This optimization is enabled by default.
Note, however, that this optimization may result in a loss of accuracy
for deducing the type of a binding. For example,
if foo:Foo? and bar:Bar? both appear in the
same production, with Foo having String as
its type and Bar having Pair, then the
declared type for both foo and bar is the
common supertype Object.

Direct left-recursive productions are now transformed into
equivalent (transient) iterations instead of (memoized)
right-recursions. The corresponding command line option
for Rats! is -Oleft2. This optimization is
enabled by default. The previously supported transformation into
right-recursions is still available through the -Oleft1
command line option.

The parser code generated for string matches has been optimized.
The corresponding command line option for Rats!
is -Omatches. This optimization is enabled by
default.

The parser code generated for selecting the more specific parse
error has been optimized. The corresponding command line option
for Rats! is -Oselect. This optimization is
enabled by default.

The parser code for setting a node's location has been optimized
to avoid unnecessary instanceof tests and casts. To this
end, Rats! now interprets any import statements in a
grammar's header and tries to analyze the corresponding
classes.

The C and Java drivers now print overall performance statistics
based on a least squares fit of individual data points. The
statistics are the parser throughput in KB/s and the memory
utilization in bytes per byte in the input.

The new grammar-wide dumpTable attribute results in
the generation of a method, dump(xtc.tree.Printer), to
print the memoization table in a human-readable format. The dump can
be used for analyzing allocation patterns. The C and Java drivers
include a corresponding command line option (-memo),
which casues the memoization table to be printed after a successful
parse. Though, the C and Java grammars do not include the
dumpTable attribute. The
new xtc.parser.TableAnalyzer utility collects and prints
(minimal for now) statistics for a previously dumped memoization
table.

Thanks to Adam Kravetz for helping to identify several
opportunities for optimizations.

This release also cleans up the interface between nodes and
visitors. In particular, dispatch can now only be initiated by
calling Visitor.dispatch(Node) (instead
of Node.accept(Visitor)). Furthermore, processing
methods specified as part of nodes are now
named Node.visitWith(Visitor) (instead
of Node.process(Visitor)). In contrast
to accept(), dispatch()
handles null nodes, doing nothing and
returning null. Furthermore, if the
selected visit() or visitWith() method
has void as its return type, dispatch()
returns the specified node (instead of null).

Rats!' internal visitors have been updated to utilize
dispatch(). Additionally, many visitors have been
refactored to utilize a common
superclass, xtc.parser.GrammarVisitor, which reduces code
bloat across Rats!' internal visitors. All visitors
in xtc.lang were already
using dispatch().

Furthermore, this release makes the following changes
to Rats!:

A grammar's global state object now must be reset
explicitly by marking the corresponding productions with
the reset attribute. Rats!' own grammar and the
C grammar have been modified accordingly.

Top-level nonterminals may now be declared as void. The Java
grammar has been modified accordingly.

Rats!-generated parsers now support incremental parsing
through the resetTo(int) and isEOF(int)
methods. Incremental parsing is useful for processing interactive or
very large inputs. It is now used by
the xtc.lang.CDriver by default (and disabled through
the -noincr command line option).

Direct left-recursions in void and text-only productions are now
automatically transformed into the corresponding right-recursions,
comparable to the transformation of direct left-recursions in generic
productions. The speed of recognizing transformable productions
in Rats! has also been improved. The Java grammar's
productions for recognizing expressions have been rewritten
accordingly (i.e., left-associative operators are now expressed
through left-recursive productions).

Support for generic text productions (i.e., productions with
pseudo-type "gstring") has been removed. They
provide little benefit with considerable complexity (and code
duplication). The two generic text productions in the C grammar have
been rewritten as regular generic productions.

The new ignoreCase attribute instructs Rats!
to perform comparisons for string matches in a case-insensitive
manner. It applies to either the entire grammar or individual
productions and is useful for languages with case-insensitive
keywords. Note that comparisons for string literals, character
literals, and character classes continue to be case-sensitive, even in
the presence of this attribute. Thanks to Ken Britton for suggesting
this feature and providing me with a prototype
implementation.

Rats! now generates an error if transient is
used as a production's type and a warning if any other per-production
attribute is used as a production's type. In either case, the actual
type is probably missing from the production.

The simplification of grammars has been improved, with the result
that fewer repetitions and options need to be lifted into their own
productions.

If the semantic value of a bound repetition in a transient
production cannot be determined, Rats! now issues a warning
and does not fail with a NullPointerException anymore.
If the value of the repeated element is the value of a nonterminal,
the corresponding production is not voided anymore, even if the
repetition is automatically recognized as a production's semantic
value.

Bound repetitions or options in text-only productions are now
always lifted into their own productions, independent of whether they
are desugared or not. As a result, they now produce the correct
semantic value, namely the matched input as a string.

String matches against repetitions or options, which are not
desugared (e.g., because the production is transient), are now
processed correctly by lifting the repetitions or options into their
own productions. As a result, they do not result in a
ClassCastException anymore.

Reference counting now counts nonterminals appearing within
once-or-more repetitions in non-transient productions twice, as the
nonterminal will appear twice in the desugared version. Furthermore,
the recursive nonterminal in directly left-recursive productions is
not counted at all anymore, which is consistent with the transformed
version. As a result of these changes, reference counting can now be
performed before applying transformations, i.e. the Transformer,
DirectLeftRecurser, and Generifier
visitors.

The folding of duplicate productions now takes into account
whether productions are recognized as text-only.

A production containing a lone nonterminal is now inlined only if
it is transient. That way, a production containing a lone nonterminal
can be used to force memoization of another, transient
production.

This release also adds support for local label declarations to
the C grammar and pretty printer. Additionally, the C grammar, symbol
table, and pretty printer have been modified, so that annotations
encapsulating regular AST nodes now represent the compiler directives
preceding that node's text in the input (instead of the other way
around).

The new xtc.xform package provides a facility for
querying and transforming abstract syntax trees (ASTs). The query
language is inspired
by XPath
2.0, but has some significant differences, notably to
destructively modify ASTs. Thanks to Joe Pamer for implementing the
query and transformation engine.

1.5.2 (3/7/05)

Minor feature and bug fix release. This release
changes Rats! so that all repetitions appearing in
transient productions are implemented through iterations and are not
desugared into the corresponding recursive expressions (which can be
used to avoid stack overflow errors for long sequences of
expressions). This release also fixes a bug in Rats!, which
caused repeated sequences to be lifted too aggressively.

All tools now return appropriate exits codes, 0 on successful
executions and 1 on error conditions.

This release also improves the C grammar and pretty printer. In
particular, it fixes bugs in:

the recognition of structure and union declarations
(in particular, typedef declarations now only introduce
the identifiers in the declarator list as type names and field names
now properly override type names when preceded by a type
specifier),

the recognition of goto
statements,

the recognition of long
and long long constant suffixes (such
as LL),

the pretty printing of unary expressions with
the +, -, and &
operators,

the pretty printing of postdecrement
expressions,

the pretty printing of compound literals.

For some nested expressions (such as arithmetic expressions appearing
as operands for the bitwise or operator), the pretty printer now emits
parentheses to avoid warnings when compiling the resulting code with
GCC under the -Wparentheses command line option.

Additionally, the C grammar and pretty printer now support the
following (GCC) extensions:

#ident directives in preprocessed
code,

empty external definitions (i.e., a semicolon by
itself),

structures and unions with no members,

extra semicolons in structure and union member
declarations,

typeof (and underscored variations) as
a type specifier,

underscored variations of the signed
type specifier,

ranges in case labels,

labels without statements at the end of compound
statements,

assembly statements,

statements and declarations in expressions,

ranges in array designators,

__alignof__ as an expression,

the __builtin_va_arg() function (which
takes a type name as its second argument),

labels as values,

attributes (the full GCC mess),

underscored variations of
the const, volatile
and restrict type qualifiers,

the __extension__ specifier.

The C grammar now accepts source files with just white space and
comments. Furthermore, Rats!-generated parsers, when created
with an explicit file size argument to the constructor, now accept
empty files (i.e., of length 0).

The overall effect is that the C driver
(xtc.lang.CDriver) now parses and pretty prints the
entire Linux kernel (version 2.6.8). The resulting source code
compiles with GCC under the -Wall command line option
(and no warnings).

Thanks to Marc Fiuczynski for identifying most of the bugs and
missing language constructs and for testing the C driver against the
Linux kernel.

This release also changes the format of pretty printed ASTs to be
more compact (and to be consistent with the AST query language
currently being developed). Pairs (xtc.util.Pair) are
now mutable, but should still be treated as immutable if they are
memoized by a Rats!-generated parser.

1.5.1 (12/16/04)

Bug fix release.

It makes the following changes:

The new visible attribute supports the generation of
parsers that are package private (instead of public).

The interface to the global state
object xtc.util.State has been changed to reflect that
state modifications are modeled as lightweight transactions.

A memoization bug resulting in
an ArrayIndexOutOfBoundsException has been fixed; thanks
to Robin Lee Powell for identifying this bug.

A bug resulting in an ArrayIndexOutOfBoundsException
when printing a default parse error (returned by a transient
production under the errors2 optimization) has been fixed; thanks to
Robin Lee Powell for identifying this bug.

The accuracy of error messages under the errors2 optimization has
been improved (by avoiding to return default parse errors).

A new command line option (-out) to select the output
directory for parsers generated by Rats! has been added.
Also, Rats! now prints only errors to the error console.
Both changes improve integration with Ant; thanks to Yonas Jongkind
for suggesting them.

1.5.0 (11/11/04)

Performance tuning and bug fix release. Parsers generated
by Rats! now use arrays of read-in characters and memoized
results instead of a linked list of parser objects. The current
parser position now is an explicit index into these arrays instead of
a reference to a parser object. Performance tests with the Java
parser show that the parser consumes only half the memory and takes
only 80% the time when recognizing Java source files when compared
with previous versions.

Note that this release changes the basic parser interface
and is not backwards-compatible. In particular, parsing
methods now take an explicit index argument
(named yyStart), and the character() method
returns an int instead of a Result.
Furthermore, parsers perform best if they are created with the
three-argument constructor, which includes the length of the input.
For example, the following code snippet parses a file
named fileName of size fileSize with
reader in and top-level production TopLevel:

The chunking optimization is now correctly performed when it is
the only optimization (i.e., when using the "-Ochunks"
command line flag; bug fix).

1.4.2 (9/23/04)

Performance tuning and bug fix release:

The inlining of transient productions into choices has been
generalized; it now works for all types of productions, not just void
and text-only productions. However, since JIT-based Java virtual
machines do not seem to compile the resulting, possibly large methods
aggressively enough, this optimization can reduce performance and is
disabled by default. The corresponding command line option
for Rats! is -Ochoices2. The original (though
slightly improved) optimization for void and text-only productions is
still available under the -Ochoices1 command line option
and enabled by default. Note that the choices2
optimization includes the choices1
optimization.

A new optimization avoids the creation of error objects when
transient productions do not match the input. The corresponding
command line option for Rats! is -Oerrors2.
This optimization is complimentary to the previously available error
object optimization, which is now controlled through
the -Oerrors1 command line option. Both optimizations
are enabled by default.

xtc.tree.GNode now uses less memory for generic nodes
with zero or one children.

Nested choices appearing as the last element in another, repeated
or optional choice are now correctly lifted (bug fix).

A new driver, xtc.lang.CDriver, for parsing and
printing C has been added. It provides control over whether to print
parsed files and also supports the collection of runtime performance
statistics. Additionally, the invocation syntax for the main class,
xtc.Main, has been changed, now using
the -util comand line option to control tool
selection.

The C grammar, c.rats, has been tuned so that
alternatives that are more likely to appear in the input are parsed
first. For example, declarations are now parsed before function
definitions.

1.4.1 (9/16/04)

Minor feature and bug fix release:

Rats! now allows semantic actions in text-only
productions, though yyValue may still not be
referenced.

Rats! now recognizes the -Oleft command line
option to control the automatic transformation of direct
left-recursions into right-recursions. It also prints additional
messages under the -verbose command line option.

An index-out-of-bounds condition for sequences containing a
! syntactic predicate on a character constant or class,
followed by the any character constant, followed by any element has
been eliminated (bug fix).

Once-or-more repetitions containing a terminal are now correctly
desugared into the equivalent right-recursive expressions (bug fix).

The C grammar now correctly handles enumeration constants,
unsigned chars, chars, and string constants. It also recognizes
several GCC extensions, notably attributes. Additionally, it
recognizes pragmas and GCC line markers, which may be present in C
preprocessor output.

The C pretty printer now correctly parenthesizes expressions,
observing both precedence and associativity. The handling of spacing
(notably, newlines and indentation) has also been improved.

Nested choices, repetitions, and optional expressions in a generic
production are now treated just like they are in regular productions
(instead of resulting in separate generic nodes). In other words, if
the semantic value of such an expression cannot be automatically
determined, it needs to be specified in explicit actions.
Furthermore, the values resulting from a repeated expression are now
directly added as individual children to a generic production's
xtc.tree.GNode.

An option in a generic production's ordered choice can now pass
the semantic value of a component expression through by explicitly
binding to yyValue (instead of always resulting in a new
generic node).

A component expression's value can now be omitted from a
production's generic node, even if it is not a character terminal or
void nonterminal, by prefacing it with void:. This new
prefix operator has lower precedence than all other operators,
including regular prefix operators, with the exception of the ordered
choice operator /.

Direct left-recursions in a generic production are now
automatically transformed into the corresponding right-recursions,
with the resulting generic nodes preserving left-associativity.
Non-generic productions can achieve similar results by using the newly
added xtc.util.Action class.

Newly added generic text productions simplify the recognition of
text within generic productions. A generic text production has
gstring (for "generic string") as its type and a generic
node as its semantic value, whose only child is the text matched in
the input.

Additionally, the newly added state attribute and the
corresponding xtc.util.State interface help with writing
grammars that are context-sensitive and require global state. The
state attribute, as well as the debug,
location, and constantBinding attributes can
now also be specified on a per-production basis, simply by including
them before the production's type. Next, sequences can now be named;
the name is specified as the first element in a sequence by including
it between less-than < and greater-than
> signs. Furthermore, the readability of printed
grammars and generated code has been improved through new
line-wrapping facilities in xtc.tree.Printer. Finally,
identifiers may now contain underscores (_).

Almost all of the newly added features are utilized by the new
grammar for C and the corresponding pretty printer (in the
xtc.lang package). Parser and pretty printer can be
tested by executing "java xtc.lang.CParser
<file>".

1.3.0 (4/21/04)

Feature release. This release adds the ability to automatically
generate abstract syntax trees (ASTs) in Rats!. A production
that should result in a generic node (xtc.tree.GNode) as
its semantic value has
generic as its type. The corresponding generic node has
the same name as the production, and the children of the generic node
are the semantic values of all component expressions in the matched
sequence, with the exception of character terminals and nonterminals
referencing void productions.

1.2.2 (4/16/04)

Internal release. Fixed a bug in the desugaring of repeated
sequences for Rats!; thanks to Robin Lee Powell for
identifying this bug.

1.2.1 (4/13/04)

Bug fix release. Fixed a bug in Rats!, under which
options were too aggressively simplified; thanks to Robin Lee Powell
for pointing out the incorrect behavior resulting from this bug. Also
fixed two bugs in the processing of nested choices. Finally, fixed a
bug in the handling of bindings to nested choices.

1.2.0 (4/9/04)

Feature release. This release improves the reflection-based
dynamic dispatch for nodes and visitors by also allowing functionality
to be expressed as part of nodes: While visit() methods
in visitors are selected based on the type of the node, the
corresponding process() methods in nodes are selected
based on the type of the visitor. The dynamic dispatch mechanism
first tries to locate a process() method and, if none can
be found, tries to locate the corresponding visit()
method.

This release also makes the following improvements
to Rats!:

Added support for parser actions, which are written like regular
actions prepended by a caret ^ and contain low-level code
that parses languages not expressible by parsing expression grammars.
This feature has been motivated by Sameer Ajmani and Bryan Ford.

Added an optimization that folds common prefixes in the general
case. This optimization is disabled by default because it may
increase heap utilization of generated parsers.

Added support for the mainMethod grammar attribute,
which causes a main method to be generated that parses files specified
on the command line.

Fixed a bug in the detection of left-recursive productions.
Thanks to Robin Lee Powell for identifying the bug.

This release also includes support for generic abstract syntax tree
nodes, though they cannot yet be generated automatically.

1.1.0 (2/3/04)

Minor feature and bug fix release. This release
improves Rats! by fixing a bug in the processing of syntactic
predicates and by adding a new optimization that avoids stack overflow
errors on some Java virtual machines. The Rats!
tool, rats, now supports command line flags to control
which optimizations to perform. The Java parser
tool, pjava, now supports the printing of parser
statistics in white-space delimited format (to easier import data into
spreadsheets).