Library Source and StaticLang

Q: I've looked at the source for Token, Language, Lexer and Parser,
and OMG, what the hell is that mess?! (Especially Token! Geez!)

A: Heh :) Put simply, it's a temporary kludge. The StaticLang tool
generates static-style languages directly from Goldie's dynamic-style
library source. That mess is all the directives used to help instruct
StaticLang in how to do the conversion.

To read the dynamic-style source, just ignore all the weird
/+...+/ directives
and anything inside version(Goldie_StaticStyle)
(it will NOT be defined when you compile).

To read the static-style source, look at StaticLang's output
and remember that version(Goldie_StaticStyle) WILL be
defined when compiling.

To read the dynamic-style source, and see how StaticLang will
transform it, see the explanation of StaticLang's preprocessing
below.

I do intend to clean that all up.

Q: Goldie is already a language processing tool. Why don't you just use
Goldie and a D grammar to do the conversion? Or use D's fantastic
metaprogramming abilities?

A: I do intend to do something like that.

StaticLang's Preprocessing

Tag Structure

StaticLang's preprocessing system is based on some basic tags:

/+LETTER:TAG_NAME+/
/+LETTER:TAG_NAME+/DUMMY

Where:

LETTER is one of: P, S, or E

P tags stand for "Place" and are used by themselves.
These tags are replaced by some other text.

S and E tags stand for "Start" and "End"
and are used in pairs surrounding code. Both tags and all the code
in between are all replaced.

The text DUMMY can optionally be appended whenever needed to make D's parser happy.
This is usually only useful for P tags.

Note that, aside from the DUMMY text, the tags are seen as
nested comments by D.

Defined Tags

You'll notice that some of the following are not yet documented.
Full documentation of StaticLang's preprocessing is delayed because I
may make use of D's metaprogramming to take over much of the work currently
done by StaticLang.

There are three classes of tags:
cvars (Constant Vars), svars (Store Vars), and mvars (Modify Vars)

cvars (Constant Vars):

These tags are replaced with constant predetermined text that is always
the same for a given language. (But it may differ from one language to another.)

If S and E tags are used, all the code inside
the start and end tags is replaced.

REM

Comments. Short for "remark", just like in BASIC and BATCH. These are
removed completely by StaticLang, and replaced with absolutely nothing.

Generally, it's preferred to use
version(Goldie_DynamicStyle) instead of this.

STATIC

Replaced with the text static

OVERRIDE

Replaced with the text override

VERSION

Replaced with Goldie's version number.

PACKAGE

Replaced with the package name of the language.

For instance, if StaticLang is run with the command line parameter
-pack:myApp.langs.myLang, then this tag is
replaced with the text myApp.langs.myLang.

SHORT_PACKAGE

Replaced with the name of the language, ie., the last part of the package name.

For instance, if StaticLang is run with the command line parameter
-pack:myApp.langs.myLang, then this tag is
replaced with the text myLang.

First, the form /+S:TAG_NAME:STORE+/ code here /+E:TAG_NAME:STORE+/
is used to tell StaticLang to "store" the code inside.
The code code here is stored, but the whole set of tags
are removed and replaced with nothing.

Then, the form /+P:TAG_NAME+/ is used. The stored code
is inserted into some internally-defined boilerplate code, repeated however
many times necessary (determined by the TAG_NAME
and the language), surrounding boilerplate is added, and
then the final result is inserted in.

Within the "stored" code, the final two forms can be used to insert certain
data that's different for each repetition.

ACCEPT_TERM

ACCEPT_TERM:STORE

Inside the lexer, this stores/inserts the code to accept and create a new
terminal token of the appropriate type.

This is repeated (along with boilerplate) for each possible terminal
token type, such as
Token_myLang!(SymbolType.Whitespace, "Whitespace")
or
Token_myLang!(SymbolType.Terminal, "Number").

ACCEPT_TERM:TOKEN_CLASSNAME

Inserts the class name for a particular type of terminal token, such as
Token_myLang!(SymbolType.Whitespace, "Whitespace")
or
Token_myLang!(SymbolType.Terminal, "Number").

REDUCE

REDUCE:STORE

Inside the parser, this stores/inserts the code to reduce a group of
subtokens into a new nonterminal token of the appropriate type.

This is repeated (along with boilerplate) for each possible
Token_{languageName}!{rule} type, such as
Token_myLang!("<Mult Exp>", 5) (where "5" is the ruleId).

REDUCE:TOKEN_CLASSNAME

Inserts the class name for a particular type of nonterminal rule token,
such as
Token_myLang!("<Mult Exp>", 5) (where "5" is the ruleId).

mvars (Modify Vars):

These take the code between the S and E
tags, and modify it in some predetermined way.

STATIC_IDENT

Converts an identifier such as fooBar to staticFooBar.

LANG_IDENT

Converts an identifier such as fooBar to fooBar_nameOfLang.
For instance, if the language name (ie, the SHORT_NAME from above)
is myLang, then
/+S:LANG_IDENT+/fooBar/+E:LANG_IDENT+/
becomes fooBar_myLang.