Studying Axiom with a Context-Free Grammar

Axiom has been implemented in common lisp and
parsed with a modified Pratt parser. There's a preparsing
step that inserts parentheses and semicolons to implement
indentation sensitivity.

It is working, but it is difficult to find out how this
language ought be parsed from the contents of a hand-written
parser.

I wrote a context-free grammar for Axiom. Ultimately, I am
attempting to interpret this language and learn some details
about its innards.

The properties of the grammar

The language presented in this grammar is not equivalent
with the one parsed by the original implementation. I
ensured that it unambiguously parses the source code of the
packages that come with Axiom.

I am not certain that the whole parse is correct because
there are small details that were difficult to understand.

Preliminary

I assume that you understand what context-free grammars are.
To figure out the syntax, you may want to have a short
introduction. The following grammar includes the string
"little mary had a lamb"

Grammar's purpose is to reveal the structure of the strings
that are valid in a language it represents. The production
rules are generative so a string may be ambiguous, eg. it
may parse multiple ways.

Acquisition

Obtaining a grammar for a hand-written parser is difficult.
I tried several ways to obtain a grammar for the pratt
parser until I found a way that worked.

Next you start searching out for the topmost elements that
exist in the language and place them to their respective
places in the grammar. Eventually I figured out how to put
everything together that appears below.

Semicolon separators

Semicolons divide the file into statements, additionally
commas are doing the same within parentheses.

The with and add are weird but important parts of this
language. with is used to provide properties for domains
whereas add provides implementations for operations in
domains.

The category and domain constructors are allowed to contain
conditionals and loops. Hinting that domains and categories
are not parametric, instead they are always constructed by a
function that has parameters.

Expressions

The "or" and "and" here are left-recursive. It may be
necessary to see in practice which precedence rules the
Axiom used.

Various tight binding operators

It took a lot of effort to figure out what is the correct
order in the following operators. I think I didn't get it
right entirely. For example, sometimes the opsuffix appears
on a place that hints it doesn't structure like I believe it
does.

Trouble with the dollar sign

The interaction of the dollar sign is troublesome. It's
there to select the source package for the operation when it
may be vague. In practice it seems to be also used to denote
when the expression should be evaluated from the common lisp
environment, and it also means something when it appears
alone or in the end of the expression.

Nud/Led naming comes from the pratt parser. It refers to the
expression being a prefix vs. a subsequent item.

Additionally you can apply items without parentheses. a b
is same as a(b). This easily causes ambiguities when the
parentheses are also used for grouping items so the
production rules for parentheses have to be carefully
crafted.