This copyright notice applies to the text and source
code of this web site, and the graphics that appear on it.
The software described in this text has its own copyright notice and
license, which can be found in the distribution itself.

Permission is granted to copy, distribute, and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
Texts. A copy of this license is included in the file
COPYING.TXT

The GNU 3DLDF language is implemented by means of a
parser. This is actually a C++ function called
yyparse(), which is generated by a package called
GNU Bison from a parser input file.

Parser input files for Bison have a special form. They consist
largely of rules written in a special syntax with
associated actions, written in C or C++. Bison reads such
an input file and generates a file C or C++ code with the definition
of yyparse().

Bison expects a single input file. In the case of 3DLDF, this file is
generated by ctangle from a large set of CWEB files. These are
the files in the directory 3DLDF-1.2.0/CWEB/
with names of the pattern p*.w, where * stands
for any sequence of characters. Names that fit this pattern are, for
example, parser.w, pbsndecl.w, pldfdcl.w, etc.

There are a couple of steps involved, but when ctangle and
Bison are finished, there's a file of C++ called
parser.c++ and two header files called parser.h and
parser.h++. The header files are included in some of the
other source files, and parser.c++ is compiled and linked
with the other object files to create the executable 3dldf.

Bison also generates the file
parser.output as a by-product of its operation.
It's not needed for building the package, but I include it in the
distribution as a convenience.
It contains information about the parser function, including a list of
the grammar rules in Backus-Naur form. It is an important source of
information for anybody who wants to learn to use the interactive
version of GNU 3DLDF, since I have not yet rewritten the manual.
See Introduction to the Grammar
for more information.

If you want to start learning about the source code of the parser, the place to start is
parser.w.
This file includes all of the other source files for the parser.

The file
parser.output may seem intimidating at first glance, but it's not
that hard to learn to interpret. And once you do, the grammar of the
3DLDF language will be an open book to you.

Please note that parser.output changes with every change
to the grammar, and this happens frequently. So the examples on this page
may not correspond exactly to what's in the most recent version of
parser.output. However, I will mostly be talking about general
principles here, so it shouldn't matter.

parser.output starts out with a copyright notice and a license.
After that, you'll find a line with the text
Terminals which are not used followed by a long list of words
in capital letters. Another long list of lines beginning with the
word State follows. Finally, you get to a line containing only
the word Grammar. We've arrived!

The text in this section is a numbered list of the grammar rules of
the 3DLDF language in a special form called Backus-Naur
form. Rule 0 is automatically generated by Bison. Rule 1 is the
so-called start symbol of the language. It looks like this:

1 program: statement_list END

This means that in 3DLDF, a program is a statement list
followed by the tokenEND.

In Bison, a token, also known as a
terminal symbol or just terminal for short,
is a basic unit of a language. They are conventionally written in
capital letters. Other parts of speech are called
non-terminal symbols or just non-terminals.
They are built up out of tokens in the course of parsing input.
The tokens come from the scanner, which is a function
called yyscan(), but that's another story.

In Rule 1, program and statement_list are written in
lowercase letters, so they are non-terminal symbols. In addition,
there is a colon with program on the left-hand side
(LHS) and statement_list END on the right-hand side (RHS).
The colon can be read as is, i.e.,
program is statement_list followed by END.
Another way to read this is
statement_list ENDreduces toprogram.
This is really what yyparse() is doing: It tries to match the
sequence of tokens it receives from the scanner to rules. When it does,
it goes into a particular state. These states pile up on a
stack until they can be reduced.
Then, they can begin to pile up again. Ultimately,
a valid input will reduce to the state corresponding to Rule 1, i.e.,
the rule for the start symbol, program.

The next two rules look like this:

2 statement_list: /* empty */
3 | statement_list statement

This tells us what a statement_list is: It's either empty,
or a statement_list followed by a statement.
/* and */ enclose a comment, so Rule 2 really has nothing
at all on its right-hand side. The | symbol means or.

Rules 2 and 3 are an example of recursion. Let's say yyparse()
starts parsing an input and it gets a statement (whatever that is).
It can now try to reduce to the state corresponding to Rule 3.
All we have is a statement, but it matches, because nothing
is a valid statement_list.

Now we scan another statement. We can match Rule 3 again,
because we've got a statement_list from before, followed by our
current statement. So, in effect, we're appending statements
onto a statement_list.

To stop, all we have to do is get the scanner to pass an END token
to the parser. This is easy. The string end, written as a separate word
in the input, following a statement, will do the trick.

In fact, this works for most tokens: There's usually a character or string
that corresponds or maps to them.
For example, these are the rules for statement:

The token SEMI_COLON maps to the character ; and END_INPUT maps to
the string end_input. In fact, END_INPUT also maps to endinput,
so end_input and endinput can be used interchangeably.
In other words, they are synonyms.

The exception is tokens used for declaring objects of types defined in
the 3DLDF language. For example, when an input contains the line
point p;
the string point doesn't map to the token POINT,
which does exist, but rather to the token POINT_DECLARATOR.
The same applies to boolean, transform, path, and
all of the other types defined in the 3DLDF language.

So what strings do map to POINT, PATH, etc.?
Well, none at all. When one is needed, the parser passes it to the scanner,
which passes it back to the parser. I call this
faking a token, although such tokens are no less real than
ones created by scanning input. The reason for this is the separation
between the rules and the actions. Information about the type of
a variable can be found in an action, but it's not possible to
influence the course of parsing except by means of the stream of tokens
passed to the parsing function. Faking tokens is a means
of doing this.

Most kinds of statements will require their own chapter.
However, a few of them are so simple that I can explain them here.

4 statement: SEMI_COLON
This is the magic rule that makes it possible to have extra
semi-colons between statements.

5 statement: END_INPUT
When the parser receives an END_INPUT token, it stops
reading from the current input stream. This rule is actually a bit more complicated
than it appears at first glance. The behavior of 3DLDF depends on what the
current input source is and whether it's using multiple threads to read input.
I will explain about this elsewhere.

6 statement: equation SEMI_COLON
Equations à la METAFONT do not yet exist in 3DLDF, so this rule is
currently non-functional.