BNF Parser Generator - Homepage

The BNF parser generator takes a syntax not unlike BNF and generates a "C" parser for it, a parser that can parse either strings or files. This is a flexible tool, ment for smaller parsing tasks where bison+flex are just to big to use.

BNF input grammar

The BNF input grammar, as the name suggests follows the estabishled BNF
syntax quite closely, and if you're familliar with that grammar you won't have
very much trouble understanding the following. All grammar rules are written
in the format as the bnf program recognizes them, so they also function
as examples.

grammar ::= [{ assignment }] eoi;

A BNF grammar is an optional repetition of assignments, followed by eoi
which matches with the end of the input.

assignment ::= name ('::=' | '=') expression;

An assignment is a name followed by '=' or '::=' followed by
an expression. The expression is the meat of the grammar.

expression = term [{ '|' term }];

An expression is a list of terms, seperated by a '|', meaning or.

term = factor [{ white factor }];

A term is a list of whitespace seperated factors.

factor = IO | name | '[' expression ']' | '&{' expression '&}';

A factor is either an IO term, or its a name, or its
an optional expression, or its an repetition of an
expression.

IO = '\'' string '\'' | '"' string '"' | '`' string '`';

With the IO rule we handle the input that the generated parser reads
and the output it produces.
IO are strings, quoted in three different ways, either
forward quoted
to denote that it matches a string in the input only,
doublequoted
meaning that the string should be copied verbatim to the output stream,
or
backward quoted meaning that this string should be copied to
the output by the parser.

A simple example

As a first example we want a program that reads a repetition of the
word "foo" and replaces it with a single word "bar". Here is the grammar
you feed to the BNF program.

sample = [{ {'foo'} `bar` | . }] eoi;

There is something new here, the dot, which simply means 'match
and echo any character'.
When you place the above grammar in a sample.bnf file and generate
the output "C" parser, bnf defines a "C" function called sample()
and the body of that function parses the input we want. We need to write
an additional "C" file containing a call to this function. That "C" file
looks like this:

The __BNF_fil_io_parser() function, like the sample()
function are defined in the generated "C" file. Now all there is left
to do is compile the two files and link them together. There is also a
__BNF_str_io_parser() function that uses strings instead of files,
so you can also use the parser in the situation that files are unapproperiate.

A fundamental property of parsers generated with BNF is that they are simple
"C" functions, so you can intermix calls to ordinairy "C" functions that
you define yourself within the grammar. This is the fundamental way of
making parsers that do more than simple echoing of the input.
You can make stuff like:

Returning 0 in such a function denotes OK, which means to to parser can
continue with processing input. If you return a non-zero value, the parser
will 'reject' it. Ofcourse you could make the call an optional element in
the grammar like this: foo = [somefunc] "foo";. Now the returnvalue
of somefunc() is always ignored.

A simple compile body like the one above could be used and then you'll
be able to verify weather a certain expression is correct or not. Now that's
exciting, isn't it? We would also like to calculate the value of the
expression ofcourse, and that's exactly where things get ugly with calls
to "C" functions cluttered everywhere around the above syntax.

Therefore, before actually start cluttering your code, always check if the
bare grammar parses what you want and as a next stage, start implementing
'what to do' with the grammar. To play with this you should know that there
is an important (global) variable: extern char* __BNF_input that
always points to 'where you are' in the input stream/string.