The useful

Peter's module allow you to do this,
and all the examples in this blog post were
run using it.

For those new to this blog
Marpa is something new in parsing --
it parses anything you can write in BNF and,
if your grammar is in one of the classes currently in practical use,
parses it in linear time.
Marpa's parse engine is written in optimized C,
so that Marpa's speed is competitive with parsers of far less power.
Marpa's stable version is
Marpa::XS.

The playful

The grammar allows every possible pairing of 2 adjacent symbols
in a grammar.
Which takes this post on a brief excursion into the playful.
Combinatoric issues are often easily expressed as grammars,
and Marpa makes it easy to count parses.
Clearly there is only one way in which 2 symbols can be paired.
And it's not too hard to see that 3 symbols can be
paired in 2 ways.
The counting rapidly gets difficult and error-prone,
but I can convince myself with pencil and paper that
4 symbols can be paired in 5 ways.
At this point I let Marpa take over.
The code is
online,
and this is the result:

1 2 5 14 42 132 429 1430 4862

For further analysis, I do what any red-blooded mathematician
would do: I type the series into a Google seach bar and hit
return.
These numbers are the
Catalan series,
one which pops up all over the place.
Because my issue is set up as a pairing problem, I didn't
obtain the first two numbers, but the standard Catalan series
is this:

0 1 1 2 5 14 42 132 429 1430 4862 16796 58786 208012 742900 2674440

Confirmation that the Catalan series is the right
one can be found in Concrete Mathmematics,
Graham, Knuth, and Patashnik,
pp. 343-344,
or
on Wikipedia.

The useful revisited

Playing with these series can be fun,
but for my Marpa work they are more than useful,
they are necessary.
Marpa claims to accurately iterate all the parses
of an ambiguous parse.
But to claim this,
I need to test it.

I can't do serious testing based on parse counts
by hand.
Because even if you can find the right answer
with pencil and paper, that's not enough --
you need to demonstrate conclusively that the
right answer IS the right answer.
Matching Marpa's answer is not enough --
it might simply be a case of me and
Marpa both counting wrong the same way.

For serious testing, I need
to match the parse counts with a
verified series.
I can verify a series in two ways.
I prefer the easy way --
finding a case,
like that of the Catalan numbers just described,
where it's already well-known that my parse
counts and the series should match.
But I also do it the hard way when I have to.
In what follows I'll show an example of each.

The easy

This next grammar allows every one of 10 symbols to be either 'a' or nulled.

S ::= A A A A A A A A A A
A ::= a
A ::= E
E ::= Null

This is equivalent to the very
well-known (and mercifully well-understood) problem of
picking k items out of n.
Here are the parse counts:

1 10 45 120 210 252 210 120 45 10 1

These are the binomial coefficients for n=10.
(Binomals are one of the many places this series pops up.)

A nice feature of Peter's MarpaX::Simple::Rules is that Peter's
rules can be mixed with Marpa's "native" rules.
This will be useful here, for generalizing from n=10 to
0<=n<=12.
The code is online as a Github gist.
And the result,
as the math-savvy among my
readers will be expecting,
is
Pascal's triangle:

The hard

The previous two examples were good, but also quite
"pure" mathematically -- there's symmetry all over the place.
Real grammars tend to be a bit more "pear-shaped",
and I wanted an example which,
while combinatorially hard,
reflected this.

Operators tend to be the most complex part of practical language
grammars, so I focused on them.
Their ambiguity is considerably tamed by using precedence, so I
chose to ignore precedence.
Few languages have a richer, and more potentially ambiguous,
set of operators than Perl, so I chose the Perl operators.
And the fastest way to generate operator ambiguity in Perl is with
a series of minus signs.
Things like:

$a--------$b

Let me emphasize that this is not real Perl syntax -- it's a
wildly ambiguous variant of Perl invented to make a good test case.

The code is
online,
and these, for minus sign counts from 1 to 12,
are our parse counts:

1 1 3 4 8 12 21 33 55 88 144 232

This series is not one of the well-known ones,
but it is known -- it is
A052952
, a sort of rag-time Fibonacci variant.
It has no name,
and I like to call it the Wall series,
as a tribute to Larry,
and because it's based on his Perl operators.
With this one I could not simply look it up
by name and confirm that A05952
is the series generated by the parse counts.
But I was able to come up with
my own proof.

The beautiful

The Pythagoreans
believed that the world was created out of beauty
and order;
that mathematics was the basis of all things;
and that each of these beliefs implied the other.
Their beliefs gained special credibility
when they showed that much of the power
of music comes down to ratio.

Avowed Pythagoreans,
who were probably never thick on the ground,
are hard to find these days.
But in another sense, Pythagoreanism dominates
our world.
That special reverence and belief in the power
of science that characterizes us in the West --
it comes from them.
We claim
many philosophies
and religions,
but in fact it seems that
we cannot
stop ourselves from thinking like Pythagoreans,
and these other creeds are more our aspiration
than our conviction.

So does the secret of a happy and virtuous life
lie in contemplating the harmonies of mathematics?
That is an idea that seems at once too hard
and too easy to be true.
But it is easy to appreciate these series
for their beauty,
and hard not to be struck by the way that,
when you avoid order, order emerges.