Marpa resources

Fri, 07 Oct 2011

Before going further in this series,
I want to touch on a question I have so far
avoided:
Are all Perl programs parseable?

Most languages do not pose this question.
It was Adam Kennedy, in creating PPI, who
first ran up
against it.
Adam conjectured that in fact
Perl parsing may be undecidable.
While in the planning stages for Marpa,
I found Adam's conjecture
and
turned it into a formal proof.

Being the first to formalize this result, I took the
initial heat over it, and now I get the credit.
The dust has pretty much settled,
and to my mild surprise, the result has proved of
interest to the academic community.
The initial revulsion at the idea of undecidable
parsing has subsided,
even to the extent that others now want to get into
the act.
I read on Wikipedia that LISP parsing is also undecidable.
So all the cool kids seem to be doing it.

Undecidability is actually something we are quite comfortable
with -- all the general-purpose CPU chips we use have this
property,
and we put up with it as the price for their power.
In the case of general-purpose CPU chips,
the downside is real and occasionally serious.
Some of the things our computers cannot do are reasonable
things for us to want them to do.
As one example,
infinite loop detection can never be 100% accurate.
As another, very serious issue,
no program will ever be able to determine
with 100% accuracy whether an arbitrary program
is or is not a virus.
Virus detection must always involve a degree of guesswork.

The undecidability of
Perl parsing differs from the usual situation in one major
respect:
In return for the additional power of Perl's parser,
we seem to be giving up absolutely nothing.
Nobody has ever been able to show a practical downside.

True, as I have proved, there are Perl programs
that no parser,
including the perl interpreter itself,
can parse.
But all the examples that anyone,
including me, has been able
to come up are contrived cases.
None of them are Perl programs that
someone might care about as a practical matter.

In future posts, I plan to discuss various strategies
for parsing Perl.
I will not mention the
undecidability issue again,
unless it has some relevance in context --
something that will seldom happen.