Re: LR(1) paser generator based on Pager's algorithm

From:

Joel E. Denny

Subject:

Re: LR(1) paser generator based on Pager's algorithm

Date:

Mon, 2 Apr 2007 01:45:08 -0400 (EDT)

On Sun, 1 Apr 2007, Xin Chen wrote:
> I don't know if this prefix by adding "%left a" is viable, because the
> canonical LR(1) parsing machine does not accept all strings in the given
> grammar with the prefix.
Yes, I consciously eliminated "aaaa" by specifying %left 'a'.
> I think the reason for this is that the grammar is not a LR(1) grammar,
> so the canonical LR(1) algorithm cannot handle it.
Right, canonical LR(1) cannot handle the grammar alone. However, I'm
specifically addressing the case of non-LR(1) grammars that have been
refined by precedence declarations such that a canonical LR(1) parser
could handle it.
> To handle this
> grammar, we need to use GLR algorithm, to fork at state 5 and try the
> two possibilities.
If I wanted to accept "aaaa", that's correct, but I don't....
I'm imagining a grammar author who wants to re-use A in several places (in
perhaps a much larger grammar than the toy example I gave) but finds it
creates a S/R conflict in one place. He decides that the reduce is always
correct at that point in the language, so he adds %left 'a'. Since
grammar authors seem to think naturally in LR(1) not LALR(1), he might not
notice that this %left 'a' should have any effect on the part of the
language ("baab") that it ends up breaking.
In my mind, the major advantage of switching from LALR(1) to LR(1) would
be to make grammar development conceptually easier by eliminating all the
mysteries of LALR(1). If the spread of an existing conflict can still
happen without warning, will the developer know to rewrite the grammar or
to turn on GLR or whatever? Is it a good idea to given him false
confidence?
> shift/shift conflict.
I assume that's a typo.
> As for Dr. Pager's algorithm, I guess it can be patched like this: in
> grammars where precedence rule is used to solve shift/reduce conflict,
> combine two states iff they are weakly compatble AND none has a
> shift/reduce conflict.
So, if we decide we can't merge them, how many predecessor states will we
have to split in order to successfully separate the merged lookaheads?
The beauty of Pager's weak compatibility definition is that it (for LR(1)
grammars) foresees whether merging successor states can produce real
conflicts by checking the current states for potential conflicts.
Unfortunately for my problem, the very mechanism of this foresight is to
allow merging when potential conflicts already existed in the current
states, and those may be real conflicts for non-LR(1) grammars.
Pager's lane-tracing algorithm proposes splitting after constructing LR(0)
states, but supposedly that's much slower.
> These are just the things came to my mind for now.
I appreciate your and Paul's comments.