I define a parser as a context-free grammar with semantic actions for each production. It is not defined in what order the semantic actions run, just that the semantic actions of the non-terminals in a production run before itself does.

The concept of semantic actions are (to me) a clear way of turning a grammar into an actionable concept that explains that while two grammars might define the same language, they might not be equivalent.

This is why deterministic parsing (e.g. LL(1) or LR(1)) is unattractive to me, as you end up massaging your grammar to the wishes of the parser generator you're using.

As an example, if you use an unrestricted parser such as an Earley parser you can forget about having to massage your grammar into the requirements of whatever parser generator you're using and instead use the full behavior of the grammar. E -> E '+' P is left-associative and X -> N '^' X is right-associative, something that you can only express as such if you can use both right and left recursion.

However, if you give up deterministic parsing then CFGs can be ambiguous, so the above concept is still incomplete to me. Some people are interested in all ambiguous parses (e.g. natural language), but I am interested in augmenting a CFG parser as above in a way that resolves ambiguity.

What methods and techniques exist for augmenting an ambiguous grammar with additional info (which might already exist, e.g. the order in which productions are defined) or routines that allow ambiguous parses to be resolved/clarified to a unique unambiguous parse?

$\begingroup$Just as a side issue: LR parsers handle both left- and right-recursive productions just fine, although right recursion requires more temporary storage. It's really not a big deal. (And an Earley parser will use more temporary storage than an LR(1) parsers parsing a completely right-recursive grammar.)$\endgroup$
– riciDec 22 '18 at 4:56

$\begingroup$@rici I am aware, it was just an example of something that $LL(1)$ parsers or PEG parsers can't handle and require you to change the semantics of your grammar. With $LR(1)$ you have the dreaded reduce/reduce or shift/reduce conflicts. An Earley parser will always use more memory AFAIK (because it is a chart parser), but "A general context-free parsing algorithm running in linear time on every LR(k) grammar without using lookahead" does describe an optimization for Earley parsers that make sure it runs in linear time and memory for any $LR(k)$ grammar.$\endgroup$
– orlpDec 22 '18 at 8:56

$\begingroup$So what parsing algorithm has trouble with right recursion? I get your point, but I think it's important to understand the real issues. Ambiguity needs to be addressed by making the grammar non-ambiguous, not because of the requirements of a parser generator, but because of the requirements of programmers who need to know what their programs mean. The annoyance with LR(1) is the 1, not the LR. Most "dreaded" conflicts are the result of grammar bugs, but sometimes it makes sense to design a syntax which requires more lookahead...$\endgroup$
– riciDec 22 '18 at 14:20

$\begingroup$... and those are the cases in which my current advice is to find a GLR/GLL parser generator, because those will work without having to modify or annotate the grammar. I honestly see little value in non-deterministic grammars for languages intended for precise descriptions (unlike, say, human languages).$\endgroup$
– riciDec 22 '18 at 14:24

$\begingroup$My goal is to write a parser generator that can handle ambiguity internally, but with mechanics to handle that ambiguity. "Ambiguity needs to be addressed by making the grammar non-ambiguous" <citation needed>. I'm exactly interested in approaches that address ambiguity not by changing the grammar, but by deciding on ambiguity after the parse has completed (e.g. precedence of rules). My question is asking for those approaches - which exist? Also a perfectly reasonable non-LR(k) language is a language that allows the user to escape quotes by repeating the start/end quote: """test"ing""".$\endgroup$
– orlpDec 22 '18 at 15:48