On Wednesday, March 26, 2014 7:57:11 PM UTC-5, federat...@netzero.com wrote:> First in terms of the syntax, the best advice is to *normalize* the> syntax. At the assembly level, the most notorious example [of an> unnormalized syntax] is MASM and the Intel syntax it is associated with.

> There are a half-dozen instances of the phrase category "Directive" in> the phrase structure grammar whose only distinctions are semantic in> nature: i.e. there is an attempt to implement semantic constraints by> cleaving this single category into a half-dozen shards.

In fact, based on the reference for MASM 6.1, one could "reintegrate" the
directive syntax by first classifying all their contexts:

(c) Statement: inside a .IF/.ELSEIF/.ELSE statement group, a statement group
headed by any of the IF*/ELSEIF*/ELSE keywords or at the TopLevel

(d) SegmentBody: inside a statement group (1) begun by a ".CODE", ".DATA",
".DATA?", ".CONST", ".FARDATA", ".FARDATA?", ".STACK" or "SEGMENT" directive
and ended by its matching "ENDS" directive; or (2) begun by a "PROC" directive
and ended by its matching "ENDP" directive.

(e) Inside a statement group headed by STRUC/STRUCT/UNION

Then it suffices to lay out one (large) set of phrase structure rules for the
directives ("Dir"), with the constraints then noted separately. It's much
easier for a parser to check for these constraints than to try and force-fit
it into the grammar. In addition, organizing things this way may lead to
simplifications that occur by recognizing that some of the constraints are
simply not needed and can be removed.

Notice, by the way, in the list below that the assembler does
"quasi-compilation" (that is, it compiles run-time statements for loops and
functional calls). I don't know if GAS is doing that much. But a decent
assembler might have to match this level of functionality.

For MASM syntax, the classification would lead to the following groups.