johnm@non.net (John D. Mitchell) writes:> On 20 Nov 1997 22:38:40 -0500, Albert Theo Hofkamp> <hat@se-46.wpa.wtb.tue.nl> wrote:> [...]>>I am aware of the capability of creating states in Lex, but I do not>>really like it, since it heavily depends on the exact moment of executing>>when YACC statements (at the end of a rule).>> Or you could switch to an LL(k) tool like PCCTS/ANTLR v1.x so you don't> have to worry so much about action placement.

Sorry, I should have stated my concern more clearly. If I would
introduce states in the scanner such that the scanner would return
only the `right' tokens, and the control of the state is done from
within yacc, then I consider the moment when the parser executes the
change of state as tricky.

I do not understand why LL(k) would solve this problem. It is true
that it is easier to predict which action is executed, but the exact
moment is still quite dependent on the implementation imho.

>>I'd rather want to know what tokens the parser is expecting, and recognize>>only those tokens (maybe with an exception on keywords) in the scanner.>> You're asking for a predictive parser rather than a synthesizing parser.

No, I am asking for a scanner which tries to do interpretation within
the limits of what the parser expects. In other words, when the parser
doesn't want a real, and the scanner gets a real as input, it will not
return a real, but 2 integer numbers seperated by a dot.

Obviously :-), there are some serious consequences to this
approach. Is there literature about this ?

I am looking at pccts, but for a different reason. (it looks as if
yacc is not powerful enough for another grammar I wrote).

>>[Well, disregarding the question of whether it's a good idea to write>>languages with lexical puns that practically beg people to write code>>that the compiler will misinterpret, I'd have the lexer return>>integers and dots as separate tokens and put reals together in the>>parser. -John]

Yes, I considered this, and discarded the idea for the reason you
mentioned. My boss thinks otherwise though :-(
Maybe a solution is to return a real to the parser, and in the expression
where INT.INT is needed, convert the real back to that format.