Darius Blasbans (darius@phidani.be) wrote:: Oliver Plohmann wrote:: > When I first looked into a book about compilers I saw a parse tree with: > integers representing the respective keyword. I thought you could: > alternatively make up the parse tree of objects, like Statement,: > ConditionalBlock, Loop, etc. A ConditionalBlock would contain the: > condition and the respective code to be executed. Then you could put the: > knowledge of how the assembler code would look like for: > aConditionalBlock into that class. Same with the other classes to: > distribute responsabilities among objects to reduce complexity.

: You can even take this idea further by having a NonTerminal root: class, a Statement class derived from NonTerminal, an IfStatement: class derived from Statement, etc...

: You can then have semantical analysis and code generation methods: defined as virtual (deferred) in the intermediate classes, and: redefined in the leafs.

: You are entirely right when stating that this approach would: reduce complexity. It does match whet we gained from our own: experience in the field.

I my experience this is not always the case. I did a compiler for the
Action semantics description language in this OO style with essentially
a class for each non-terminal in the grammar.

The problems that I saw with this approach is the proliferation of
lots on non-important classes corresponding to the needs of the
concrete syntax to be LALR(1) (I was using yacc). I got into a maze
of lots of little methods all alike, but syntactically dispersed
because each of them needed to be a method.

I my view it would have been better if I had had a class hierarchy
corresponding to a more abstract view of the syntax. However, this
imposes more of a burden on the parser/AST constructor.

Another problem with the pure OO AST representation is that it is
extremely tedious to code yet another traversal of the AST. In the
pure approach one has to implement a method for each non-terminal
in the worst case. A solution to this problem is to look into the
notion of adaptive programming as advocated by the Demeter group
at Northeastern University (http://www.ccs.neu.edu/home/lieber/demeter.html)
Or to have a list of sub-trees in an abstract superclass...

What I've been using lately in a project that analyzes C code, is
a small class hierarchy of no more that four classes that describe
generalized trees, with a few annotations/tags, and these
trees admit various kinds of map and foldr operations.

I also find that I prefer to have the analysis code for + and -
separated only by a "case '-':" line instead of putting it in two
separate methods.