### "The theoretical broadening which comes### from having many humanities subjects### on the campus is offset by the general dopiness### of the people who study these things."###### -- Richard P. Feynman

# # This defines the abstract syntax of a text with text_items in it. # I won't bore you with a BNF, but roughly the syntax is as follows: # # elemStart (nm, a1, ... an) is <nm a1 ... an> # -- start of an "element" in SGML-speak. # Note there must be _no space_ betweem # the opening < and the name nm, and nm has to start with _letter_; # a1 to an are the arguments of the element # # elemEnd nm is <\nm> # -- the end of an element # # escape e is &str; # -- the escape e denoted by str # # quote str is just the string str

# ******************************************************************* # # The parser # # The parser is extremely tolerant; it doesn't generate any errors, and # if it can't decipher something it will just leave it as a verbal # quote.

package parser = package {

include substring;

# The lexical elements

Lexem = OPEN_EL | OPEN_END_EL | OPEN_ESC; # corresponding to <, <\ and &; plus > and ; # which only become a lexem after one of these

# parse an "element", i.e. a thingy enclosed in '<' ... '>' # i is supposed to be the index into t right after the opening bracket # parseEl returns the representation of the rest of t # fun parse_el t i = { unto = scan_close_el t i;

# The four components of the consEl's second argument are the following: # - the first is the stack of unprocessed open elements, along with their # position within the text; # - the second is current position within the text; # - the third is the text content up to here; # - and the last is the list of text_items built up to here. # # As it stands, opening elements with no matching close are discarded. # This can be changed easily.