Name: pappy
Version: 0.1
Synopsis: Packrat parsing; linear-time parsers for grammars in TDPL.
Description:
Packrat parsing is a novel and practical method for implementing
linear-time parsers for grammars defined in Top-Down Parsing
Language (TDPL). While TDPL was originally created as a formal
model for top-down parsers with backtracking capability, this
thesis extends TDPL into a powerful general-purpose notation for
describing language syntax, providing a compelling alternative to
traditional context-free grammars (CFGs). Common syntactic idioms
that cannot be represented concisely in a CFG are easily
expressed in TDPL, such as longest-match disambiguation and
"syntactic predicates," making it possible to describe the
complete lexical and grammatical syntax of a practical
programming language in a single TDPL grammar.
.
Packrat parsing is an adaptation of a 30-year-old tabular parsing
algorithm that was never put into practice until now. A packrat
parser can recognize any string defined by a TDPL grammar in
linear time, providing the power and flexibility of a
backtracking recursive descent parser without the attendant risk
of exponential parse time. A packrat parser can recognize any
LL(k) or LR(k) language, as well as many languages requiring
unlimited lookahead that cannot be parsed by shift/reduce
parsers. Packrat parsing also provides better composition
properties than LL/LR parsing, making it more suitable for
dynamic or extensible languages. The primary disadvantage of
packrat parsing is its storage cost, which is a constant multiple
of the total input size rather than being proportional to the
nesting depth of the syntactic constructs appearing in the input.
.
Monadic combinators and lazy evaluation enable elegant and direct
implementations of packrat parsers in recent functional
programming languages such as Haskell. Three different packrat
parsers for the Java language are presented here, demonstrating
the construction of packrat parsers in Haskell using primitive
pattern matching, using monadic combinators, and by automatic
generation from a declarative parser specification. The prototype
packrat parser generator developed for the third case itself uses
a packrat parser to read its parser specifications, and supports
full TDPL notation extended with "semantic predicates," allowing
parsing decisions to depend on the semantic values of other
syntactic entities. Experimental results show that all of these
packrat parsers run reliably in linear time, efficiently support
"scannerless" parsing with integrated lexical analysis, and
provide the user-friendly error-handling facilities necessary in
practical applications.
Homepage: http://pdos.csail.mit.edu/~baford/packrat/thesis/
License: BSD3
License-file: LICENSE
Author: Bryan Ford
Maintainer: Chris Done
Stability: Alpha
Category: Text
Build-type: Simple
Cabal-version: >=1.2
Executable pappy
Main-is: Main.hs
Hs-source-dirs: src
Build-depends: base >= 4 && < 5