{-# LANGUAGE FlexibleContexts #-}{-|
Module to create indentation aware tokenisers. Despite the simplicity
of parser combinators, getting tokenisers for common language
contructs right is tricky. The parsec way of handling this involves
the following steps.
* Define the description of the language via the
@`Text.Parsec.Language.LanguageDef`@ record.
* Apply the 'Text.Parsec.Token.makeTokenParser' combinator get hold
of @'Text.Parsec.Token.TokenParser'@. The actual tokenisers are
the fields of this record.
This module provides a similar interfaces for generating indentation
aware tokenisers. There are few specific things that an indentation
aware tokeniser should be careful aboute
1. All tokenisers should be indentation aware.
2. Whitespaces and comments should be skipped irrespective on which
indentation mode one is in
3. The tokenisers should themselves be lexeme parsers and should skip trailing
whitespace.
Getting all this working can often be tricky.
-}moduleText.Parsec.IndentParsec.Token(-- * Usage.-- $usageGenIndentTokenParser,IndentTokenParser,identifier,operator,reserved,reservedOp,charLiteral,stringLiteral,natural,integer,float,naturalOrFloat,decimal,hexadecimal,octal,symbol,lexeme,whiteSpace,semi,comma,colon,dot,parens,parensBlock,braces,bracesBlock,angles,anglesBlock,brackets,bracketsBlock,semiSep,semiSepOrFoldedLines,semiSep1,semiSepOrFoldedLines1,commaSep,commaSepOrFoldedLines,commaSep1,commaSepOrFoldedLines1)whereimportText.Parsec.IndentParsec.PrimimportText.Parsec.IndentParsec.CombinatorimportText.Parsec(many)importqualifiedText.Parsec.TokenasTimportText.Parsec(Stream)importText.Parsec.Combinatorhiding(between){- $usage
For each combinator @foo@ for every field @foo@ of the
'Text.Parsec.Token.TokenParser' with essentially the same semantics
but for the returned parser being indentation aware. There are certain
new combinators that are defined specifically for parsing indentation
based syntactic constructs:
[Grouping Parsers] A grouping parser takes an input parser @p@ and
returns a parser that parses @p@ between two /grouping delimiters/.
There are three flavours of grouping parsers: @foo@, @fooBlock@ where
@foo@ can be one of @angles@, @braces@, @parens@, @brackets@. For
example, consider the parser @'braces' tokP p@ parses @p@ delimited by
'{' and '}'. In this case @p@ does not care about indentation. The
parser @'bracesBlock' tokP p@ is like @braces tokP p@ but if no
explicit delimiting braces are given parses @p@ within an indented
block.
> bracesBlock tokP p = braces tokP p <|> blockOf p
[Seperator Parsers] A seperator parser takes as input a parser say @p@
and returns a parser that parses a list of @p@ seperated by a
seperator. The module exports the combinators @fooSep@, @fooSep1@,
@fooSepOrFoldedLines@ and @fooSepOrFoldedLines1@, where @foo@ is
either @semi@ (in which case the seperator is a semicolon) or @comma@
(in which case the seperator is a comma).
To illustrate the use of this module we now give, as an incomplete
example, a parser that parses a where clause in Haskell which
illustrates the use of this module.
> import qualified Text.Parsec.Language as L
> import qualified Text.Parsec.Toke as T
> import qualified Text.Parsec.IndentToken as IT
> tokP = T.makeTokenParser L.haskellDef
> mySemiSep = IT.semiSepOrFoldedLines tokP
> myBraces = IT.bracesBlock tokP
> identifier = IT.identifier tokP
> ....
> symbol = IT.symbol tokP
>
> binding = mySemiSep bind
> bind = do id <- identifier
> symbol (char '=')
> e <- expr
> return (id,e)
>
> whereClause = do reserved "where"; braceBlock binding
-}typeGenIndentTokenParserisum=T.GenTokenParsersu(IndentTim)typeIndentTokenParsersum=GenIndentTokenParserHaskellLikesum-- | Indentation aware tokeniser to match a valid identifier.identifier::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumStringidentifier=tokeniser.T.identifier-- | Indentation aware tokeniser matches an operator.operator::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumStringoperator=tokeniser.T.operator-- | Indentation aware tokeniser to match a reserved word.reserved::(Indentationi,Monadm)=>GenIndentTokenParserisum->String-- ^ The reserved word.->GenIndentParsecTisum()reservedtokP=tokeniser.T.reservedtokP-- | Indentation aware parser to match a reserved operator of the-- language.reservedOp::(Indentationi,Monadm)=>GenIndentTokenParserisum->String-- ^ The reserved operator to be matched.->GenIndentParsecTisum()reservedOptokP=tokeniser.T.reservedOptokP-- | Indentation aware parser to match a character literal (the syntax-- is assumend to be that of Hasekell which matches that of most-- programming languagecharLiteral::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumCharcharLiteral=tokeniser.T.charLiteral-- | Indentation aware parser to match a string literal (the syntax is-- assumend to be that of Hasekell which matches that of most-- programming language).stringLiteral::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumStringstringLiteral=tokeniser.T.stringLiteral-- | Indentation aware parser to match a natural number.natural::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumIntegernatural=tokeniser.T.natural-- | Indentation aware parser to match an integer.integer::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumIntegerinteger=tokeniser.T.integer-- | Indentation aware tokeniser to match a floating point number.float::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumDoublefloat=tokeniser.T.float-- | Indentation aware tokensier to match either a natural number or-- Floating point number.naturalOrFloat::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisum(EitherIntegerDouble)naturalOrFloat=tokeniser.T.naturalOrFloat-- | Indentation aware tokensier to match an integer in decimal.decimal::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumIntegerdecimal=tokeniser.T.decimal-- | Indentation aware tokeniser to match an integer in hexadecimal.hexadecimal::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumIntegerhexadecimal=tokeniser.T.hexadecimal-- | Indentation aware tokeniser to match an integer in ocatal.octal::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumIntegeroctal=tokeniser.T.octal-- | Indentation aware tokeniser that is equvalent to @`string`@.symbol::(Indentationi,Monadm)=>GenIndentTokenParserisum->String->GenIndentParsecTisumStringsymboltokP=tokeniser.T.symboltokP-- | Creates a lexeme tokeniser. The resultant tokeniser indentation-- aware and skips trailing white spaces/comments.lexeme::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisumalexemetokP=tokeniser.T.lexemetokP-- | The parser whiteSpace skips spaces and comments. This does not-- care about indentation as skipping spaces should be done-- irrespective of the indentationwhiteSpace::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisum()whiteSpace=T.whiteSpace-- | Matches a semicolon and returns ';'.semi::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumStringsemi=tokeniser.T.semi-- | Matches a comma and returns ",".comma::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumStringcomma=tokeniser.T.comma-- | Matches a colon and returns ":".colon::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumStringcolon=tokeniser.T.colon-- | Matches a dot and returns ".".dot::(Indentationi,Monadm)=>GenIndentTokenParserisum->GenIndentParsecTisumStringdot=tokeniser.T.dotlparen::(Monadm,Indentationi)=>GenIndentTokenParserisum->GenIndentParsecTisumStringrparen::(Monadm,Indentationi)=>GenIndentTokenParserisum->GenIndentParsecTisumStringlbrace::(Monadm,Indentationi)=>GenIndentTokenParserisum->GenIndentParsecTisumStringrbrace::(Monadm,Indentationi)=>GenIndentTokenParserisum->GenIndentParsecTisumStringlangle::(Monadm,Indentationi)=>GenIndentTokenParserisum->GenIndentParsecTisumStringrangle::(Monadm,Indentationi)=>GenIndentTokenParserisum->GenIndentParsecTisumStringlbracket::(Monadm,Indentationi)=>GenIndentTokenParserisum->GenIndentParsecTisumStringrbracket::(Monadm,Indentationi)=>GenIndentTokenParserisum->GenIndentParsecTisumStringlparentokP=symboltokP"("rparentokP=symboltokP")"lbracetokP=symboltokP"{"rbracetokP=symboltokP"}"langletokP=symboltokP"<"rangletokP=symboltokP">"lbrackettokP=symboltokP"["rbrackettokP=symboltokP"]"-- | Match the input parser @p@ within a pair of paranthesis.parens::(Indentationi,Showi,Monadm,Streams(IndentTim)t,Showt)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisumaparenstokP=lparentokP`between`rparentokP{-|
Same as `parens` but if no explicit paranthesis are given, matches @p@
inside an indented block.
-}parensBlock::(Monadm,Streams(IndentTHaskellLikem)t,Showt)=>GenIndentTokenParserHaskellLikesum->GenIndentParsecTHaskellLikesuma->GenIndentParsecTHaskellLikesumaparensBlocktokP=lparentokP`betweenBlock`rparentokP-- | Match the input parser @p@ within a pair of bracesbraces::(Indentationi,Showi,Monadm,Streams(IndentTim)t,Showt)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisumabracestokP=lbracetokP`between`rbracetokP{-|
Same as `braces` but if no explicit braces are given, matches @p@
inside an indented block.
-}bracesBlock::(Monadm,Streams(IndentTHaskellLikem)t,Showt)=>GenIndentTokenParserHaskellLikesum->GenIndentParsecTHaskellLikesuma->GenIndentParsecTHaskellLikesumabracesBlocktokP=lbracetokP`betweenBlock`rbracetokP{-|
Match the input parser @p@ within a pair of angular brackets, i.e. '<'
and '>'.
-}angles::(Indentationi,Showi,Monadm,Streams(IndentTim)t,Showt)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisumaanglestokP=langletokP`between`rangletokP{-|
Same as `angles` but if no explicit anglular brackets are given,
matches p inside and indented block.
-}anglesBlock::(Monadm,Streams(IndentTHaskellLikem)t,Showt)=>GenIndentTokenParserHaskellLikesum->GenIndentParsecTHaskellLikesuma->GenIndentParsecTHaskellLikesumaanglesBlocktokP=langletokP`betweenBlock`rangletokP-- | Match p within a angular brackets i.e. '[' and ']'.brackets::(Indentationi,Showi,Monadm,Streams(IndentTim)t,Showt)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisumabracketstokP=lbrackettokP`between`rbrackettokP{-|
Same as `brackets` but if no explicit brackets are given, matches p
inside and indented block.
-}bracketsBlock::(Monadm,Streams(IndentTHaskellLikem)t,Showt)=>GenIndentTokenParserHaskellLikesum->GenIndentParsecTHaskellLikesuma->GenIndentParsecTHaskellLikesumabracketsBlocktokP=lbrackettokP`betweenBlock`rbrackettokP-- | Parse zero or more @p@ seperated by by a semicolonsemiSep::(Indentationi,Monadm,Streams(IndentTim)t,Showt)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisum[a]semiSeptokPp=sepByp$semitokP-- | Parse one or more @p@ seperated by a semicolonsemiSep1::(Indentationi,Monadm,Streams(IndentTim)t,Showt)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisum[a]semiSep1tokPp=sepBy1p$semitokP{-|
Parse zero or more @p@ seperated by semicolon or new line. Long lines
are continued using line folding.
-}semiSepOrFoldedLines::(Monadm,Streams(IndentTHaskellLikem)t,Showt)=>GenIndentTokenParserHaskellLikesum->GenIndentParsecTHaskellLikesuma->GenIndentParsecTHaskellLikesum[a]semiSepOrFoldedLinestokPp=fmapconcat.many.foldedLinesOf.sepEndBy1p$semitokP{-|
Parse one or more @p@ seperated by semicolon or new line. Long lines
are continued using line folding.
-}semiSepOrFoldedLines1::(Monadm,Streams(IndentTHaskellLikem)t,Showt)=>GenIndentTokenParserHaskellLikesum->GenIndentParsecTHaskellLikesuma->GenIndentParsecTHaskellLikesum[a]semiSepOrFoldedLines1tokPp=dofirst<-foldedLinesOf.sepEndBy1p$semitokPrest<-semiSepOrFoldedLinestokPpreturn(first++rest)-- | Parse zero or more @p@ seperated by by a comma.commaSep::(Indentationi,Monadm,Streams(IndentTim)t,Showt)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisum[a]commaSeptokPp=sepByp$commatokP-- | Parse one or more @p@ seperated by a comma.commaSep1::(Indentationi,Monadm,Streams(IndentTim)t,Showt)=>GenIndentTokenParserisum->GenIndentParsecTisuma->GenIndentParsecTisum[a]commaSep1tokPp=sepBy1p$commatokP{-|
Parse zero or more @p@ seperated by comma or new line. Long lines are
continued using line folding.
-}commaSepOrFoldedLines::(Monadm,Streams(IndentTHaskellLikem)t,Showt)=>GenIndentTokenParserHaskellLikesum->GenIndentParsecTHaskellLikesuma->GenIndentParsecTHaskellLikesum[a]commaSepOrFoldedLinestokPp=fmapconcat.many.foldedLinesOf.sepEndBy1p$commatokP{-|
Parse one or more @p@ seperated by comma or new line. Long lines are
continued using line folding.
-}commaSepOrFoldedLines1::(Monadm,Streams(IndentTHaskellLikem)t,Showt)=>GenIndentTokenParserHaskellLikesum->GenIndentParsecTHaskellLikesuma->GenIndentParsecTHaskellLikesum[a]commaSepOrFoldedLines1tokPp=dofirst<-foldedLinesOf.sepEndBy1p$commatokPrest<-commaSepOrFoldedLinestokPpreturn(first++rest)