IsRegex, PCRE and TDFA

The IsRegex re tx provides regex methods for the RE type re (belonging to either the TDFA or PCRE back end) and a text type tx that the re back end accepts. The Text.RE.TDFA and Text.RE.PCRE API modules provide functions that work over all the text types, with the following match operators:

regex provides some classic tools that have quickly proven themselves in the examples and scripts used to maintain regex itself.

The regex Tools

The classic tools assocciated with regular expressions have inspired some regex conterparts.

Text.RE.Tools.Grep: takes a regular expression and a file or lazy ByteString (depending upon the variant) and returns all of the matching lines.

Text.RE.Tools.Lex: takes an association list of REs and token-generating functions and the input text and returns a list of tokens. This should never be used where performance is important (use Alex), except as a development prototype.

Text.RE.Tools.Sed using Text.RE.Tools.Edit: takes an association list of regular expressions and substitution actions, some input text and invokes the associated action on each line of the file that matches one of the REs, substituting the text returned from the action in the output stream.

Text.RE.Tools.Find: scans a directory tree in the file system executing an action against all of the files that match RE.

These tools are built on top of the core library and act as good examples of how to use the regex library as well as useful tools.

The following sections will present some of the internal library code used to build the tools as well as some code from the example programs. These fragments work best as starting points for studying these tools.

Sed and Edit

Edits scripts are applied to each line of the text by the sed functions.

-- | an 'Edits' script will, for each line in the file, either perform-- the action selected by the first RE in the list, or perform all of the-- actions on line, arranged as a pipelinedataEdits m re s
=Select![Edit m re s] -- ^ for each line select the first @Edit@ to match each line and edit the line with it|Pipe![Edit m re s] -- ^ for each line apply every edit that matches in turn to the line-- | each Edit action specifies how the match should be processeddataEdit m re s
=Template!(SearchReplace re s)
-- ^ replace the match with this template text, substituting ${capture} as apropriate|Function!re REContext!(LineNo->Match s->RELocation->Capture s->m (Maybe s))
-- ^ use this function to replace the 'REContext' specified captures in each line matched|LineEdit!re !(LineNo->Matches s->m (LineEdit s))
-- ^ use this function to edit each line matched-- | a LineEdit is the most general action thar can be performed on a line-- and is the only means of deleting a linedataLineEdit s
=NoEdit-- ^ do not edit this line but leave as is|ReplaceWith!s -- ^ replace the line with this text (terminating newline should not be included)|Delete-- ^ delete the this line altogetherderiving (Functor,Show)

sed' applies the script in its first argument to each line in the text in its second argument.

Grep

The grepFilter function takes an RE and a text and returns the result of matching the RE to every line in the file.

-- | returns a 'Line' for each line in the argument text, enumerating-- all of the matches for that linegrepFilter ::IsRegex re s => re -> s -> [Line s]
grepFilter rex = grepWithScript [(rex,mk)] . linesR
where
mk i mtchs =Just$Line i mtchs

-- | 'grepLines' returns a 'Line' for each line in the file, listing all-- of the 'Matches' for that linedataLine s =Line
{ getLineNumber ::LineNo-- ^ the 'LineNo' for this line
, getLineMatches ::Matches s -- ^ all the 'Matches' of the RE on this line
}
deriving (Show)

The sortImports utility in the TestKit utility module used by the scripts and example programs. It uses grep to sort all of the imports by the name of the module in a single block located at the position of the first import statement in the module, where each import statement is in a standard form matched by the regex

It is used by the re-sort-imports program to discover all of the Haskell scripts in the regex source tree and sort their import statements into a standard order (ultimately using the above-mentioned sortImport function).