Base types

The type of an iterator-enumerator, which transcodes data from
some input type tIn to some output type tOut. An Inum acts
as an Iter when consuming data, then acts as an enumerator when
feeding transcoded data to another Iter.

At a high level, one can think of an Inum as a function from
Iters to IterRs, where an Inum's input and output types are
different. A simpler-seeming alternative to Inum might have
been:

type Inum' tIn tOut m a = Iter tOut m a -> Iter tIn m a

In fact, given an Inum object inum, it is possible to construct
a function of type Inum' with (inum .|). But sometimes one
might like to concatenate Inums. For instance, consider a
network protocol that changes encryption or compression modes
midstream. Transcoding is done by Inums. To change transcoding
methods after applying an Inum to an iteratee requires the
ability to "pop" the iteratee back out of the Inum so as to be
able to hand it to another Inum. Inum's return type (Iter tIn
m (IterR tOut m a) as opposed to Iter tIn m a) allows the
monadic bind operator >>= to accomplish this popping in
conjunction with the tryRI and reRunIter functions.

All Inums must obey the following two rules.

AnInummay never feed a chunk with the EOF flag set toit's targetIter. Instead, upon receiving EOF, the Inum
should simply return the state of the inner Iter (this is how
"popping" the iteratee back out works--If the Inum passed
the EOF through to the Iter, the Iter would stop requesting
more input and could not be handed off to a new Inum).

AnInummust always return the state of its targetIter.
This is true even when the Inum fails, and is why the Fail
state contains a Maybe a field.

In addition to returning when it receives an EOF or fails, an
Inum should return when the target Iter returns a result or
fails. An Inum may also unilaterally return the state of the
iteratee at any earlier point, for instance if it has reached some
logical message boundary (e.g., many protocols finish processing
headers upon reading a blank line).

Inums are generally constructed with one of the mkInum or
mkInumM functions, which hide most of the error handling details
and ensure the above rules are obeyed. Most Inums are
polymorphic in the last type, a, in order to work with iteratees
returning any type.

An Onum t m a is just an Inum in which the input is
()--i.e., Inum () t m a--so that there is no meaningful input
data to transcode. Such an enumerator is called an
outer enumerator, because it must produce the data it feeds to
Iters by either executing actions in monad m, or from its own
internal pure state (as for enumPure).

As with Inums, an Onum should under no circumstances ever feed
a chunk with the EOF bit set to its Iter argument. When the
Onum runs out of data, it must simply return the current state of
the Iter. This way more data from another source can still be
fed to the iteratee, as happens when enumerators are concatenated
with the cat function.

Onums should generally be constructed using the mkInum or
mkInumM function, just like Inums, the only difference being
that for an Onum the input type is (), so executing Iters to
consume input will be of little use.

.|$ is a variant of |$ that allows you to apply an Onum
from within an Iter monad. This is often useful in conjuction
with enumPure, if you want to parse at some coarse-granularity
(such as lines), and then re-parse the contents of some
coarser-grained parse unit. For example:

Note the important distinction between (.|$) and (.|).
(.|$) runs an Onum and does not touch the current input, while
(.|) pipes the current input through an Inum. For instance, to
send the contents of a file to standard output (regardless of the
current input), you must say enumFile ".signature" .|$
stdoutI. But to take the current input, compress it, and send
the result to standard output, you must use .|, as in inumGzip.|stdoutI.

As suggested by the types, enum .|$ iter is sort of equivalent to
lift (enum |$ iter), except that the latter will call throw
on failures, causing language-level exceptions that cannot be
caught within the outer Iter. Thus, it is better to use .|$
than lift (... |$ ...), though in the less general case of
the IO monad, enum .|$ iter is equivalent to liftIO (enum |$
iter) as illustrated by the following examples:

Concatenate the outputs of two enumerators. For example,
enumFile "file1" `cat` enumFile "file2" produces an
Onum that outputs the concatenation of files "file1" and
"file2". Unless the first Inum fails, cat always invokes the
second Inum, as the second Inum may have monadic side-effects
that must be executed even when the Iter has already finished.
See lcat if you want to stop when the Iter no longer requires
input. If you want to continue executing even in the event of an
InumFail condition, you can wrap the first Inum with
inumCatch and invoke resumeI from within the exception handler.

cat (and lcat, described below) are useful in right folds.
Say, for instance, that files is a list of files you wish to
concatenate. You can use a construct such as:

Note the use of inumNull as the starting value for foldr. This
is not to be confused with inumNop. inumNull acts as a no-op
for concatentation, producing no output analogously to
/dev/null. By contrast inumNop is the no-op for fusing (see
|. and .| below) because it passes all data through untouched.

Left-associative pipe operator. Fuses two Inums when the
output type of the first Inum is the same as the input type of
the second. More specifically, if inum1 transcodes type tIn to
tOut and inum2 transcodes tOut to tOut2, then inum1
|. inum2 produces a new Inum that transcodes from tIn to
tOut2.

Typically types i and iR are Iter tOut2 m a and IterR
tOut2 m a, respectively, in which case the second argument and
result of |. are also Inums.

Right-associative pipe operator. Fuses an Inum that transcodes
tIn to tOut with an Iter taking input type tOut to produce
an Iter taking input type tIn. If the Iter is still active
when the Inum terminates (either normally or through an
exception), then .| sends it an EOF.

Note that `inumCatch` has the default infix precedence (infixl
9 `inumcatch`), which binds more tightly than any concatenation
or fusing operators.

As noted for catchI, exception handlers receive both the
exception thrown and the failed IterR. Particularly in the case
of inumCatch, it is important to re-throw exceptions by
re-executing the failed Iter with reRunIter, not passing the
exception itself to throwI. That way, if the exception is
re-caught, resumeI will continue to work properly. For example,
to copy two files to standard output and ignore file not found
errors but re-throw any other kind of error, you could use the
following:

Like resumeI, but if the Iter is resumable, also prints an
error message to standard error before resuming.

Simple enumerator construction function

The mkInum function allows you to create stateless Inums out of
simple transcoding Iters. As an example, suppose you are
processing a list of L.ByteStrings representing packets, and want
to concatenate them all into one continuous stream of bytes. You
could implement an Inum called inumConcat to do this as
follows:

A ResidHandler specifies how to handle residual data in an
Inum. Typically, when an Inum finishes executing, there are
two kinds of residual data. First, the Inum itself (in its role
as an iteratee) may have left some unconsumed data. Second, the
target Iter being fed by the Inum may have some resitual data,
and this data may be of a different type. A ResidHandler allows
this residual data to be adjusted by untranslating the residual
data of the target Iter and sticking the result back into the
Inum's residual data.

The two most common ResidHandlers are pullupResid (to pull the
target Iter's residual data back up to the Inum as is), and
id (to do no adjustment of residual data).

Create a stateless Inum from a "codec" Iter that transcodes
the input type to the output type. The codec is invoked repeately
until one of the following occurs: The codec returns null data,
the codec throws an exception, or the underlying target Iter is
no longer active. If the codec throws an exception of type
IterEOF, this is considered normal termination and the error is
not further propagated.

mkInumC requires two other arguments before the codec. First, a
ResidHandler allows residual data to be adjusted between the
input and output Iter monads. Second, a CtlHandler specifies a
handler for control requests. For example, to pass up control
requests and ensure no residual data is lost when the Inum is
fused to an Iter, the inumConcat function given previously for
mkInum at #mkInumExample could be re-written:

A simplified version of mkInum that passes all control requests
to enclosing enumerators. It requires a ResidHandler to describe
how to adjust residual data. (E.g., use pullupResid when tIn
and tOut are the same type.)

Pass all control requests through to the enclosing Iter monad.
The ResidHandler argument says how to adjust residual data, in
case some enclosing CtlHandler decides to flush pending input
data, it is advisable to un-translate any data in the output type
tOut back to the input type tIn.

Create a CtlHandler given a function of a particular control
argument type and a fallback CtlHandler to run if the argument
type does not match. consCtl is used to chain handlers, with the
rightmost handler being either noCtl or passCtl.

For example, to create a control handler that implements seek on
SeekC requests, returns the size of the file on SizeC
requests, and passes everything else out to the enclosing
enumerator (if any), you could use the following:

Some basic Inums

inumNop passes all data through to the underlying Iter. It
acts as a no-op when fused to other Inums with |. or when fused
to Iters with .|.

inumNop is particularly useful for conditionally fusing Inums
together. Even though most Inums are polymorphic in the return
type, this library does not use the Rank2Types extension, which
means any given Inum must have a specific return type. Here is
an example of incorrect code:

inumNull feeds empty data to the underlying Iter. It pretty
much acts as a no-op when concatenated to other Inums with cat
or lcat.

There may be cases where inumNull is required to avoid deadlock.
In an expression such as enum |$ iter, if enum immediately
blocks waiting for some event, and iter immediately starts out
triggering that event before reading any input, then to break the
deadlock you can re-write the code as cat inumNull enum |$
iter.

Repeat an Inum until the input receives an EOF condition, the
Iter no longer requires input, or the Iter is in an unhandled
IterC state (which presumably will continue to be unhandled by
the same Inum, so no point in executing it again).

Enumerator construction monad

Complex Inums that need state and non-trivial control flow can be
constructed using the mkInumM function to produce an Inum out of a
computation in the InumM monad. The InumM monad implicitly keeps
track of the state of the Iter to which the Inum is feeding data,
which we call the "target" Iter.

InumM is an Iter monad, and so can consume input by invoking
ordinary Iter actions. However, to keep track of the state of the
target Iter, InumM wraps its inner monadic type with an
IterStateT transformer. Specifically, when creating an enumerator
of type Inum tIn tOut m a, the InumM action is of a type like
Iter tIn (IterStateT (InumState ...) m) (). That means that to
execute actions of type Iter tIn m a that are not polymorphic in
m, you have to transform them with the liftI function.

Output can be fed to the target Iter by means of the ifeed
function. As an example, here is another version of the inumConcat
function given previously for mkInum at #mkInumExample:

There are several points to note about this function. It reads data
in Chunks using chunkI, rather than just inputting data with
dataI. The choice of chunkI rather than dataI allows
inumConcat to see the eof flag and know when there is no more
input. chunkI also avoids throwing an IterEOF exception on end of
file, as dataI would. In contrast to mkInum, which gracefully
interprets IterEOF exceptions as an exit request, mkInumM by
default treats such exceptions as an Inum failure.

As previously mentioned, data is fed to the target Iter, which here
is of type Iter L.ByteString m a, using ifeed. ifeed returns
a Bool that is True when the Iter is no longer active. This
brings us to another point--there is no implicit looping or
repetition. We explicitly loop via a tail-recursive call to loop so
long as the eof flag is clear and ifeed returned False
indicating the target Iter has not finished.

What happens when eof or done is set? One possibility is to do
nothing. This is often correct. Falling off the end of the InumM
do-block causes the Inum to return the current state of the Iter.
However, it may be that the Inum has been fused to the target
Iter, in which case any left-over residual data fed to, but not
consumed by, the target Iter will be discarded. We may instead want
to put the data back onto the input stream. The ipopresid function
extracts any left-over data from the target Iter, while ungetI
places data back in the input stream. Since here the input stream is
a list of L.ByteStrings, we have to place resid in a list. (After
doing this, the list element boundaries may be different, but all the
input bytes will be there.) Note that the version of inumConcat
implemented with mkInum at #mkInumExample does not have this
input-restoring feature.

The code above looks much clumsier than the version based on mkInum,
but several of these steps can be made implicit. There is an
AutoEOF flag, controlable with the setAutoEOF function, that
causes IterEOF exceptions to produce normal termination of the
Inum, rather than failure (just as mkInum handles such
exceptions). Another flag, AutoDone, is controlable with the
setAutoDone function and causes the Inum to exit immediately when
the underlying Iter is no longer active (i.e., the ifeed function
returns True). Both of these flags are set at once by the
mkInumAutoM function, which yields the following simpler
implementation of inumConcat:

withCleanup, demonstrated here, is a variant of addCleanup that
cleans up after a particular action, rather than at the end of the
Inum's whole execution. (At the outermost level, as used here,
withCleanup's effects are identical to addCleanup's.)

In addition to ifeed, the ipipe function invokes a different
Inum from within the InumM monad, piping its output directly to
the target Iter. As an example, consider an Inum that processes a
mail message and appends a signature line, implemented as follows:

The . between runI and enumFile is because Inums are
functions from Iters to IterRs; we want to apply runI to the
result of applying enumFile ".signature" to an Iter. Spelled
out, the type of enumFile is:

A monad in which to define the actions of an Inum tIn tOut m
a. Note InumM tIn tOut m a is a Monad of kind * -> *, where
a is the (almost always parametric) return type of the Inum. A
fifth type argument is required for monadic computations of kind
*, e.g.:

seven :: InumM tIn tOut m a Int
seven = return 7

Another important thing to note about the InumM monad, as
described in the documentation for mkInumM, is that you must call
lift twice to execute actions in monad m, and you must use
the liftI function to execute actions in monad Iter t m a.

Build an Inum out of an InumM computation. If you run
mkInumM inside the Iter tIn m monad (i.e., to create an
enumerator of type Inum tIn tOut m a), then the InumM
computation will be in a Monad of type Iter t tm where tm is
a transformed version of m. This has the following two
consequences:

If you wish to execute actions in monad m from within your
InumM computation, you will have to apply lift twice (as
in lift $ lift action_in_m) rather than just once.

If you need to execute actions in the Iter t m monad, you
will have to lift them with the liftI function.

The InumM computation you construct can feed output of type
tOut to the target Iter (which is implicitly contained in the
monad state), using the ifeed, ipipe, and irun functions.

Set the AutoEOF flag within an InumM computation. If this
flag is True, handle IterEOF exceptions like a normal but
immediate termination of the Inum. If this flag is False
(the default), then IterEOF exceptions must be manually caught or
they will terminate the thread.

Set the AutoDone flag within an InumM computation. When
True, the Inum will immediately terminate as soon as the
Iter it is feeding enters a non-active state (i.e., Done or a
failure state). If this flag is False (the default), the
InumM computation will need to monitor the results of the
ifeed, ipipe, and irun functions to ensure the Inum
terminates when one of these functions returns False.

Run an InumM with some cleanup action in effect. The cleanup
action specified will be executed when the main action returns,
whether normally, through an exception, because of the AutoDone
or AutoEOF flags, or because idone is invoked.

Note withCleanup also defines the scope of actions added by the
addCleanup function. In other words, given a call such as
withCleanup cleaner1 main, if main invokes addCleanup
cleaner2, then both cleaner1 and cleaner2 will be executed
upon main's return, even if the overall Inum has not finished
yet.

Used from within the InumM monad to feed data to the target
Iter. Returns False if the target Iter is still active and
True if the iter has finished and the Inum should also
return. (If the autoDone flag is True, then ifeed,
ipipe, and irun will never actually return True, but
instead just immediately run cleanup functions and exit the
Inum when the target Iter stops being active.)

A variant of ifeed that throws an exception of type IterEOF
if the data being fed is null. Convenient when reading input
with a function (such as Data.ListLike's hget) that returns 0
bytes instead of throwing an EOF exception to indicate end of file.
For instance, the main loop of enumFile could be implemented
as:

Note that the applied Inum must handle all control requests. (In
other words, ones it passes on are not caught by whatever handler
is installed by setCtlHandler, but if the Inum returns the
IterR in the IterC state, as inumPure does, then requests
will be handled.)

Repeats an action until the Iter is done or an EOF error is
thrown. (Also stops if a different kind of exception is thrown, in
which case the exception propagates further and may cause the
Inum to fail.) irepeat sets both the AutoEOF and
AutoDone flags to True.

If the target Iter being fed by the Inum is no longer active
(i.e., if it is in the Done state or in an error state), this
funciton pops the residual data out of the Iter and returns it.
If the target is in any other state, returns mempty.

Immediately perform a successful exit from an InumM monad,
terminating the Inum and returning the current state of the
target Iter. Can be used to end an irepeat loop. (Use
throwI ... for an unsuccessful exit.)