Module documentation for 0.3.0

Streamly

Streaming Concurrently

Haskell lists express pure computations using composable stream operations like
:, unfold, map, filter, zip and fold. Streamly is exactly like
lists except that it can express sequences of pure as well as monadic
computations aka streams. More importantly, it can express monadic sequences
with concurrent execution semantics without introducing any additional APIs.

Streamly expresses concurrency using standard, well known abstractions.
Concurrency semantics are defined for list operations, semigroup, applicative
and monadic compositions. Programmer does not need to know any low level
notions of concurrency like threads, locking or synchronization. Concurrent
and non-concurrent programs are fundamentally the same. A chosen segment of
the program can be made concurrent by annotating it with an appropriate
combinator. We can choose a combinator for lookahead style or asynchronous
concurrency. Concurrency is automatically scaled up or down based on the
demand from the consumer application, we can finally say goodbye to managing
thread pools and associated sizing issues. The result is truly fearless
and declarative monadic concurrency.

Where to use streamly?

Streamly is a general purpose programming framwework. It can be used equally
efficiently from a simple Hello World! program to a massively concurrent
application. The answer to the question, “where to use streamly?” - would be
similar to the answer to - “Where to use Haskell lists or the IO monad?”.
Streamly generalizes lists to monadic streams, and the IO monad to
non-deterministic and concurrent stream composition. The IO monad is a
special case of streamly; if we use single element streams the behavior of
streamly becomes identical to the IO monad. The IO monad code can be replaced
with streamly by just prefixing the IO actions with liftIO, without any other
changes, and without any loss of performance. Pure lists too are a special
case of streamly; if we use Identity as the underlying monad, streamly
streams turn into pure lists. Non-concurrent programs are just a special case
of concurrent ones, simply adding a combinator turns a non-concurrent program
into a concurrent one.

In other words, streamly combines the functionality of lists and IO, with
builtin concurrency. If you want to write a program that involves IO,
concurrent or not, then you can just use streamly as the base monad, in fact,
you could even use streamly for pure computations, as streamly performs at par
with pure lists or vector.

Why data flow programming?

If you need some convincing for using streaming or data flow programming
paradigm itself then try to answer this question - why do we use lists in
Haskell? It boils down to why we use functional programming in the first place.
Haskell is successful in enforcing the functional data flow paradigm for pure
computations using lists, but not for monadic computations. In the absence of a
standard and easy to use data flow programming paradigm for monadic
computations, and the IO monad providing an escape hatch to an imperative
model, we just love to fall into the imperative trap, and start asking the same
fundamental question again - why do we have to use the streaming data model?

Isn’t that magical? What’s going on here? Streamly does not introduce any new
abstractions, it just uses standard abstractions like Semigroup or
Monoid to combine monadic streams concurrently, the way lists combine a
sequence of pure values non-concurrently. The foldMap in the code
above turns into a concurrent monoidal composition of a stream of readdir
computations.

How does it perform?

Providing monadic streaming and high level declarative concurrency does not
mean that streamly compromises with performance in any way. The
non-concurrent performance of streamly competes with lists and the vector
library. The concurrent performance is as good as it gets, see concurrency
benchmarks for detailed
performance results and a comparison with the async package.

The following chart shows a summary of the cost of key streaming operations
processing a million elements. The timings for streamly and vector are in
the 600-700 microseconds range and therefore can barely be seen in the graph.
For more details, see streaming
benchmarks.

Streaming Pipelines

The following snippet provides a simple stream composition example that reads
numbers from stdin, prints the squares of even numbers and exits if an even
number more than 9 is entered.

Unlike pipes or conduit and like vector and streaming, streamly
composes stream data instead of stream processors (functions). A stream is
just like a list and is explicitly passed around to functions that process the
stream. Therefore, no special operator is needed to join stages in a streaming
pipeline, just the standard function application ($) or reverse function
application (&) operator is enough. Combinators are provided in
Streamly.Prelude to transform or fold streams.

Concurrent Streaming Pipelines

Use |& or |$ to apply stream processing functions concurrently. The
following example prints a “hello” every second; if you use & instead of
|& you will see that the delay doubles to 2 seconds instead because of serial
application.

Mapping Concurrently

Serial and Concurrent Merging

Semigroup and Monoid instances can be used to fold streams serially or
concurrently. In the following example we compose ten actions in the
stream, each with a delay of 1 to 10 seconds, respectively. Since all the
actions are concurrent we see one output printed every second:

Streams can be combined together in many ways. We provide some examples
below, see the tutorial for more ways. We use the following delay
function in the examples to demonstrate the concurrency aspects:

Concurrent Nested Loops

To run the above code with, lookahead style concurrency i.e. each iteration in
the loop can run run concurrently by but the results are presented in the same
order as serial execution:

main = runStream $ aheadly $ loops

To run it with depth first concurrency yielding results asynchronously in the
same order as they become available (deep async composition):

main = runStream $ asyncly $ loops

To run it with breadth first concurrency and yeilding results asynchronously
(wide async composition):

main = runStream $ wAsyncly $ loops

The above streams provide lazy/demand-driven concurrency which is automatically
scaled as per demand and is controlled/bounded so that it can be used on
infinite streams. The following combinator provides strict, unbounded
concurrency irrespective of demand:

main = runStream $ parallely $ loops

To run it serially but interleaving the outer and inner loop iterations
(breadth first serial):

main = runStream $ wSerially $ loops

Magical Concurrency

Streams can perform semigroup (<>) and monadic bind (>>=) operations
concurrently using combinators like asyncly, parallelly. For example,
to concurrently generate squares of a stream of numbers and then concurrently
sum the square roots of all combinations of two streams:

For some practical uses of rate control, see
AcidRain.hs
and
CirclingSquare.hs
.
Concurrency of the stream is automatically controlled to match the specified
rate. Rate control works precisely even at throughputs as high as millions of
yields per second. For more sophisticated rate control see the haddock
documentation.

Exceptions

From a library user point of view, there is nothing much to learn or talk about
exceptions. Synchronous exceptions work just the way they are supposed to work
in any standard non-concurrent code. When concurrent streams are combined
together, exceptions from the constituent streams are propagated to the
consumer stream. When an exception occurs in any of the constituent streams
other concurrent streams are promptly terminated. Exceptions can be thrown
using the MonadThrow instance.

There is no notion of explicit threads in streamly, therefore, no
asynchronous exceptions to deal with. You can just ignore the zillions of
blogs, talks, caveats about async exceptions. Async exceptions just don’t
exist. Please don’t use things like myThreadId and throwTo just for fun!

Reactive Programming (FRP)

Streamly is a foundation for first class reactive programming as well by virtue
of integrating concurrency and streaming. See
AcidRain.hs
for a console based FRP game example and
CirclingSquare.hs
for an SDL based animation example.

Conclusion

Streamly, short for streaming concurrently, provides monadic streams, with a
simple API, almost identical to standard lists, and an in-built
support for concurrency. By using stream-style combinators on stream
composition, streams can be generated, merged, chained, mapped, zipped, and
consumed concurrently – providing a generalized high level programming
framework unifying streaming and concurrency. Controlled concurrency allows
even infinite streams to be evaluated concurrently. Concurrency is auto scaled
based on feedback from the stream consumer. The programmer does not have to be
aware of threads, locking or synchronization to write scalable concurrent
programs.

Streamly is a programmer first library, designed to be useful and friendly to
programmers for solving practical problems in a simple and concise manner. Some
key points in favor of streamly are:

Simplicity: Simple list like streaming API, if you know how to use lists
then you know how to use streamly. This library is built with simplicity
and ease of use as a design goal.

Concurrency: Simple, powerful, and scalable concurrency. Concurrency is
built-in, and not intrusive, concurrent programs are written exactly the
same way as non-concurrent ones.

Performance: Streamly is designed for high performance. It employs stream
fusion optimizations for best possible performance. Serial peformance is
equivalent to the venerable vector library in most cases and even better
in some cases. Concurrent performance is unbeatable. See
streaming-benchmarks
for a comparison of popular streaming libraries on micro-benchmarks.

The basic streaming functionality of streamly is equivalent to that provided by
streaming libraries like
vector,
streaming,
pipes, and
conduit.
In addition to providing streaming functionality, streamly subsumes
the functionality of list transformer libraries like pipes or
list-t, and also the logic
programming library logict. On
the concurrency side, it subsumes the functionality of the
async package, and provides even
higher level concurrent composition. Because it supports
streaming with concurrency we can write FRP applications similar in concept to
Yampa or
reflex.

See the Comparison with existing packages section at the end of the
tutorial.

This library was originally inspired by the transient package authored by
Alberto G. Corona.

Changes

0.5.2

Bug Fixes

Cleanup any pending threads when an exception occurs.

Fixed a livelock in ahead style streams. The problem manifests sometimes when
multiple streams are merged together in ahead style and one of them is a nil
stream.

As per expected concurrency semantics each forked concurrent task must run
with the monadic state captured at the fork point. This release fixes a bug,
which, in some cases caused an incorrect monadic state to be used for a
concurrent action, leading to unexpected behavior when concurrent streams are
used in a stateful monad e.g. StateT. Particularly, this bug cannot affect
ReaderT.

0.5.1

0.5.0

Bug Fixes

Leftover threads are now cleaned up as soon as the consumer is garbage
collected.

Fix a bug in concurrent function application that in certain cases would
unnecessarily share the concurrency state resulting in incorrect output
stream.

Fix passing of state across parallel, async, wAsync, ahead, serial,
wSerial combinators. Without this fix combinators that rely on state
passing e.g. maxThreads and maxBuffer won’t work across these
combinators.

Enhancements

Added rate limiting combinators rate, avgRate, minRate, maxRate and
constRate to control the yield rate of a stream.

Add fromList and fromListM to generate streams from lists, faster than
fromFoldable and fromFoldableM

Add map as a synonym of fmap

Add scanlM', the monadic version of scanl’

Add takeWhileM and dropWhileM

Add filterM

0.3.0

Breaking changes

Some prelude functions, to whom concurrency capability has been added, will
now require a MonadAsync constraint.

Bug Fixes

Fixed a race due to which, in a rare case, we might block indefinitely on
an MVar due to a lost wakeup.

Fixed an issue in adaptive concurrency. The issue caused us to stop creating
more worker threads in some cases due to a race. This bug would not cause any
functional issue but may reduce concurrency in some cases.

Enhancements

Added a concurrent lookahead stream type Ahead

Added fromFoldableM API that creates a stream from a container of monadic
actions

0.2.1

Bug Fixes

Fixed a bug that caused some transformation ops to return incorrect results
when used with concurrent streams. The affected ops are take, filter,
takeWhile, drop, dropWhile, and reverse.

0.2.0

Breaking changes

Changed the semantics of the Semigroup instance for InterleavedT, AsyncT
and ParallelT. The new semantics are as follows:

For InterleavedT, <> operation interleaves two streams

For AsyncT, <> now concurrently merges two streams in a left biased
manner using demand based concurrency.

For ParallelT, the <> operation now concurrently meges the two streams
in a fairly parallel manner.

To adapt to the new changes, replace <> with serial wherever it is used
for stream types other than StreamT.

Remove the Alternative instance. To adapt to this change replace any usage
of <|> with parallel and empty with nil.

Stream type now defaults to the SerialT type unless explicitly specified
using a type combinator or a monomorphic type. This change reduces puzzling
type errors for beginners. It includes the following two changes:

Change the type of all stream elimination functions to use SerialT
instead of a polymorphic type. This makes sure that the stream type is
always fixed at all exits.

Change the type combinators (e.g. parallely) to only fix the argument
stream type and the output stream type remains polymorphic.

Stream types may have to be changed or type combinators may have to be added
or removed to adapt to this change.

Change the type of foldrM to make it consistent with foldrM in base.

async is renamed to mkAsync and async is now a new API with a different
meaning.

ZipAsync is renamed to ZipAsyncM and ZipAsync is now ZipAsyncM
specialized to the IO Monad.

Remove the MonadError instance as it was not working correctly for
parallel compositions. Use MonadThrow instead for error propagation.

Remove Num/Fractional/Floating instances as they are not very useful. Use
fmap and liftA2 instead.

Deprecations

Deprecate and rename the following symbols:

Streaming to IsStream

runStreaming to runStream

StreamT to SerialT

InterleavedT to WSerialT

ZipStream to ZipSerialM

ZipAsync to ZipAsyncM

interleaving to wSerially

zipping to zipSerially

zippingAsync to zipAsyncly

<=> to wSerial

<| to async

each to fromFoldable

scan to scanx

foldl to foldx

foldlM to foldxM

Deprecate the following symbols for future removal:

runStreamT

runInterleavedT

runAsyncT

runParallelT

runZipStream

runZipAsync

Enhancements

Add the following functions:

consM and |: operator to construct streams from monadic actions

once to create a singleton stream from a monadic action

repeatM to construct a stream by repeating a monadic action

scanl' strict left scan

foldl' strict left fold

foldlM' strict left fold with a monadic fold function

serial run two streams serially one after the other

async run two streams asynchronously

parallel run two streams in parallel (replaces <|>)

WAsyncT stream type for BFS version of AsyncT composition

Add simpler stream types that are specialized to the IO monad

Put a bound (1500) on the output buffer used for asynchronous tasks

Put a limit (1500) on the number of threads used for Async and WAsync types

0.1.2

Enhancements

Add iterate, iterateM stream operations

Bug Fixes

Fixed a bug that casued unexpected behavior when pure was used to inject
values in Applicative composition of ZipStream and ZipAsync types.