1 General libraries and tools for data structures

This is the latest version of the Edison library of efficient data structures. There are also earlier version of Edison by Chris Okasaki. It provides sequences, finite maps, priority queues, and sets/bags. (overview paper).

A ranged set is a list of non-overlapping ranges. The ranges have upper and lower boundaries, and a boundary divides the base type into values above and below. No value can ever sit on a boundary. So you can have the set {2.0 < x <= 3.0, 5.3 < x < 6}

The Partial library provides a partial order class. It also provides routines for generating a Hasse diagram from a set and a partial order. Renderers are provided for the abstract Hasse diagram representation into LaTeX (via Xy-pic) and into dot, the format for AT&T's Graphviz tools. Since no horizontal sorting is done, the Xy-pic output is rather poor at present; dot does a much better job with its layout optimisation algorithm.

A heterogeneous collection is a datatype that is capable of storing data of different types, while providing operations for look-up, update, iteration, and others. There are various kinds of heterogeneous collections, differing in representation, invariants, and access operations.

Iavor Diatchki's library of monad transformers for Haskell. It enables the quick construction of monads --- abstract data types that capture common programming idioms, such as throwing and catching exceptions or continuations. In many programming languages such features are built into the language (if they're provided at all), but in Haskell they are user-programmable.

Pointless Haskell is library for point-free programming with recursion patterns defined as hylomorphisms. It also allows the visualization of the intermediate data structure of the hylomorphisms with GHood. This feature together with the DrHylo tool allows us to easily visualize recursion trees of Haskell functions.

Stefan Wehr's reactive objects library. Reactive objects are a convenient abstraction for writing programs which have to interact with a concurrent environment. A reactive object has two characteristics: the abandonment of all blocking operations and the unification of the concepts state and process. The former allows a reactive object to accept input from multiple sources without imposing any ordering on the input events. The latter prevents race conditions because the state of an object is only manipulated from the process belonging to the object.

A collection of random useful utility functions written in pure Haskell 98. In general, it tries to conform to the naming scheme put forth in the Haskell prelude and fill in the obvious omissions, as well as providing useful routines in general.

The persistent document abstraction takes care of dealing with a document you want to open from and save to disk and that supports undo. This functionality can be used by editors of arbitrary documents and saves you a lot of quite subtle coding.

2 Graphs

Part of the UMinho Haskell libraries, this library provides a representation and operations on relations. A special case of relations are graphs. The operations include graph chopping and slicing, strong connected component analysis, graphs metrics, and more.

3 IO

Streams is a feature-rich, flexible, extensible, backward-compatible and fast I/O library. It supports various stream types: files and legacy Handle type, string and memory buffers, pipes. There is also common functionality available for any stream: buffering, Char encoding, locking.

4 Mutable data

This is a short preprocessor for stateful Haskell programs. It aleviates the pain of performing single array lookup/write/update functions with some syntax to support it. It also supports hash table operations based on the HashTable implementation available from the author. Finally, it supports monadic if and monadic case constructions. It is lightweight in the sense that it performs no syntactic analysis and is essentially a character transformer.

5 Lists

6 Strings

The FPS library provides mmapped and malloc'd packed strings (byte arrays held by a ForeignPtr), along with a list interface to these strings. It lets you do extremely fast I/O in Haskell; in some cases, even faster than typical C implementations, as well as conserving space.

A well-known tree data structure for manipulating strings without (too many) memory copies, contrarily to what Data.ByteString does. Also, contrarily to the functions in Data.ByteString.Lazy, most operations on ropes are in O(log n).

This is very much the same like Data.ByteString extended to any Storable element type. There is a data structure for monolithic blocks and chunky sequences. Mutable access is also possible in ST monad.

PCRE-based regexes. This requires libpcre (currently works against version 6.6.0 as of January 2007). This Haskell library and libpcre are both BSD. This is very fast but has left branch semantics instead of POSIX leftmost longest semantics.

TRE-based regexes. This requires libtre (currently works against version 0.7.5 as of January 2007). Libtre aims to be POSIX compatible and is a much faster implementation that used by regex-posix. Note that libtre does have an outstanding bug in correctly interpreting some regular expressions. This Haskell package is BSD-3 licensed and libtre is LGPL.

Pure Haskell regexes implemented via parsec. The leftmost longest subexpression capture semantics of this library differ from the POSIX standard. This also has an option to prefer the left branch semantics that libpcre uses.

Pure Haskell DFA for regexes. This is licensed under LPGL (the above packages are all under BSD-3) due to derived code. This is faster than regex-parsec but does not return captured subexpressions. It also is being updated to handle repeated patterns that can match empty strings (the older version hangs, the new version rewrites the pattern internally).

There is a regex-tdfa packages in development (not released) that will be a pure Haskell and BSD-3 licensed implementation inspired by libtre. It aims to be a replacement for regex-posix.

Development versions (possible unstable or broken) of the above packages are also available under regex-unstable. For support, please use the haskell-cafe mailing list.

SerTH is a binary serialization library for Haskell. It supports serializing cyclic datatypes in a fast binary format. SerTH uses template haskell for deriving the serializing interface for new datatypes.

AltBinary is an exhaustive library that support binary I/O and serialization. It's part of Streams library, so serialization is possible to any I/O source, from String to memory-mapped file. It's also backward compatible with NewBinary library what makes translation of old code trivial. Very fast, very feature-rich, Hugs/GHC compatible, etc, etc...

YAML is a straightforward machine parsable data serialization format designed for human readability and interaction with dynamic languages. It is optimized for data serialization, configuration settings, log files, Internet messaging and filtering. Syck is an extension, written in C, for reading and writing YAML swiftly in popular scripting languages. It is part of core Ruby, and also has bindings for Perl 5, Python, Lua, Cocoa, and Perl 6. HsSyck provides Data.Yaml.Syck as an interface to YAML structures, using Data.ByteString for efficient textual data representation. Additionally, we provide a set of DrIFT rules to dump and load arbitrary Haskell data types in the YAML format.

GenericSerialize is a library which serializes data using the "Scrap Your Boilerplate" infrastructure. This means that while it cannot easily support data-structure-specific serialization, it can support many different data formats cheaply.

10 Benchmarking data structures

Auburn is Graeme Moss's kit for benchmarking implementations of lazy data structures. Give it several implementations of an ADT (abstract data type) and it will tell you which one is best for your particular application.

11 Generic traversals

The Recursion Library for Haskell provides a rich set of generic traversal strategies to facilitate the flexible specification of generic term traversals. The underlying mechanism is the Scrap Your Boilerplate (SYB) approach. Most of the strategies that are used to implement recursion operators are taken from Stratego.

12 Typesafe variants of

IntSet

and

IntMap

for enumerations

ChrisKuklewicz wrote a bit of boilerplate to re-use IntSet and IntMap with any Enum type: EnumSet EnumMap

an efficient implementation of EnumSet for sets which fit into a machine word can be found in Edison (see above)