Latest revision as of 19:28, 25 April 2013

Vector is a Haskell library for working with arrays. It has an emphasis on very high performance through loop fusion, whilst retaining a rich interface. The main data types are boxed and unboxed arrays, and arrays may be immutable (pure), or mutable. Arrays may hold Storable elements, suitable for passing to and from C, and you can convert between the array types. Arrays are indexed by non-negative

Int

values.

The vector library has an API similar to the famous Haskell list library, with many of the same names.

-- The empty vectorPrelude Data.Vector> empty
fromList []:: Data.Vector.Vector
-- A vector of length onePrelude Data.Vector> singleton 2
fromList [2]:: Data.Vector.Vector
-- A vector of length 10, filled with the value '2'-- Note that to disambiguate names,-- and avoid a clash with the Prelude,-- with use the full path to the Vector modulePrelude Data.Vector> Data.Vector.replicate 102
fromList [2,2,2,2,2,2,2,2,2,2]:: Data.Vector.Vector

In general, you may construct new vectors by applying a function to the index space:

The vector package provides several array types, with an identical interface. They have different flexibility with respect to the types of values that may be stored in them, and different performance characteristics.

In general:

End users should use Data.Vector.Unboxed for most cases

If you need to store more complex structures, use Data.Vector

If you need to pass to C, use Data.Vector.Storable

For library writers;

Use the generic interface, to ensure your library is maximally flexible: Data.Vector.Generic

Simple, atomic types, and pair types can be stored in a more efficient manner: consecutive memory slots without pointers. The Data.Vector.Unboxed.Vector type provides unboxed arrays of types that are members of the Unbox class, including:

Bool

()

Char

Double

Float

Int

Int8, 16, 32, 64

Word

Word8, 16, 32, 64

Complex a's, where 'a' is in Unbox

Tuple types, where the elements are unboxable

Unboxed arrays should be preferred when you have unboxable elements, as they are generally more efficient.

These arrays are pinned, and may be converted to and from pointers, that may be passed to C functions, using a number of functions:

unsafeFromForeignPtr
:: Storable a
=> ForeignPtr a
->Int->Int-> Vector a
-- Create a vector from a ForeignPtr with an offset and a length. The data may-- not be modified through the ForeignPtr afterwards.
unsafeToForeignPtr
:: Storable a
=> Vector a
->(ForeignPtr a,Int,Int)-- Yield the underlying ForeignPtr together with the offset to the data and its-- length. The data may not be modified through the ForeignPtr.
unsafeWith
:: Storable a
=> Vector a
->(Ptr a ->IO b)->IO b
-- Pass a pointer to the vector's data to the IO action. The data may not be-- modified through the 'Ptr.

Arrays can be created and operated on in a mutable fashion -- using destructive updates, as in an imperative language. Once all operations are complete, the mutable array can be "frozen" to a pure array, which changes its type.

Mutable arrays plus freezing are quite useful for initializing arrays from data in the outside world.

The most common way to generate a vector is via an enumeration function:

enumFromN

enumFromStepN

And the list-like:

enumFromTo

enumFromThenTo

The enumFrom*N functions are guaranteed to optimize well for any type. The enumFromTo functions might fall back to generating from lists if there is no specialization for your type. They are currently specialized to most Int/Word/Double/Float generators.

We often want to populate a vector using a external data file. The easiest way to do this is with bytestring IO, and Data.Vector.unfoldr (or the equivalent functions in Data.Vector.Unboxed or Data.Vector.Storable:

We might want to fill a vector with a monadic action, and have a pure vector at the end. The Vector API now contains a standard replicateM for this purpose, but if your monadic action is in IO, the following code is more efficient:

For performance reasons you may wish to avoid bounds checks, when you
can prove that the substring or index will be in bounds. For these cases
there are unsafe operations, that let you skip the bounds check: