We can also just use the vector-random package to generate new vectors initialized with the mersenne twister generator:

+

+

For example, to generate 100 million random Doubles and sum them:

+

+

<haskell>

+

+

import qualified Data.Vector.Unboxed as U

+

import System.Random.Mersenne

+

import qualified Data.Vector.Random.Mersenne as G

+

+

main = do

+

g <- newMTGen Nothing

+

a <- G.random g 10000000 :: IO (U.Vector Double) -- 100 M

+

print (U.sum a)

</haskell>

</haskell>

Revision as of 21:54, 20 February 2010

Vector is a Haskell library for working with arrays, with an emphasis on raw performance, whilst retaining a rich interface. The main data types are boxed and unboxed arrays, and arrays may be immutable (pure), or mutable. Arrays are indexed by non-negative

Int

values.

The vector library has an API similar to the famous Haskell list library, with many of the same names.

-- The empty vectorPrelude Data.Vector> empty
fromList []:: Data.Vector.Vector
-- A vector of length onePrelude Data.Vector> singleton 2
fromList [2]:: Data.Vector.Vector
-- A vector of length 10, filled with the value '2'-- Note that to disambiguate names,-- and avoid a clash with the Prelude,-- with use the full path to the Vector modulePrelude Data.Vector> Data.Vector.replicate 102
fromList [2,2,2,2,2,2,2,2,2,2]:: Data.Vector.Vector

In general, you may construct new vectors by applying a function to the index space:

2.2 Array Types

The vector package provides several array types, with an identical interface. They have different flexibility with respect to the types of values that may be stored in them, and different performance characteristics.

2.2.1 Boxed Arrays: Data.Vector

The most flexible type is Data.Vector.Vector, which provides *boxed* arrays: arrays of pointers to Haskell values.

Data.Vector.Vector's are fully polymorphic: they can hold any valid Haskell type

These arrays are suitable for storing complex Haskell types (sum types, or algebraic data types), but a better choice for simple data types is Data.Vector.Unboxed.

2.2.2 Unboxed Arrays: Data.Vector.Unboxed

Simple, atomic types, and pair types can be stored in a more efficient manner: consecutive memory slots without pointers. The Data.Array.Unboxed.Vector type provides unboxed arrays of types that are members of the Unbox class, including:

Bool

()

Char

Double

Float

Int

Int8, 16, 32, 64

Word

Word8, 16, 32, 64

Complex a's, where 'a' is in Unbox

Tuple types, where the elements are unboxable

Unboxed arrays should be preferred when you have unboxable elements, as they are generally more efficient.

2.2.3 Storable Arrays: passing data to C

These arrays are pinned, and may be converted to and from pointers, that may be passed to C functions, using a number of functions:

unsafeFromForeignPtr
:: Storable a
=> ForeignPtr a
->Int->Int-> Vector a
-- Create a vector from a ForeignPtr with an offset and a length. The data may --- not be modified through the ForeignPtr afterwards.
unsafeToForeignPtr
:: Storable a
=> Vector a
->(ForeignPtr a,Int,Int)-- Yield the underlying ForeignPtr together with the offset to the data and its -- length. The data may not be modified through the ForeignPtr.
unsafeWith
:: Storable a
=> Vector a
->(Ptr a ->IO b)->IO b
-- Pass a pointer to the vector's data to the IO action. The data may not be -- -- modified through the 'Ptr.

2.2.4 Pure Arrays

2.2.5 Impure Arrays

2.2.6 Some examples

The most important attributes of an array are available in O(1) time, such as the size (length),

-- how big is the array?Prelude Data.Vector>let a = fromList [1,2,3,4,5,6,7,8,9,10]Prelude Data.Vector> Data.Vector.length a
10-- is the array empty?Prelude Data.Vector> Data.Vector.null a
False

2.3 Array Creation

2.3.1 Enumerations

The most common way to generate a vector is via an enumeration function:

enumFromN

enumFromStepN

And the list-like:

enumFromTo

enumFromThenTo

The enumFrom*N functions are guaranteed to optimize well for any type. The enumFromTo functions might fall back to generating from lists if there is no specialization for your type. They are currently specialized to most Int/Word/Double/Float generators.

Doubling the performance, by halving the number of traversals. Fusion also means we can avoid any intermediate data structure allocation.

2.3.2 An example: filling a vector from a file

We often want to populate a vector using a external data file. The easiest way to do this is with bytestring IO, and Data.Vector.unfoldr (or the equivalent functions in Data.Vector.Unboxed or Data.Vector.Storable:

2.3.2.1 Parsing Ints

The simplest way to parse a file of Int or Integer types is with a strict or lazy ByteString, and the readInt or readInteger functions: