Legend:

The nested parallel arrays of DPH could be used to model regular arrays, as we could simply either create segment information using replicate, or define regular arrays in terms of flat parallel arrays and separately stored dimensionality information, and define operations on these arrays in a library in terms of the nested operations. However, there are two main reasons why this is unsatisfactory: convenience and efficiency.

3

The library provides a layer on top of DPH unlifted arrays to support multi-dimensional arrays, and operations

4

like maps, folds, permutations, shifts and so on. The interface for delayed arrays is similar, but in contrast

5

to operations on the former, any operation on a delayed array is not evaluated. To force evaluation, the programmer

6

has to explicitely convert a delayed array to a strict array.

4

7

5

=== Convenience ===

8

The current implementation of the library exposes some implementation details the user of the library shouldn't

9

have to worry about. Once the design of the library is finalised, most of these will be hidden by distinguishing

10

between internal types and representation types

6

11

7

Languages like SAC, which provide high-level support for operations on multi-dimensional arrays, offer shape invariant operations. If we want to model this on a library level, we either have to give up type safety to a

8

large extend (for example, by encoding the shape as a list of integer values whose length is proportionate to its dimensionality) or use sophisticated language features like GADTs, which may impede the usability of the library for inexperienced users.

12

== Strict Arrays, Delayed Array and Shape Data Type ==

13

Both strict and delayed arrays are parametrised with their shape - that is, their dimensionality and size

Note that a `Shape` has to be in the type class `Elt` imported from `Data.Parallel.Array.Unboxed` so

40

that it can be an element of `Data.Parallel.Array.Unboxed.Array`.

41

42

The following instances are defined

43

{{{

44

instance Shape ()

45

instance Shape sh => Shape (sh :*: Int)

46

}}}

47

so we have inductively defined n-tuples of `Int` values to represent shapes. This somewhat unusual representation

48

is necessary to be able to inductively define operations on `Shape`. It should, however, be hidden from the library

49

user in favour of the common tuple representation.

50

51

For convenience, we provide type synonyms for dimensionality up to five:

52

{{{

53

type DIM0 = ()

54

type DIM1 = (DIM0 :*: Int)

55

....

56

}}}

9

57

10

58

11

=== Efficiency ===

59

== Operations on Arrays and Delayed Arrays ==

12

60

13

When encoding multidimensional arrays using segment descriptors or by storing the dimensions separately. In the first case, this would mean significant memory overhead proportionate to the number of subarrays on each level. But even in the second case, segment descriptors have to be generated to call functions like segmented fold and scan. It is hard to predict the exact overhead for this step, as fusion might prevent the segment descriptor array to be actually built in many cases. More significant in terms of overhead is that, when using segment descriptors, parallel splits become significantly more complicated, as they require communication in the irregular case to determine the distribution, whereas the distributions of a regularly segmented array can be determined locally on each processor.

61

=== Array Creation and Conversion ===

14

62

15

== Language Support ==

63

Strict arrays are simply defined as record containing a flat data array and shape information:

64

{{{

65

data Array dim e where

66

Array { arrayShape :: dim -- ^extend of dimensions

67

, arrayData :: U.Array e -- flat parallel array

68

} :: Array dim e

69

deriving Show

16

70

17

The remainder of this document is a first design draft for SaC style language support of multidimensional arrays in the context of DPH. The implementation is not completed yet, and there are several open questions.

71

toArray :: (U.Elt e, Shape dim) => dim -> U.Array e -> Array dim e

72

fromArray:: (U.Elt e, Shape dim) => Array dim e -> U.Array e

73

}}}

18

74

19

== The regular array type ==

20

21

=== SaC ===

22

23

In SaC, multidimensional arrays are represented by two vectors, the shape and the data vector, where vectors are one dimensional arrays. Scalar values are viewed as 0-dimensional arrays. The function `reshape` takes as first argument a shape vector, as second an array, and creates an array with identical data vector and the given shape vector. For example:

75

Delayed arrays, in contrast, in addition to the shape, only contain a function which, given an index,

76

yields the corresponding element.

24

77

{{{

25

reshape ([3,2],[1,2,3,4,5,6])

78

data DArray dim e where

79

DArray :: {dArrayShape::dim -> dArrayFn:: (dim -> e)} -> DArray dim e

26

80

}}}

27

produces a 3 times 2 matrix.

28

29

=== DPH ===

30

Regular parallel arrays are similar to arrays in SaC, with one major

31

difference: SaC employs a mix of static and dynamic type checking, combined with a form of shape inference, whereas we use GHC's type checker to ensure certain domain restrictions are not violated.

32

33

'''Note:''' currently, we are only able to statically check that restrictions regarding the dimensionality of and array are met, but not with respect to the size. SaC is, to a certain extend, able to do so. I still need to check if there are some cases where the DPH approach would statically find some dimensionality bugs where SaC wouldn't - need to check that.

81

Delayed arrays can be converted to and from strict arrays:

82

(TODO: there needs to be an darray constructor function accepting the shape and the function as arguments)

is called 'shape invariant programming' in SaC works differently in DPH. In particular, in DPH the dimensionality of an array (not its size, however) are encoded in its type.

89

=== Shape Invariant Computations on Arrays ===

38

90

39

An multidimensional array is parametrised with its dimensionality and its

40

element type:

91

The library provides a range of operation where the dimensionality of

92

the result depends on the dimensionality of the argument in a

93

non-trivial matter, which we want to be reflected in the type system.

94

Examples of such functions are generalised selection, which allows for

95

extraction of subarrays of arbitrary dimension, and generalised replicate,

96

which allows replication of an array in any dimension (or dimensions).

97

98

For selection, we can informally state the relationship between dimensionality of

99

the argument, the selector, and the result as follows:

100

{{{

101

select:: Array dim e -> <select dim' of dim array> -> Array dim' e

102

}}}

103

104

To express this relationship, the library provides the index GADT,

105

which expresses a relationship between the inital and the projected

106

dimensionality. It is defined as follows:

41

107

42

108

{{{

43

(Shape dim, U.Elt a) => Array dim a

44

}}}

45

The element type of multidimensional arrays is restricted to the type class `Elt` exported from ` Data.Array.Parallel.Unlifted`, and contains all primitive types like `Bool`, `Int`, `Float`, and pairs thereof constructed with the type constructor `:*:`, also exported from the same module. The elements of the type class `Shape` describe the shape of a multidimensional array, but also indices into

46

an array ('''Note:''' so, is `Shape` really the right name? `Ix` however, also doesn't seem to be right, since it is too different from the`Ix` defined in the Prelude)

So, for example a two dimensional array of three vectors of the length five has the shape `(() :*: 5) :*: 3`. This is a suitable internal representation, but it should be hidden from the user, who should be provided with a more familiar notation, but for now, we will stick with the internal representation.

121

Note that the library provides no way to statically check the pre- and

122

postconditions on the actual size of arguments and results. This has

123

to be done at run time using assertions.

69

124

70

We use the following type synonyms to improve the readability of the code:

71

{{{

72

type DIM0 = ()

73

type DIM1 = DIM0 :*: Int

74

type DIM2 = DIM1 :*: Int

75

type DIM3 = DIM2 :*: Int

76

}}}

77

== Operations ==

78

79

=== Array Shapes ===

80

The `shape` function returns the shape of an n-dimensional array as n-tuple: