The VECTORISE pragma

The vectoriser needs to know about all types and functions whose vectorised variants are directly implemented by the DPH library (instead of generated by the vectoriser), and it needs to know what the vectorised versions are. That is the purpose of the VECTORISE pragma (which comes in in number of flavours).

The basic VECTORISE pragma for values

Given a function f, the vectoriser generates a vectorised version f_v, which comprises the original, scalar version of the function and a second version lifted into array space. The lifted version operates on arrays of inputs and produces arrays of results in one parallel computation. The original function name is, then, rebound to use the scalar version referred to by f_v. This differs from the original in that it uses vectorised versions for any embedded parallel array computations.

However, if a variable f is accompanied by a pragma of the form

{-# VECTORISE f = e #-}

then the vectoriser defines f_v = e and refrains from rebinding f. This implies that for f :: t, e's type is the t vectorised (in particular), e's type uses the array closure type (:->) instead of the vanilla function space (->). The vectoriser checks that e has the appropriate type.

IMPLEMENTATION RESTRICTION: Currently the right-hand side of the equation —i.e., e— may only be a simple identifier and it must be at the correct type instance. More precisely, the Core type of the right-hand side must be identical to the vectorised version of t.

The NOVECTORISE pragma for values

If a variable f is accompanied by a pragma

{-# NOVECTORISE f #-}

then it is ignored by the vectoriser — i.e., no function f_v is generated and f is left untouched.

This pragma can only be used for bindings in the current module (exactly like an INLINE pragma).

Caveat: If f's definition contains bindings that are being floated to the toplevel, those bindings will still be vectorised.

The VECTORISE SCALAR pragma for functions

Functions that contain no array computations, especially if they are cheap (such as (+)), should not be vectorised, but applied by simply mapping them over an array. This could be achieved by using the VECTORISE pragma with an appropriate right-hand side, but leads to repetitive code that we rather like the compiler to generate.

If a unary function f is accompanied by a pragma

{-# VECTORISE SCALAR f #-}

then the vectoriser generates

f_v = closure1 f (scalar_map f)

and keeps f unchanged.

For a binary function, it generates

f_v = closure2 f (scalar_zipWith f)

for a tertiary function, it generates

f_v = closure3 f (scalar_zipWith3 f)

and so on. (The variable f must have a proper function type.)

The basic VECTORISE pragma for type constructors

Without right-hand side

For a type constructor T, the pragma

{-# VECTORISE type T #-}

indicates that the type T should be vectorised and embeds no parallel arrays. This is similar to where the vectoriser automatically decides to vectorise a type, but but no special vectorised representation needs to be generated as the type embeds no arrays.

The data type constructor T that together with its constructors Cn may be used in vectorised code, where T and the Cn represent themselves in vectorised code. An example is the treatment of 'Bool'. 'Bool' together with 'False' and 'True' may appear in vectorised code and they remain unchanged by vectorisation. (There is no need for a special representation as the values cannot embed any arrays.)

The type constructor T must be in scope, but it may be imported. 'PData' and 'PRepr' instances are automatically generated by the vectoriser.

TODO

Do we need to be able to specify that an imported type embedding arrays should be vectorised including the generation of a specialised right-hand side?

With right-hand side

{-# VECTORISE type T = ty #-}

TODO

This isn't fully implemented yet. (Implemented up to and including desugaring and being put into ModGuts, but not used in the vectoriser.)

The VECTORISE SCALAR pragma for type constructors

For a type constructor T, the pragma

{-# VECTORISE SCALAR type T #-}

indicates that the type is scalar; i.e., it has no embedded arrays. Note that the type cannot be parameterised (as we could not rule out that any of the type parameters at a usage site is an array type.)

Due to this pragma declaration, T that may be used in vectorised code, where T represents itself. However, the representation of T is opaque in vectorised code. An example is the treatment of Int. Ints can be used in vectorised code and remain unchanged by vectorisation. However, the representation of Int by the I# data constructor wrapping an Int# is not exposed in vectorised code. Instead, computations involving the representation need to be confined to scalar code.

The type constructor T must be in scope, but it may be imported. The PData and PRepr instances for T need to be manually defined. (For types that the vectoriser automatically determines that they don't need a vectorised version, instances for PData and PRepr are still generated automatically.)

TODO

For type constructors identified with this pragma, can we generate an instance of the Scalar type class automatically (instead of relying on it being in the library)?

Cross-module functionality

The various VECTORISE pragmas can be applied to imported variables and types. (For variables still needs to be implemented.) The vectorisation mappings will only be exported if the variable or type to which a pragma is applied is also exported. In other words, if we have

{-# VECTORISE SCALAR type Int #-}

where Int is imported from the standard Prelude and we want clients to treat Int as a scalar vectorised type, then Int needs to be re-exported. The re-export, effectively exports the pragma.