An instance of vex::Reductor<T,ReduceOP=vex::SUM> allows one to
reduce an arbitrary vector expression to a single value of type T. Supported
reduction operations are vex::SUM, vex::MIN, and
vex::MAX. Reductor objects are steteful – they keep small
temporary buffers on compute devices and receive a list of command queues at
construction.

Reduce operations may be combined with the
vex::CombineReductors class. This way several
reduction operations will be fused into single compute kernel. The operations
should return the same scalar type, and the result of the combined reduction
operation will be appropriately sized OpenCL/CUDA vector type.

In the following example minimum and maximum values of the vector are computed
at the same time:

One of the most common operations in linear algebra is the matrix-vector
product. An instance of vex::SpMat class holds a representation of
a sparse matrix. Its constructor accepts a sparse matrix in common CRS format.
In the example below a vex::SpMat is constructed from an Eigen sparse
matrix:

Matrix-vector products may be used in vector expressions. The only restriction
is that the expressions have to be additive. This is due to the fact that the
operation involves inter-device communication for multi-device contexts.

// Compute residual value for a system of linear equations:Z=Y-A*X;

This restriction may be lifted for single-device contexts. In this case VexCL
does not need to worry about inter-device communication. Hence, it is possible
to inline matrix-vector product into a normal vector expression with the help of
vex::make_inline():

When applied to a matrix-vector product, the product becomes inlineable. That is, it may be used in any vector expression (not just additive expressions). The user has to guarantee the function is only used in single-device expressions.

Sort and scan functions take an optional function object used for comparison
and summing of elements. The functor should provide the same interface as, e.g.
std::less for sorting or std::plus for summing; additionally, it should
provide a VexCL function for device-side operations.

Here is an example of such an object comparing integer elements in such a way
that even elements precede odd ones:

The need to provide both host-side and device-side parts of the functor comes
from the fact that multidevice vectors are first sorted partially on each of
the compute devices and then merged on the host.

Sorting algorithms may also take tuples of keys/values (in fact, any
Boost.Fusion sequence will do). One will have to explicitly specify the
comparison functor in this case. Both host and device variants of the
comparison functor should take 2n arguments, where n is the number of
keys. The first n arguments correspond to the left set of keys, and the
second n arguments correspond to the right set of keys. Here is an example
that sorts values by a tuple of two keys: