Original multidimensional array reference paper with
motivation, specification, and examples.

P0009r1

Revised with renaming from view to array_ref
and allow unbounded rank through variadic arguments.

P03310
(current)

Multidimensional array reference motivation and examples
moved from P0009.

References

P0009r2

Multidimensional array reference specification

P0332

Relaxed array declaration

P0122

span: bounds-safe views for sequences of objects

earlier related papers: N4512, N4355, N4300, N4222

1 Motivation for polymorphic multidimensional array reference

Multidimensional arrays are a foundational data structure
for science and engineering codes, as demonstrated by their
extensive use in FORTRAN for five decades.
A multidimensional array reference is a reference to a memory extent
through a layout mapping from a multi-index space (domain) to
that extent (range).
A array layout mapping may be bijective as in the case of a traditional
multidimensional array, injective as in the case of a subarray, or
surjective to express symmetry.

Traditional layout mappings have been specfied as part of the language.
For example, FORTRAN specifies column major layout and
C specifies row major layout.
Such a language-imposed specification requires signficant code refactoring
to change an array's layout, and requires significant code complexity to
implement non-traditional layouts such as tiling in modern linear algebra
or structured grid application domains. Such layout changes are required
to adapt and optimize code for varying computer architectures; for example,
to change a code from array of structures or row major storage
to structure of arrays or column major storage.
Furthermore, multiple versions of code must be maintained for
each required layout.

A multidimensional array reference abstraction
with a polymorphic layout is required
to enable changing array layouts without extensive code refactoring and
maintenance of functionally redundant code.
Layout polymorphism is a critical capability; however, it is not the only
beneficial form of polymorphism.

The Kokkos library (github.com/kokkos/kokkos) implements
multidimensional array references with polymorphic layout,
and other access properties as well.
Until recently the Kokkos implementation was limited
to C++1998 standard and is incrementally being refactored
to C++2011 standard.
Additionally, there is a standalone reference implementation of this proposal
which is publicly available on github (github.com/kokkos/array_ref).

2 Motivation for Syntax

2.1 One-Dimensional Array

A reference to a one-dimension array is anticipated to subsume the functionality
of a pointer to a memory extent combined with an array length.
For example, a one-dimensional array is passed to a function as follows.

The const-ness of an array_ref is analogous to the const-ness
of a pointer.
A const array_ref<D> is similar to a const-pointer in that the array_ref
may not be modifid but the referenced extent of memory may be modified.
A array_ref<const D> is similar to a pointer-to-const in that the
referenced extent of memory may not be modified. These are the same const-ness
semantics of unique_ptr and shared_ptr.

The T[] syntax has precedence in the standard; unique_ptr supports this
syntax to denote a unique_ptr which manages the lifetime of a dynamically
allocated array of objects.

2.2 Traditional Multidimensional Array with Static Dimensions

A traditional multidimensional array with static dimensions
(for example, an array of 3x3 tensors) is passed to a function as follows.

Support for static extents is an essential performance feature
of the proprosed array_ref.
First a compiler can optimize the index-to-object mapping computation
and second the array_ref implementation can eliminate storage for
static extents.
Consider the following example where L is only known at runtime.

The member access mapping can be optimized to a single integer
multiply-add A.ptr[i*9+7]
because the implementation of A(i,2,1) is A.ptr[((i)*3+j)*3+k]
and j=2 and k==1 are statically determined.
The sizeof(A) can be sizeof(double*)+sizeof(size_t)
because storage is required for only the pointer and dynamic extent.

2.3 Multidimensional Array with Multiple Dynamic Dimensions

The current multidimensional array type declaration in n4567 8.3.4.p3
restricts array declarations such that only the leading dimension
may be implicit.
Multidimensional arrays with multiple implicit (dynamic) dimensions as well as
static dimensions are supported with the dimension property.
The dimension property uses the "magic value" zero to denote an
implicit dimension.
The "magic value" of zero is chosen for consistency with std::extent.

2.4 Preferred Syntax

We prefer the following concise and intuitive syntax for arrays
with multiple implict dimensions.

array_ref<int[][][3]>y;// concise intuitive syntax

However, this syntax requires a
relaxation of the current multidimensional array type declaration
in n4567 8.3.4.p3, as proposed in P00332.
Furthermore, this concise and intuitive syntax eliminates the need
for array_property::dimension<...> and the associated "magic value"
of zero to denote an implicit dimension.

3 Array Properties

3.1 Layout Polymorphism

The array_ref::operator() maps the input multi-index from the array's
cartesian product multi-index domain space to a member in the array's range space.
This is the layout mapping for the referenced array.
For natively declared multidimensional arrays the layout mapping
is defined to conform to treating the multidimensional array as
an array of arrays of arrays ...; i.e., the size and span are
equal and the strides increase from right-to-left
(the layout specified in the C language).
In the FORTRAN language defines layout mapping with strides
increasing from left-to-right.
These native layout mappings are only two of many possible layouts.
For example, the basic linear algebra subprograms (BLAS) standard
defines dense matrix layout mapping with padding of the leading dimension,
requiring both dimensions and LDA parameters to fully declare a matrix layout.

A property template parameter specifies a layout mapping.
If this property is omitted the layout mapping of the array reference
conforms to a corresponding natively declared multidimensional array
as if implicit dimensions were declared explicitly.
The default layout is regular - the distance is constant between
entries when a single index of the multi-index is incremented.
This distance is the stride of the corresponding dimension.
The default layout mapping is bijective and the stride increases
monotonically from the right most to the left most dimension.

An initial set of layout properties are
layout_right, layout_left, and layout_stride,

namespace std {

namespace experimental {

namespace array_property {

struct layout_right ;

struct layout_left ;

struct layout_stride ;

}}}

A void (a.k.a., default or native) mapping is regular and bijective with
strides increasing from increasing from right most to left most dimension.
A layout_right mapping is regular and injective (may have padding) with
strides increasing from right most to left most dimension.
A layout_left mapping is regular and injective (may have padding) with
strides increasing from left most to right most dimension.
A layout_stride mapping is regular; however, it might
not be injective or surjective.

// The right and left layout mapping of a rank-four
// multidimensional array could be is as if implemented
// as follows. Note that padding is allowed but not required.
template<size_tN0,size_tN1,size_tN2,size_tN4>size_tright_padded_mapping(size_ti0,size_ti1,size_ti2,size_ti3){constsize_tS3=// stride of dimension 3
constsize_tP3=// padding of dimension 3
constsize_tP2=// padding of dimension 2
constsize_tP1=// padding of dimension 1
returni0*S3*(P3+N3)*(P2+N2)*(P1+N1)+i1*S3*(P3+N3)*(P2+N2)+i2*S3*(P3+N3)+i3*S3;}template<size_tN0,size_tN1,size_tN2,size_tN4>size_tleft_padded_mapping(size_ti0,size_ti1,size_ti2,size_ti3){constsize_tS0=// stride of dimension 0
constsize_tP0=// padding of dimension 0
constsize_tP1=// padding of dimension 1
constsize_tP2=// padding of dimension 2
returni0*S0+i1*S0*(P0+N0)+i2*S0*(P0+N0)*(P1+N1)+i3*S0*(P0+N0)*(P1+N1)*(P2+N2);}

3.2 Extensible Layout Polymorphism

The array_ref is intended to be extensible such that a user may supply
a customized layout mapping.
A user supplied customized layout mapping will be required to conform
to a specified interface; a.k.a., a C++ Concept.
Details of this extension point will be included in a subsequent
proposal.
Our current extensibility strategy is for
a user supplied layout property to implement an offset mapping.

Motivation: An important customized layout mapping is hierarchical tiling.
This kind of layout mapping is used in dense linear algebra matrices and
computations on Cartesian grids to improve the spatial locality
of array entries.
These mappings are bijective but are not regular.
Computations on such multidimensional arrays typically iterate
through tiles as subarray of the array.

Note that a tiled layout mapping is irregular and if padding is
required to align with tile boundarries then the span will exceed the size.
A customized layout mapping will have slightly different requirements
depending on whether the layout is regular or irregular.

3.3 Bounds Checking

Array bounds checking is an invaluable tool for debugging user code.
This functionality traditionally requires global injection through
special compiler support.
In large, long running code global array bounds checking introduces
a significant overhead that impedes the debugging process.
A member access array bounds checking array property allows
the selective injection of array bounds checking and removes
the need for special compiler support.
A high quality implementation of bounds checking would output the
array bounds, multi-index, and traceback of where the array bounds violation occured.

3.4 Future Possible Extensions

The array_ref abstraction and interface has utility
well beyond the multidimensional array layout property.
Other planned and prototyped properties include specification
of which memory space within a heterogeneous memory system
the referenced data resides on and algorithmic access intent properties.
Examples of access intent properties include

read-only random with locality such that member queries are
performed through GPU texture cache hardware for GPU memory spaces,

non-temporal such that member access operations can be overloaded
with non-caching reads and writes, and

restrict to guarantee non-aliasing of referenced data within the
current context.

4 Subarrays

The capability to easily extract subarrays of an array,
or subarrays of subarrays, is essential to many array-based
algorithms.

usingU=array_ref<int,array_properties::dimension<0,0,0>>;Ux(buffer,N0,N1,N2);// Using std::pair<int,int> for an integral range
autoy=subarray(x,std::pair<int,int>(1,N0-1),std::pair<int,int>(1,N1-1),1);assert(y.rank()==2);assert(y.extent(0)==N0-2);assert(y.extent(0)==N1-2);assert(&y(0,0)==&x(1,1,1));// Using initializer_list of size 2 as an integral range
autoz=subarray(x,1,{1,N1-1},1);assert(z.rank()==1);assert(&z(0)==&x(1,1,1));// Conveniently extracting subarray for all of a extent
// without having to explicitly extract the dimensions.
autox=subarray(x,array_property::all,1,1);

The subarray() function returns an unspecified instantiation
of array_ref<>.
Note that there is precedence in the standard for library functions with
unspecified return types
(e.g. bind()).

4.1 Subarray Type Deduction

The subarray function returns array_ref<deduced...>.
The return type is deduced from the input array_ref and the slicing argument pack.
The deduction rules must be defined to insure correctness and
should be defined for performance.
For example, a simple rule wuld define the returned type to always
have a strided layout. While correct there are many use cases
where a better performing layout can be deduced.

Subarray type deduction is necessarily dependent upon the layout.

4.2 Example Usage in an 8th Order Finite Difference Stencil

The subarray interface provides a powerful mechanism for accessing
3-dimensional data in numerical kernels in a fashion which utilizes performant
memory access patterns and is amenable to compiler-assisted vectorization.

The following code is an example of a typical finite difference stencil which
might be used in a computational fluid dynamics application. This code utilizes
operator splitting to avoid vector register pressure and moves through memory
in unit stride to facilitate optimal memory access patterns. With the addition
of compiler alignment hints (as well as padding and aligned allocations to make
those assumptions true) and compiler directives or attributes to indicate that
the input pointers do not alias each other, this code would vectorize well on a
traditional x86 platform.