One of the most fundamental concepts in linear algebra is the transpose function. As implemented on a computer, a transpose changes the shape of a matrix and repositions its values in memory.

For programming modern mathematics, a transpose function is especially useful because it provides a conversion between Row-Major and Column-Major data orderings, which can be required when mixing code from languages that use different orderings, such as C and Fortran. Despite its usefulness, until very recently LAPACK hadn’t provided routines for this and other basic operations. One of the reasons for this is that these functions had been so simple to write that they didn’t need a canonical definition. For example, consider the implementation of a transpose in the loop below:

There is not much in this loop that can go wrong. When programming for the GPU, however, this function becomes significantly more complicated. Compare the loops above to the transpose example from the CUDA SDK:

Obviously, this function is not trivial. The code here models an out-of-place transpose; if it were to be in-place, it would be even more complex. This isn’t the kind of function that you can quickly write in-line when you need it. Instead, you should look to use a time-tested and high-performance library routine to do the job for you.

In the next release of CULA, we’re going to include a transpose and other auxiliary functions directly so that you do not have to write any of them yourself. Available functions will include copy, transpose, conjugate, nancheck, and others, and you can be confident in knowing that these functions have been fully validated by our extensive test suite.

With the inclusion of these new functions, we are continuing to make CULA easier and easier to use by rounding out our support for auxiliary functions, just as we did in R10 with the inclusion of Level 3 BLAS functions. What functions would make your life easier when using CULA? We’re always open to new ideas so feel free to drop by our forums and discuss any suggestions you have.