The Khronos Group - a non-profit industry consortium to develop, publish and promote open standard, royalty-free media authoring and acceleration standards for desktop and handheld devices, combined with conformance qualification programs for platform and device interoperability.

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Reverse (.rev) Vector Components

I ran into a case where it would be really useful to reverse the vector components. To create fully compliant BLAS functions, one must account for when the steps between elements (e.g., incx), are negative. Since hardware vendors may not support implicitly vectorize, elements may need to be explicitly packed into vector types.

To treat negative and non-unit increments, I copy the relevent elements from low-to-high global memory addresses to low-to-high local memory. Then I do a vload from local memory into private memory, where I currently shuffle (if needed), compute, and shuffle (again if needed) before doing a vstore to local memory, and then back into global memory.

Since the OpenCL specification already includes .hi, .lo, .even, .odd, I think .rev would be a natural addition. Of course I can continue to just use the built-in shuffle function, but then I need to create a reverse mask for each vector length. I think .hi, .lo, .even, and .odd being already in the spec. makes a reasonable argument to include .rev as well.

Re: Reverse (.rev) Vector Components

Reversing built-in vector types using shuffles and swizzles are explicit, but not readily portable for different vector lengths because function overloading in not included in the OpenCL C kernel language. Of course it's not too difficult manually reverse them, but it seems like it would be a simple thing to add to the specification to make things easier/faster and less prone to error when dealing with swizzles.