I'm trying to use clEnqueueReadBufferRect to read back a sub-matrix (for use with clBlas), but can't get the region parameter to work for this - despite being an array of 3 size_ts, it always copies a continuous region.

eg. the below example - compiled against the current version of APP on xubuntu 12 / amd64 :

Runtime didn't handle Read/WrriteRect with the pitch value for DMA engine. The issue will be fixed in the new release. As a temporary workaround try to force the allocation to host memory - add CL_MEM_USE_HOST_PTR flag when you create a buffer(you will need to allocate extra system memory and align it to 4K). Please note, your application will have slow kernel executions, but you can test the functionality. Please don't forget to remove the flag and extra allocated memory with the new drivers.