The Khronos Group - a non-profit industry consortium to develop, publish and promote open standard, royalty-free media authoring and acceleration standards for desktop and handheld devices, combined with conformance qualification programs for platform and device interoperability.

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Error propagation from devices to application level

Hi,

We are using the NVIDIA's OpenCL conformant 1.0 SDK to develop an application. We are compiling the OpenCL code and getting the ".ptx" files as binaries ( different devices have different formats ). When we try to run the code we get an "CL_OUT_OF_RESOURCES (-5)" error. Therefore there is a problem with device capabilities. How can we learn which property had caused the error ? Is there a possibility to propagate device specific error to the application level ?

Re: Error propagation from devices to application level

I think somebody from GPU team posted a reply to similar question. It is likely due to too many threads being spawned. Make sure you do not exceed device limitations. I use the following code to find out what my device (CPU) can do. May need some tweaking for GPU device.

Re: Error propagation from devices to application level

The reliable way to determine the max. size of the work-group i.e. the number of work-items that can be specified in local_work_size argument to clEnqueueNDRangeKernel is to call clGetKernelWorkGroupInfo for specific kernel, device with param_name = CL_KERNEL_WORK_GROUP_SIZE. This will return the work group size that can be used. Note that the answer can vary from kernel to kernel so using just the device max values is not sufficient.

Re: Error propagation from devices to application level

The problem as far as we can understand is that the kernel uses too many registers for a single thread. Thus, when we try to set the block size as even a small value such as 8 x 8, the kernel fails. I guess clGetKernelWorkGroupInfo is what we are looking for. Thanks for pointing it out.

Re: Error propagation from devices to application level

Keep in mind that if clGetKernelWorkGroupInfo returns a maximum size of, say, 128, you may be limited in how you can use it. Use clGetDeviceInfo to query the maximum size you are allowed in each dimension. For example, one device may allow a maximum workgroup size of 8 in the 3rd dimension, so you would have to obey that when enqueuing your kernel.