The Khronos Group - a non-profit industry consortium to develop, publish and promote open standard, royalty-free media authoring and acceleration standards for desktop and handheld devices, combined with conformance qualification programs for platform and device interoperability.

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

OpenCL Synchronization between workgroups.

I am actually looping an openCL call to a kernel several times.

The kernel command is called within a loop in the host program and in each iteration we wait for the commands to complete(using clFinish), assign the buffer values to an another buffer(consider it as previous values buffer) and then continue to the next iteration, where the kernel is again called as shown below.

loop n times

{

1.call kernel(uses the previous buffer values for updating the current buffer values, which includes the 4 neighbours if we consider a 2d grid)

2.wait for the kernel command to finish.

3.copy the current buffer values to the previous buffer.

4.wait for all commands to finish.

}

The values from the previous values buffer could be used to update the current buffer values , however the current buffer values are updated based on the neighboring values(in the corresponding previous values buffer from previous iteration) within the same work-group(wave front) or the next work-groups but the previous work-group values in the previous values buffer are completely neglected by OpenCL. Theoretically all the neighbors including corresponding values in the previous work-groups if present should also be considered. After each clFinish all the values in current buffer are updated then only we copy these values to the previous values buffer, so theses values are available in the next kernel call. My point is why its not working as expected even tough, previous values buffer is declared global and also a read only buffer so we cannot assign values to previous values buffer within the kernel?

Sorry for such a long explanation , I wanted to make my problem clear. The problem becomes clearer when attached kernel code is seen.

Re: OpenCL Synchronization between workgroups.

Why all the math using the work group side, etc? I'd think you could calculate the index of these values simply using the global_id.

Why the barrier at the end of the kernel? It's not doing anything except slowing you down.

If your goal is to iteratively run this simulation over and over, I'd suggest a double-buffer setup where you run from buffer 1 to buffer 2, then buffer 2 back to buffer 1, etc. Then you don't need to copy buffers.

Avoid using the CPU to do any of the buffer management.

Pay attention to what "read only" and "write only" mean; with OpenCL buffers this is often from the point of view of the GPU, not the CPU. I'd suggest stating with read/write buffers and only change them after you get it working.

The best way to develop OpenCL kernels is to start with something simple that works and then make it more complex. I'd pare back your kernel to the simplest thing that works and then start making it do what you need it to do.

Like the other poster said, it's hard to debug this without the host code or the other kernel.