The Khronos Group - a non-profit industry consortium to develop, publish and promote open standard, royalty-free media authoring and acceleration standards for desktop and handheld devices, combined with conformance qualification programs for platform and device interoperability.

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Re: Problems with multiple commandQueues

Could there be something else at play? It should not be necessary to try to force the ordering of commands to be different: since you have two command queues they should run independently of each other and progress more or less at the same time.

Are you absolutely certain that the two command queues are associated with different devices? Can you double and triple check? clGetCommandQueueInfo() with CL_QUEUE_DEVICE should give us the answer.

The other thing I can think of would be some blocking call like WaitForEvents(), a blocking read, clFinish() or similar after step 4 or 5. Can you verify through your profiler that commands 1-10 are CL_QUEUED one right after another without having to wait for any of the previous commands to be CL_COMPLETE?

Also, I'm a bit confused by descriptions like "enqueueWriteBuffer for input data 1". Do you mean that "data 1" is one buffer object and "data 2" is a different buffer object?

Disclaimer: Employee of Qualcomm Canada. Any opinions expressed here are personal and do not necessarily reflect the views of my employer. LinkedIn profile.

Re: Problems with multiple commandQueues

Are you absolutely certain that the two command queues are associated with different devices? Can you double and triple check? clGetCommandQueueInfo() with CL_QUEUE_DEVICE should give us the answer.

I checked this with clGetCommandQueueInfo, and I am certain, that each command queue is associated with a different device.

The other thing I can think of would be some blocking call like WaitForEvents(), a blocking read, clFinish() or similar after step 4 or 5. Can you verify through your profiler that commands 1-10 are CL_QUEUED one right after another without having to wait for any of the previous commands to be CL_COMPLETE?

After each download (enqueueReadBuffer) there is a clFinish for each command queue. This clFinish is necessary to be able to merge the data (when using multiple devices, the input data is split up and after calculation has to be set together again). Do you think the problem could be here?

I could not verify that

commands 1-10 are CL_QUEUED one right after another without having to wait for any of the previous commands to be CL_COMPLETE

, because I could not find such a small grained summary in NVIDIA Visual Profiler.

Also, I'm a bit confused by descriptions like "enqueueWriteBuffer for input data 1". Do you mean that "data 1" is one buffer object and "data 2" is a different buffer object?

data 1 to 3 are data (simple arrays) created on the host, which are copied to the same buffer on the device. I hope this made it more clear.

Re: Problems with multiple commandQueues

After each download (enqueueReadBuffer) there is a clFinish for each command queue. This clFinish is necessary to be able to merge the data (when using multiple devices, the input data is split up and after calculation has to be set together again). Do you think the problem could be here?

In your version, it is clear to me, that first the 0th Queue is completed before any command of the 1st queue is enqueued. That's what makes me wonder that much, that my above code behaves the same way...

Re: Problems with multiple commandQueues

If the queue execution is lazy then there might be nothing until the finish to trigger anything to be executed at all. By doing a read and then a blocking flush there is nothing to trigger the second queue's execution until the first has completed.