The Khronos Group - a non-profit industry consortium to develop, publish and promote open standard, royalty-free media authoring and acceleration standards for desktop and handheld devices, combined with conformance qualification programs for platform and device interoperability.

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Reuse device buffer across kernel batches

Just seeking some advice for an implementation where I'm batching kernel runs on a device and there's particular read only buffers I would like to reuse across these runs.

Each batch consists of 1,000,000 kernel threads. I need to batch them because I parse an array of structs where each struct contains values that kernel thread writes to. If I didn't, it would require 39GB of device memory.

So in my host loop I build an array of structs "clmodels" for 1,000,000 items and fire it off to the kernel in such fashion:

That additional read only buffer "d_clvar" is the one I want to reuse. It contains a struct of variables that get read in once by the main host program and never changed again.

So my question, how can I create that d_clvar buffer so that I can re-use it across my batched host loops without having to call enqueueWriteBuffer and hence make an expensive device memory copy operation every time. Basically, I want to write it once into device memory and use it for each new kernel run.

Re: Reuse device buffer across kernel batches

Re: Reuse device buffer across kernel batches

Thanks for the response.

The code actually runs. That's not the issue.

The issue is I want to reduce unnecessary memory copies to hopefully speed things up. There is actually more than d_clvar being parsed. What I pasted is just a sample. I have a read-only 2D array (that I'm accessing in a 1D fashion) as well that needs to be pushed to device memory but never changes. It doesn't make sense to copy it for each batch but I'm doing it this way at the moment as I'm having issue doing otherwise.

If I try and move the buffer creation and copying outside of the loop and parse this to the kernel for each batch loop I get memory errors.

Can anyone provide any advice on buffering "objects" in device memory for essentially being accessed by multiple calls of the same kernel (in a loop). I.e. write-once-read-many .

Re: Reuse device buffer across kernel batches

As I said before: this must work. Once you have your copy on the device you can re-use the memory object for as many kernel calls as you like.

It doesn't make sense to copy it for each batch but I'm doing it this way at the moment as I'm having issue doing otherwise. […] If I try and move the buffer creation and copying outside of the loop and parse this to the kernel for each batch loop I get memory errors.