The Khronos Group - a non-profit industry consortium to develop, publish and promote open standard, royalty-free media authoring and acceleration standards for desktop and handheld devices, combined with conformance qualification programs for platform and device interoperability.

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Question about OpenCL samplers

Hello,

I am very new to OpenCL and I have a question about how samplers are dealt with.

I am dealing mostly with 3D images and as I understand it, I have to do the following:

- Create an image with createImage3D and then say all I want to do is interpolate the image using some transformation, then I create a sampler object and associate it with this 3D image and I can have continuous indexing.

I am guessing on the GPU, the sampler object binds the image to a texture and can use the hardware accelerated interpolation operations available on the GPU. But what does OpenCL do when the underlying hardware is a multicore CPU? Can a sampler image be even used? I cannot test it at the moment as my OpenCL code crashes on my Macbook Air

Also, another question that does the sampler object have a lot of memory overhead (does it replicate the data). I am trying to design an abstract image class where once the user creates an image there will be a sampler automatically associated with it, so that the resampling can be done. However, i wonder if i should create the sampler as needed and then release it.

These questions might be quite n00b and I am sorry for that. However, i would be really grateful if someone can help me with these doubts.

Re: Question about OpenCL samplers

Originally Posted by xargon

Hello,

I am very new to OpenCL and I have a question about how samplers are dealt with.

I am dealing mostly with 3D images and as I understand it, I have to do the following:

- Create an image with createImage3D and then say all I want to do is interpolate the image using some transformation, then I create a sampler object and associate it with this 3D image and I can have continuous indexing.

I am guessing on the GPU, the sampler object binds the image to a texture and can use the hardware accelerated interpolation operations available on the GPU. But what does OpenCL do when the underlying hardware is a multicore CPU? Can a sampler image be even used? I cannot test it at the moment as my OpenCL code crashes on my Macbook Air

Also, another question that does the sampler object have a lot of memory overhead (does it replicate the data). I am trying to design an abstract image class where once the user creates an image there will be a sampler automatically associated with it, so that the resampling can be done. However, i wonder if i should create the sampler as needed and then release it.

These questions might be quite n00b and I am sorry for that. However, i would be really grateful if someone can help me with these doubts.

Many thanks,

xarg

I don't know the internal details but from seeing the way they work, it seems sampler objects are like a macro or a bit-field which tells the read_image*() functions how to read the data.

I guess on a GPU they equate to bound textures, but on a CPU they just equate to a macro or a case in a switch statement as everything is just normal code there. On a CPU you might be better off doing your own interpolation as you can hard-code the data format, whereas the sampler doesn't contain this information so it still needs to query the image storage type and dimensions at run-time.

One of the main benefits on the GPU apart from it doing the address calculations for you is the non-linear memory images normally use, which improves cache coherency of certain algorithms.

I don't see why they would take up any memory, they are just a code construct, not a data one.

Re: Question about OpenCL samplers

Many thanks for the reply! I will test it on the CPU as soon as I have the code working and see what I get out of it.

I guess doing own implementation might be much more effieint on CPU as well, like you say. In my case, the data is always floating point values, so I can hard-code some of the stuff and the compiter might optimize for it.