The Khronos Group - a non-profit industry consortium to develop, publish and promote open standard, royalty-free media authoring and acceleration standards for desktop and handheld devices, combined with conformance qualification programs for platform and device interoperability.

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Re: OpenCL C++ Bindings

Originally Posted by coleb

Originally Posted by Tanek

You can also write your own very simplified shared_ptr and give an option to the user to use the implementation they want (like you did for vector and string). I question more the design (using a standard and non intrusive reference counting) than the implementation.

That's essentially what detail::Wrapper is, an implementation of shared_ptr.

Also note the design of OpenCL C++ layer does not preclude you from using shared_ptr. The following should work:

Code :

std::shared_ptr<cl::Image2D> imagePtr;

It may even be preferable from a performance point of view since std::shared_ptr is implemented using atomics and cl::detail::Wrapper is implemented using cl(Retain/Release)*. If the above doesn't work let us know and we can make sure the OpenCL C++ layer can accomodate it.

If cl::Wrapper was an implementation of shared_ptr it would not be the parent of all the class of OpenCL or is there something I don't get? Also the constructor and destructor of Image2D are using the function retain/release (indirectly by calling the constructor/destructor of Wrapper) so I don't see how shared_ptr could be faster than using cl::Wrapper since anyway the functions of cl::Wrapper will be called by shared_ptr.

[quote:g2mg0dsh]
Premature optimization is the root of all evil. I have a production multi-threaded database server application that makes heavy use of the getInfo methods and have never seen a performance issue. If there is one the vendor should be notified.

There are good reasons to keep the objects as lightweight as possible, i.e., sizeof(cl::Context) == sizeof(cl_context). When passing the objects to an argument handler they are very easy to translate into what OpenCL C needs. Also, the interface is very easy to update when new properties are added to the various OpenCL C objects (it's a single table within the header file).
[/quote[

You convinced me on this one .

[quote:g2mg0dsh]
Thanks for the feedback, I hope you find the bindings useful enough to suit your needs.

Re: OpenCL C++ Bindings

Originally Posted by Tanek

If cl::Wrapper was an implementation of shared_ptr it would not be the parent of all the class of OpenCL or is there something I don't get?

Very likely there's something I'm not getting about shared_ptr as well, we're not allowed to use it here yet since we have to support compilers as old as GCC 3.2 so I don't have that much experience with it. It seems super cool though, especially in a multi-threaded environment.

The boost library states the following about the best practice for shared_ptr:

A simple guideline that nearly eliminates the possibility of memory leaks is: always use a named smart pointer variable to hold the result of new. Every occurence of the new keyword in the code should have the form:
shared_ptr<T> p(new Y);It is, of course, acceptable to use another smart pointer in place of shared_ptr above

To me, the only different between detail::Wrapper and shared_ptr is that detail::Wrapper uses the reference counting inherent in the OpenCL C API, versus doing its own reference counting.

Originally Posted by Tanek

Also the constructor and destructor of Image2D are using the function retain/release (indirectly by calling the constructor/destructor of Wrapper) so I don't see how shared_ptr could be faster than using cl::Wrapper since anyway the functions of cl::Wrapper will be called by shared_ptr.

Should mean that there is only one cl::Context retain call (actually, it would be implicit in the clCreateContext). The rest of the copying and reference counting in the return of the variable 'p' is done in the context of the shared_ptr object using cl::Context as a pointer (not having to call the cl::Context ctor or dtor at all).

There can be several calls (depending on how smart your compiler is as optimizing return by value) to cl::Context retain/release. Theoretically, shared_ptr and the OpenCL retain/release calls should be equivalently performant. Though it is possible that an implementation is using some heavy weight mutexes instead of the cheaper atomics that shared_ptr uses. I'm curious whether anyone has seen this because the above C++ pass by value annoyance is seen in the OpenCL C++ wrappers when cl::Kernel::setArg is called on a cl::Memory object. Since cl::Kernel::setArg is templatized to pass that argument by value there are a few extra retain/release calls incurred that would not be seen in the straight C API. There are ways to optimize this away, just curious if anyone has seen this performance hit.

Re: OpenCL C++ Bindings

Originally Posted by coleb

Originally Posted by Tanek

If cl::Wrapper was an implementation of shared_ptr it would not be the parent of all the class of OpenCL or is there something I don't get?

Very likely there's something I'm not getting about shared_ptr as well, we're not allowed to use it here yet since we have to support compilers as old as GCC 3.2 so I don't have that much experience with it. It seems super cool though, especially in a multi-threaded environment.

Re: OpenCL C++ Bindings

I'm having some difficulties with the C++ bindings. It's the same issue I was having with the Ruby bindings, when ever I tried to create a new context I would get back a -32 or CL_INVALID_PLATFORM error. I'm using a Nvidia geforce 9400 GPU with the latest drivers for it on a Fedora 10 box. The Ruby bindings author had to fix his code to work with the new CUDA 3.0 version of the Nvidia drivers. Does the C++ binding also need this fix? The context creation works for older drivers, just not the newer ones. I have tried both 195.36.15 and 195.36.24 drivers and they both have this same issue. I have also tried both the rev 48 and the previous version of the C++ bindings with no luck.

Re: OpenCL C++ Bindings

Re: OpenCL C++ Bindings

I'm following the C++ OpenCL bindings 1.0 examples and using the following:

cl::Context context(CL_DEVICE_TYPE_GPU, 0, NULL, NULL, &err);

As far as I know, creating the platform first and passing it in is not necessary and I don't have to with the Ruby bindings. I do know that once the Ruby bindings were fixed to support the new driver they work great now. I'm getting the same error with the C++ bindings, so I suspect that it's the same problem.

...
At every OpenCL function call, the ICD Loader infers the Vendor ICD function to call from the arguments to the function.
...
Functions which do not have an argument from which the vendor implementation may be inferred are ignored, with the exception of clGetExtensionFunctionAddress
...

Re: OpenCL C++ Bindings

Thanks Matrem, but I don't know OpenCL enough to determine that level of a problem. I did go ahead and put in a bug report about this and another issue I'm having with the bindings. I also confirmed that this is an issue by running the hello test program that is in the binding header file and not just an issue with my code. Here is the code I ran to test it if you are interested.