Hi all,
I'd like to ask one thing. Is there a way to "nicely" do row-wise and
column-wise reduction or scan in pyopencl? What do I mean by that is,
that I have
1 1 1
1 1 1
1 1 1
then row-wise scan would return
1 2 3
1 2 3
1 2 3
and column one
1 1 1
2 2 2
3 3 3
Is there a way to do this? I can of course implement it with my own
kernel (though I have a feeling that my implementation is in no way
efficient), but I need this for a project I wanted to use as a showcase
of pyopencl's abilities, so I want as simple and "canonical" way to do
this as possible.
Thanks,
Jake

<#part sign=pgpmime>
On Tue, 27 Mar 2012 12:06:31 -0600, Ryan Haynes <rhaynesak(a)gmail.com> wrote:
> I have 4 54 megabyte buffers which I want to perform byte by byte
> analysis on. I can copy the data in roughly 100msec, this seems like
> decent tranfer time ~2gbyte/ second. However, when I go to execute my
> kernel the overhead passing in my device pointers is huge. Something
> like 500msec even on a no-op kernel.
What implementation are you using?
Andreas

I found a frustrating problem in pyopencl - after each kernel execution
host memory consumption increases by approx 1.5 MB. Taking into account
program workflow (modelling some amount of iterations on card, finishing
kernel, reading data via enqueue_copy, writing it to file, then starting
kernel again), I run out of my 4 GB of RAM after some thousands of such
calls.
In a previous project which was written on C++ I solved this problem this
way. An event object was dynamically created and passed to
enqueueNDRangeKernel(). After reading data from GPU event object was
deleted. Obviously it's impossible to use this method in Python.
Also it's worth to notice that the memory leak occures when using both CPU
or GPU. I'm using Intel CPU and nVidia GPU.
What can be done to fix the problem?

<#part sign=pgpmime>
Hi Vincent,
On Mon, 19 Mar 2012 09:02:35 +0100 (CET), Vincent Favre-Nicolin <vincent.favre-nicolin(a)cea.fr> wrote:
> I replaced this path in siteconfig.py by
> "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.7.sdk",
> and it seems to be working !
I've tried to fix PyOpenCL based on your feedback, can you (and Karsten,
Steve, or Lewis) please try to build current git and report back if it
works out of the box? (and if not, what needs to be fixed or what the
error is?)
Thanks!
Andreas