The Khronos Group - a non-profit industry consortium to develop, publish and promote open standard, royalty-free media authoring and acceleration standards for desktop and handheld devices, combined with conformance qualification programs for platform and device interoperability.

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Re: basic question regarding get_global_id

Hopefully this has been cleared up in the other thread.

get_global_id returned the index into the work for a given work item. Each work item will get a different id. Each unit of work will get executed. So if you have a 1D array to process, pass the size as the global size and write your kernel to use get_global_id to figure out which item to process. Study the example code; these are critical concepts to understand fully.

You only need get_global_id(1) for 2D kernels (and likewise get_global_id(2) for 3D kernels). You can't do more than 3D kernels; higher dimensional work has to be subdivided down into 1D, 2D, or 3D.

Re: basic question regarding get_global_id

size_t id=get_global_id(0);
it means id =0 for 1st core of GPU
so it will pass to
output[0]=input[0]*input[0];

means first element of input array after squaring will be stored in output array
if initial say 5 core are busy and statement gives
size_t id=get_global_id(0);
id =6;
so it will pass to array so
output[6]=input[6]*input[6];
means wt about initial 5 element of the array which i have declared

Re: basic question regarding get_global_id

How is it that initial 5 cores are "busy"?

It doesn't matter how many cores there are or what other work they are working on. If you specify a certain global work size, all units of work will get done. If not at the same time (which can happen for global work sizes smaller than the core count), then in groups.

Example:
4 core machine.
Global size = 10.
First work group will process 0 to 3.
Second work group will process 4 to 7.
Third work group will process 8 and 9.Now all work items have been processed.The OpenCL runtime handles breaking up the global work size into work groups and assigning each work item a unique ID.

With a GPU work groups are larger; often 32 or 64 but they can be bigger. To get best performance you should design your work so the global work size is in the thousands or tens of thousands. If your work is smaller than that then the device can be underutilized and performance will suffer.

Re: basic question regarding get_global_id

get_global_id returns the number for the current thread. The parameter is just the dimension of the array of threads. When you enqueue a kernel, one of the parameters is an int array global_work_size. If global_work_size is an int[2] (2 dimensional array of threads) then each thread will have a 2 dimensional identifier. Lets say global_work_size[0] = 3 and global_work_size[1] = 3 then there will be 3 * 3 = 9 threads in total in a grid something like this:

Re: basic question regarding get_global_id

means if one kernel use say global work size =10 and other kernel uses the global work size =10,in both get_global_id will be 0,and i not getting because which link u have provide in that link u told that global id is unique across gpu so if there are 2 kernel using same gpu then both will return get_global_id=0 or other because u told that get_global_id is unique across the gpu in image u can see which u have given about global_id.

Re: basic question regarding get_global_id

But if you have two kernels it's OK for get_global_id to return the same value, just like it's OK for two houses to have the same house number, because they are on different streets. It's just like having two C arrays; they each have an index 0, but they refer to different elements because there are two arrays.

In your array_sum example, get_global_id(0) will return 0 to the global size minus one. If you have another kernel (perhaps called array_difference) then get_global_id(0) will again return 0 to global size minus 1.

I wonder if you are confusing "kernel" with "work item"? I kernel is a piece of code that gets executed across a number of work items in parallel. Inside each work item, you can find out which item of work you are supposed to do by using get_global_id.

An analogy would be if I had a team of interns to grade papers. I say "start" and the first thing they each need to do is grab a paper, but which one? So they each call get_global_id and I give them each a unique paper number (index) and they grab that paper and grade it. In your example kernel (array_sum) each work item calls get_global_id in order to figure out which array element to sum.