The professional, friendly Java community. 21,500 members and growing!

The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.

Fine-Grained and Coarse-Grained actions - Help me understand

I have a question around fine-grained and coarse-grained atomic actions.

Could someone please give me a very basic description on what each one is.

As far as I understand, coarse-grained is built up of multiple fine-grained actions in that it being something like a synchronized method.

But for fine-grained I am unsure on exactly what this means, i know it is an instruction which has either happened or hasn't in that it have either processed everything or nothing at all. An as far as my research goes it has led me to believe that the following is an example of a fine-grained atomic action:

int x = 0;

where as:

int x = y+3;

isnt due to the fact it must read and process data before doing a direct write to a memory location.

And it's possible to have various grain sizes between these (for example adding up 10 elements/task, 100 elements/task, etc. etc.).

CPU's are designed to have "heavy weight" threads, i.e. they are capable of doing a lot of things at the cost of performance (speed). In the above example code, it would be silly to start a thread and have it add up two numbers then save the result. The amount of time it takes to start the thread and mange it exceeds the time it takes to add up a single element. The overhead is simply too great. However, if you end up with a trillion elements per job, the overhead becomes extremely small. If there are any idle processing units, then your program isn't utilizing the full computational potential of your computer and will run slower. The trick is to balance the right granularity size so that the overhead isn't unbearable, but also ensure that you are able to utilize as much of your computers resources effectively (there is also another issue of load balancing, but that's a slightly different issue).

To contrast this, in the context of the GPU, it has lots and lots of very light-weight threads. These threads aren't able to do as much as a CPU thread, but their primary advantage is that it's very easy to create lots of them very quickly. It's not unheard of to have a GPGPU kernel create and run thousands or tens of thousands of threads. Additionally, each core on the GPU is slower than that of the CPU. This lends GPU programming to benefit the most from having lots of very fine grain tasks (there are other reasons for this, too but I won't go into them here). On the GPU, adding just one element up or a handful of elements per grain/task is quite common.