OpenMP® Forum

Discussion on the OpenMP specification run by the OpenMP ARB. OpenMP and the OpenMP logo are registered trademarks of the OpenMP Architecture Review Board in the United States and other countries. All rights reserved.

Basically you can use any number of threads you like. The easiest is to set this using environment variable OMP_NUM_THREADS. You may need to set OMP_DYNAMIC to FALSE, since some run-time systems by default will not let you use more threads than the hardware supports.

Having said that, it is good practice to see find out where the optimum for an application is. You mention an 8 core system. As far as I know, an AMD core does not support additional threading, so 8 OpenMP threads are most likely your upper limit regarding performance.

But, and that is a big "but", unless you have carefully parallelized the application, you may not get much benefit from using 8 cores. This is because of Amdahl's law which shows that any non-parallel part in your code will dominant performance sooner than later.

So again, I would do some trial runs with 1,2,4,... threads and see what number gives you the most satisfactory performance.

A whole different discussion is that performance analysis tools can guide how to improve the parallel performance in case the program doesn't scale well.

Ruud

PS Since you're new to parallel programming, you may find my very extensive technical white paper on the basics of parallel programming useful. You can find it on my blog (http://blogs.oracle.com/ruud).

Cool.. I tell ya, there are many things to read and watch. I get why you suggest the performance analysis with the 1,2,4 and see what happens!

Okay.. Now what the heck are the RedhatEL6.x tools for that!?? LOL

--- Adding on to rather than add more posts.

Another great point you made was about cpu getting up to speed on processing a data set!!! Cool! The AMD 8150 has 8 MB of L3.

Just watching the first video for a second time.

----------------------------------

Second video explains a lot! I was so unsure what was what now I have a better understanding of cores, cache and threads.I am looking at a graphic of the 8150 and I see L3 and L2 @ 2MB each so that seems favourable to accessing "a dataset."

Lots to learn..

On to Video 3

-----------------------------------

So now it starts to get node-y lol.. Talking about groups of units working together over distance. 'I'll watch more but I don't know when I will be doing that

Good stuff tho.. I'll probably not comment more since I am probably seeing these things for the first time.

So far tho, the first three are really helpful to me. Clears up some of the glazing over while reading the "Using OpenMP" book because of unfamiliar terms and concepts. I see that it helps to know the architecture I would be using and, I will accept that it is true that a 5 line program can make or break any architecture. Not that I know how but I believe you Rudd.

Okay casual viewing from here on out unless there is something I think can add to this "thread." lol

It looks like you're making good progress getting started. I forgot to mention I have more tutorial material on line. It is easiest if you go to http://www.iwomp.org and select "Previous IWOMPs". I gave a tutorial at IWOMP 2011, so if you follow that link you'll find the slides by going to the IWOMP 2011 page, select Program and then Tutorial. Somewhere there you'll find the PDF files with my talks.

I also wanted to answer 2 of your questions.

There are various profiling tools available, also under Linux. For one, you can use our Performance Analyzer. It is part of our Oracle Solaris Studio compilers and tools product that you can get and use for free. Don't be confused by the name though. Several major Linux distributions are supported too. I don't want to turn this into some sort of a product promo so will give you the link only: http://www.oracle.com/technetwork/server-storage/solarisstudio/overview/index.html

But if I were you, I would first focus on getting it right. Once you have the code working correctly it is time to use a profiling tool to find out where the time is spent and to see if you can improve it.

That brings me to your other question. Yes, you can very easily call functions in parallel in OpenMP. There are several ways to do this. For example, if you have a limited set of functions to call, you can use parallel sections. Like this:

Last, but certainly not least, you can use tasks. Whether this provides the easiest solution depends on the details of what you're trying to do, but based on your description I can imagine you can assign tasks to specific data sets to be processed. In the IWOMP 2011 tutorial you can find examples of tasking.

One more, but important, comment. In all cases, your function needs to be thread safe. That means if it modifies a shared variable for example you need to make sure this happens in the right way. Through a critical section for example, but also here it is hard to generalize. Just be careful modifying shared data in a function that is executed in parallel.

You're right. Parallel programming is a challenge. We're so used to think in a serial way and I maintain my claim that "thinking parallel" is the hardest aspect of it.

I like OpenMP because it is simple in the sense that I can fairly easily test or implement an idea without having to worry about tons of low level details. Still, it is on me to find the parallelism and to make sure my parallel program is correct. That is true for any parallel programming model.

There is one advantage OpenMP has. It is possible for compilers to detect some errors and issue a warning. That is really nice and can save time, but they can't catch everything and sophisticated tools are needed. I definitely encourage you to use those. It takes time to get to know these tools, but it is time well spent.