Intel Parallel Studio 2011 Helps Out with the Hard Work

When I reviewed the first version of Intel's Parallel Studio
last year, it was already a solid tool for enhancing applications to take
advantage of parallel processing across multiple CPU cores. The latest version
of the product, which became available following last month's Intel Developer
Conference, has grown better still at enabling developers to make the most of
today's hardware.
Parallel Studio is actually a composite of four different
products: Parallel Composer, Parallel Advisor, Parallel Inspector and Parallel
Amplifier. And similar to the previous version, it's a huge plugin into Visual Studio
(versions 2005, 2008 or 2010). The two main areas of improvement over last
year's iteration of the product are in its Parallel Composer, which adds
support for new language extensions in the form of Cilk Plus, and in its
enhanced Parallel Advisor, which serves as sort of an automated parallel
programming tutor.

At this year's IDF, I spent a bit of time meeting
with the folks at Intel and speaking with them one on one, and one of the
things they told me was that they're hoping people will be able to use this
tool to "play with" parallelism and learn about it. Parallel Advisor certainly
does just that. But it's not just for learning; it's a professional-grade tool
that even parallel pros can use.

Intel Parallel Studio 2011 is priced at $799 for the full
product, with individual components (Parallel Composer, Parallel Advisor,
Parallel Inspector, Parallel Amplifier) available for $399 apiece. Parallel
Studio is also available in a free 30-day trial version. The product works with
Visual Studio 2005, 2008 or 2010 with all but the Express editions of Visual
Studio.
Parallel
Composer
Parallel Composer is the coding aspect of Parallel Studio,
and consists of extensions to the C++ language and a set of libraries that
simplify writing parallel code. The improvements to Parallel Composer are the
addition of Cilk Plus, the new version 3.0 of Threading Building Blocks and the
new (but still beta) Array Building Blocks. Together these libraries and
extensions make it considerably easier to write parallel code.

Cilk was originally a language created at MIT, and it was
based on C. It included constructs meant for parallel programming. But in July
of 2009, a company called Cilk Arts, which was the main company researching and
furthering Cilk on a commercial basis, was purchased by Intel. Intel then began
working Cilk into its C++ compiler, with the results being Cilk Plus, a set of
extensions to C++. And so now, in addition to the original support for the
OpenMP C++ extensions, the compiler allows for Cilk Plus code. And Cilk Plus
code is actually very easy to write. Here's an example line of code from the
samples:
cilk_for(int i=0; i<size; i++) {
This is for a loop that runs in parallel, using the multiple
cores when possible.
Cilk Plus actually consists only of three additional
keywords added to the C++ language: cilk_for, cilk_spawn and cilk_sync. The
cilk_spawn keyword basically spawns a function as a separate thread that runs
in parallel to the current thread. That's pretty easy. And cilk_sync waits for
called threads to complete.
Of course, since these keywords are built into the C++
language that the Intel C++ compiler recognizes, the code you write won't port
to other compilers. That may or may not be a problem for you, depending on your
needs.
The Intel C++ Compiler that ships with Parallel Studio is
considered part of Parallel Composer. In addition to the compiler, Parallel
Composer also includes Parallel Building Blocks, which is a set of two template
libraries that aid in writing parallel code. The main reason for this inclusion
is that the standard C++ library (which includes all the usual classes like std::map
and so on) isn't thread-safe. You can, technically, carefully write code with
the standard library that is thread-safe, but it's a lot of work. The advantage
to the Parallel Building Blocks, however, is that you don't need to work so
hard. The entire library is automatically thread-safe and includes a great
amount of code that takes out the headaches of worrying about who is doing what
and when.
Parallel Building Blocks is actually two distinct libraries:
Threading Building Blocks and Array Building Blocks. Threading Building Blocks
isn't new, but the version that ships with Parallel Studio 2011 is new (version
3.0). Array Building Blocks is new; it's an array library that greatly
simplifies threading with data structures. (And at the time of this writing,
the version of Array Building Blocks shipping with Parallel Studio 2011 is
technically still a beta version, although it's already quite stable.)
Parallel Advisor
2011
Writing parallel code isn't always easy, and if you already
have a large amount of code that isn't parallelized, it can be a real headache
trying to figure out how you can parallelize your code. That's where Parallel
Advisor 2011 comes in: It analyzes your program and advises you on where to add
parallelization.
When I reviewed the first version of Parallel Studio, I
mentioned that it included a product called "Parallel Advisor Lite." At the
time, it wasn't exactly clear why the word "Lite" was included in the title.
I'm still not sure of the exact answer, but I'm guessing it was because the
company was anticipating the release of the full product with the next
version-which is where we are now. The full version is called Parallel Advisor
2011, and it's a huge improvement over the Lite version of last year.
Parallel Advisor includes a window in Visual Studio that's a
workflow for analyzing and getting advice on where to parallelize your code.
The process begins with the Survey Target stage, in which the Parallel Advisor
runs your program and analyzes it while it's running to determine where
parallelization would fit.
Once the analysis is complete, you can use the tool to add
annotations to your program in proposed places where parallelization would
work. This doesn't actually parallelize the code; instead, it tells Parallel
Advisor to monitor the following code in the next step to check for possible
parallelization.
After the annotations are added, you rebuild your program,
and then run it again. The code is still in serial-not parallel-but Parallel
Advisor is now monitoring specifically these places to try to determine what
kind of performance increase you'll get if a given section is parallelized.
Once the advising is complete, you know where you should add
parallelization. You can then add the actual parallel code using Cilk Plus and
the Parallel Building Blocks. The product runs a correctness check to help suss
out potential data-sharing issues introduced by the code changes.
I tried out all these steps on one of the several sample
programs that comes with Parallel Studio and did see a definite increase in
performance. The computer I was using only had a dual-core processor, but I
could see the load from the application spread out among the pair of cores in
Windows' task manager.
While it isn't true "artificial intelligence," Parallel Advisor
is certainly a step toward exactly that. You don't have to be a total
parallelization guru to be able to analyze your code and learn how to add
parallelization to it. Parallel Advisor does a huge amount of the hard work for
you; it's almost like having one of Intel's parallel gurus sitting right there
next to you.

Jeff Cogswell is the author of Designing Highly Useable Software (http://www.amazon.com/dp/0782143016) among other books and is the owner/operator of CogsMedia Training and Consulting.Currently Jeff is a senior editor with Ziff Davis Enterprise. Prior to joining Ziff, he spent about 15 years as a software engineer, working on Windows and Unix systems, mastering C++, PHP, and ASP.NET development. He has written over a dozen books.