I was recently faced with a problem that often comes up in the scientific computing realm: the tsunami model that I was working with was too slow. I needed to run a job for 24780 iterations (time steps). Not realizing this was about 10 times longer than any of the test runs previously made, I started the job up on one p655 node on Iceberg (the code is serial) and waited for the results.

What I got was very discouraging. In the eight hours allowed in the standard queue on Iceberg, the code only completed 4050 iterations. This worked out to about 8.4 iterations per minute. A quick calculation showed me that I was most certainly doomed, since the full run would take roughly 49 hours to complete and, while the "single" queues at ARSC would permit such a long run, it would be impractical for the desired test and production work.

ACT II: Depression Sets In

Since I wanted to get these runs done in a timely fashion, I ruled out any significant code changes. For instance, trying to modify the code to write a restart file would be too time consuming. Writing a parallel version of the code using MPI would be a serious time sink, not to mention that these types of codes (many many iterations on relatively small grids) are not the best candidates for message passing algorithms.

I thus turned to the compiler to help with my dilemma. At this point, my compiler flags looked like this:

Where the FFLAGS are for the .f files while the FFLAGS_1 are for the .f90 files. I already had the code tuned for the architecture and at the highest level of optimization provided.

This left my next alternative as trying to use the IBM compiler's built-in auto-parallelization. Not having had much luck with this in the past, I was not optimistic. Sure enough, simply adding the -qsmp=auto switch to my compiler flags and setting the environment variable OMP_NUM_THREADS=8 in my loadleveler script bought me nothing. I was still getting roughly 8.3 iterations per minute.

To facilitate testing, I reduced the run to only 1000 iterations, or approximately 2 hours of run time.

ACT III: Relying on Moore

Next I had what I thought was a brilliant idea! We've got these new power5 nodes on Iceflyer. Maybe that'll work - make Moore's law work for me by using a bigger/better/faster machine! So, that's what I did. I moved the code to iceflyer and compiled it with some slightly modified flags:

Well, I got what Moore said I should expect. A bit less than a 2 times speedup, bringing the time down to 66 minutes for 1,000 iterations. Taking advantage of the 16 hour queue time in the p5 queue, I re-ran the entire simulation only to have the job time out after completing 15000 iterations. Close, but no cigar.

What followed was many failed attempts at slight variations. I tried auto parallelization using 4 or 8 threads. Iterations per minute improved from 15.15 (serial) to 15.38 (4 threads) to 15.63 (8 threads). Once again, virtually no gains were realized from auto-parallelization.

I also tried AIX 5.3. Since it has support for simultaneous multi-threading, I could use up to 16 threads on a single 8-processor node. Alas, the times were pretty much exactly the same as on the AIX 5.2 nodes.

ACT IV: Take Your Own Advice

Finally, I took the advice that I give in all of my classes. Start by profiling your code. See what kind of optimizations are possible. I changed up my compilations flags a bit, adding -pg -g to turn on profiling with symbol tables and adding -qreport -qsource -qlist to get full compilation reports for the code:

So, 90.9% of the execution time was being spent in the routine momt_s, which calculates momentum in spherical coordinates. I next looked at the .lst file created during compilation. Since this file had over 98,000 lines in it, I searched for momt_s and found:

So 90.9% of my run time is spent in a routine that the compiler will not automatically parallelize for me. What to do...

...don't miss the thrilling conclusion in the next newsletter:

ACT V

Directives Save The Day

ACT VI

Conclusions of the Super-Linear Kind

Java for Fortran Programmers: Part I

[[ Thanks to Lee Higbie of ARSC for this tutorial. ]]

This is the first in a series of articles, presented as a tutorial, for scientists and engineers. Some knowledge of C is useful, but I will not assume that you know C++ or any other object oriented language.

My planned tutorial outline is:

Java's Uses for the Scientific and Engineering Community

Object Oriented Programming (OOP)

How the OOP mindset differs from that usual for Fortran programmers

How the OOP syntax differs from that of Fortran and C

Interfacing Java and Fortran programs

Creating Java programs

Example

How far and deep I go will depend on feedback. If this topic interests you, let me or one of the editors know!

This initial part of the tutorial is expected to interest new scientific and engineering programmers or programming managers, those considering a new project and wondering if Java might be a good choice. After this initial background, the material will become more technical and should interest programmers who are starting to learn Java or have picked up a little in the past.

Java's Uses for the Scientific and Engineering Community

Java is easy to use, but it has a steep learning curve if you've never used an object oriented programming language. OOPs require a different mindset from that for imperative languages (like Fortran and C). Unlike C++, where it is easy to write a conventional (imperative) program by only using C, Java is more aggressively object oriented--even HelloWorld uses an object.

In our world Java is especially suited for GUIs and support programs and I doubt I'll see it used for a major, computation-intensive application. Though unsuited for heavy computational work, Java is a well designed OO language with many good features. Some are:

It includes an automatic documentation system. Stylized comments can be used to describe parts of a code and the documentation is automatically generated from them and the code.

The are several large libraries of GUI widgets that allow control programs to interact visually with users.

It is highly portable. With minimal care applications can be written that will run on most platforms.

It was designed from the beginning for applets, programs that run in web browsers. An applet allows the user to safely run a program from a workstation.

It has a built-in structure for creating and vetting exceptional conditions. A method can create an exception and force its users to deal with the exception.

It has built in functionality and syntax to eliminate many of the problems that crop up in C++ programs (memory leaks, wandering pointers, weak typing, implicit type conversions, ...)

(I've measured a 4:1 slowdown when doing simple array computations in Java instead of Fortran. Carbon-based systems[*] react slowly so it works well for interacting with them.)

Object Oriented Programming (OOP)

So what is an object oriented programming language? The four defining characteristics of OOPs are:

Encapsulation
A single block of code, called a class, defines a data structure and the procedures for operating on it, called methods. Classes often include methods and variables that are hidden from users, which facilitates changing algorithms or code without users of the class knowing about it.

Inheritance
A class can include the data and variables of a parent class. This is especially useful for libraries and is an important concept to understand. For example, PopupMenu extends Menu extends MenuItem extends MenuComponent extends Object. This means that the methods for adding items to a PopupMenu are not recoded but are taken exactly from Menu and the event handling methods of MenuItem are directly inherited by any PopupMenu, and so on.

Polymorphism
. Methods (functions) can be called with a variety of arguments. The number and type of arguments is not constrained. In object-oriented languages it is common for

A method to set some default parameters then call the general version of the method

For inherited methods (methods from a class being extended) to provide variants that accept different arguments

For a method to convert the argument types and call the general version of the method

The basic unit of code is a class, which encapsulates a data structure and the methods for working with it. For example, the String class includes almost two score methods of its own, it inherits another half dozen from Object, the ultimate parent of all classes, and has polymorphic variants on many of these methods. The emphasis of an OOP is on the classes and their data, not on the flow of logic or control.

There is one more bit of basic OOP terminology that is needed to discuss OOP programs. As mentioned, a class is the code that describes a data structure and includes the methods (functions) for operating on it. The actual data structure is called an object, but don't confuse this with the class Object (upper case oh), which is the ultimate parent of all Java classes. Just as you might have dozens of strings in a program, in an OOP you may have dozens of string objects, each of which is an instance of the String class for a Java application.

So, how does Java measure up? It has all of these characteristics but also has basic, non-object data. Logical, various types of integers, floating point and character data are available facilitating basic imperative programming. In Part II, I will provide an example to illustrate the basic parts of code.

This article has described some of the places where scientific and engineering programmers might apply Java in their work. I have introduced the top level of OOP terminology. I'll recap with a dictionary translating the Fortran terminology used here to Java.

Fortran term

Java term

Explanation

function

method

parameters passed by value, polymorphism rampant

structure declaration

class

class includes code, usually one to a file

structured variable

object (small oh)

object also owns all its class's methods

subroutine

method with void type

(no returned value)

type

conversion cast

syntax--use type in parens: x = (real) i;

This article has covered the first two tutorial topics. We'll pick it up again with:

How the OOP mindset differs from that usual for Fortran programmers

--

[*]
Footnote:
"Carbon-based systems": a euphemism for people. Those unfamiliar with this term are referred to Star Trek, where, I think, the Borg referred to the astronauts as a carbon-based infestation.

Quick-Tip Q & A

A:[[ I am writing a script which looks at the extension of a file.
[[ So far I'm not too committed to a particular scripting language.
[[ Is there an easy way to get the extension of a file without
[[ using sed!
#
# Lorin Hochstein
#
In tcsh, the ":e" variable modifier will extract the extension of a
file. Also useful: the ":r" modifier will extract the name without the
extension.
$ set x="filename.txt"
$ echo $x:e
txt
$ echo $x:r
filename
#
# Harper Simmons
#
using csh/tcsh (I know, I know, uncool)
set a = roo.dat
set ext = $a:e
echo $ext
produces "dat"
#
# Ryan Czerwiec
#
For csh/tcsh this will work (there will be a similar answer for
sh/bash/ksh):
If your filename is stored in the variable "file,"
then the extension "ext" can be obtained with:
set ext = `echo $file
tr "." " "`
This will create an array where the extension is the last element,
or ext[$#ext]. This can also be useful if you need to reassemble the
filename with a different extension, for example.
This version uses less memory (it doesn't create an array), but it's
a little slower:
set ext = `echo $file
tr "." "\n"
tail -1`
You can do it a little more simply if you happen to know that all of
your filenames will have the same number of "." characters in them:
set ext = `echo $file
cut -d'.' -f2`
where the example of -f2 is for a file with one "." character. Use a
number one higher than the number of dots as long as that number is
fixed (you can use a variable for it, too, as in -f$num).
#
# One Editor:
#
You can use "expr" regular expressions. E.g.:
$ expr this.is.a.test : ".*\.\(.*\)"
test
#
# Other Editor:
#
I would use one of the bash pattern matching operators to do this.
${val##pattern}
This operator does the following: If pattern matches the beginning
of the variable $val it deletes the longest part that matches then
returns the rest of the string.
So the following pattern will return the extension as long as there
is a least one dot in the filename.
${val##*.}
If the filename might not have a dot in it, we can check for that
using grep:
for f in *; do
if [ ! -z "$(echo $f
grep "\." )" ]; then
echo ${f##*.};
fi
done
Alternately, we can eliminate the grep by ensuring there is a dot
in the filename. E.g.:
for f in *.*; do
echo ${f##*.};
done
Q: Here's a conditional statement grabbed from the (/bin/sh)
configure script for mysql. There are many like this:
if test X"$mysql_cv_compress" != Xyes; then
# ...do stuff...
fi
For my scripts, the following style has always worked:
if [[ $mysql_cv_compress != yes ]]; then
# ...do stuff...
fi
So, two questions:
1) Why would the experts use "test" rather than the square bracket
syntax?
2) Why bother with that "X" ???

The University of Alaska Fairbanks is an affirmative action/equal
opportunity employer and educational institution and is a part of the University
of Alaska system.
Arctic Region Supercomputing Center (ARSC) |PO Box 756020, Fairbanks, AK 99775 | voice: 907-450-8602 | fax: 907-450-8601 | Supporting high performance computational research in science and engineering with emphasis on high latitudes and the arctic.
For questions or comments regarding this website, contact info@arsc.edu