ARSC T3D Users' Newsletter 44, July 14, 1995

More on Fixed and Plastic Executables

In last week's newsletter, I described one Craft Fortran program that did not show much of a
speedup when compiled as a fixed executable as opposed to a "plastic" executable. Of course, a
possible speed improvement is only one benefit from such a switch. Dr. Ming Jiang, a member of
the UAF faculty and the ARSC staff sends in this further note:

I also thought that the -X npes switch would make more efficient use of memory, but from the
test case below, it seems that the plastic executable is implemented as efficiently as the fixed
executable. To check out how much of a size advantage the fixed executable has over the plastic
executable, I devised this small program:

This program has one large two dimensional array of size NMAX*NMAX which contains the
integers 1, 2, 3, ... NMAX*NMAX. The program uses as many processors as it can to sum the elements
of the two dimensional array and then checks the result with the identity:

1 + 2 + 3 + ... + n = ( n + 1 ) * n / 2

By varying the value NMAX, we can see how large a 2 dimensional array is allowed when the
program is compiled for a fixed executable. Finding the maximum value for NMAX doesn't require
much searching because Craft Fortran requires that the shared array "a" have a leading dimension
that is a power of 2. Just for fun, I also computed the speed of the shared initialization loop
and summation loop. The table below summaries the results:

Because each doubling of NMAX causes a quadrupling of the memory required by array "a", the
pattern of largest problems that fits seems to make sense. The use of the -X npes flag shows that
the largest problem that physically fits in memory can be solved is solved. And for this problem,
the current implementation of compiling a "plastic" executable is as efficient in memory use as
compiling for a fixed executable. But as in the last newsletter there is a slight speed
improvement for executables targeted for a fixed number of PEs.

There is a significant advantage for the fixed executable in that the size of its a.out is
much smaller than of the the plastic executable. For this program, the executable sizes are:

and the size of the fixed executable was the same whatever the value of npes in the
compilation command cf77 -X npes ...

On the way to producing the above tables I ran into several interesting error messages:

Operand Range Error When compiling a problem larger than what will physically fit
either as a fixed or plastic executable, there is no error message (Although CRI sells only
8MW nodes, so there is a fixed limit for the case of the fixed executables). Only at execution
time does the user get the catch-all "Operand Range Error". Increasing NMAX until this error
message appears is how I determined the largest problem that fits.

The fixed limit on PE_PRIVATE arrays If the array "a" is not declared SHARED then it
is a PE_PRIVATE array and will be contained in the memory of each PE. In compiling both the
plastic and fixed executable, the compiler gives a good error message for arrays too large
for a single PE:

So the size of a shared array seems to have no real limit except for maybe artifical
programs like the "linpeak" benchmark.

Fixed objects take precedence over plastic objects

When dealing with object files that have been compiled as fixed and plastic, the
executable that mppldr produces is always fixed. And mppldr believes the user knows what
he is doing, because there is no warning message.

In the case when the "fixedness" varies among objects, the mppldr knows the user needs
helps and gives an appropriate error message:

/mpp/bin/cf77 -X1 second.f
/mpp/bin/cf77 -X128 suma.f second.o -o suma
mppldr-302 cf77: WARNING
The number of PEs compiled into module 'SECOND' (1) differs
from the number of PEs compiled into a prior module (128).
mppldr-112 cf77: WARNING
Because of previous errors, file 'suma' is not executable.

Release 1.2.2 of CrayLibs

ARSC is planning to make available the 1.2.2 release of
CrayLibs as soon as it is released. Watch this newsletter for further details.

List of Differences Between T3D and Y-MP

The current list of differences between the
T3D and the Y-MP is:

Data type sizes are not the same (Newsletter #5)

Uninitialized variables are different (Newsletter #6)

The effect of the -a static compiler switch (Newsletter #7)

There is no GETENV on the T3D (Newsletter #8)

Missing routine SMACH on T3D (Newsletter #9)

Different Arithmetics (Newsletter #9)

Different clock granularities for gettimeofday (Newsletter #11)

Restrictions on record length for direct I/O files (Newsletter #19)

Implied DO loop is not "vectorized" on the T3D (Newsletter #20)

Missing Linpack and Eispack routines in libsci (Newsletter #25)

F90 manual for Y-MP, no manual for T3D (Newsletter #31)

RANF() and its manpage differ between machines (Newsletter #37)

CRAY2IEG is available only on the Y-MP (Newsletter #40)

Missing sort routines on the T3D (Newsletter #41)

I encourage users to e-mail in differences that they have found, so we all can benefit
from each other's experience.

The University of Alaska Fairbanks is an affirmative action/equal
opportunity employer and educational institution and is a part of the University
of Alaska system.
Arctic Region Supercomputing Center (ARSC) |PO Box 756020, Fairbanks, AK 99775 | voice: 907-450-8602 | fax: 907-450-8601 | Supporting high performance computational research in science and engineering with emphasis on high latitudes and the arctic.
For questions or comments regarding this website, contact info@arsc.edu