Article Index

In Case of Problems

You should have minimal
or no problems with the single machine tests. As more machines become
involved with the tests, there is room for more configuration errors
to arise. If a test does not run, check the "test_name.log" file in
the log directory (the default is bps-logs).
In the case of the NAS tests, the results are in
the form npb.COMPILER.MPI.CLASS.PROCESSORS.

In general, if you have problems with a test it may be best to examine the results
in the log file. In the case of the NAS suite, the
"-k" option will keep the npb tests directory in the log directory so you
can run the tests directly if you have problems. There is a script called
run_suite in
the npb directory that will run the tests (run_suite -h ). Also the README.bps file in the npb
directly should provide more information on how the tests are run
and how to resolve possible problems.

Working with the NAS Benchmark Suite

The NAS suite will probably produce the most problems for end users.
The main script run_suite is designed to "wrap" around
and hide the various issues with the different MPI's and compilers.
While run_suite does an adequate job, it certainly can not
predict the potential software environments on a cluster. The NAS
suite is run from the command line. The following options are
required. If you are using run_suite directly you need to list the machine names in the npb/cluster/machines
file (one per line). This file is used by MPI to start your programs.
The options are:

If you have problems producing the binary files, consult the make.log
file for a complete listing for the make process. Often an error will cause all
the builds to fail, so fix one problem at time and re-run the build.

Future Plans

The BPS suite is showing its age. Virtually all the functionality of the BPS suite is being converted into the CMBP (Cluster Monkey Benchmarking Project).

Acknowledgments

I wish to thank and acknowledge all the authors of the tests suites used in
this package. See Sidebar Two (below) for more information on each test.

BT is a simulated CFD application that uses an implicit algorithm to solve 3dimensional (3D) compressible NavierStokes equations. The finite differences solution to the problem is based on an Alternating Direction Implicit (ADI) approximate factorization that decouples the x, y, and z dimensions. The resulting systems are BlockTridiagonal of 5x5 blocks and are solved sequentially along each dimension.

SP is a simulated CFD application that has a similar structure to BT. The finite differences solution to the problem is based on a BeamWarming approximate factorization that decouples the x, y, and z dimensions. The resulting system has scalar Pentadiagonal bands of linear equations that are solved sequentially along each dimension.

LU is a simulated CFD application that uses symmetric successive overrelaxation (SSOR) method to solve a seven block diagonal system resulting from finite difference discretization of the NavierStokes equations in 3D by splitting to into block Lower and Upper triangular systems.

FT contains the computational kernel of a 3D fast Fourier Transform (FFT)based spectral method. FT performs three one dimensional (1D) FFT's, one for each dimension.

MG uses a Vcycle MultiGrid method to compute the solution of the 3D scalar Poisson equation. The algorithm works continuously on a set of grids that are made between coarse and fine. It tests both short and long distance data movement.

CG uses a Conjugate Gradient method to compute an approximation to the smallest eigenvalue of a large, sparse, unstructured matrix. This kernel tests unstructured grid computations and communications by using a matrix with randomly generated locations of entries.

EP is an Embarrassingly Parallel benchmark. It generates pairs of Gaussian random deviates according to a specific scheme. The goal is to establish the reference point for peak performance of a given platform. EP is almost independent of the interconnect as communication is minimal.

IS in a parallel integer sort algorithm that is very sensitive to latency of the interconnect.