Includes the manufacturer/processor type, processor speed, number of processors, threads, and number of processes.
Move mouse over this column for each row to display additional information,
including; manufacturer, system name, interconnect, MPI, affiliation, and submission date.

Run Type

Run Type, indicates whether the benchmark was a base run or was optimized.

Processors

Processors, this is the number of processors used in the benchmark, entered in the form by the benchmark submitter.

PP-HPL ( per processor )

HPL, Solves a randomly generated dense linear system of equations in double
floating-point precision (IEEE 64-bit) arithmetic
using MPI. The linear system matrix is stored in a two-dimensional block-cyclic
fashion and multiple variants of code
are provided for computational kernels and communication patterns.
The solution method is LU factorization through Gaussian elimination with
partial row pivoting followed by a backward substitution.
Unit: Tera Flops per Second

PP-PTRANS (A=A+B^T, MPI) ( per processor )

PTRANS (A=A+B^T, MPI), Implements a parallel matrix transpose for two-dimensional block-cyclic
storage. It is an important benchmark
because it exercises the communications of the computer heavily on a realistic
problem where pairs of processors communicate
with each other simultaneously. It is a useful test of the total communications
capacity of the network.
Unit: Giga Bytes per Second

PP-RandomAccess ( per processor )

Global RandomAccess, also called GUPs, measures the rate at which the computer can update
pseudo-random locations
of its memory - this rate is expressed in billions (giga) of updates per second
(GUP/s). Unit: Giga Updates per Second

PT-SN-STREAM ( per thread )

The Single Process STREAM benchmark is a simple synthetic benchmark program that measures
sustainable memory bandwidth and the corresponding computation rate for simple numerical
vector kernels. It is run on single computational process chosen at random. Unit: Giga Bytes per Second

PT-SN-DGEMM ( per thread )

The Single Process DGEMM benchmark measures the floating-point execution rate of double precision
real matrix-matrix multiply performed by the DGEMM subroutine from the BLAS (Basic Linear Algebra
Subprograms). It is run on single computational process chosen at random. Unit: Giga Flops per Second

PP-FFT ( per processor )

FFT, performs the same test as FFT but across the entire system by
distributing the input vector in block fashion across all the processes.
Unit: Giga Flops per Second

Randomly Ordered Ring Bandwidth ( per process )

Randomly Ordered Ring Bandwidth, reports bandwidth achieved in the ring
communication pattern. The communicating processes are ordered randomly in the
ring (with respect to the natural ordering of the MPI default communicator).
The result is averaged over various random assignments of processes in the
ring. Unit: Giga Bytes per second

Randomly-Ordered Ring Latency ( per process )

Randomly-Ordered Ring Latency ( per process ), reports latency in the ring communication
pattern. The communicating processes are ordered randomly in the ring (with respect to the
natural ordering of the MPI default communicator) in the ring. The result
is averaged over various random assignments of processes in the ring. Unit: micro-seconds

The values plotted for HPL, PTRANS, RandomAccess, and FFT are per processor.
The values plotted for SN-DGEMM and SN-STREAM are per thread. The value plotted
for RandomRing Latency is normalized using it's reciprocal. Only those systems
that have values for all the tests plotted are available for the diagram. Use the left-hand
column to select up to 6 systems to plot in the Kiviat diagram.