System Benchmarking

By Dr. T.S. Kelso

March/April 1996

Throughout the development of this column, we have been faced with two
inescapable conclusions: satellite tracking can be computationally intensive and
requires careful attention to ensure accurate results. These conclusions are
inescapable regardless of whether you are developing your own satellite tracking
software or using someone else's.

As a result of these conclusions, it becomes important at some point to
consider the issue of benchmarks to assess certain performance characteristics
of any satellite tracking system. As we shall see, benchmarks can be used for a
variety of purposes: evaluating the performance of a particular hardware
configuration, operating system, programming language, or specific satellite
tracking application. Which consideration—speed or accuracy—is most
important to you will depend upon your needs, but we will address several
approaches to assess each.

Which benchmark is best to use depends heavily on the application. In the
area of satellite tracking, we would expect the best benchmarks to be
floating-point intensive, that is, to make heavy use of real calculations as opposed to
integer calculations. We would also expect them to make heavy use of
trigonometric and other mathematical functions. While we could use existing
standard benchmarks, the drawback to this approach is that often it is not clear
what the basis of the benchmark calculation is and, hence, how appropriate it is
to assessing the performance of a particular class of applications.

This column (and the next) will endeavor to provide you with a suite of
benchmarks that are not only simple but will also allow you to assess
performance across the spectrum of operations. We will begin by developing a set
of benchmarks with known solutions. The easiest of these is something known as
the Savage benchmark.

The Savage benchmark is a particularly good benchmark for the fields of
satellite tracking and astronomy. The reason it is so good is that it relies
entirely on the repeated use of mathematical functions which yield a known (and
easily calculable) numerical result. This benchmark is based on the use of
matching inverse mathematical functions. Sample code, written for Borland Pascal
7.0, is provided in figure 1 below. Note that units from the SGP4 Pascal Library have been
used to time the calculation.

The heart of the calculation consists of taking the variable a,
starting with a value of one, and incrementing it by one 2,499 times until
a = 2,500. To make things more interesting, though, we evaluate
a using a set of three matching inverse functions: square and
square root, exponential and logarithm, and tangent and arc tangent. The result
of each pair of matching inverse functions should be the original value
of a and the result of all the calculations should equal exactly
2,500. Of course, due to limitations of the hardware, operating system, and
programming language the result will not yield exactly the expected result. How
close we get to the approved solution and how quickly we can calculate it are
the two dimensions of this benchmark. Table 1 shows the results for a range of
systems in current use for differing levels of numerical precision.

The results are actually rather illuminating. Even the slowest machine beats
the time of a Cray X-MP/24 from a decade ago.1
And while the accuracy isn't quite as high as with the Cray, the adoption of
standards for numerical processing has resulted in consistent results for
standard data types. Finally, the use of numeric coprocessors allows high
precision (certainly compared to the single-precision results) for very little
additional time.

How might this benchmark be used? Well, because of its
simplicity—relying entirely on number crunching—it can be used to
demonstrate anything from the 18-fold improvement in performance going from a
386DX-33 to a Pentium 133, to the difference in programming languages (e.g.,
Pascal vs. C), or even the differences between operating systems (e.g., Windows
95 vs. Unix). The Savage benchmark is a straightforward way of assessing the
kind of computational speed and accuracy required for satellite tracking
precisely because it avoids measuring things like disk throughput or video
performance.

Let's put aside the issue of speed for a moment and address the issue of
accuracy. Of course, we've been assessing accuracy throughout the history of
this column. Each time we've presented the theory behind a particular aspect of
satellite tracking, we've followed up with a specific numerical example to
ensure you can implement the theory from start to finish. These examples are
usually fairly simple because we're only looking at a small piece of the larger
picture. However, as we pull these smaller pieces together, it becomes
increasingly important to be able to assess the accuracy of the resulting
complex procedures. We do this through the use of standard test cases.

An example of a standard test case would be to provide element sets for a
particular orbital model (e.g., NORAD two-line element sets for the SGP4 orbital
model) and the output from a known correct implementation of the orbital model.
An example of such test cases is included in the appendix of
Spacetrack Report Number
3.2 The sample test cases in this report
include an element set for one near-earth and one deep-space satellite and the
resulting SGP4 state vectors (ECI position and velocity) at points over a
specific time interval. These test cases can be used to verify the proper
implementation of the SGP4 (near-earth) and SDP4 (deep-space) portions of the
current NORAD orbital model for a particular satellite tracking application.

For example, we can verify the implementation of the SGP4 model used by
TrakStar by running the
code in figure 2 with the data in
Spacetrack Report Number
3 (included in the code).

The results show agreement at the meter-level in position and
millimeter/second-level in velocity. Most of the disparity comes from
refinements to the constants used in the model together with modifications to
the code since its initial release. Obviously, it is important to have good
current test cases to use to verify your software. Perhaps not so obvious is the
need to have a more diverse set of orbital elements to test against. Such a set
would go further toward testing all aspects of the complicated SGP4 model and
provide better confidence in a particular implementation.

Now that we have at least a basic means of assessing the overall accuracy of
a particular implementation of an orbital model, let's return to the question of
speed. Our ultimate benchmark would be to run a standard set of orbital
elements—for satellites in various orbits—over a specific interval
and count how many state vectors can be computed per unit time. That is exactly
what SGP4-BM does. The code in figure 3 is run with the NORAD two-line orbital
elements in figure 4 to calculate the state vectors for each satellite at
one-minute intervals for the entire day of 1993 March 11. By measuring how long it
takes to do these 14,400 calculations, we come up with a figure of how many SGP4
calculations per minute are computed.

For the four systems we looked at earlier, the results are compiled in table 2 below.

Table 2. SGP4 Benchmark Results

386DX-33

486DX2-66

Pentium 90

Pentium 133

Calculations per minute

9,285.3

43,221.6

113,089.1

167,441.2

It should be obvious that this benchmark is best suited to determining the
number-crunching ability of any satellite tracking system since that is exactly
what it tests. Interestingly enough, though, we see the same 18-fold improvement
between the 386DX-33 and the Pentium 133 with SGP4-BM as we did with the much
simpler Savage benchmark.

If you think we haven't quite finished our discussion of satellite tracking
benchmarks yet, you're absolutely right. While we've looked at system
benchmarks, we haven't even addressed the need for benchmarking against
real-world data. System benchmarks can only verify that our application is
consistent with existing models but cannot validate that the application
actually works. Next time we'll look at some real-world data sets and see how to
test an application against that data using TrakStar as an example. We
will also discuss various types of real-world data which may be used for this
purpose, depending on your requirements.

As always, if you have questions or comments on this column, feel free to send me
e-mail at TS.Kelso@celestrak.com or write care
of Satellite Times. Until next time, keep looking up!