It is possible to cost-effectively design performance into new software systems. Software performance engineering (SPE) provides a systematic, quantitative approach to managing performance throughout the development process.

This chapter is from the book

Knowledge of what is possible is the beginning of
happiness.George Santayana

In This Chapter:

Performance failures and their consequences

Managing performance

Performance successes

What is software performance engineering?

SPE models and modeling strategies

1.1 Software and Performance

This book is about developing software systems that meet performance
objectives. Performance is an indicator of how well a software system or
component meets its requirements for timeliness. Timeliness is measured in terms
of response time or throughput. The response time is the time required to
respond to a request. It may be the time required for a single transaction, or
the end-to-end time for a user task. For example, we may require that an online
system provide a result within one-half second after the user presses the
"enter" key. For embedded systems, it is the time required to respond
to events, or the number of events processed in a time interval. The
throughput of a system is the number of requests that can be processed in
some specified time interval. For example, a telephony switch may be required to
process 100,000 calls per hour.

NOTE

Performance is the degree to which a software system or component meets
its objectives for timeliness.

Thus, performance is any characteristic of a software product that you could,
in principle, measure by sitting at the computer with a stopwatch in your
hand.

NOTE

Other definitions of performance include additional characteristics such as
footprint or memory usage. In this book, however, we are concerned primarily
with issues of timeliness.

There are two important dimensions to software performance timeliness:
responsiveness and scalability.

1.1.1 Responsiveness

Responsiveness is the ability of a system to meet its objectives for
response time or throughput. In end-user systems, responsiveness is typically
defined from a user perspective. For example, responsiveness might refer to the
amount of time it takes to complete a user task, or the number of transactions
that can be processed in a given amount of time. In real-time systems,
responsiveness is a measure of how fast the system responds to an event, or the
number of events that can be processed in a given time.

In end-user applications, responsiveness has both an objective and a
subjective component. For example, we may require that the end-to-end time for a
withdrawal transaction at an ATM be one minute. However, that minute may feel
very different to different users. For a user in Santa Fe in the summer, it may
seem quite reasonable. To a user in Minneapolis in January, a minute may seem
excessively long. Both objective and user-perceived (subjective) responsiveness
must be addressed when performance objectives are specified. For example, you
can improve the perceived responsiveness of a Web application by
presenting user-writable fields first. Then, build the rest of the page (e.g.,
the fancy graphics) while the user is filling in those fields.

NOTE

Responsiveness is the ability of a system to meet its objectives for
response time or throughput.

1.1.2 Scalability

Scalability is the ability of a system to continue to meet its response
time or throughput objectives as the demand for the software functions increases.
The graph in Figure 1-1
illustrates how increasing use of a system affects its response time.

In Figure 1-1, we've
plotted response time against the load on the system, as measured by the number
of requests per unit time. As you can see from the curve, as long as you are
below a certain threshold, increasing the load does not have a great effect
on response time. In this region, the response time increases linearly with
the load. At some point, however, a small increase in load begins to have a
great effect on response time. In this region (at the right of the curve), the
response time increases exponentially with the load. This change from a linear
to an exponential increase in response time is usually due to some resource
in the system (e.g., the CPU, a disk, the network, sockets, or threads) nearing
one hundred percent utilization. This resource is known as the "bottleneck"
resource. The region where the curve changes from linear to exponential is known
as the "knee" because of its resemblance to a bent knee.

Web applications are discussed in Chapters 5, 7, and
13.

NOTE

Scalability is the ability of a system to continue to meet its response
time or throughput objectives as the demand for the software functions
increases.

Scalability is an increasingly important aspect of today's software
systems. Web applications are a case in point. It is important to maintain the
responsiveness of a Web application as more and more users converge on a site.
In today's competitive environment, users will go elsewhere rather than
endure slow response times.

In order to build scalability into your system, you must know where the
"knee" of the scalability curve falls for your hardware/software
environment. If the "knee" occurs before your target load
requirements, you must either reduce the utilization of the bottleneck resource
by streamlining the processing, or add additional hardware (e.g., a faster CPU
or an extra disk) to remove the bottleneck.

This book presents an integrated set of solutions that you can use to build
responsiveness and scalability into your software systems. These solutions
include a combination of modeling, measurement, and other techniques, as well as
a systematic process for applying them. They also include principles, patterns,
and antipatterns that help you design responsiveness and scalability into your
software. These techniques focus primarily on early life cycle phases to
maximize your ability to economically build performance into your software.
However, we also present solutions for systems that already exhibit performance
problems.