Performance

Summary

From a user perspective, application response time defines performance, and
performance is defined through key application metrics such as transaction
throughput and resource utilization. Metrics related to hardware such as network
throughput and disk access are common performance bottlenecks.

Note that performance and scalability are not interchangeable terms. They are
two distinct issues- while performance relates to how fast the application
responds to user requests, scalability relates to the capability of the system
to add additional resources to compensate for increased work load.

Disregarding performance will result in an application that performs poorly.
Responsibility for performance occur both at design time and at run time:

Design-Time
Developers should not introduce code that hinders application performance.
Developers should follows accepted programming practices. Third party tools
can like DevPartner can profile code and identify bottleneck spots.

Run-Time
The application should undergo mandatory performance testing to identify any
performance bottlenecks, such as contention for resources or slow-running
code. Of course, this should be done after all functional testing has been
completed. In other words, an application must work before it can work well.
Nonetheless, performance testing should begin as early as possible to
identify problem areas as soon as they are introduced into the application.

The following suggestions help identify the expected application load:

A common way to determine the load on the application is to estimate the
number of users.

A related measure is think-time which is the elapsed time between
the receipt of a reply to a request and the submission of the next request.
For example, if it takes on average 45 seconds for a new user to register
using a Web-based registration form, then the think-time is 45 seconds.

Another way to estimate load to evaluate load variance over time. For some
applications, the load remain constant, while for others the load varies
over time. For example, if a Web-based payment processing application sets
the deadline for receiving payments at the end of each month, then it is
natural to expect a heavy load towards the end of each month. Note that
information about how the load varies over time can be used to determine
peak and average system loads. You then base performance requirements on
peak and average system loads.

Application features typically correspond to use cases and usage scenarios. Here
you need to precisely define the semantics of each feature (use case) that is
performance-sensitive. You need to fully examine how the feature processes
the use-case including verifications, business processing, database access,
caching, and so on. It is these definitions that drive the tests that measure
performance.

Also note that accurate estimates of the usage of various application
services can help create tests that mimic real-life usage of the system.

Changing some aspects of a project to improve performance may not be an
option. For example, if an application has to be delivered by a specified date,
then re-designing it for performance may not be an option. Hardware constraints
may also be a factor, especially for user workstations. All these constraints
must be documented because they are constant during performance tuning.

Aspects of the projects that are not constrained may be changed during
performance tuning. For example, examine if transactions (when used) are really
needed. Examine if new servers can be added to the application topology, and so
on. These issues can help remove bottlenecks in the system.

Capacity planning is determining the most efficient way of increasing a
system's performance and scalability, while at the same time predicting the
point at which a resource will cause a bottleneck in the system. The starting
point for capacity planning is determining the application's capacity, which you
can determine by:

Number of users it can handle before performance falls off.

Server's ability to handle increased load - either due to increased number
of users or increased request complexity.

Complexity of the application.

Capacity is indirectly influenced by performance. A well-tuned application
can increase capacity by efficiently using available resources and releasing
resources that are not used by an active process. At some point, the application
can handle more request without degrading performance. This is the point at
which you either scale up by upgrading or replacing existing servers, or scale
out by adding more servers..

Ideally, you should do some capacity planning that established acceptable
performance benchmarks and resource usage limits. You should also develop a plan
to scale your system as soon as performance degrades.#

Performance testing assumes that the application is functioning, stable, and
robust. The application must pass its functional tests before you can test for
performance, otherwise, bugs in the code can potentially mask performance
problems, or even worse, give the impression that there is a performance
problem.

To tune performance, you must be able to maintain records of each performance
test pass. These records should include:

Exact system configuration, including changes from pervious test.

Raw data.

Calculated performance results.

Try to automate as much as possible of the performance tests to eliminate
possible operator differences. Also during each test pass, run the same exact
set of performance tests - otherwise, it will not be possible to distinguish
whether different performance results are due to code changes or to test
changes.

Testing should be as realistic as possible. For example, test the application
to determine how it performs when many clients are accessing it simultaneously (
a multi-threaded test-harness can simulate multiple clients in a reproducible
manner.) If the application accesses a database, the database should contain a
realistic number of records. If the database is too small, performance results
will be inaccurate.

Make sure you document how to set up a database for running a performance
test. The instructions should specify that the database should not include
changes make by a previous test pass.

After defining performance goals and developing performance test, run the
tests one to establish a baseline. Baseline results along with documenting the initial
test environment will provide a solid foundation for the tuning effort.

Stress testing is a specialized form of performance testing. The goal of
stress testing is to crash the application by increasing the processing load
until the application begins to fail due to saturation of resources or
occurrence of errors. Stress testing often reveals subtle bugs that go unnoticed
until the application is deployed. While some of these bugs may be logical
(i.e., array limit was exceeded ), most often they are the result of design
flaws. Therefore, stress testing should begin early in the development phase
of each part of the application. These kinds of bugs should be fixed at
their source rather than fixing bugs that manifest themselves as a result of the
source bug.

Finding a solution to poor performance is often like conducting a scientific
experiment. You can most often solve performance problems by following the same
process for conducting a scientific experimentation. This process contains six
steps:

Observation.

Preliminary hypothesis.

Prediction.

Controls.

Tests.

Theory.

The output of the experiment is a theory which consists of a
hypothesis supported by a collection of evidence accumulated by the process.

For example, you observe a poor performance in a distributed application
which uses thread pooling for server-side objects. Using the performance monitor
- PerMon - you (1) observe that the number of threads per CPU never
exceeds 10 threads . You (2) hypothesize that the maximum number of
threads per thread pool is set to a low value and needs to be increased. You (3)
predict that increasing the ThreadCountPerThreadPool
property will improve performance. The ThreadCountPerThreadPool
property has now become the (4) control and you start (5) testing
various values of this property and see how it affects performance. If
performance that is more satisfactory is achieved after several adjustments to
this property, you establish a (6) theory that certain property settings
can provide enhanced performance in combination with all current variables.

Performance tuning is the main activity associated with performance
management. Reduced to its most basic level, performance tuning is about
finding and eliminating performance bottlenecks. Bottlenecks usually appear
when a piece of hardware or software approaches the limit of its capacity.

Tuning the performance of an application uses a tuning cycle shown
below:

However, before starting the performance tuning cycle, you need to establish
the framework for ongoing performance tuning activities. For example:

Identify Constraints
Constraints such as manageability and budget limits are unalterable factors
in search of higher performance. Focus performance work on factors that are
not constrained.

Specify Work Load
The most common metric for specifying the load is the number of users, user
think-time and load distribution.

Set Performance Goals
Performance goals must be explicit. Total system throughput and response
time are two common metrics used to measure performance.

After establishing boundaries and expectations for performance tuning,
you can begin the tuning cycle. As shown by the figure above, the tuning cycle
is an iterative series of four (4) controlled performance experiments. These
four steps are:

This is the starting point of any tuning exercise. During this phase you
simply collect performance data with the collection of performance counters that
you have chosen for a specific part of the system. These counters (often from
PerfMon) could be the CPU, threads, network I/O, back-end database connections,
and so on.

Regardless of what part you are tuning, you require a baseline measurement
against which you should compare performance changes. You can use your first
data-gathering pass to establish a baseline set of values for the system's
behavior.

After collecting performance data, you start analyzing it to determine
performance bottlenecks. Keep in mind that a performance counter is only an
indicator - it does not necessarily identify the bottleneck because you can
trace a performance bottleneck back to multiple sources. One the best examples
is a CPU performance counter which can be directly affected by low disk space.
It is also common for problems in one system component to result from problems
in another system component.

The following points provide guidelines for interpreting counter values and
eliminating false or misleading data:

Monitoring processes with the same name
Track processes using process ID rather than name. The system monitor may
represent data for separate instances having the same name by reporting the
combined values of these instances as the value of a single instance.

Monitoring several threads
Trace threads by including identifiers of the process's thread. When
monitoring several threads and one of them stops, the data for one thread
might appear to be reported by another thread. This is because of the way
threads are numbered.

Intermittent data spikes
Do not give them too much weight. Counters that average can cause the
effects of spikes to linger in the reported average value.

Monitoring over an extended period
Use graphs rather than reports or histograms.

Excluding start-up events
Unless you have a reason for including start-up events, exclude them because
the temporarily high values they produce tend to skew overall results.

Zero values or missing data
Investigate all occurrences of zero values or missing data, as this has
negative effects on establishing a meaningful baseline.

After collecting and analyzing data, you can determine which part of the
system is a candidate for a configuration change, and then implement this
change. When tuning performance, the cardinal rule is to implement one
configuration change at a time before repeating the tuning cycle again.

After completing a single configuration change, determine the impact
of this single change on the system performance. At this point, you need to
determine if this single change improved performance, degraded performance, or
had no effect on performance. If performance has been improved, you can quite,
otherwise you must step through the tuning cycle again.

Reuse work by cachingOne of the best way to improve performance is not to do the same work
again and again. For example, static data (i.e., data that does not
change) should not be fetched repeatedly from the database. It should be
cached so that it is available immediately.

Warn the user
Warn the user ahead of any potentially long-running operations. Long-running
operations should be performed asynchronously.

Tune the database
Using a database can introduce bottlenecks when reading/writing data. There
are numerous steps to optimize data access:

Identify potential indexes and use them.

If using SQL Server, use its Profile and Index
Tuning Wizard.

If using SQL Server, analyze query plans using Query
Analyzer.

Monitor processor usage.

Use stored procedures to maximize performance.

Write less data by normalize what you write a lot.

Read less data by de-normalizing what you read a lot.

Partition large data tables
When accessing large data tables, increase processing speed by horizontally
and vertically partitioning them. Horizontal partitioning divides a table
containing a large number of rows into multiple tables containing the same
number of columns, but each contains a subset of the data. For example, a
[Customers] table can be horizontally portioned to two tables - table 1
contains all customers with last names beginning with letters A to M, while
table 2 contains all customers with last names beginning with letters N to
Z, Vertical partioning segments a table containing a large set of columns
into multiple tables containing the same rows but with each table containing
a subset of these columns..

Stress test your application
You cannot fully determine where bottlenecks exist in your application
unless you test it under load. The Application Stress Tool can be helpful in
simulating stress for your application.

Use transactions wisely
Transactions should be short-lived and only encapsulate what is required.
Distributed transactions involve significant overhead that can adversely
impact application performance. Distributed transactions should only be used
when absolutely necessary.

Reduce network communications
Cross-boundary communications across application or process boundary affects
performance. Therefore, try to reduce network roundtrips when calling remote
methods/procedures. For example, if you have a remote object, rather than
setting five different properties on it and then calling a remote method
that uses these parameters (six network roundtrips), modify the method to
accept five arguments and then call the remote method (one network
roundtrip). This design approach indirectly encourages the use of the
stateless business object layer.

Use security wisely
Limit the use of security only to parts of the application that truly
needed. For example, accessing web pages using SSL/TLS incurs significant
overhead since all communication is encrypted between the server and the
client.