Scatter charts are powerful because they allow us to apply heuristic pattern-matching. Instead of looking at a data table with thousands of rows and applying statistical methods to simplify the problem, we can use the power of visual patterns to analyze those thousands of points of data at the same time. So why is it that scatter charts often leave us feeling, well, scattered? Michael Kelly provides a clue.

Like this article? We recommend

Like this article? We recommend

I got hooked on scatter charts a couple of years ago after seeing Scott
Barber give a talk based on his article
"Beyond Performance Testing Part 6: Interpreting Scatter Charts."
I thought, "Wow, that’s cool. I want to be able to look at a big glob
of data and be able to recognize patterns." The problem was that my scatter
charts never looked like his. Of course he could read scatter
charts—his applications failed in ways that were easy to identify
in a scatter chart!

After a few frustrated email messages to Scott asking for help, I learned
that his scatter charts didn’t typically start off looking like that. He
had to manipulate them to find the information he wanted. So what’s the
missing piece? How can you get your scatter charts to tell you a story? In this
article, we’ll take a look at some techniques for manipulating your
scatter charts so you can get them to tell you a story. This article is intended
for the experienced performance tester who wants to be able to identify patterns
in performance test data faster.

Why We Need Scatter Charts

Performance problems are difficult to solve for many reasons:

The tools we use and the applications we test are buggy.

The results presented by performance tools are often difficult to
interpret.

The number of variables in any performance test is beyond a mere
mortal’s ability to keep in his or her head all at the same
time—network settings, hardware configurations, application settings,
script settings, etc.

When we encounter a failure or a slowdown, it’s very difficult to
determine where to start tuning.

Enter scatter charts. A scatter chart plots transactions with respect to
response time and the time in the run. That is, if a transaction takes place 60
seconds into a run, and ends after 5 seconds, it will be plotted at point (60,
5) on a standard graph. Figure 1 shows an example.

The chart in Figure 1 shows thousands of transactions for a single test. By
looking at the x axis, you can see that the test ran for about 7,000 seconds.
The y axis tells us that the slowest response time was 1,000 seconds. For this
test, the acceptable response time was under 6 seconds. Clearly, we have a
problem, but where do we start?

I once heard a performance testing expert refer to scatter charts in the
following way: "With an order of magnitude fewer variables it could be a
science, but for now there is a heavy reliance on the human brain to draw
relationships based on past experience." That sums up the scatter chart
analysis nicely. So how do you take that chart and turn it into something
useful? You start by developing an understanding of what you’re looking at
and then manipulating the data until it starts to make sense based on your
understanding of the context.

Scatter charts are good for identifying patterns in response times over a
whole run. You can display response time graphically to highlight instances of
poor performance, and you can identify correlations between response times and
resource usage over time. The charts are great for technical stakeholders, but
not so great for non-technical stakeholders. (If you’re a non-technical
stakeholder, you may want to bail now.) They also tend to be less useful for
comparing results over multiple runs.