Application I/O Pattern AnalysisA Case Study in Performance Analysis

The 307 Corporation is involved in the industrial production of very
high-grade ethanol by means of a novel mechanism. Their operating
apparatus consists of about a hundred large production vessels. Because
of the nature of their process it is necessary to monitor several
parameters -- temperature, for example -- at various points throughout
the vessel. This data is then used to optimize the ethanol production
process. They are considering scaling up their production process by
a factor of four, and want to be sure they will be able to record
data fast enough.

In general, one of the top areas to look for performance improvements is
disk I/O. "Spinning rust," as one of my friends refers to it, is significantly
slower than main memory. And unfortunately, in exchange for capacity and
relative permanence, our applications tend to use it. Before we can start
tuning our disk subsystems for maximum performance, however, we need to be
able to look at how our applications exercise the disk.

As far as our work for 307 Corp. goes, we must first figure out what sort of patterns their data collection mechanism is generating on the disk: only then can we
approach the problem of how to configure their I/O subsystem.

They use a Sun Ultra Enterprise 3500 with six 400 MHz processors
and six gigabytes of memory running Solaris 8 to record all data
concerning the operating parameters of these production vessels, which
comes in via very fast serial interconnects. At present, all of their
data is being recorded to a single 7200rpm FC-AL disk.

In order to figure out what's going on, we'll use the TNF, or "trace
normal form," kernel tracing facility in Solaris, along with some Perl
to extract the relevant fields from the (voluminous) TNF output. Then
we'll visualize the data to see what sort of patterns present themselves.

Now we've got a shiny new I/O subsystem trace file. The problem of parsing
it, however, still lies before us. To separate the wheat of understanding
from the chaff of despair, we will turn to every system engineer's best
friend, Perl. A little bit of Perl will let us pull out relevant fields:

Now that we have all our data and we can get at it in a reasonable way,
the first question we should ask is, "How much data are we writing to disk
per I/O operation?" Computing the mean I/O size is pretty straightforward:

% iotx trace.tnf | cut -f6 -d' ' | mean
mean of 1164 samples is 512

(Here "mean" is a little piece of Perl that computes the mean of numbers
given to it on standard input.)

However, we should also check that the mean is a good way of
representing the average I/O size. For example, here we have an
observed mean size of 512 bytes. The 'average' might actually be 512 bytes,
or we might have a pair of peaks: perhaps one at 602 bytes and one at
422 bytes (such a distribution would be called 'bimodal'). So to check
ourselves, we should take a look at the data, and make sure it looks
like what we think it does. In this case, the only writes being
done to the disk are 512 bytes -- but it was good to check.

Two more interesting questions to ask are: How many I/O operations are
being issued per second? and, Where on the disk are these I/Os are located?
Answering the first question is pretty straightforward: our TNF data
shows that 1,163 I/Os were completed, and the trace has a lifetime of 17,530
milliseconds. A millisecond is a thousandth of a second, so we are doing
about 66 transactions per second.

Answering the second question is a bit more difficult, but after
some thought, drawing a scatter plot of time (on the horizontal access)
and disk seek position (on the vertical axis) looks promising.

I/O time vs. location

That looks pretty random to me.

After all this data collection and analysis, what have we learned? We
can summarize our improved understanding in one sentence:

307 Corporation's production vessel monitoring system produces randomly distributed disk writes of, on average, 512 bytes, and requires approximately 66 I/Os per second in the present environment.

It seems like a lot of work for one sentence, but this sentence is
absolutely vital to understanding the environment we're working in.
That understanding is what is going to enable us to tune the disk
subsystem in order to produce higher performance, which we'll cover
in next month's column.

Gian-Paolo D. Musumeci
is a research engineer in the Performance and Availability Engineering group at Sun Microsystems, where he focuses on network performance.