Oracle Blog

Jeff Taylor's Weblog

Tuesday Nov 27, 2012

I've been troubled by drop outs in CPU usage in my application server,
characterized by the CPUs suddenly going from close to 90% CPU busy to
almost completely CPU idle for a few seconds. Here is an example of a
drop out as shown by a snippet of vmstat data taken while the
application server is under a heavy workload.

This behavior occurred consistently while the application server was
processing synthetic transactions: HTTP requests from JMeter running on
an external machine.

I explored many theories trying to explain the drop outs, including:

Unexpected JMeter behavior

Network contention

Java Garbage Collection

Application Server thread pool problems

Connection pool problems

Database transaction processing

Database I/O contention

Graphing the CPU %idle led to a breakthrough:

Several of the drop outs were 30 seconds apart. With that insight, I
went digging through the data again and looking for other outliers that
were 30 seconds apart. In the database server statistics, I found spikes
in the iostat "asvc_t" (average response time of disk transactions, in
milliseconds) for the disk drive that was being used for the database
log files.

The database and application server were running on two different SPARC servers.

Storage for the database was on a storage array connected via 8 gigabit Fibre Channel

Data storage and log file were on different physical disk drives

Reliable low latency I/O is provided by battery backed NVRAM

Highly available:

Two Fibre Channel links accessed via MPxIO

Two Mirrored cache controllers

The log file physical disks were mirrored in the storage device

Database log files on a ZFS Filesystem with cutting-edge technologies, such as copy-on-write and end-to-end checksumming

Why would I be getting service time spikes in my high-end storage?
First, I wanted to verify that the database log disk service time spikes
aligned with the application server CPU drop outs, and they did:

At first, I guessed that the disk service time spikes might be
related to flushing the write through cache on the storage device, but I
was unable to validate that theory.

After searching the WWW for a while, I decided to try using a separate log device:

# zpool add ZFS-db-41 log c3t60080E500017D55C000015C150A9F8A7d0

The ZFS log device is configured in a similar manner as described above: two physical disks mirrored in the storage array.

We can see the steady flow of 4k writes to the ZIL device from O_SYNC database log file writes. The spikes are from flushing the transaction group.

Like almost all problems that I run into, once I thoroughly
understand the problem, I find that other people have documented similar
experiences. Thanks to all of you who have documented alternative approaches.

Saved for another day: now that the problem is obvious, I should try
"zfs:zfs_immediate_write_sz" as recommended in the ZFS Evil Tuning
Guide.