October 2014

November 16, 2009

i Can … Tell You Why You’re Waiting

Wait accounting is the patented technology built into IBM
i that tells you what a thread or task is doing when it isn't doing anything.
This is an IBM i exclusive that’s possible because the Rochester (Minn.)
development lab develops all the layers of i.

Wait accounting is a very powerful capability for
detailed performance analysis. This entry focuses on waiting, why threads wait
and how you can use wait accounting to troubleshoot performance problems or
simply improve the performance of your applications.

Let's start by reviewing some terminology. Most people
are familiar with the term “job.” Every job has at least one thread and may
have multiple threads. Every thread is represented by a licensed internal code
(LIC) task, but tasks also exist without the IBM i thread-level structures. LIC
tasks are generally not visible externally except through the IBM i performance
or service tools. Wait accounting concepts apply to both threads and tasks;
thus, the terms “thread” and “task” are used when referring to an executable
piece of work.

A thread or task has two basic states. It can be
executing on the processor (this is the running state) or it can be waiting to
run on the processor. There are three key wait conditions:

Ready to run, waiting for the processor. This is a
special wait state and is generally referred to as CPU queueing, which means
the thread or task is queued and waiting to run on the CPU. One reason CPU
queueing can occur is if the partition is overloaded and there’s more work than
it can accommodate. Logical partitioning and simultaneous multithreading can
also result in CPU queueing; however, these are quite complex topics and are
covered in the Job Watcher whitepaper.

Idle waits. Idle waits are normal and expected wait
conditions. Idle waits occur when the thread is waiting for external input from
a user, the network or another application. Until that input is received,
there’s no work to be done.

Blocked waits. Blocked waits are a result of
serialization mechanisms to synchronize access to shared resources. Blocked
waits may be normal and expected -- for example, serialized access to updating
a row in a table, disk I/O operations or communications I/O operations.
However, blocked waits may be abnormal, and it’s these unexpected block points
where wait accounting can be helpful.

You can think of the lifetime of a thread or a task in a
graphical manner, breaking out the time spent running or waiting. This
high-level graphical depiction is called the run-wait time signature:

Traditionally, the focus for improving an application’s
performance was to have it use the CPU as efficiently as possible. On IBM i
with wait accounting, we can examine the time spent waiting and understand what
contributed to that wait time. If elements of waiting can be reduced or
eliminated, the overall performance can also be improved.

Nearly all of the wait conditions in IBM i have been
identified and enumerated – that is, each unique wait point is assigned a
numerical value. The 6.1 release includes 268 unique wait conditions! Keeping
track of so many unique wait conditions for every thread and task would consume
too much storage, so IBM uses a grouping approach. Each unique wait condition
is assigned to one of 32 groups, or buckets. As threads or tasks go into and
out of wait conditions, the task dispatcher maps the wait condition to the
appropriate group.

If we take the run-wait time signature using wait
accounting, we can now identify the components that make up the time the thread
or task was waiting. If the thread's wait time was due to reading and writing
data to disk, locking records for serialized access, and journaling the data,
we’d see the waits broken out as:

When you understand the types of waits that involved, you
can start to ask yourself some questions. For this situation, you could ask:

What files are being journaled? Are all the journals
required and optimally configured?

You’ll see many of these wait groups surface if you do
wait analysis on your application. Understanding what your application is doing
and why it’s waiting in those situations can possibly help you reduce or
eliminate unnecessary waits.

Holders and Waiters

Not only does IBM i keep track of what resource a thread
or task is waiting on, it also keeps track of the thread or task that has the
resource allocated to it. This is a very powerful feature. A holder is the
thread or task that’s using the serialized resource. A waiter is the thread or
task that wants access to that serialized resource.

Call Stacks

IBM i also manages call stacks for
every thread or task. This is independent of the wait accounting information.
The call stack shows the programs and procedures that have been invoked and can
be very useful in understanding the wait condition since the call stack gives
an outline of the logic that led up to either holding a resource or wanting to
get access to it. The combination of holder, waiter and call stacks provides a
very powerful capability to analyze wait conditions. No other operating system
provides such rich function.

Collecting and Analyzing the Data

Collection Services and Job Watcher are two performance
data collection mechanisms that collect the wait accounting information. Job Watcher
also collects holder and waiter information, as well as call stacks. Once the
performance data has been collected, you can graphically analyze it. In IBM i
6.1, the IBM Systems Director Navigator Web console has the Performance tasks;
the Investigate Data feature can be used to graphically view wait data through a browser interface.
Or you can use iDoctor GUI to view
wait data.

IBM Systems Magazine is a trademark of International Business Machines Corporation. The editorial content of IBM Systems Magazine is placed on this website by MSP TechMedia under license from International Business Machines Corporation.