Thursday, August 8, 2013

Save cpu by including extra columns to an unique index

Indexes are mainly used to
improve the performance of SELECT statements. They however need to be used with
caution, because indexes consume a lot of disk space. One could argue that
nowadays DASD is relatively cheap, so that’s not a big deal. In fact, there is
another big reason why one should be cautious about adding indexes: an index
adds 30% of CPU on INSERTs.

IBM DB2 v10 introduces a new
feature, the ability to include non-key columns to an unique index [more
info on IBM site].

What is it good for
?

The new INCLUDE feature in DB2
v10 allows you to reduce the number of indexes, which results in cost savings in
DASD and in CPU.

Let’s see how, through a very
simple scenario: a table composed of 2 columns, COL1 is a unique ID, and COL2 is a
non-unique column. You obviously need to define an unique index IX1 on COL1 to
insure the uniqueness of the rows in COL1. But because a lot of queries gather
data from COL1 and COL2 with ORDER BY, you decided (prior DB2 v10), to create
another index IX2 on COL1 and COL2, to improve performance of the
applications (SORT avoidance).

Now, in DB2 v10, you can simply
INCLUDE COL2 as part as IX1 as a non-key column, meaning that X1 can both
assure the uniqueness requirement in COL1, and keys for COL2. As a result, X2 became
obsolete and can be deleted (saving DASD space and improving INSERTs
performance).

Obviously, the unique key of an
index can be composed of several columns, but the idea is the same.

In the scenario above, you will
want to include COL3, COL4, and COL5 into the unique index IX1, and get rid of
IX2.

Note that you can INCLUDE
several non-key columns to an unique index, but you can only INCLUDE them one
at a time … Meaning that in the above scenario, you will need to execute 3 SQL
statements to INCLUDE to 3 non-key columns :

ALTER
INDEX IX1 ADD INCLUDE (COL3);

ALTER
INDEX IX1 ADD INCLUDE (COL4);

ALTER
INDEX IX1 ADD INCLUDE (COL5);

From Theory to
Practice …

The above theory is nice, but
this scenario might be very difficult to locate manually, in a production DB2
environment composed of thousands of tables and indexes.

The following SQL statement will
scan the DB2 catalog to locate some of these cases, namely locating non-unique
indexes which contain columns that include the key of an unique index (when the unique index
is composed of only 1 column).

SELECT
IX.CREATOR,IX.NAME,

IX.TBCREATOR,IX.TBNAME,

IX.UNIQUERULE,IX.COLCOUNT,

KEYS.IXCREATOR,KEYS.IXNAME,

KEYS.COLNAME,KEYS.COLNO,

IX2.CREATOR,IX2.NAME,

IX2.TBCREATOR,IX2.TBNAME,

IX2.UNIQUERULE,IX2.COLCOUNT,

KEYS2.IXCREATOR,KEYS2.IXNAME,

KEYS2.COLNAME,KEYS2.COLNO

FROM
SYSIBM.SYSINDEXES IX,

SYSIBM.SYSKEYS
KEYS,

SYSIBM.SYSINDEXES
IX2,

SYSIBM.SYSKEYS
KEYS2

WHERE
IX.UNIQUERULE = 'U'

AND
IX.CREATOR = KEYS.IXCREATOR

AND
IX.NAME = KEYS.IXNAME

AND
IX.COLCOUNT =
1

AND
IX.TBCREATOR = IX2.TBCREATOR

AND IX.TBNAME = IX2.TBNAME

AND
IX2.CREATOR = KEYS2.IXCREATOR

AND
IX2.NAME = KEYS2.IXNAME

AND
IX2.UNIQUERULE = 'D'

AND
IX.TBCREATOR = IX2.TBCREATOR

AND
IX.TBNAME = IX2.TBNAME

AND
KEYS.COLNAME = KEYS2.COLNAME ;

With the result of this SQL
statement, you can prepare your ALTER INDEX to INCLUDE non-key columns to the
unique indexes listed, and get rid of the corresponding now-superfluous
indexes.

If the unique index is composed
of several columns, getting such simple output will be much more complex in pure
SQL, because of the necessity of some kind of "table transpose"
involving 2 UNIONs, and several INNER JOINs - due to the structure of the
catalog. That said, if one gets rid (or changes the value) of the AND IX.COLCOUNT = 1 WHERE clause in the above statement, and looks carefully
at the output (IX.COLCOUNT and number of IX2 rows matching), locating the other
cases should be pretty straight forward.