News, views, and items of interest on IBM's Db2 database management system and mainframes.

Thursday, July 16, 2015

Influencing the DB2 Optimizer: Part 3 - Tweaking SQL Statements

In Part 2 of this series we took a look at standard methods of influencing the DB2 optimizer, essentially boiling down to running RUNSTATS, reorganizing, encouraging parallelism and indexing.

But sometimes you may not want to change the DB2 Catalog statistics (or use hints, which we will cover later in this series). In such situations, you might consider tweaking the offending SQL statement. Tweaking is the
process of changing a statement in a non-intuitive fashion, without altering
its functionality. Let's take a look at some of these methods...

At times, you may need to disable a specific index from being
considered by the optimizer. One method of achieving this is to append OR
0 = 1 to the predicate. For example, consider a query against the
EMP
table on which two indexes exist: one on EMPNO
and one on WORKDEPT. Appending OR 0 = 1
(as shown next) to the WORKDEPT predicate will
cause DB2 to avoid using an index on WORKDEPT.

SELECT EMPNO, WORKDEPT,
EDLEVEL, SALARY

FROM EMP

WHERE EMPNO BETWEEN ‘000020’ AND ‘000350’

AND
(WORKDEPT > ‘A01’ OR 0 = 1);

TheOR 0 = 1clause does not change the results
of the query, but it can change the access path chosen.

There are other tweaks that can be used to dissuade the DB2 optimizer from choosing a specific index, such as multiplying or dividing by 1, adding or subtracting 0, or appending an empty string to a character column.

Another method of tweaking SQL to influence DB2’s access path
selection is to code redundant predicates. When DB2
calculates the filter factor for a SQL statement, it multiplies the filter
factors for all predicates connected with AND.

You can lower the filter factor of a query by adding redundant predicates. For example, consider the following statement:

SELECT LASTNAME

FROM DSN81010.EMP

WHERE WORKDEPT = :VAR;

You can attempt to lower the filter factor for the predicate by adding redundant predicates, for example:

SELECT LASTNAME

FROM DSN81010.EMP

WHERE WORKDEPT = :VAR

AND WORKDEPT = :VAR

AND WORKDEPT = :VAR;

The two predicates added to the end are redundant and do not affect SQL statement functionally. However, DB2 calculates a lower filter factor, which increases the possibility that an index on the WORKDEPT column will be chosen. The lower filter factor also increases the possibility that the table will be chosen as the outer table, if the redundant predicates are used for a join.

When redundant predicates are added to enhance performance, be sure to document the reasons for the extra predicates. Failure to do so may cause a maintenance programmer to assume that the redundant predicates are an error and thus remove them.

Another option for getting a small amount of performance out of an SQL statement is to change the physical order of the predicates in your SQL code. DB2 evaluates predicates first by predicate type, then according to the order in which it encounters the predicates. The four types of SQL predicates are listed in the order that DB2 processes them:

Equality, in which a column is tested for equivalence to another column, a variable, or a literal

Ranges, in which a column is tested against a range of values (for example, greater than, less than, or BETWEEN)

IN, where a column is tested for equivalence against a list of values

Stage 2 predicates

Understanding this hierarchy, you can place the most restrictive predicates at the beginning of your predicate list. For example, consider the following query:

SELECT LASTNAME

FROM DSN8810.EMP

WHERE WORKDEPT = ‘A00’

AND SEX = ‘M’;

The first predicate has a lower filter factor than the second because there are fewer workers in department A00 than there are males in the entire company. This probably will not increase performance by much, but it can shave a little off a query’s processing time.

Before deciding to tweak SQL statements to achieve different
access paths, remember that you are changing SQL code in a non-intuitive fashion.
For each modification you make to increase performance, document the reasons in
the program, the data dictionary or repository (if one exists), and the system documentation. Otherwise, the
tweaked SQL could be maintained after it is no longer required, or modified
away when it still is required for performance.

Also remember that the changes could enhance performance for one
release of DB2 but result in no gain or decreased efficiency in subsequent
releases. Re-examine your SQL for each new version and release of DB2.