Contents

Introduction

The purpose of this document is to describe different ways of improving the performance of SQL Server queries. Most of this document will describe index optimization, with occasional references to particular code snippets. In other words, this document will describe how to achieve the best performance, given the tables and queries to run against.

Database design issues and entity-relationship modeling techniques are out of scope of this document, even though flawed design can severely impact the performance in many ways.

Data and Index Organization

Index and data organization will be discussed together for two reasons. First, there is rarely a table without an index. Even a small lookup table has a primary key that allows other table to refer to it, and a primary key is always based on the index. Second, data organization depends on the indexes that exist for the table; a table with a clustered index is organized differently than a table without one.

Non-Clustered Indexes/Heap-Based Tables

Figure 1 shows the organization of a non-clustered index defined on a heap-based table.

Figure 1. Non-clustered index/heap-based table

A heap table, by definition, is a table that doesn't have any clustered indexes. Different pages of the heap-based table occupy different non-contiguous areas on a disk, and they are not linked together in any way. Each non-clustered index defined on the table will have a corresponding entry in a sysindexes table with an indid between 2 and 254, pointing to the first IAM (Index Allocation Map) page. IAM pages represent the linked list of all pages used by the database object (table or index) and used by SQL Server for allocating and deallocating storage space. It is important to remember that IAM pages are not used to search through data or index pages, but only to allocate and deallocate them.

An observant reader might ask at this point: What if there is no index defined for a table at all? Well, the address of the first IAM page of the heap table itself is stored in the sysindexes table with indid = 0 (see Figure 2). In that respect, the abbreviation IAM is a little misleading; it would be better called SAM (Storage Allocation Map or Space Allocation Map).

Figure 2. Table with no index defined

All the indexes (clustered and non-clustered) are B-Trees, where B stands for "balanced" (not "binary," as is sometimes thought), which basically means that it will take about the same number of jumps to get to the leaf node for any value of the key. The number of jumps required to get to the leaf node has a direct impact on the total time required to read the data using that index, so let's explore this a little bit more.

All pages, including non-leaf index pages, have a size of 8192 bytes, including 96 bytes occupied by the header—this leaves 8096 bytes for the index data. Let's calculate how many jumps will be required for the index built on the single field of the integer type. Each pointer to the index page of the next level or to the data page in the heap takes 6 bytes, so the total size of one index entry is 10 bytes. This means that each index page can contain up to 800 index entries. Let's assume that the average number of entries on each index page is 400. This means that for a table holding up to 160,000 rows of data, it will take only three jumps to get to the data page. The first jump is to the root index page, the second jump is to the leaf-node-level index page, and the third jump is to the data page itself. For a table holding 160,000 to 64,000,000 rows of data, it will take four jumps.

This calculation demonstrates that the width of the key directly affects the number of required jumps. If we have the key defined on the char (200) column, it will take only three jumps for a table holding 400 rows or less; four jumps for a table holding up to 8,000 rows; and five jumps for a table holding up to 160,000 rows. This leads to the conclusion that indexes built on wider keys perform searches slower (in addition to taking more space).

Covering Indexes

The term covering index does not mean a separate kind of index having a different internal structure. Rather, this term is used to describe a certain technique that is used to improve performance. It is discussed here because, as will be shown later, for most practical purposes, a clustering index can be thought of as a covering index.

Let's consider a simple but typical example. Suppose that we need to join two tables—Master, defined as (MasterId int primary key, Data int), and Detail, defined as (DetailId int primary key, Data int, MasterId int)—and select columns (MasterID, Detail.Data) where Master.Data satisfies a certain condition. The condition can be anything that is built directly on the value of Master.Data—for example, Data = @data, Data between @start and @end, or Data in (1, 2, 3). Let's also assume that, besides primary key indexes, the Master table has an index idxMaster built on (Data, MasterID), and the Detail table has an index idxDetail built on (MasterID). The query plan will look as shown in Figure 3.

Figure 3. Query plan

What does that mean, exactly? Let's analyze it step by step:

The query engine searches idxMaster, to find the entries that satisfy our condition for Master.Data, and it builds a list of corresponding Master.MasterID values. This is fast, because:

Index pages are sorted by Master.Data, so the engine can jump directly to leaf index pages that satisfy our condition.

Those leaf index pages already contain values of Master.MasterId, so there is no need to jump somewhere else to read them.

The engine selects entries from idxDetail having values of Detail.MasterID that match the list of Master.MasterID built during Step 1. This can be done by simply looping through the list of Master.MasterID and looking up each corresponding entry in idxDetail, or by using more complicated hash or mergealgorithms. The query engine determines which algorithm to use in each particular case, based on the relative cost of each algorithm. This, in turn, depends on the following criteria:

Selectivity of the index

Estimated number of lookups

Estimated number of rows to find.

For the low relative values of the second and third criteria, the loop algorithm is preferred, whereas for higher values, the hash/merge algorithms become more efficient. The output of this step is a list of (MasterID, Detail:FID:PN:RN) values (see Figure 1).

A list of heap pointers (FID:PN:RN), also known as bookmarks, is used to read randomly located heap data pages of the Detail table, and to retrieve values of the Data field in order to compile the requested result set (MasterID, Detail.Data). This step is called bookmark lookup, and, in our case, it is the most expensive one. The cost of bookmark lookup can be anywhere from 90 to 99 percent of the total query cost. This means that eliminating this step can improve the performance ten-fold or even one-hundred-fold. The cost of bookmark lookup depends on the fragmentation of data pages, the number of data pages to read, that availability of read-ahead optimization, the IO-to-CPU cost ratio, and other factors.

Please note that, in this example, bookmark lookup has only been performed on the Detail table, not on the Master table. This is because the required output field MasterID was a part of the index. This leads us to the definition of covering index: A covering index is the index that contains all output fields required by the operation performed on that index. Please note that if the Master table contained more fields, and at least one of these fields was present in the SELECT clause, the query engine would have to perform bookmark lookup on the Master table, and therefore idxMaster could not serve as a covering index anymore. That is why it is not possible to tell whether the index is covering without taking into consideration the query it is used in, and even the particular execution plan.

It is very easy now to devise a way to improve the performance of our query. Adding Detail.Data as a second field to idxDetail eliminates the need for bookmark lookup and considerably improves the performance.

Good candidates for covering indexes are junction tables (tables that exist to resolve many-to-many relationships). In the simplest case of a many-to-many relationship between T1 and T2, it is enough to have two indexes, (T1_Id, T2_Id) and (T2_Id, T1_Id), in order to save the query engine from ever doing bookmark lookups on this table.

It is rarely feasible to use covering indexes for the fact tables, since there are usually many fields requested from these. Putting all required fields into the index, in order to make it covering, would in a way make it a very ineffective clustered index, which brings us to the next topic.

Clustered Indexes/Index-Based Tables

Figure 4 shows the organization of an index-based table (a table that has a clustered index).

Figure 4. Clustered index/index-based table

As you can see, this diagram is very similar to the diagram of a non-clustered index (Figure 1), except that leaf nodes contain data, instead of pointers to data. Therefore, the easiest way to understand a clustered index is to treat it as a covering index consisting of all the fields in the table, with three additional advantages:

Data stored in leaf nodes is not duplicated. On the other hand, in the case of a covering index on a heap table, data will be stored both in the index and in the heap.

Table scan (called clustered index scan in the case of an index-based table) is faster than table scan of a heap table, because leaf nodes of the clustered index are stored together. The internal implementation of modern hard drives allows storing all the data read from one logical cylinder of the hard drive in its internal cache; in the case of an index-based table, it is more likely that this data will contain more data pages of the table being scanned.

If hash or merge algorithms are selected by the query engine for the join, data retrieved from the table should be sorted by the join key. If a clustered index is built on the join key, retrieved data is already sorted.

Several special considerations have to be kept in mind about index-based tables. The most important one is that all other (non-clustered) indexes of the index table contain values of the clustering key, instead of heap pointers. In the available sources, it is never mentioned what exactly was the reason for this design decision, so we can only guess that the additional cost of searching through the clustering index is outweighed by the cost saved during the reallocation of the clustered index pages. Indeed, nothing would prevent non-clustered indexes of the index-based table from pointing directly to the row in the leaf node that contains the data, but this pointer should be updated each time this row is moved to another location. Also, obviously, only one clustered index can exist for the table. So out of all possible indexes that could be clustered, it is necessary to select one that will give maximum performance gain.

Good candidates for clustered indexes are:

Primary keys of the lookup/reference/dimension/master tables

Foreign keys of the fact/detail tables

Datetime fields of the tables queried by the date range

Optimization Rules of Thumb

Always look at the query plan first. It will show you the optimal current execution plan from the query engine's point of view. Find the most expensive part of the execution plan and start optimizing from there. However, even before that, make sure that the statistics on all tables in your query are up to date, by running the update statistics <TableName> command on all tables in your query.

If you see table scan, optimize. Table scan is the slowest possible way of execution. Table scan means not only that no index is used, but that there is no clustered index for this table at all. Even if you can only replace table scan with clustered index scan, it is still worth it.

If you see clustered index scan, find out whether it can be replaced with index seek. For that, find what conditions are applied to this table. Usually, conditions exist for two or three fields of the table. Find out the most selective condition (that is, the condition that would produce the smallest number of records if applied alone), and see whether an index on this field exists. Any index that lists this field first will qualify. If there is no such index, create it and see whether the query engine picks it up.

If the query engine is not picking up the existing index (that is, if it is still doing a clustered index scan), check the output list. It is possible that seek on your index is faster than clustered index scan, but involves bookmark lookup that makes the combined cost greater than use of a clustered index. Clustered index operations (scan or seek) never need bookmark lookup, since a clustered index already contains all the data. If the output list is not big, add those fields to the index, and see whether the query engine picks it up. Please remember that the combined size is more important than the number of fields. Adding three integer fields to the index is less expensive than adding one varchar field with an average data length of 20.

Summarizing this rule, try to make your index covering, and see whether it works better than clustered index scan. Please note that it is not always possible to make the query engine pick up your index automatically. A small table or a low-selectivity index will produce clustered index scan, even if your index is covering.

If you see bookmark lookup, it means that your index is not covering. Try to make it covering if it makes sense (see the preceding guidelines).

The execution plan selected by the query engine may be not the best one. The query engine makes certain assumptions about disk subsystem and CPU cost versus IO cost. These assumptions sometimes can be incorrect. If you don't believe that the query engine's selection is the best one, run a query in the loop for 10 to 15 minutes with automatic selection, change the query to use your index (you will have to use index hint to force it), and then run it for 10 to 15 minutes again. Compare the results to see which one works better.

Avoid any operations on the fields, where possible. Some operations will prevent the use of the index on this field even if it exists—for example, the infamous ltrim(rtrim(FieldName)); other operations will degrade the performance. For example, instead of using the condition cast(DateField as varchar(20)) = @dateString, try to convert @dateString to an expression of datetime type first, and then compare it to DateField.

Please note that the query engine cost estimate does not include the cost of embedded procedure or function calls. If you compare between plain join and select from table-value functions, the latter would seem to have smaller cost, but it usually does not. In such a situation, use your own metrics to find out which query performs better.

When it is not possible to avoid operation on the field, use an index built on that expression. This can be done in two ways:

Create a calculated field based on your expression.

Create a view, and build an index on it.

Note SQL Server requires certain conditions to be met in order to allow the use of calculated fields and indexed views (set quoted_identifier on, set arithabort on, and so on).

Indexed views are a good way to further speed up the query if you are not satisfied with the results. Indexed view is a clustered index built over the view's select list. You can also define additional indexes for the indexed view, just as you can for any regular table. Indexed views take disk space and involve some maintenance overhead (every time underlying tables change, the indexed view also has to change), but they usually provide a good boost in performance, even after all other optimization techniques are exhausted.

Special thanks to Andrei Volkov for many interesting discussions about the SQL Server internals.