SQL Tuning Tutorial - Understanding a Database Execution Plan (1)

If you want to tune your SQL statements, you need to understand how the server executes them. The article explains this.

Introduction

My everyday job is to develop back-office applications for a mobile telecom operator. When a customer orders a service through the web or voice front-end, our applications have to provide a very quick feedback. Although we are required to answer in less than one second, we have to perform complex SQL statements on our databases which are dozens of GBs.

In this environment, a single inefficient query can have disastrous effects. A bad statement may overload all database processors, so that they are no longer available to serve other customers' orders. Of course, such problems typically occur shortly after the launch of new offers... that is, precisely under heavy marketing fire. Could you imagine the mood of our senior management if such a disaster happens?

Unfortunately, suboptimal statements are difficult to avoid. Applications are generally tested against a much smaller amount of data than in production environment, so performance problems are not likely to be detected empirically.

That's why every database developer (and every application developer coping with databases) should understand the basic concepts of database performance tuning. The objective of this article is to give a theoretical introduction to the problem. At the end of this article, you should be able to answer the question: is this execution plan reasonable given the concrete amount of data I have?

I have to warn you: this is about theory. I know everyone dislike it, but there is no serious way to go around it. So, expect to find here a lot of logarithms and probabilities... Not afraid? So let's continue.

In this article, I will try to cover the following topics:

What are we optimizing.

Looking for records in tables.

Joining tables with nested loops.

In the second part (to be written), the following has to be sketched:

Applying predicates.

Joining tables with hash.

Sorting

Merging

Joining tables with merge joins.

Scenario

I need a sample database for the examples of this article. Let's set up the scene.

The CUSTOMERS table contains general information about all customers. Say the company has about a million of customers. This table has a primary key CUSTOMER_ID, which is indexed by PK_CUSTOMERS. The LAST_NAME column is indexed by IX_CUSTOMERS_LAST_NAME. There are 100000 unique last names. Records in this table have an average of 100 bytes.

The REGION_ID column of the CUSTOMERS table references the REGIONS table, which contains all the geographical regions of the country. There are approximately 50 regions. This table has a primary key REGION_ID indexed by PK_REGIONS.

I will use the notations RECORDS(CUSTOMERS) and PAGES(CUSTOMERS) to denote respectively the number of records and pages in the CUSTOMERS table, and similarly for other tables and even for indexes. Prob[CUSTOMERS.LAST_NAME = @LastName] will denote the probability that a customer will be named by the @LastName when we have no other information about him.

What is an execution plan

An SQL statement expresses what you want but does not tell the server how to do it. Using an SQL statement, you may for instance ask the server to retrieve all customers living in the region of Prague. When the server receives the statement, the first thing it does is to parse it. If the statement does not contain any syntax error, the server can go on. It will decide the best way to compute the results. The server chooses whether it is better to read completely the table of customers, or whether using an index would be faster. It compares the cost of all possible approaches. The way that a statement can be physically executed is called an execution plan or a query plan.

An execution plan is composed of primitive operations. Examples of primitive operations are: reading a table completely, using an index, performing a nested loop or a hash join,... We will detail them in this series of articles. All primitive operations have an output: their result set. Some, like the nested loop, have one input. Other, like the hash join, have two inputs. Each input should be connected to the output of another primitive operation. That's why an execution plan can be sketched as a tree: information flows from leaves to the root. There are plenty of examples below in this article.

The component of the database server that is responsible for computing the optimal execution plan is called the optimizer. The optimizer bases its decision on its knowledge of the database content.

How to inspect an execution plan

If you are using Microsoft SQL Server 2000, you can use the Query Analyzer, to which execution plan is chosen by the optimizer. Simply type an SQL statement in the Query window and press the Ctrl+L key. The query is displayed graphically:

As an alternative, you can get a text representation. This is especially useful if you have to print the execution plan. Using a Command Prompt, open the isql program (type isql -? to display the possible command line parameters). Follow the following instructions:

Type set showplan_text on, and press Enter.

Type go, and press Enter.

Paste your SQL statement at the command prompt, and press Enter.

Type go, and press Enter.

The program will display the execution plan:

The top operation of this execution plan is a hash join, whose inputs are an index scan of UNC_Dep_DepartmentName and a clustered index scan of PK_USERS. The objective of this series of articles is to learn how to understand such execution plans.

What are we optimizing?

Application developers usually have to minimize processor use and sometimes memory use. However, when developing database applications, the bottleneck is elsewhere. The main concern is to minimize disk access.

The main disk allocation unit of database engines is called a page. The size of a page is typically some kilobytes. A page usually contains between dozens and hundreds of records. This is important to remember: sometimes you may think a query is optimal from the point of view of the record accesses, while it is not if you look at page accesses.

Looking for records in tables

Full table scan

Say we are looking for a few records in a single table -- for instance we are looking for the customers whose last name is @LastName.

sql1 ::= SELECT * FROM CUSTOMERS WHERE LAST_NAME = @LastName

The first strategy is to read records from the table of customers and select the ones fulfilling the condition LAST_NAME = @LastName. Since the records are not sorted, we have to read absolutely all the records from the beginning to the end of the table. This operation is called a full table scan. It has linear complexity, which means that the execution time is a multiple of the number of rows in the table. If it takes 500 ms to look for a record in a table of 1000 records, it may take 8 minutes in a table of one million records and 5 days in a table of one billion records...

To compute the cost of sql1, we set up a table with primitive operations. For each operation, we specify the cost of one occurrence and the number of occurrences. The total cost of the query is obviously the sum of the products of operation unit cost and number of repetitions.

Operation

Unit Cost

Number

Full table scan of CUSTOMERS

PAGES(CUSTOMERS)

1

Let's take a metaphor: a full table scan is like finding all occurrences of a word in a Roman.

Index seek and index range scan

Now what if the book is not a Roman but a technical manual with an exhaustive index at the end? For sure, the search would be much faster. But what is precisely an index?

An index is a collection of pairs of key and location. The key is the word by which we are looking. In the case of a book, the location is the page number. In the case of a database, it is the physical row identifier. Looking for a record in a table by physical row identifier has constant complexity, that is, it does not depend on the number of rows in the table.

Keys are sorted, so we don't have to read all keys to find the right one. Indeed, searching in an index has logarithmic complexity. If looking for a record in an index of 1000 records takes 100 ms, it may take 200 ms in an index of million of rows and 300 ms in an index of billion of rows. (Here I'm talking about B-Tree indexes. There are other types of indexes, but they are less relevant for application development).

If we are looking for customers by name, we can perform the following physical operations:

Seek the first entry in IX_CUSTOMERS_LAST_NAME where LAST_NAME=@LastName. This operation is named an index seek.

Read the index from this entry to the last where the LAST_NAME=@LastName is still true. This will cost to read PAGES(IX_CUSTOMERS_LAST_NAME)*Prob[LAST_NAME=@LastName] pages from disk. This operation (always coupled with an index seek) is called an index range scan.

Each index entry found by the previous steps gives us the physical location of the a record in the CUSTOMERS table. We still have to fetch this record from the table. This will imply RECORDS(CUSTOMERS)*Prob[LAST_NAME=@LastName] page fetches. This operation is called a table seek.

The detailed cost analysis of sql1 using an index range scan is the following.

Operation

Unit Cost

Number

Index Seek of IX_CUSTOMERS_LAST_NAME

Log( PAGES(IX_CUSTOMERS_LAST_NAME) )

1

Index Range Scan of IX_CUSTOMERS_LAST_NAME

PAGES(IX_CUSTOMERS_LAST_NAME)* Prob[LAST_NAME = @LastName]

1

Table Seek of CUSTOMERS

1

RECORDS(CUSTOMERS)* Prob[LAST_NAME = @LastName]

Bad news is that the query complexity is still linear, so the query time is still a multiple of the table size. Good news is that we cannot do really better: the complexity of a query cannot be smaller than the size of its result set.

In the next section of this article, we will accept a simplification: we will assume that index look-up has unit cost. This estimation is not so rough because a logarithmic cost can always be neglected if it is added to a linear cost. This simplification is not valid if it is multiplied to another cost.

Index selectivity

Comparing the cost of the full table scan approach and the index range scan approach introduces us to a crucial concept in database tuning. The conclusion of the previous section is that the index range scan approach shall be faster if, in terms of order of magnitude, the following condition is true:

[1] RECORDS(CUSTOMERS)* Prob[LAST_NAME = @LastName]< PAGES(CUSTOMERS)

The probability that a customer has a given name is simply the number customers having this name divided by the total number of customers. Let KEYS(IX_CUSTOMERS_LAST_NAME) denote the number of unique keys in the index IX_CUSTOMERS_LAST_NAME. The number of customers named @LastName is statistically RECORDS(CUSTOMERS)/KEYS(IX_CUSTOMERS_LAST_NAME).

The selectivity of a unique index is always 1. The more an index is selective (the larger is its selectivity coefficient), the more is its efficiency. Corollary: indexes with poor selectivity can be counter-productive.

Joining tables with nested loops

Things become much more difficult when you need to retrieve information from more than one table.

Suppose we want to display the name of the region besides the name of the customer:

Among the possible strategies, I will present in this article the most natural: choosing a table, reading it from the beginning to the end and, for each record, search the corresponding record in the second table. The first table is called the outer table or leading table, and the second one the inner table. The dilemma is of course to decide which table should be leading.

So let's first try to start with the table of regions. We learnt before that an index on CUSTOMERS.REGION_ID would have too low selectivity to be efficient, so our first candidate execution plan is to read the table of regions and, for each region, perform a full table scan of CUSTOMERS.

Operation

Unit Cost

Number

Full table scan of REGIONS

PAGES(REGIONS)

1

Full table scan of CUSTOMERS

PAGES(CUSTOMERS)

RECORDS(REGIONS)

The leading cost is clearly PAGES(CUSTOMERS)*RECORDS(REGIONS). If we give numeric values, we have approximately 50*PAGES(CUSTOMERS).

Now what if we did the opposite? Since the table of regions is so small that it has a single page, it is useless to have an index, so we choose again two nested full table scans.

Operation

Unit Cost

Number

Full table scan of CUSTOMERS

PAGES(CUSTOMERS)

1

Full table scan of REGIONS

PAGES(REGIONS)

RECORDS(CUSTOMERS)

At first sight, the leading cost is PAGES(REGIONS)*RECORDS(CUSTOMERS). Since the table of regions is so small that it fits in one page, and since we have approximately 80 customer records per page (pages have, say, 8K),we can write that the leading cost is 80*PAGES(CUSTOMERS), which seems a little worse than the first approach. However, this second join order shall be must faster than the first one. To see this, we have to take into account a factor that we have forgotten up to now: the memory cache.

Since we are interested only in minimizing disk access, we can consider that the cost of reading a page from memory is zero.

The REGIONS table and its primary key can both be stored in cache memory. It follows that the cost matrix can be rewritten as follows:

Operation

Unit Cost

Number

Full table scan of CUSTOMERS

PAGES(CUSTOMERS)

1

First Full table scan of REGIONS

PAGES(CUSTOMERS)

1

Next Full table scan of REGIONS

0

RECORDS(CUSTOMERS)

So, finally the leading cost term is PAGES(CUSTOMERS), which is around 200 times better than the first join order.

Summary

Let's summarize what I have tried to introduce in this article:

We are optimizing page access from disk, not record access.

An execution plan describes how data flows between primitive operations. To understand an execution plan, it is necessary to know the complexity of each primitive operation and the number of records of their input flows.

Pay attention to myths. Index scans are not always better than full table scans.

A nested loop is similar to a foreach iteration.

Caching can have a tremendous effect.

The order of execution has a direct influence on the memory use efficiency.

There is however still a lot of work before us. I haven't still talked about the most common type of joins: the hash joins. Worse: I haven't discussed how predicates (non-joining parts of the WHERE clause) can totally mislead the optimizer. If you want to be better than the built-in optimizer, you still have to read the second article.

References

A good source of information, which helped me a lot to write this article, is the Oracle Performance Tuning Guide and Reference. Most of the concepts explained there are also relevant for Microsoft SQL Server. Microsoft users can also read the Query Tuning section of the on-line help.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Share

About the Author

Gael Fraiteur is a software engineer specializing in application development for large and medium accounts. He has 5 years of professional experience and more than 15 years of hobby experience in software development. Gael Fraiteur has an advanced knowledge of the .NET Framework, Oracle Server, Microsoft SQL Server and TIBCO.

Gael Fraiteur is the lead developer of PostSharp, a post-compiler for .NET that makes AOP on .NET easy both for business and system developers.

First of all, this is a great tutorial on database tuning, I learned a lot from it.

However there are some statements of yours in it which are rather unclear to me, please elaborate on them:

Statement a./There are approximately 50 regions.

Under Joining tables with nested loops:

Statement b./The leading cost is clearly PAGES(CUSTOMERS)*RECORDS(REGIONS). If we give numeric values, we have approximately 10*PAGES(CUSTOMERS).
That is RECORDS(REGIONS) must be 10, which seems to contradict Statement a./

Statement c./At first sight, the leading cost is PAGES(REGIONS)*RECORDS(CUSTOMERS). With numeric values, we have 20*PAGES(CUSTOMERS) which seems a little worse than the first approach.

How did you calculate PAGES(REGIONS)*RECORDS(CUSTOMERS) = 20*PAGES(CUSTOMERS)?

Thanks for sharing this article, I am eagerly waiting for the second part!

It is true that Statement a conflicts with Statement b, so say I will correct Statement b to: The leading cost is clearly PAGES(CUSTOMERS)*RECORDS(REGIONS). If we give numeric values, we have approximately 50*PAGES(CUSTOMERS).

To understand Statement b, I have to add a new statement about the size of a page and to take into account the average size of a record (stated to 100 bytes in the scenario description). If I have pages of 2K, I have 20 records per page, so RECORDS(CUSTOMERS) = 20*PAGES(CUSTOMERS). Since the table of regions is small, it will fit on one page, so PAGES(REGIONS)=1. And we have it. However, it is more probable that pages will have a size of 8K, so there will be in average 80 records per page.

;Don't use ctrl-l to bring up the Execution Plan. This is actually the estimated execution plan and can and will differ from the actual execution plan. You can use CTRL-k and the execute the query and see exactly what happened. This does not always work if your using the FOR XML clauses.

The other option is to use SQL Profiler using the Show Statistics event. The Binary data will contain the execution tree plus it will contain the rows affected and the number of executions and other information shown in the Query Analyzer.

Great start on your articles, however I don't agree with your opinion of Query Analyzer being "terribly ineffective" because "it contains only little information." What it's really doing is presenting even more information that the command line version in a format that's manageable for large queries.

For example, put the mouse over any icon in QA's execution plan and it gives you the list of primitive operations and many other statistics (including disk I/O operations, a key point in your article). Putting the mouse over any arrow gives you an estimated row count at that point in the plan (or actual count, if displaying the actual plan after the query is run).

I'd much rather view a statement with dozens of primitive operations graphically than having to "scroll endlessly" in the command line version to find the bottleneck join or scan.

I have to admit that I have been a little offending towards the Query Analyzer and, since it is rather a matter of opinion, I will revise the text.

In thought about what I had in mind when I wrote this. I indeed have a more objective reason to prefer the text version, yet linked to my current customer.

According to the policy of this customer, execution plans are a part of the delivery. This allows QA and production people to control that SQL statements are not to damage server performance. Concretely, execution plans are pasted in a Word document, that's why the text version is better for this purpose.