Optimizing Query Performance

If you want to end up with a poorly performing application or a complete project failure, you should wait until the end of the project to deal with performance concerns. If, however, you want your application to be the best it can be, you must consider performance throughout the development cycle. In fact, you must consider performance before you even write your first line of code.

Every chapter of this book has included some information on performance. If you've turned directly to this chapter in hopes of finding the secret to improving your lackluster application, you'll be disappointed. We can offer only guidelines for you to keep in mind, along with some pointers that refer back to earlier information or to other helpful materials.

Microsoft SQL Server systems can be brilliantly fast with well-designed, well-implemented applications. It can support workloads of the type and size that no one dreamed possible back in 1988 when SQL Server first became available. But with a poorly planned or poorly implemented system, SQL Server can perform horribly. Statements like "SQL Server is slow" are not uncommon. (Nor are such statements about other database products.) Anytime you hear this from someone who has deployed a production application, your first thought should be that the person has dropped the ball somewhere along the line—or that SQL Server is unsuitable for the task at hand. (SQL Server can handle most systems, but some are still beyond its reach.)

If this is the first chapter you've turned to, please stop and go back at least to Chapter 3. All of the chapters from Chapter 3 to this one are relevant to performance issues. You might also revisit Chapter 11, which is about cursors, and Chapter 13, on locking. A thorough understanding of cursors and locking is a prerequisite for understanding the material in this chapter.

The Development Team

A software project's success depends on the experience and skill of the staff developing it. Your second SQL Server development project will be better than your first, no matter how smart you are, so don't make your first project one that will have thousands of concurrent users, manage tens of gigabytes of data, and replicate data to 10 other servers. If you tackle such an extensive project your first time out, you'll probably fail (or at least finish late and over budget). And although it's useful to have experience working with other systems and environments, you can't expect successful SQL Server development on the first try.

The smartest companies start with a small, less-than-mission-critical system before moving many applications over to SQL Server. If management will not allow you the luxury of working on a "practice" system, you should at least try to augment your team with an experienced consultant or two. Look for someone who, at a minimum, is a Microsoft Certified Systems Engineer (MCSE) who has taken both SQL Server exams for electives. In 1999, Microsoft is introducing a Microsoft Certified DBA (MCDBA) certification, which would indicate a similar level of qualification. This book, while it provides much useful information, can't replace other training. And skills such as database design, which transcend SQL Server specifically, are essential for a project's success.

See Also The Microsoft course called "Performance Tuning and Optimization of Microsoft SQL Server" is worthwhile. Microsoft develops this course with input from the SQL Server development group. The course is available at Microsoft Certified Partner(s) for Learning Solutions. For more information, go to https://partner.microsoft.com/global/40011646.

Application and Database Design

The biggest performance gains come from changes to the application and database design. You might change your configuration settings and add heftier hardware and be thrilled when performance doubles, but changes to the application can often result in even larger performance increases. There are as many approaches to software development as there are pages in this book. No single approach is the right approach—yours must be tailored to the size of the project, your team, and the skill level of the team members.

Take a look at the list of suggestions on the facing page for planning and implementing good performance in your system. We'll explain these items in detail in this chapter and in Chapter 15.

Develop expertise on your development team.

Understand that there is no substitute for solid application and database design.

State performance requirements for peak, not average, use.

Consider perceived response time for interactive systems.

Prototype, benchmark, and test throughout the development cycle.

Create useful indexes.

Choose appropriate hardware.

Use cursors judiciously.

Use stored procedures almost always.

Minimize network round-trips.

Understand concurrency and consistency tradeoffs.

Analyze and resolve locking (blocking) problems.

Analyze and resolve deadlock problems.

Monitor and tune queries using SQL Server Profiler.

Monitor system using Performance Monitor.

Review and adjust Windows NT settings.

Review and adjust SQL Server configuration settings.

Make only one change at a time, and measure its effect.

Do periodic database maintenance.

Normalize Your Database

We'll assume that if you're reading this book, you understand the concept of normalization and terms such as third normal form. (If you don't, see Candace Fleming and Barbara Vonhalle's Handbook of Relational Database Design and Michael Hernandez's Database Design for Mere Mortals. A plethora of other books about database design and normalization are also available.)

A normalized database eliminates functional dependencies in the data so that updating the database is easy and efficient. But querying from that database might require a lot of joins between tables, so common sense comes into play. If the most important and time-critical function your system must perform is fast querying, it often makes sense to back off from a normalized design in favor of one that has some functional dependencies. (That is, the design is not in third normal form or higher.) Think of normalization as typically being good for updating but potentially bad for querying. Start with a normalized design and then look at all the demands that will be placed on the system.

Note: There really isn't a binary concept of being "normalized" or "not normalized." There are only degrees of normalization. It is common to refer to a database that is at least in third normal form as "normalized" and to refer to a database at a lower level of normalization as "unnormalized" or "denormalized." To keep the discussion simple, we'll use the terms in that way, as imprecise as that might be. (The terms "unnormalized" and "denormalized" have slightly different meanings. An unnormalized database is one that has never been normalized. A denormalized database is one that was normalized at some point, but for specific performance-related reasons the design was backed down from the normalized version. Our apologies to those who make a living doing entity-relationship diagrams and are horrified by the loose use of these terms.)

If you understand the data elements you need to record and you understand data modeling, producing a normalized design is not difficult. But it might take some time to learn about the way a business operates. If you already understand the underlying processes to be modeled and know how to do data modeling, the mechanics of producing a normalized database design are quite straightforward.

Once you produce a normalized design, which you can also think of as the logical design, you must decide if you can implement the design nearly "as is" or if you need to modify it to fit your performance characteristics. A lot of people have trouble with this. Rather than try to articulate specific performance characteristics, they generalize and strive for "as fast as possible" or "as many users as we can handle." Although goals can be difficult to articulate precisely, you should at least set relative goals. You should understand the tradeoffs between update and query performance, for example. If a salesperson must call up all of a customer's records while the customer is waiting on the phone, that action should be completed within a few seconds. Or if you want to run a bunch of batch processes and reports for your manufacturing operation each night and you have a window of four hours in which to do it, you have a pretty clear objective that must be met.

Evaluate Your Critical Transactions

One thing that you should do immediately is look at your critical transactions—that is, transactions whose performance will make or break the system. (In this context, we use the term "transaction" loosely; it means any operation on the database.) Which tables and joins will be required for your critical transactions? Will data access be straightforward or complicated?

For example, if it is imperative that a given query have less than a 2-second response time but your normalized design would require a seven-way join, you should look at what denormalizing would cost. If tables are properly indexed, the query is well qualified, the search parameters are quite selective, and not a lot of data needs to be returned, the quick response might be possible. But you should note any seven-way joins and consider other alternatives. (You should probably look for alternatives any time you get beyond a four-way join.)

In our example, you might decide to carry a little redundant information in a couple of tables to make it just a three-way join. You'll incur some extra overhead to correctly update the redundant data in multiple places, but if update activity is infrequent or less important and the query performance is essential, altering your design is probably worth the cost. Or you might decide that rather than compute a customer's balance by retrieving a large amount of data, you can simply maintain summary values. You can use triggers to update the values incrementally when a customer's records change. (For example, you can take the old value and add to or average it but not compute the whole thing from scratch each time.) When you need the customer balance, it is available, already computed. You incur extra update overhead for the trigger to keep the value up-to-date, and you need a small amount of additional storage.

Proper indexes are extremely important for getting the query performance you need. But you must face query-vs.-update tradeoffs similar to those described earlier because indexes speed up retrieval but slow down updating. Chapter 8 explains the extra work required when your updates require index maintenance. (Because of the way that nonclustered indexes are stored and updated, the overhead of index maintenance is not nearly as severe in SQL Server 7 as in previous versions of the product.) You might want to lay out your critical transactions and look for the likely problems early on. If you can keep joins on critical transactions to four tables or less and make them simple equijoins on indexed columns, you'll be in good shape.

None of these considerations are new, nor are they specific to SQL Server. Back in the mainframe days, there was a technique known as "completing a CRUD chart." CRUD stands for Create-Retrieve-Update-Delete. In SQL, this would translate as ISUD—Insert-Select-Update-Delete. Conceptually, CRUD is pretty simple. You draw a matrix with critical transactions on the vertical axis and tables with their fields on the horizontal axis. The matrix gets very big very quickly, so creating it in Microsoft Excel or in your favorite spreadsheet program can be helpful. For each transaction, you note which fields must be accessed and how they will be accessed, and you note the access as any combination of I, S, U, or D, as appropriate. You make the granularity at the field level so you can gain insight into what information you want in each table. This is the information you need if you decide to carry some fields redundantly in other tables to reduce the number of joins required. Of course, some transactions require many tables to be accessed, so be sure to note whether the tables are accessed sequentially or via a join. You should also indicate the frequency and time of day that a transaction runs, its expected performance, and how critical it is that the transaction meet the performance metric.

How far you carry this exercise is up to you. You should at least go far enough to see where the potential hot spots are for your critical transactions. Some people try to think of every transaction in the system, but that's nearly impossible. And what's more, it doesn't matter: only a few critical transactions need special care so they don't lead to problems. (You shouldn't worry much about such things as noncritical reports that run only during off-hours.) For example, if you have to do frequent select operations simultaneously on the tables that are being updated the most, you might be concerned about locking conflicts. You need to consider what transaction isolation level to use and whether your query can live with Read Uncommitted and not conflict with the update activity.

If you are doing complex joins or expensive aggregate functions—SUM(), AVG(), and so on—for common or critical queries, you should explore techniques such as the following and you should understand the tradeoffs between query performance improvement and the cost to your update processes:

Add logically redundant columns to reduce the number of tables to be joined.

Use triggers to maintain aggregate summary data, such as customer balances, the highest value, and so forth. Such aggregates can usually be incrementally computed quickly. The update performance impact can be slight, but the query performance improvement can be dramatic.

Keep Table Row Lengths and Keys Compact

When you create tables, you must understand the tradeoffs of using variable-length columns. (See Chapter 6.) As a general rule, data with substantial variance in the actual storage length is appropriate for variable-length columns. Also remember that the more compact the row length, the more rows will fit on a given page. Hence, a single I/O operation with compact rows is more efficient than an I/O operation with longer row lengths—it returns more rows and the data cache allows more rows to fit into a given amount of memory.

As with tables, when you create keys you should try to make the primary key field compact because it frequently occurs as a foreign key in other tables. If no naturally compact primary key exists, you might consider using an identity or uniqueidentifier column as a surrogate. And recall that if the primary key is a composite of multiple columns, the columns are indexed in the order that they are declared to the key. The order of the columns in the key can greatly affect how selective, and hence how useful, the index is.

Your clustered key should also be as compact as possible. If your clustered key is also your primary key (which is the default when you declare a PRIMARY KEY constraint), you might already have made it compact for the reason mentioned above. There are additional considerations for your clustered key because SQL Server automatically keeps the clustered key in all nonclustered indexes, along with the corresponding nonclustered key. For example, if your clustered index is on zipcode and you have a nonclustered index on employee_id, every row in the nonclustered index stores the corresponding zipcode value along with the employee_id value. We discussed the structure of indexes in Chapters 3 and 6, and we'll look at it again later in this chapter when we look at how to choose the best indexes.

Occasionally, a table will have some columns that are infrequently used or modified and some that are very hot. In such cases, it can make sense to break the single table into two tables; you can join them back together later. This is kind of the reverse of denormalization as you commonly think of it. In this case, you do not carry redundant information to reduce the number of tables; instead, you increase the number of tables to more than are logically called for to put the hot columns into a separate, narrower table. With the more compact row length, you get more rows per page and potentially a higher cache-hit ratio. As with any deviation from the normalized model, however, you should do this only if you have good reason. After you complete a CRUD chart analysis, you might see that while your customer table is frequently accessed, 99 percent of the time this access occurs just to find out a customer's credit balance. You might decide to maintain this balance via a trigger rather than by recomputing it each time the query occurs. Information such as customer addresses, phone numbers, e-mail addresses, and so on are large fields that make the table have a wide row length. But not all that information is needed for critical transactions—only the customer balance is needed. In this case, splitting the table into two might result in the difference between fitting, say, 150 rows on a page instead of only 2 or 3 rows. A more narrow table means a greater likelihood that the customer balance can be read from cache rather than by requiring physical I/O.

Planning for Peak Usage

People often ask questions like, "Can SQL Server handle our system? We do a million transactions a day." To answer this question, you have to know exactly what transactions they have in mind. You also need to know the system's peak usage. If a million transactions a day are nicely spread out over 24 hours, that's less than 12 transactions per second. In general, a 12-TPS (transactions-per-second) system would be adequate. But if 90 percent of the million transactions come between 2 p.m. and 3 p.m., it's a very different situation. You'll have rates of about 275 TPS during that hour and probably peaks of more than 350 TPS. You can use benchmarks, such as Debit-Credit, in which SQL Server performs over 1500 TPS, but all transactions are different. It is meaningless to refer to transactions per second in SQL Server or any other system without also talking about the types of transactions. This is precisely the reason that standardized tests are available from the Transaction Processing Council for comparing database performance.

Regardless of the specific transactions, though, you must design and build your system to handle peak usage. In the case above, the important consideration is peak usage, which determines whether you should target your system to a volume of 350 TPS for unevenly distributed usage or to only 12 TPS with usage spread out evenly. (Most systems experience peaks, and daily usage is not so nicely spread out.)

Perceived Response Time for Interactive Systems

Systems are often built and measured without the appropriate performance goals in mind. When measuring query performance, for example, most designers tend to measure the time that it takes for the query to complete. By default, this is how SQL Server decides to cost query performance. But this might not be the way your users perceive system performance. To users, performance is often measured by the amount of time that passes between pressing the Enter key and getting some data. As a program designer, you can use this to your advantage. For example, you can make your application begin displaying results as soon as the first few rows are returned; if many rows will appear in the result set, you don't have to wait until they are all processed. You can use such approaches to dramatically improve the user's perception of the system's responsiveness. Even though the time required to get the last row might be about the same with both approaches, the time it takes to get the first row can be different—and the perceived difference can translate into the success or failure of the project.

By default, SQL Server optimizes a query based on the total estimated cost to process the query to completion. Recall from Chapter 3 that if a significant percentage of the rows in a table must be retrieved, it is better to scan the entire table than to use a nonclustered index to drive the retrieval. (A clustered index, of course, would be ideal because the data would be physically ordered already. The discussion here pertains only to the performance tradeoff of scan-and-sort vs. using a nonclustered index.)

Retrieving a page using the nonclustered index requires traversing the B-tree to get the address of a data page or the clustering key and then retrieving that page (by using the RID to directly access it or traversing the clustered index to find the data page) and then traversing the nonclustered B-tree again to get the location information for the next data page and retrieving it…and so on. Many data pages are read many times each, so the total number of page accesses can be more than the total number of pages in the table. If your data and the corresponding nonclustered index are not highly selective, SQL Server usually decides not to use that nonclustered index. That is, if the index is not expected to eliminate more than about 90 percent of the pages from consideration, it is typically more efficient to simply scan the table than to do all the extra I/O of reading B-trees for both the nonclustered and clustered indexes as well as the data pages. And by following the index, each data page must frequently be accessed multiple times (once for every row pointed to by the index). Subsequent reads are likely to be from cache, not from physical I/O, but this is still much more costly than simply reading the page once for all the rows it contains (as happens in a scan).

Scanning the table is the strategy SQL Server chooses in many cases, even if a nonclustered index is available that could be used to drive the query, and even if it would eliminate a sort to return the rows based on an ORDER BY clause. A scan strategy can be much less costly in terms of total I/O and time. However, the choice of a full scan is based on SQL Server's estimate of how long it would take the query to complete in its entirety, not how long it would take for the first row to be returned. If an index exists with a key that matches the ORDER BY clause of the query and the index is used to drive the query execution, there is no need to sort the data to match the ORDER BY clause (because it's already ordered that way). The first row is returned faster by SQL Server chasing the index even though the last row returned might take much longer than if the table were simply scanned and the chosen rows sorted.

In more concrete terms, let's say that a query that returns many rows takes 1 minute to complete using the scan-and-sort strategy and 2 minutes using a nonclustered index. With the scan-and-sort strategy, the user doesn't see the first row until all the processing is almost done—for this example, about 1 minute. But with the index strategy, the user sees the first row within a subsecond—the time it takes to do, say, five I/O operations (read two levels of the nonclustered index, two levels of the clustered index, and then the data page). Scan-and-sort is faster in total time, but the nonclustered index is faster in returning the first row.

SQL Server provides a query hint called FAST that lets SQL Server know that having the first n rows returned quickly is more important than the total time, which would be the normal way query plans get costed. Later in this chapter, we'll look at some techniques for speeding up slow queries and discuss when it is appropriate to use the query hint. For now, you should understand the issues of response time (the time needed to get the first row) vs. throughput (the time needed to get all rows) when you think about your performance goals. Typically, highly interactive systems should be designed for best response time, and batch-oriented systems should be designed for best throughput.

Prototyping, Benchmarking, and Testing

As you make changes to your application design or hardware configuration, you should measure the effects of these changes. A simple benchmark test to measure differences is a tremendous asset. The benchmark system should correlate well with the expected performance of the real system, but it should be relatively easy to run. If you have a development "acceptance test suite" that you run before checking in any significant changes, you should add the benchmark to that test suite.

Tip You should measure performance with at least a proxy test; otherwise, you're setting yourself up for failure. Optimism without data to back it up is usually misguided.

Your benchmark doesn't have to be sophisticated initially. You can first create your database and populate it with a nontrivial amount of data—thousands of rows at a minimum. The data can be randomly generated, although the more representative you can make the data the better. Ideally, you should use data from the existing system, if there is one. For example, if a particular part represents 80 percent of your orders, you shouldn't make all your test data randomly dispersed. Any differences in the selectivity of indexes between your real data and the test data will probably cause significant differences in the execution plans you choose. You should also be sure that you have data in related tables if you use FOREIGN KEY constraints. As we explained earlier in this book, the enforcement of FOREIGN KEY constraints requires that those related tables (either referenced or referencing) be accessed if you are modifying data in a column that is participating in the constraint. So the execution plan is sometimes considerably more complicated due to the constraints than might be apparent, and a plethora of constraints can result in a system that has no simple operations.

As a rule of thumb, you should start with at least enough data so the difference between selecting a single row based on its primary key by using an index is dramatically faster than selecting such a row using a table scan. (This assumes that the table in production will be large enough to reflect that difference.) Remember that the system will perform much differently depending on whether I/O operations are physical or from the cache. So don't base your conclusions on a system that is getting high cache-hit ratios unless you have enough data to be confident that this behavior also will be true for your production system. In addition, keep in mind that running the same test multiple times might yield increasingly short response times. If you are testing on a dedicated machine, no other processes will be using SQL Server's memory, and the data you read in from the disk the first time will already be in cache for subsequent tests.

Tip If you want to run your tests repeatedly under the same conditions, use the command DBCC DROPCLEANBUFFERS after each test run to remove all data from memory.

Early in the development process, you should identify areas of lock contention between transactions and any specific queries or transactions that take a long time to run. The SQL Server Profiler, discussed in Chapter 15, can be a wonderful tool for tracking down your long-running queries. And if table scans will be a drain on the production system, you should have enough data early on so that the drain is apparent when you scan. If you can run with several thousand rows of data without lock contention problems and with good response time on queries, you're in a good position to proceed with a successful development cycle. Of course, you must continue to monitor and make adjustments as you ramp up to the actual system and add more realistic amounts of data and simultaneous users. And, of course, your system test should take place on an ongoing basis before you deploy your application. It's not a one-time thing that you do the night before you go live with a new system.

Tip Before you roll out your production system, you should be able to conduct system tests with the same volumes of data and usage that the real system will have when it goes live. Crossing your fingers and hoping is not good enough. Two tools in the Microsoft BackOffice Resource Kit might prove useful for this purpose. The filltabl utility populates a specified table with any number of rows of random data. A load simulator (sqlls) lets you run one or more SQL scripts with up to 64 operating system threads executing each script. These and other tools from the BackOffice Resource Kit are written for the previous version of SQL Server, but they will work with SQL Server 7.

Obviously, if your smallish prototype is exhibiting lock contention problems or the queries do not perform well within your desired goals, it's unlikely that your real system will perform as desired. Run the stored procedures that constitute your critical transactions. You can use a simple tool like OSQL.EXE to dispatch them. First run each query or transaction alone—time it in isolation and check the execution plans (SET SHOWPLAN_TEXT ON). Then run multiple sessions to simulate multiple users, either by manually firing off multiple OSQL commands or by using the SQL Load Simulator mentioned above. (A bit later, we'll look at how to analyze and improve a slow-running query.)

Also, based on your CRUD chart analysis (or on any other analysis of critical transactions), you should identify tasks that will run at the same time as your critical transactions. Add these tasks to your testing routine to determine whether they will lead to contention when they run simultaneously with your critical transactions. For example, suppose proc_take_orders is your critical transaction. When it runs, some reports and customer status inquiries will also run. You should run some mixture of these types of processes when you analyze proc_take_orders. This will help you identify potential lock contention issues or other resource issues, such as high CPU usage or low cache.

You might also want to use the SQL Server benchmark kit, which is available on the companion CD as well as at the Microsoft Web site (http://www.microsoft.com/sql). Although the transaction in the kit will probably not be directly applicable to your situation, the benchmark framework provides the infrastructure you need to launch and coordinate multiple client tasks simulta neously to time their work for throughput and response time. The kit also provides some sample code for populating the test database. Other custom benchmarking tools are available from other companies (such as Dynameasure from Bluecurve, Inc. at http://www.bluecurve.com. )

A Rant on Development Methodologies

How much you spec, how you spec, when you start coding, and the role of prototyping are all matters on which there is no consensus; there is no one right answer. What's right for one team or project might be wrong for another. You must remember that the point of development is to ship products or deploy applications. Your ultimate purpose is not to produce specs and design documents.

The spec exists to clearly articulate and document how a module or system should work. It ensures that the developer thinks through the approach before writing code. It is also vitally important to others who come in later and are new to the system. While writing the design document, the developer might have to write some quick prototype code to help think through an issue. Or the developer should at least write some pseudocode and include it as part of the document.

Any development task that will take more than a couple of days to implement deserves a simple design document. Ideally, such a document should be at most 15 pages, and often about 3 succinct pages are ideal for even a complex component. The best design documents can be read in one sitting, and the reader should come away understanding clearly how the system will be built. The document should be reviewed by other developers before coding begins. This is best done in a positive, informal atmosphere in which others can contribute ideas. And, of course, in a healthy environment, developers are continually bouncing ideas off their peers.

The document should assume that the reader already knows why a module or component is needed. The document provides the how. Too many specs have 40 pages describing the market need for something or why a potential product would be really cool, and then they have 1 page that says, in essence, "We'll figure it out when we code it up."

No one can write a perfect spec up front. Prototyping the spec in an iterative fashion works far better. Areas that are clear don't need a prototype; use prototypes for areas fraught with risk, for your critical transactions, and for critical areas of the system in which multiple approaches are possible or reasonable. For the tricky stuff that you can't describe or predict a best performance, prototypes provide enormous benefits.

A useful prototype can be "quick and dirty." If you don't worry about all the failure cases and every conceivable state in which something can exist, useful prototype code can often be produced in 5 percent of the time it takes to create production-caliber code (in which you must worry about those things). However, the production code might be 10 times better because you have recognized and corrected deficiencies early on. You'll either junk the prototype code or use it for the skeleton of the real system. Prototyping lets you learn and prove or disprove ideas, and then you can update the spec based on what you learn. And if you're selling your ideas to management or customers, your prototype can be useful to demonstrate proof of your concept.

As we mentioned before, every nontrivial development task deserves a brief design document. You need enough documentation to lay out the framework and system architecture and to detail how pieces fit together. But at the detailed levels, it's better to simply comment the source code liberally while you write and modify it. External design documentation rarely gets updated in a timely manner, so it quickly becomes useless. No one has much use for worthless comments (like adding incrementing i before the statement i++ in C). Rather, comments should describe the approach and intent of each routine, the expected inputs and outputs, and any side effects that might occur by changing the code. Explain operations when their purpose is not obvious. Don't assume that your readers will immediately grasp all the subtleties. And when you change something after the fact, add a comment about what you changed, when you changed it, and why. A module should be commented well enough so that testers or technical writers can gain a good understanding of what is going on just from reading the comments, even if they are not skilled programmers.

Creating Useful Indexes

Creating useful indexes is one of the most important tasks you can do to achieve good performance. Indexes can dramatically speed up data retrieval and selection, but they are a drag on data modification because along with changes to the data, the index entries must also be maintained and those changes must be logged. The key to creating useful indexes is understanding the uses of the data, the types and frequencies of queries performed, and how queries can use indexes to help SQL Server find your data quickly. A CRUD chart or similar analysis technique can be invaluable in this effort. You might want to quickly review the difference between clustered and nonclustered indexes because the difference is crucial in deciding what kind of index to create.

Clustered and nonclustered indexes are similar at the upper (node) levels—both are organized as B-trees. Index rows above the leaf level contain index key values and pointers to pages the next level down. Each row keeps track of the first key value on the page it points to. Figure 14-1 shows an abstract view of an index node for an index on a customer's last name. The entry Johnson indicates page 1:200 (file 1, page 200), which is at the next level of the index. Since Johnson and Jones are consecutive entries, all the entries on page 1:200 have values between Johnson (inclusive) and Jones (exclusive).

Figure 14-1: . An index node page.

The leaf, or bottom, level of the index is where clustered and nonclustered indexes differ. For both kinds of indexes, the leaf level contains every key value in the table on which the index is built, and those keys are in sorted order. In a clustered index, the leaf level is the data level, so of course every key value is present. This means that the data in a table is sorted in order of the clustered index. In a nonclustered index, the leaf level is separate from the data. In addition to the key values, the index rows contain a bookmark indicating where to find the actual data. If the table has a clustered index, the bookmark is the clustered index key that corresponds to the nonclustered key in the row. (If the clustered key is composite, all parts of the key are included.)

Remember that clustered indexes are guaranteed to be unique in SQL Server 7; if you don't declare them as unique, SQL Server adds a uniqueifier to every duplicate key to turn the index into a unique composite index. If our index on last name is a nonclustered index and the clustered index on the table is the zip code, a leaf-level index page might look something like Figure 14-2 on the following page. The number in parentheses after the zip code is the uniqueifier and appears only when there are duplicate zip codes.

Figure 14-2: . A leaf-level index page.

Choose the Clustered Index Carefully

Clustered indexes are extremely useful for range queries (for example, WHERE sales_quantity BETWEEN 500 and 1000) and for queries in which the data must be ordered to match the clustering key. Only one clustered index can exist per table, since it defines the physical ordering of the data for that table. Since you can have only one clustered index per table, you should choose it carefully based on the most critical retrieval operations. Because of the clustered index's role in managing space within the table, nearly every table should have one. And if a table has only one index, it should probably be clustered.

If a table is declared with a primary key (which is advisable), by default the primary key columns form the clustered index. Again, this is because almost every table should have a clustered index, and if the table has only one index, it should probably be clustered. But if your table has several indexes, some other index might better serve as the clustered index. This is often true when you do single-row retrieval by primary key. A nonclustered, unique index works nearly as well in this case and still enforces the primary key's uniqueness. So save your clustered index for something that will benefit more from it by adding the keyword NONCLUSTERED when you declare the PRIMARY KEY constraint.

Make Nonclustered Indexes Highly Selective

A query using an index on a large table is often dramatically faster than a query doing a table scan. But this is not always true, and table scans are not all inherently evil. Nonclustered index retrieval means reading B-tree entries to determine the data page that is pointed to and then retrieving the page, going back to the B-tree, retrieving another data page, and so on until many data pages are read over and over. (Subsequent retrievals can be from cache.) With a table scan, the pages are read only once. If the index does not disqualify a large percentage of the rows, it is cheaper to simply scan the data pages, reading every page exactly once.

The query optimizer greatly favors clustered indexes over nonclustered indexes, because in scanning a clustered index the system is already scanning the data pages. Once it is at the leaf of the index, the system has gotten the data as well. So there is no need to read the B-tree, read the data page, and so on. This is why nonclustered indexes must be able to eliminate a large percentage of rows to be useful (that is, they must be highly selective), whereas clustered indexes are useful even with less selectivity.

Indexing on columns used in the WHERE clause of frequent or critical queries is often a big win, but this usually depends on how selective the index is likely to be. For example, if a query has the clause WHERE last_name ='Stankowski', an index on last_name is likely to be very useful; it can probably eliminate 99.9 percent of the rows from consideration. On the other hand, a nonclustered index will probably not be useful on a clause of WHERE sex = 'M' because it eliminates only about half of the rows from consideration; the repeated steps needed to read the B-tree entries just to read the data require far more I/O operations than simply making one single scan through all the data. So nonclustered indexes are typically not useful on columns that do not have a wide dispersion of values.

Think of selectivity as the percentage of qualifying rows in the table (qualifying rows/total rows). If the ratio of qualifying rows to total rows is low, the index is highly selective and is most useful. If the index is used, it can eliminate most of the rows in the table from consideration and greatly reduce the work that must be performed. If the ratio of qualifying rows to total rows is high, the index has poor selectivity and will not be useful. A nonclustered index is most useful when the ratio is around 5 percent or less—that is, if the index can eliminate 95 percent of the rows from consideration. If the index has less than 5 percent selectivity, it probably will not be used; either a different index will be chosen or the table will be scanned. Recall that each index has a histogram of sampled data values for the index key, which the optimizer uses to estimate whether the index is selective enough to be useful to the query.

Tailor Indexes to Critical Transactions

Indexes speed data retrieval at the cost of additional work for data modification. To determine a reasonable number of indexes, you must consider the frequency of updates vs. retrievals and the relative importance of the competing types of work. If your system is almost purely a decision-support system (DSS) with little update activity, it makes sense to have as many indexes as will be useful to the queries being issued. A DSS might reasonably have a dozen or more indexes on a single table. If you have a predominantly online transaction processing (OLTP) application, you need relatively few indexes on a table—probably just a couple carefully chosen ones.

Look for opportunities to achieve index coverage in queries, but don't get carried away. An index "covers" the query if it has all the data values needed as part of the index key. For example, if you have a query such as SELECT emp_name, emp_sexFROM employee WHERE emp_name LIKE 'Sm%' and you have a nonclustered index on emp_name, it might make sense to append the emp_sex column to the index key as well. Then the index will still be useful for the selection, but it will already have the value for emp_sex. The optimizer won't need to read the data page for the row to get the emp_sex value; the optimizer is smart enough to simply get the value from the B-tree key. The emp_sex column is probably a char(1), so the column doesn't add greatly to the key length, and this is good.

Every nonclustered index is a covering index if all you are interested in is the key column of the index. For example, if you have a nonclustered index on first name, it covers all these queries:

Select all the first names that begin with K.

Find the first name that occurs most often.

Determine whether the table contains the name Melissa.

In addition, if the table also has a clustered index, every nonclustered index includes the clustering key. So it can also cover any queries that need the clustered key value in addition to the nonclustered key. For example, if our nonclustered index is on the first name and the table has a clustered index on the last name, the following queries can all be satisfied by accessing only leaf pages of the B-tree:

Select Tibor's last name.

Determine whether any duplicate first and last name combinations exist.

Find the most common first name for people with the last name Wong.

You can go too far and add all types of fields to the index. The net effect is that the index becomes a virtual copy of the table, just organized differently. Far fewer index entries fit on a page, I/O increases, cache efficiency is reduced, and much more disk space is required. The covered queries technique can improve performance in some cases, but you should use it with discretion.

A unique index (whether nonclustered or clustered) offers the greatest selectivity (that is, only one row can match), so it is most useful for queries that are intended to return exactly one row. Nonclustered indexes are great for single-row accesses via the PRIMARY KEY or UNIQUE constraint values in the WHERE clause.

Indexes are also important for data modifications, not just for queries. They can speed data retrieval for selecting rows, and they can speed data retrieval needed to find the rows that must be modified. In fact, if no useful index for such operations exists, the only alternative is for SQL Server to scan the table to look for qualifying rows. Update or delete operations on only one row are common; you should do these operations using the primary key (or other UNIQUE constraint index) values to be assured that there is a useful index to that row and no others.

A need to update indexed columns can affect the update strategy chosen. For example, to update a column that is part of the key of the clustered index on a table, you must process the update as a delete followed by an insert rather than as an update-in-place. When you decide which columns to index, especially which columns to make part of the clustered index, consider the effects the index will have on the update method used. (Review the discussion of updates in Chapter 8.)

Pay Attention to Column Order

At the risk of stating the obvious, an index can be useful to a query only if the criteria of the query match the columns that are leftmost in the index key. For example, if an index has a composite key of last_name,first_name, that index is useful for a query such as WHERE last_name = 'Smith' or WHERE last_name = 'Smith' AND first_name = 'John'. But it is not useful for a query such as WHERE first_name = 'John'. Think of using the index like a phone book. You use a phone book as an index on last name to find the corresponding phone number. But the standard phone book is useless if you know only a person's first name because the first name might be located on any page.

Put the most selective columns leftmost in the key of nonclustered indexes. For example, an index on emp_name,emp_sex is useful for a clause such as WHERE emp_name = 'Smith' AND emp_sex = 'M'. But if the index is defined as emp_sex,emp_name, it isn't useful for most retrievals. The leftmost key, emp_sex, cannot rule out enough rows to make the index useful. Be especially aware of this when it comes to indexes that are built to enforce a PRIMARY KEY or UNIQUE constraint defined on multiple columns. The index is built in the order that the columns are defined for the constraint. So you should adjust the order of the columns in the constraint to make the index most useful to queries; doing so will not affect its role in enforcing uniqueness.

Index Columns Used in Joins

Index columns are frequently used to join tables. When you create a PRIMARY KEY or UNIQUE constraint, an index is automatically created for you. But no index is automatically created for the referencing columns in a FOREIGN KEY constraint. Such columns are frequently used to join tables, so they are almost always among the most likely ones on which to create an index. If your primary key and foreign key columns are not naturally compact, consider creating a surrogate key using an identity column (or a similar technique). As with row length for tables, if you can keep your index keys compact, you can fit many more keys on a given page, which results in less physical I/O and better cache efficiency. And if you can join tables based on integer values such as an identity, you avoid having to do relatively expensive character-by-character comparisons. Ideally, columns used to join tables are integer columns—fast and compact.

Join density is the average number of rows in one table that match a row in the table it is being joined to. You can also think of density as the average number of duplicates for an index key. A column with a unique index has the lowest possible density (there can be no duplicates) and is therefore extremely selective for the join. If a column being joined has a large number of duplicates, it has a high density and is not very selective for joins.

Joins are frequently processed as nested loops. For example, if while joining the orders table with order_items the system starts with the orders table (the outer table) and then for each qualifying order row, the inner table is searched for corresponding rows. Think of the join being processed as, "Given a specific row in the outer table, go find all corresponding rows in the inner table." If you think of joins in this way, you'll realize that it is important to have a useful index on the inner table, which is the one being searched for a specific value. For the most common type of join, an equijoin that looks for equal values in columns of two tables, the optimizer automatically decides which is the inner table and which is the outer table of a join. The table order that you specify for the join doesn't matter in the equijoin case. However, the order for outer joins must match the semantics of the query, so the resulting order is dependent on the order specified. We'll talk about join strategies later in this chapter.

Create or Drop Indexes as Needed

If you create indexes but find that they aren't used, you should drop them. Unused indexes slow data modification without helping retrieval. You can determine whether indexes are used by watching the plans produced via the SHOWPLAN options; this is easy if you are analyzing a large system with many tables and indexes. There might be thousands of queries that can be run and no way to run and analyze the SHOWPLAN output for all of them. An alternative is to use the Index Tuning Wizard to generate a report of current usage patterns. The wizard is designed to determine which new indexes to build, but you can use it simply as a reporting tool to find out what is happening in your current system. We'll look at the wizard later in this chapter.

Some batch-oriented processes that are query intensive can benefit from certain indexes. Such processes as complex reports or end-of-quarter financial closings often run infrequently. If this is the case, remember that creating and dropping indexes is simple. Consider a strategy of creating certain indexes in advance of your batch processes and then dropping them when those batch processes are done. In this way, the batch processes benefit from the indexes but do not add overhead to your OLTP usage.

Using Stored Procedures and Caching Mechanisms

You should try using stored procedures whenever possible, rather than passing ad hoc SQL statements from your applications. Chapters 3 and 10 discussed the efficiencies that stored procedures bring in terms of not requiring compilation of an execution plan for each execution. Plans can be reused and stay cached and available for subsequent use. Saving recompile time was something to consider with previous versions of SQL Server, although it was not the most important reason to use stored procedures. In many queries, especially complex joins on large tables, the time spent on compilation was frequently insignificant compared to the time needed for execution. So it was much more crucial that the best plan be chosen, even if that meant a recompilation. The picture has changed somewhat in SQL Server 7. The query optimizer now has many more processing options, and compilation and optimization time can end up being a more significant percentage of total execution time. It is even more crucial now that you know when recompilation will take place and try to avoid it, if at all possible.

Choosing Appropriate Hardware

We don't advise trying to solve performance problems simply by "killing them with hardware." A powerful system cannot compensate for basic inefficiencies in the application or database design. Nevertheless, appropriate hardware is extremely important. (We discussed some hardware issues in Chapter 4.) Remember that SQL Server performance, from the hardware perspective, is a function of the integer processing (CPU) power, the amount of memory in the system, and the I/O capacity of the system. A system must be well matched. Even a system with tremendous CPU capacity might run SQL Server slower than a system with less CPU power if the first system has too little memory or I/O capacity. And when you put together your system, you should think carefully about not just the I/O capacity but also the fault tolerance capabilities. RAID solutions provide varying capabilities of increased I/O performance and fault tolerance. You should decide up front on the appropriate level of RAID for your system. (Chapter 4 also discusses RAID levels.)

SQL Server 7 can save on recompilation using four new mechanisms to make plan caching accessible in a wider set of scenarios:

Ad hoc caching

Autoparameterization

The sp_executesql procedure

The prepare and execute method

Note: The information on caching query plans was adapted from a preliminary copy of a whitepaper by Peter Carlin. We are indebted to him for his assistance.

Ad Hoc Caching

SQL Server caches the plans from ad hoc queries, and if a subsequent batch matches exactly, the cached plan is used. This feature requires no extra work to use, but it is limited to exact textual matches. For example, if the three queries shown on the following page are submitted, the first and third queries will use the same plan but the second one will need to generate a new plan.

Using Cursors Judiciously

If you intend to use cursors heavily in your application, make sure you've closely read Chapter 11. Used properly, cursors provide valuable features not found in any other mainstream database product. But, as discussed in Chapter 11, cursors are misused in many systems, turning SQL Server into a network ISAM instead of a relational database. This problem is common if a system is being ported from a mainframe using VSAM (or a similar method) or upsized from a data store such as Btrieve or Microsoft FoxPro. In such cases, the cursor model of one-row-at-a-time processing seems familiar to developers with an ISAM background. Converting the ISAM calls to cursor calls looks easy, and it is. But you can easily create a bad application. You should approach your application in terms of set operations and nonprocedural code, and you should avoid the temptation of simply doing a quick port from an ISAM and using cursors extensively.

Autoparameterization

For simple queries, SQL Server guesses which constants might really be parameters and attempts to treat them as parameters. If this is successful, subsequent queries that follow the same basic template can use the same plan. Four templates can be used for autoparameterization. (Note that key-expression is an expression involving only column names, constants, AND operators, and comparison operators (<, >, =, <=, >=, and <>).)

SQL Server can allow other queries of the same template to use the same plan only if the template is safe. A template is safe if the plan selected will not change even if the actual parameters change. This ensures that autoparameterization won't degrade a query's performance.

The SQL Server query processor is much more conservative about deciding whether a template is safe than an application can be. SQL Server guesses which values are really parameters, whereas your application should actually know. Rather than rely on autoparameterization, you should use one of the following two mechanisms to mark parameters when they are known.

The sp_executesql Procedure

The stored procedure sp_executesql is halfway between ad hoc caching and stored procedures. Using sp_executesql requires that you identify the parameters but doesn't require all the persistent object management needed for stored procedures.

ODBC and OLE DB expose this functionality via SQLExecDirect and ICommandWithParameters. (The ODBC and OLE DB documentation provide more details.)

The Prepare and Execute Method

This last mechanism is like sp_executesql in that parameters to the batch are identified by the application, but there are some key differences. The prepare and execute method does not require the full text of the batch to be sent at each execution. Rather, the full text is sent once at prepare time; a handle that can be used to invoke the batch at execute time is returned. ODBC and OLE DB expose this functionality via SQLPrepare/SQLExecute and ICommandPrepare. You can also use this mechanism via ODBC and OLE DB when cursors are involved. When you use these functions, SQL Server is informed that this batch is meant to be used repeatedly.

Sharing Cached Plans

To allow users to share plans and thus maximize the effectiveness of the caching mechanisms, all users should execute in the same environment. Don't change SET options or database settings in the middle of an application or connection.

For all of the caching mechanisms, reusing a cached plan avoids recompilation and optimization. This saves compilation time, but it means that the same plan is used regardless of the particular parameter values passed in. If the optimal plan for a given parameter value is not the same as the cached plan, the optimal execution time will not be achieved. For this reason, SQL Server is very conservative about autoparameterization. When an application uses sp_executesql, prepare and execute, or stored procedures, the application developer is responsible for determining what should be parameterized. You should parameterize only constants whose range of values does not drastically affect the optimization choices.

The SQL Server Performance Monitor includes a counter called SQL Server:SQL Statistics that has several counters dealing with autoparameterization. You can monitor these counters to determine whether there are many unsafe or failed autoparameterization attempts. If these numbers are high, you can inspect your applications for situations in which the application can take responsibility for explicitly marking the parameters.

When to Use Stored Procedures and Other Caching Mechanisms

Keep the following in mind when you are deciding whether to use stored procedures or one of the other mechanisms:

Stored procedures Use when multiple applications are executing batches in which the parameters are known.

Autoparameterization Use for applications that cannot be easily modified. Don't design an application to use this.

The sp_executesql procedure Use when a single user might use the same batch multiple times and when the parameters are known.

The prepare and execute method Use when multiple users are executing batches in which the parameters are known, or when a single user will definitely use the same batch multiple times.

Recompiling Stored Procedures

In spite of the benefits of the mechanisms just discussed, you should still use stored procedures whenever possible. Besides giving you the benefits of precompilation, they minimize network traffic by reducing the text that needs to be sent from your application to SQL Server, and they provide a security mechanism to control who can access data and under what conditions. Stored procedures can be recompiled automatically or manually under various circumstances. (Chapter 10 discussed some of these issues.)

You can actually watch automatic recompilation occurring by using SQL Server Profiler: Choose to trace the event in the Stored Procedures category called SP:Recompile. If you also trace the event called SP:StmtStarting, you can see at what point in the procedure it is being recompiled. In particular, you might notice that stored procedures can be recompiled multiple times during their execution. For example, if you build a temporary table and later create an index on that table, and then add data to the table, your stored procedure will be recompiled numerous times. One way to avoid this is to include all data definition statements dealing with your temporary tables, as well as the insertion of rows into the temporary tables, right at the beginning of the procedure. So if the procedure must be recompiled, it won't happen more than once. Another way to prevent recompilations is to include the query hint KEEPPLAN in your statements that access the temporary tables. We'll discuss query hints in detail later in this chapter.

Although we've been assuming that recompilation is something you will usually want to avoid, this is not always the case. If you know that updated statistics can improve the query plan, or if you know that you have wildly different possible parameter values, recompilation can be a good thing.

Tip If you want to run your tests repeatedly under the same conditions so that you can measure performance when you start with no cached plans, you can use the command DBCC FREEPROCCACHE after each test run to remove all cached plans from memory.

Limiting the Number of Plans in Cache

SQL Server will try to limit the number of plans for any particular stored procedure. Since plans are reentrant, this is much easier to do in SQL Server 7 than in previous versions. Although the online documentation states that there can be at most two compiled plans for any procedure (at most one for parallel plans—that is, those that will be executed on multiple processors—and one for nonparallel plans), this is not completely true. Other situations will cause multiple plans for the same procedure to be stored. The most likely situation is a difference in certain SET options, database options, or configuration options. For example, a stored procedure that concatenates strings might compile the concatenation differently depending on whether the option CONCAT_NULL_YIELDS_NULL is on or off, or whether the corresponding database option is true or false. If a user executes the procedure with the option on, that person will use a different plan than if the option is off.

The system table syscacheobjects keeps track of the compiled objects in cache at any time. This table is accessible only by system administrators, but you can write your own stored procedures to provide information for your own tuning and testing purposes. You can use a procedure called sp_procs_in_cache (which is on this book's companion CD) to return a list of all procedure plans in cache and the number of times each plan occurs. The list is organized by database. If you execute the procedure without passing a parameter, you see procedures in your current database only. Alternatively, you can provide a specific database name or use the parameter all to indicate that you want to see procedure plans in cache from all databases. If you want to pull different information out of syscacheobjects, you can customize the procedure in any way you like. Remember to create the procedure in the master database so that it can be called from anywhere. Also remember to grant appropriate permissions if anyone other than a system administrator will be running it.

Running the sp_procs_in_cache procedure shows the two different types of stored procedure plans: compiled plans and executable plans. A compiled plan is the part of a plan that is reentrant and can be shared by multiple users. You can think of an executable plan as an instance of the compiled plan that contains information describing a particular process that is executing the procedure. In most cases, both compiled and executable plans remain in the memory cache, subject of course to memory pressure from other processes or applications. In other cases, such as when a compiled plan contains a sort operation, execution plans do not remain in cache at all after execution.

Other Benefits of Stored Procedures

Beyond the significant performance and security advantages, stored procedures can provide a valuable level of indirection between your applications and the database design. Suppose your application calls a procedure such as get_customer_balance and expects a result set to be returned. If the underlying database is then changed, the application can be totally unaffected and unaware of the change as long as the procedure also changes to return the result set as expected. For example, if you decide to denormalize your database design to provide faster query performance, you can change the stored procedure to respecify the query. If many applications call the procedure, you can simply change the stored procedure once and never touch the application. In fact, a running application doesn't even need to be restarted—it executes the new version of the stored procedure the next time it is called.

Another good reason to use stored procedures is to minimize round-trips on the network (that is, the conversational TDS traffic between the client application and SQL Server for every batch and result set). If you will take different actions based on data values, try to make those decisions directly in the procedure. (Strictly speaking, you don't need to use a stored procedure to do this—a batch also provides this benefit.) Issue as many commands as possible in a batch. You can try a simple test by inserting 100 rows first as a single batch and then as every insert in its own batch (that is, 100 batches). You'll see a performance improvement of an order of magnitude even on a LAN, and on a slow network the improvement will be stunning. While LAN speeds of 10 megabits per second (Mbps) are fast enough that the network is generally not a significant bottleneck, speeds can be tremendously different on a slow network. A modem operating at 28.8 kilobits per second (Kbps) is roughly 300 times slower than a LAN. Although WAN speeds vary greatly, they typically can be 100 to 200 times slower than a LAN. The fast LAN performance might hide inefficient network use. You can often see a prime example of this phenomenon when you use cursors. Fetching each row individually might be tolerable on a LAN, but it is intolerable when you use dial-up lines.

If a stored procedure has multiple statements, by default SQL Server sends a message to the client application at the completion of each statement to indicate the number of rows affected for each statement. This is known as a DONE_IN_PROC message in TDS-speak. However, most applications do not need DONE_IN_PROC messages. So if you are confident that your applications do not need these messages, you can disable them, which can greatly improve performance on a slow network when there is otherwise little network traffic. (One particular application deployed on a WAN had an order-of-magnitude performance improvement after suppressing the DONE_IN_PROC messages. This made the difference between a successful deployment and a fiasco.)

You can use the connection-specific option SET NOCOUNT ON to disable these messages for the application. While they are disabled, you cannot use the ODBC SQLRowCount() function or its OLE DB equivalent. However, you can toggle NOCOUNT on and off as needed, and you can use SELECT @@ROWCOUNT even when NOCOUNT is on. You can also suppress the sending of DONE_IN_PROC messages by starting the server with trace flag 3640, which lets you suppress what might be needless overhead without touching your application. However, some ODBC applications depend on the DONE_IN_PROC message, so you should test your application before using trace flag 3640 in production—it can break applications that are written implicitly to expect that token.

You should keep an eye on the traffic between your client applications and the server. An application designed and tuned for slow networks works great on a fast network, but the reverse is not true. If you use a higher-level development tool that generates the SQL statements and issues commands on your behalf, it is especially important to keep an eye on what's moving across the network. You can use the SQL Server Profiler to watch all traffic into the server. If you want to watch both commands in and responses sent, you can start the server from the command line with trace flags 4032 and 4031, which respectively display the server's receive and send buffers (for example, sqlservr –c –T4031–T4032). Enabling these two flags dumps all conversation to and from the server to standard output for the sqlservr process, which is the monitor if you start it from a console window. (Note that the output can get huge.) Whether you use SQL Server Profiler or these trace flags, you can get all the exact commands and data being sent. Just to monitor network traffic, though, this might be overkill; a network sniffer such as Network Monitor (available with Microsoft Windows NT Server and Microsoft Systems Management Server) can work better if you want only to monitor network traffic. Network Monitor is easy to use, and even a networking neophyte can set it up quickly and with a few mouse clicks watch the traffic between machine pairs.

Concurrency and Consistency Tradeoffs

If you are designing a multiple-user application that will both query and modify the same data, you need a good understanding of concurrency and consistency. Concurrency is the ability to have many simultaneous users operating at once. The more users who can work well simultaneously, the higher your concurrency. Consistency is the level at which the data in multiple-user scenarios exhibits the same behavior that it would if only one user were operating at a time. Consistency is expressed in terms of isolation levels. At the highest isolation level, Serializable, the multiple-user system behaves identically to what would exist if users submitted their requests serially—that is, as if the system made every user queue up and run operations one at a time (serially). At the Serializable level you have a greater need to protect resources, which you do by locking and which reduces concurrency. (Locking is discussed in Chapter 13. You should also understand transactional concepts, which are presented in Chapter 10.)

In many typical environments, an ongoing struggle occurs between the OLTP demands on the database and the DSS demands. OLTP is characterized by relatively high volumes of transactions that modify data. The transactions tend to be relatively short and usually don't query large amounts of data. DSS is read-intensive, often with complex queries that can take a long time to complete and thus hold locks for a long time. The exclusive locks needed for data modification with OLTP applications block the shared locks used for DSS. And DSS tends to use many shared locks and often holds them for long periods, stalling the OLTP applications, which then must wait to acquire the exclusive locks required for updating. DSS also tends to benefit from many indexes and often from a denormalized database design that reduces the number of tables to be joined. Large numbers of indexes are a drag on OLTP because of the additional work to keep them updated. And denormalization means redundancy—a tax on the update procedures.

You need to understand the isolation level required by your application. Certainly, you do not want to request Serializable (or, equivalently, use the HOLDLOCK or SERIALIZABLE hint) if you need only Read Committed. And it might become clear that some queries in your application require only Read Uncommitted (dirty read), since they look for trends and don't need guaranteed precision. If that's the case, you potentially have some flexibility in terms of queries not requiring shared locks, which keeps them both from being blocked by processes modifying the database and from blocking those modifying processes. But even if you can live with a dirty-read level, you should use it only if necessary. This isolation level means that you might read data that logically never existed; this can create some weird conditions in the application. (For example, a row that's just been read will seem to vanish because it gets rolled back.) If your application can live with a dirty-read isolation level, it might be comforting to have that as a fallback. However, you should probably first try to make your application work with the standard isolation level of Read Committed and fall back to dirty read only if necessary.

Resolving Blocking Problems

Many applications suffer poor performance because processes are backed up waiting to acquire locks or because of deadlocks. A smooth, fast application should minimize the time spent waiting for locks, and it should avoid deadlocks. The most important step you can take is to first understand how locking works.

A process blocks when it stalls while waiting to acquire a lock that is incompatible with a lock held by some other process. This condition is often, but erroneously, referred to as a "deadlock." As long as the process being stalled is not, in turn, stalling the offending process—which results in a circular chain that will never work itself out without intervention—you have a blocking problem, not a deadlock. If the blocking process is simply holding on to locks or is itself blocked by some other process, this is not a deadlock either. The process requesting the lock must wait for the other process to release the incompatible lock; when it does, all will be fine. Of course, if the process holds that lock excessively, performance still grinds to a halt and you must deal with the blocking problem. Your process will suffer from bad performance and might also hold locks that stall other processes; every system on the network will appear to hang.

The following guidelines will help you avoid or resolve blocking problems:

Keep transactions as short as possible. Ideally, a BEGIN TRAN…COMMIT TRAN block will include only the actual DML statements that must be executed. To the extent possible, do conditional logic, variable assignment, and other "setup" work before the BEGIN TRAN. The shorter the transaction lasts, the shorter the time that locks will be held. Keep the entire transaction within one batch if possible.

Never add a pause within a transaction for user input. This rule is basically a part of the previous one, but it is especially important. Humans are slow and unreliable compared to computers. Do not add a pause in the middle of the transaction to ask a user for input or to confirm some action. The person might decide to get up to take a coffee break, stalling the transaction and making it hold locks that cause blocking problems for other processes. If some locks must be held until the user provides more information, you should set timers in your application so that even if the user decides to go to lunch, the transaction will be aborted and the locks will be released. Similarly, give your applications a way to cancel out of a query if such an action is necessary. Alternatively, you can use the LOCK_TIMEOUT session option to control when SQL Server automatically cancels out of a locking situation. We'll look at this option later in this chapter.

When you process a result set, process all rows as quickly as possible. Recall from Chapter 3 that an application that stops processing results can prevent the server from sending more results and stall the scanning process, which requires locks to be held much longer.

For browsing applications, consider using cursors with optimistic concurrency control. An address book application is a good example of a browsing application: users scroll around to look at data and occasionally update it. But the update activity is relatively infrequent compared to the time spent perusing the data. Using scrollable cursors with the OPTIMISTIC locking mode is a good solution for such applications. Instead of locking, the cursor's optimistic concurrency logic determines whether the row has changed from the copy that your cursor read. If the row has not changed, the update is made without holding locks during the lengthy period in which the user is perusing the data. If the row has changed, the UPDATE statement produces an error and the application can decide how to respond. Although you were strenuously cautioned in Chapter 11 about the misuse of cursors, they are ideally suited to browsing applications.

You can also easily implement your own optimistic locking mechanism without using cursors. Save the values of the data you selected and add a WHERE clause to your update that checks whether the values in the current data are the same as those you retrieved. Or, rather than use the values of the data, use a SQL Server timestamp column—an ever-increasing number that's updated whenever the row is touched, unrelated to the system time. If the values or timestamp are not identical, your update will not find any qualifying row and will not affect anything. You can also detect changes with @@ROWCOUNT and decide to simply abort, or more typically, you can indicate to the user that the values have changed and then ask whether the update should still be performed. But between the time the data was initially retrieved and the time the update request was issued, shared locks are not held, so the likelihood of blocking and deadlocks is significantly reduced.

Indexes and Blocking

We've already recommended that you choose your indexes wisely, strictly for performance reasons. However, concurrency concerns are also a reason to make sure you have good indexes on your tables. (Of course, better concurrency can also lead to better performance.) We saw in Chapter 13 that SQL Server acquires row (or key) locks whenever possible. However, this does not always mean that no other rows are affected if you are updating only one row in a table. Remember that to find the row to update, SQL Server must first do a search and acquire UPDATE locks on the resources it inspects. If SQL Server does not have a useful index to help find the desired row, it uses a table scan. This means every row in the table acquires an update lock, and the row actually being updated acquires an exclusive lock, which is not released until the end of the transaction. The following page shows a small example that nevertheless illustrates the crucial relationship between indexes and concurrency.

USE pubs
go
DROP TABLE t1
go
/* First create and populate a small table. */
CREATE TABLE t1 (a int)
go
INSERT INTO t1 VALUES (1)
INSERT INTO t1 VALUES (3)
INSERT INTO t1 VALUES (5)
go
BEGIN tran
UPDATE t1
SET a = 7 WHERE a = 1
EXEC sp_lock @@spid
/* The output here should show you one X lock, on a RID; that is the
row that has been updated. */
/* In another query window, run this batch before rollback is
issued: */
USE pubs
UPDATE t1
SET a = 10
WHERE a = 3
/* Execute the rollback in the first window. */
ROLLBACK TRAN

You should have noticed that the second connection was unable to proceed. To find the row where a = 3, it tries to scan the table, first acquiring update locks. However, it cannot obtain an update lock on the first row, which now has a value of 7 for a, because that row is exclusively locked. Since SQL Server has no way to know whether that is a row it is looking for without being able to even look at it, this second connection blocks. When the rollback occurs in the first connection, the locks are released and the second connection can finish.

Let's see what happens if we put an index on the table. We'll run the same script again, except we'll build a nonclustered index on column a.

USE pubs
go
DROP TABLE t1
go
/* First create and populate a small table. */
CREATE TABLE t1 ( a int)
Go
CREATE INDEX idx1 ON t1(a)
go
INSERT INTO t1 VALUES (1)
INSERT INTO t1 VALUES (3)
INSERT INTO t1 VALUES (5)
go
BEGIN tran
UPDATE t1
SET a = 7 WHERE a = 1
EXEC sp_lock @@spid
/* In this case, the output should show you three X locks (one again on
a RID) and two KEY locks. When the key column a is changed, the leaf
level of the nonclustered index is adjusted. Since the leaf level of
the index keeps all the keys in sorted order, the old key with the
value 1 is moved from the beginning of the leaf level to the end,
because now its value is 7. However, until the transaction is over, a
ghost entry is left in the original position and the key lock is
maintained. So there are two key locks: one for the old value and one
for the new. */
/* In another query window, run this batch before rollback is issued: */
USE pubs
UPDATE t1
SET a = 10
WHERE a = 3
/* Execute the rollback in the first window. */
ROLLBACK TRAN

This time the second query succeeded, even though the first query held X locks on the keys in the leaf level of the index. The second connection was able to generate the lock resource string for the keys it needed to lock, and then only request locks on those particular keys. Since the keys the second connection requested were not the same as the keys that the first connection locked, there was no conflict over locking resources and the second connection could proceed.

Resolving Deadlock Problems

Deadlocks befuddle many programmers and are the bane of many applications. A deadlock occurs when, without some intervening action, processes cannot get the locks they need no matter how long they wait. Simply waiting for locks is not a deadlock condition. SQL Server automatically detects the deadlock condition and terminates one of the processes to resolve the situation. The process gets the infamous error message 1205 indicating that it was selected as the "victim" and that its batch and the current transaction have been canceled. The other process can then get the locks it needs and proceed. The two general forms of deadlocks are cycle deadlocks and conversion deadlocks. (See Chapter 13 for more information about these two kinds of deadlocks.)

Cycle Deadlock Example

If you repeatedly run the following two transactions simultaneously from different OSQL.EXE sessions, you are nearly assured of encountering a "deadly embrace" (cycle deadlock) almost immediately from one of the two processes. It should be clear why: one of the processes gets an exclusive lock on a row in the authors table and needs an exclusive lock on a row in the employee table. The other process gets an exclusive lock on the row in employee, but it needs an exclusive lock for the same row in authors that the first process has locked.

If you simply rewrite one of the batches so that both batches first update authors and then update employee, the two connections will run forever without falling into a cycle deadlock. Or you can first update employee and then update authors from both connections. Which table you update first doesn't matter, but the updates must be consistent and you must follow a known protocol. If your application design specifies a standard protocol for accessing the tables consistently, one of the connections gets the exclusive page lock on the first table and the other process must wait for the lock to be released. Simply waiting momentarily for a lock is normal and usually fast, and it happens frequently without your realizing it. It is not a deadlock.

Conversion Deadlock Example

Now run the following transaction simultaneously from two different OSQL.EXE sessions. You're running the same script, so it's obvious that the two processes follow a consistent order for accessing tables. But this example quickly produces a deadlock. A delay has been added so that you encounter the race condition more quickly, but the condition is lurking there even without the WAITFOR DELAY—the delay just widens the window.

You can correct this example in a couple of ways. Does the isolation level need to be Repeatable Read? If Read Committed is sufficient, simply change the isolation level to get rid of the deadlock and to provide better concurrency. If you use Read Committed isolation, the shared locks can be released after the SELECT, and then one of the two processes can acquire the exclusive lock it needs. The other process waits for the exclusive lock and acquires it as soon as the first process finishes its update. All operations progress smoothly, and the queuing and waiting happen invisibly and so quickly that it is not noticeable to the processes.

But suppose that you need Repeatable Read (or Serializable) isolation. In this case, you should serialize access by using an update lock, which you request using the UPDLOCK hint. Recall from Chapter 13 that an update lock is compatible with a shared lock for the same resource. But two update locks for the same resource are not compatible, and update and exclusive locks are also not compatible. Acquiring an update lock does not prevent others from reading the same data, but it does ensure that you are first in line to upgrade your lock to an exclusive lock if you subsequently decide to modify the locked data. (The important issue here is that the exclusive lock can be acquired, so this technique works equally well even if that second statement is a DELETE and not an UPDATE, as in this example.)

By serializing access to the exclusive lock, you prevent the deadlock situation. The serialization also reduces concurrency, but that's the price you must pay to achieve the high level of transaction isolation. A modified example follows; multiple simultaneous instances of this example will not deadlock. Try running about 20 simultaneous instances and see for yourself. None of the instances deadlock and all complete, but they run serially. With the built-in 5-second sleep, it takes about 100 seconds for all 20 connections to complete because one connection completes about every 5 seconds. This illustrates the lower concurrency that results from the need for higher levels of consistency (in this case, Repeatable Read).

As a general strategy, add the UPDLOCK hint (or some other serialization) if you discover during testing that conversion deadlocks are occurring. Or add UPDLOCK from the outset because you detect from your CRUD analysis that deadlocks are likely, and then try backing it off during multiple-user testing to see if it is absolutely necessary. Either approach is viable. If you know that in most cases the read-for-update (discussed later) will follow with an actual update, you might opt for the UPDLOCK hint from the outset and see if you can back it off later. But if the transaction is short and you will later update the data in most cases, there isn't much point in backing off the update lock. The shared lock will need to upgrade to an exclusive lock, so getting the update lock in the first place makes good sense.

Preventing Deadlocks

Deadlocks can usually be avoided, although you might have to do some detailed analysis to solve the problems that cause them. Sometimes the cure is worse than the ailment, and you're better off handling deadlocks rather than totally preventing them, as we'll discuss in the next section. Preventing deadlocks (especially conversion deadlocks) requires a thorough understanding of lock compatibility. These are the main techniques you can use to prevent deadlocks:

To prevent cycle deadlocks, make all processes access resources in a consistent order.

Reduce the transaction isolation level if it's suitable for the application.

To prevent conversion deadlocks, explicitly serialize access to a resource.

Deadlock prevention is a good reason to use stored procedures. By encapsulating the data access logic in stored procedures, it's easier to impose consistent protocols for the order in which resources (for example, tables) are accessed, which can help you avoid cycle deadlocks. But in so doing, you do not reduce the likelihood of conversion deadlocks. As noted above, conversion deadlocks are best dealt with by serializing access to a resource or by lowering the transaction isolation level if appropriate. The most common scenario for conversion deadlocks is the read-for-update situation. If a resource will be read within a transaction requiring Repeatable Read or Serializable isolation and will be updated later, you should request an update lock on the read using the UPDLOCK hint. The update lock will let other users read the data but will prevent them from updating the data or doing a read-for-update. The net effect is that once the update lock is acquired, the process will be next in line for the exclusive lock needed to actually modify it.

Handling Deadlocks

The cost of serializing access is that other users wanting to read-for-update (or actually update or delete) must wait. If you are not experiencing deadlocks, serializing access might needlessly reduce concurrency. You might have a case in which a transaction often does a read-for-update but only infrequently does the update. In this case, deadlocks might be infrequent. The best course of action might be to not prevent deadlocks and simply deal with them when they occur.

Preventing deadlocks can significantly reduce concurrency because the read-for-update would be blocked. Instead, you can simply write your applications to handle deadlocking. Check for deadlock message 1205, and retry the transaction. With retry logic, you can live with moderate deadlock activity without adding a lot of serialization to the application. It's still a good idea to keep count of how often you experience deadlocks; if the incidence is high, the wasted effort and constant retrying are likely to be worse than the cost of preventing the deadlock in the first place. How you write the deadlock handler will depend on the language or tool you use to build your application. But an application that is prone to deadlocks should have a deadlock handler that retries. Such a handler must be written in the host language. There is no way to do a retry directly within a stored procedure or a batch, since deadlock error 1205 terminates the batch.

Volunteering to Be the Deadlock Victim

Recall from Chapter 13 that the LOCK_MONITOR process in SQL Server typically chooses as the deadlock victim the process that made the final lock request that closed the loop and created a circular chain. But a process can also offer to "sacrifice itself" as the victim for deadlock resolution. You can make this happen by using the SET DEADLOCK_PRIORITY LOW | NORMAL statement. If a process has a deadlock priority of LOW and the other participating process is NORMAL (the default), the LOW process is chosen as the victim even if it was not the process that closed the loop.

In the deadlock examples shown earlier, you saw that the default victim is the process that you started second, since it closes the loop. However, adding SET DEADLOCK_PRIORITY LOW to one of the connections (and not the other) indicates that it will be selected as the victim, even if it was started first. You might find this useful if, for example, you are doing reporting and OLTP on the same database and you occasionally have deadlocks, and you know that one process is more important than the other. You set the less important process to LOW. It might also be useful if one application was written with a good deadlock handler and the other was not. Until the application without the handler can be fixed, the "good" application can make the simple change of volunteering itself and then handle a deadlock with retry logic when the deadlock occurs later.

Watching Locking Activity

Locking problems often result from locks that you don't even realize are being taken. You might be updating only table A, but blocking issues arise on table B because of relationships that you don't realize exist. If, for example, a foreign key relationship exists between table A and table B and the update on A is causing some query activity to B, some shared locks must exist. Or a trigger or a nested call to another procedure might cause some locking that isn't obvious to you.

In cases like these, you must be able to watch locks to look for locking operations that you weren't aware of. The graphical lock display of SQL Server Enterprise Manager is the most convenient way to watch locking in many cases. However, SQL Server Enterprise Manager provides only a snapshot of the current state of locking; you can miss a lot of locking activity that occurs in the time it takes you to refresh the display.

Identifying the Culprit

As with most debugging situations, the hardest part of solving a problem with locking is understanding the problem. If you get complaints that "the system is hung," it's a good bet that you have a blocking problem. Most blocking problems happen because a single process holds locks for an extended period of time. A classic case (as we discussed earlier) is an interactive application that holds locks until the user at the other end takes an action, such as clicking a button to commit the transaction or scrolling to the end of the output, which causes all the results to be processed. If the user goes to lunch and doesn't take that action, everyone else might as well grab lunch too, because they're not going to get much work done. Locks pile up, and the system seems to hang. The locks in the system are reported in the pseudo–system table, syslockinfo in the master database. (We'll discuss a related pseudo–system table, sysprocesses, later in the chapter.)

Note: The pseudo–system tables are not maintained as on-disk structures. Locking and process data are by definition relevant only at runtime. So syslockinfo and sysprocesses are presented as system tables, and although they can be queried just like other system tables, they do not have on-disk storage as normal tables do.

A big challenge in trying to identify the cause of blocking is that when you experience such problems, you probably have hundreds or thousands of locks piled up. At these times, it's hard to see the trees through the forest. Usually, you just want to see which process is holding up everyone else. Once you identify it, of course, you need to immediately get it out of the way, either by forcing it to complete or, if necessary, by issuing the KILL command on the connection. The longer-term solution is to rework the application so that it will not hold locks indefinitely.

SQL Server Enterprise Manager, shown in Figure 14-3, provides a graphical way to watch locking activity; this is the best method most of the time. From the Management folder, choose Current Activity. You can look at locks either by process or by object. If you select the Locks / Process ID option, the graphic shows which processes are blocked and which processes are blocking. When you double-click on the process in the right pane, you see the last command the process issued. This is what Figure 14-3 shows. You can then send a message to the offending user to complete the transaction (although this is a short-term solution at best). Or you can kill the process from SQL Server Enterprise Manager—which isn't ideal, but sometimes it's the best short-term course of action. You can select a particular process from the left pane, under the Locks / Process ID option, and the right pane will show all the locks held by that process.

Alternatively, you can select the Locks/Objects option under Current Activity. The left pane displays a graphic of each locked object. If you then select one of these objects, the right pane shows which processes have that object locked. Again, when you double-click on the process in the right pane, you see the last command the process issued.

But sometimes even SQL Server Enterprise Manager can get blocked by locking activity in tempdb, so it's useful to know how to monitor-lock the old-fashioned way, by using system stored procedures or querying directly from the syslockinfo table. You start by running sp_who2 and sp_lock. Most people are familiar with sp_who but not sp_who2, which works in almost the same way but formats the output in a more readable way and contains more of the columns from the sysprocesses table. However, sysprocesses contains additional information that neither sp_who nor sp_who2 reports, so you might even want to write your own sp_who3! We'll look at the sysprocesses table in more detail later in this chapter.

The BlkBy column of the sp_who2 output shows the ID (spid) of a blocking process. The procedure sp_lock provides a formatted and sorted listing of syslockinfo that decodes the lock types into mnemonic forms (such as update_page instead of type 7). In Chapter 13, we looked at output from sp_lock to watch the various types and modes of locking. If your users say that the system is hung, try to log on and execute sp_who2 and sp_lock. If you can log on and execute these procedures, you immediately know that the system is not hung. If you see a nonzero spid in the BlkBy column of the sp_who2 output or if the Status value is WAIT in the sp_lock output, blocking is occurring. It's normal for some blocking to occur—it simply indicates that one process is waiting for a resource held by another. If such contention didn't exist, you wouldn't even need to lock. But if you reissue the query for lock activity a moment later, you expect to find that same blockage cleared up. In a smoothly running system, the duration that locks are held is short, so long pileups of locks don't occur.

When a major lock pileup occurs, you can get a lengthy chain of processes blocking other processes. It can get pretty cumbersome to try to track this manually. You can create a procedure like the following to look for the process at the head of a blocking chain:

Once you identify the connection (spid) causing the problem, check to see the specific locks that it is holding. You can query from syslockinfo for this, but simply running sp_lockspid will probably give you exactly what you need. There might be multiple separate lock chains, in which case the batch will return more than one spid. You can follow the same procedure for each one.

The syslockinfo Table

If a lot of blocking is going on, you might want to take a look at all locks held that are blocking other users. You can use the procedure sp_blockinglocks (which is not a part of the installed SQL Server product but is included on the companion CD) to print out a list of all locks held that are blocking other processes. The procedure examines the syslockinfo table for resources that have more than one lock listed and are held by different processes with different status values. For example, if process 12 has an X lock on the sales table with a status of GRANT and process 13 is trying to read from that table, the output of sp_blockinglocks looks like this:

It can also be useful to see the last command issued by a blocking process; you can do this by double-clicking on a process ID in the Current Activity panes of SQL Server Enterprise Manager. Use DBCC INPUTBUFFER (spid) for this—which is the same thing SQL Server Enterprise Manager does when you double-click on a process. DBCC INPUTBUFFER reads the memory from within the server that was used for the last command. So DBCC INPUTBUFFER can access whatever command is still in the buffer for a given connection, even if it has already been executed.

Usually, by this point you have enough of a handle on the problem to turn your attention to the application, which is where the resolution will ultimately lie. But if you need or want to go a bit further, you can find out the depth at which the connection is nested within a transaction. Blocking problems often happen because the connection doesn't realize the nested depth and hasn't applied the correct pairing of COMMIT commands. (Chapter 10 discusses the scoping and nesting levels of transactions.) A connection can determine its own nesting depth by querying @@TRANCOUNT. Starting with SQL Server 7, you can also determine the nesting level of any current connection, not just your own. This involves inspecting the sysprocesses table directly because the information is not included in either sp_who or sp_who2.

The sysprocesses Table

As we mentioned, some columns in the sysprocesses table don't show up in the output of either sp_who or sp_who2. Table 14-1 depicts these columns.

Column Name

Data Type

Description

waittype

binary(2)

The waittypes of 1 through 15 correspond to the mode of lock being waited on. (See list on the following page.) Hex values 0x40 through 0x4f correspond to miscellaneous waits. 0x81 is a wait to write to the log. 0x400 through 0x40f are latch waits. 0x800 is a wait on a network write.

waittime

int

Indicates current wait time in milliseconds. Value is 0 when the process is not waiting.

lastwaittype

nchar(32)

A string indicating the name of the last or current wait type.

waitresource

nchar(32)

A textual representation of a lock resource.

memusage

int

Indicates number of pages in the memory cache that are currently allocated to this process. A negative number means that the process is freeing memory allocated by another process.

login_time

datetime

Indicates time when a client process logged on to the server. For system processes, indicates time when SQL Server startup occurred.

Columns in the sysprocesses table that aren't available via sp_who or sp_who2.

Column Name

Data Type

Description

ecid

smallint

An execution context ID that uniquely identifies the subthreads operating on behalf of a single process.

open_tran

smallint

Indicates current transaction nesting depth for the process.

sid

binary(85)

A globally unique identifier (GUID) for the user.

hostprocess

nchar(8)

The workstation process ID number.

nt_domain

nchar(128)

The Windows NT domain for the client.

nt_username

nchar(128)

The Windows NT username for the process.

net_address

nchar(12)

An assigned unique identifier for the network interface card on each user's workstation.

net_library

nchar(12)

The name of the client's network library DLL.

The procedure sp_who2 translates the dbid into a database name and translates the numeric value for its status into a string. The most common status values you'll see are BACKGROUND, ROLLBACK, DORMANT (used when a connection is being retained while waiting for Remote Procedure Call [RPC] processing), SLEEPING, and RUNNABLE.

The values that the sysprocesses table holds in the waittype column are the same ones you'll see in the req_mode column of syslockinfo. They can be translated as follows:

Trace flags are useful for analyzing deadlock situations. When a process is part of a deadlock, the victim process realizes that the deadlock occurred when it gets error message 1205. Any other process participating in the deadlock is unaware of the situation. To resolve the deadlock, you probably want to see both processes; trace flag 1204 provides this information.

The following is a fragment of the output from SQLSERVR.EXE, which was started from the command line using –T1204. This example uses the conversion deadlock script that we used previously—issuing the same batch from two connections—to illustrate the output of trace flag 1204. To capture deadlock information, SQL Server must be started from a command prompt. You can specify a special location for the error log, which will contain all the same information that is displayed on the command screen while SQL Server is running.

This output shows the spid for both processes affected, shows a fragment of their input buffer (but not the entire command), and notes that neither process can upgrade its locks because of the circular relationship. Process 8 requests an update lock on a key that process 7 already has an update lock on. At the same time, process 7 requests an exclusive lock on a key that process 8 already has a shared lock on. Thus the two processes are in a circular relationship. The output can help you solve the problem—you know the processes involved and the locks that could not be acquired, and you have an idea of the commands being executed.

You might also notice, if you scrutinize the trace flag output in the error log carefully, that trace flag 1206 is turned on. Used in conjunction with trace flag 1204, this flag produces some of the actual object name information that you see in the output above. SQL Server automatically enables this flag when 1204 is turned on.

Note: The number assigned to trace flag 1204 is intended to be easy to remember. Recall that error 1205 is the well-known error message an application receives when it is chosen as the deadlock victim.

Segregating OLTP and DSS Applications

Sometimes it makes sense to split up your OLTP and DSS applications. This can be an excellent strategy if your DSS applications don't need immediate access to information. This is a reason for the recent popularity of data warehousing, data marts, and other data management systems (although these concepts have been around for years).

You can use a separate database (on the same server or on different servers) for DSS and OLTP, and the DSS database can be much more heavily indexed than the OLTP database. The DSS database will not have exclusive locks holding up queries, and the OLTP database's transactions will not get held up by the shared locks of the DSS. SQL Server's built-in replication capabilities make it relatively easy to publish data from the OLTP server and subscribe to it from your reporting server. SQL Server's replication capabilities can propagate data in near real time, with latency between the servers of just a few seconds. However, for maintaining a DSS server, it is best to propagate the changes during off-peak hours. Otherwise, the locks acquired during propagation at the subscribing site will affect the DSS users, just as any other update activity would. In addition, if for most of the day the DSS server is only executing SELECT queries, you can enable the database option read only. This means that SQL Server will not acquire or check locks for any queries because they will all be shared and therefore compatible. Completely bypassing lock maintenance can lead to noticeable performance improvements for the SELECT queries.

Optimizing Queries

Monitoring and tuning of queries are essential to optimizing performance. Knowing how the query optimizer works can be helpful as you think about how to write a good query or what indexes to define. However, you should guard against outsmarting yourself and trying to predict what the optimizer will do. You might miss some good options this way. Try to write your queries in the most intuitive way you can and then try to tune them only if their performance doesn't seem good enough.

The big gains in query performance usually do not come from syntactic changes in the query but rather from a change in the database design or in indexing—or from taking a completely different approach to the query. For example, you might have to choose among the following approaches: writing a pretty complex query using a self-join or multilevel correlated subquery, using a cursor, or creating a solution that involves temporary tables. (We saw some of these techniques in Chapter 12.) Invariably, the only way you can determine which solution is best is to try all the queries. You might be surprised by the results.

Rather than trying to learn a bunch of tricks to do up front, you should be much more interested in doing the basics right. That is, you need to be sure that you've set up a good database structure—including perhaps some denormalization based on your CRUD analysis—and that you've created in advance what appear to be useful indexes. From there, you can test your queries and study the query plans generated for any queries that seem problematic. (This is why we've looked at general database design strategies and indexing before discussing how the query optimizer works.)

Nevertheless, insight into how the optimizer works is certainly useful. It can help you understand the guidelines on which indexes to create, as well as the query plans you will examine in the SQL Server Query Analyzer. For each table involved in the query, the query optimizer evaluates the search arguments and considers which indexes are available to narrow the scan of a table. That is, the optimizer evaluates to what extent the index can exclude rows from consideration. The more rows that can be excluded, the better, since that leaves fewer rows to process.

Joins can be processed using one of three methods: nested iteration, hashing, or merging. For any of these methods, the optimizer decides on the order in which the tables should be accessed. Because a nested iteration is a loop, order is important. The fewer the iterations through the loops, the less processing will be required. So it is useful to start with the table (or tables) that can exclude the most rows as the outer loops. The general strategy is to make the outer table limit the search the most, which results in the fewest total iterations (scans). With hash or merge joins, SQL Server builds an in-memory structure from one of the tables. To conserve memory resources, it tries to determine which is the smaller table or which will have the smallest number of qualifying rows after the WHERE conditions are applied. This table is typically chosen as the first table to be processed. (We'll see more on each of these join strategies later in the chapter.)

For each combination of join strategy, join order, and indexes, the query optimizer estimates a cost, taking into account the number of logical reads and the memory resources that are required and available. Then the optimizer compares the cost of each plan and chooses the plan with the lowest estimate. You can think of query optimization as happening in three main phases: query analysis, index selection, and join selection (although there is a lot of overlap between the index selection and join selection phases). The following sections discuss each phase.

Query Analysis

During the first phase of query optimization, query analysis, the query optimizer looks at each clause of the query and determines whether it can be useful in limiting how much data must be scanned—that is, whether the clause is useful as a search argument (SARG) or as part of the join criteria. A clause that can be used as a search argument is referred to as sargable, or optimizable, and can make use of an index for faster retrieval.

A SARG limits a search because it specifies an exact match, a range of values, or a conjunction of two or more items joined by AND. A SARG contains a constant expression (or a variable that is resolved to a constant) that acts on a column by using an operator. It has the form

column inclusive_operator <constant or variable>

or

<constant or variable> inclusive_operator column

The column name can appear on one side of the operator, and the constant or variable can appear on the other side. If a column appears on both sides of the operator, the clause is not sargable. Sargable operators include =, >, <, =>, <=, BETWEEN, and sometimes LIKE. LIKE is sargable depending on the type of wildcards (regular expression) used. For example, LIKE 'Jon%' is sargable but LIKE 'Jon%' is not because the wildcard (%) at the beginning prevents the use of an index. Here are some SARG examples:

A single SARG can include many conditions if they are AND'ed together. That is, one index might be able to operate on all the conditions that are AND'ed together. In the example above, there might be an index on (name,salary), so the entire clause name = 'jones' AND salary > 100000 can be considered one SARG and can be evaluated for qualifying rows using one index. If OR is used instead of AND in this example, a single index scan cannot be used to qualify both terms. The reason should be clear—if the lead field in the index key is name, the index is useful for finding just the 'jones' entries. If the criteria is AND salary, the second field of the index also qualifies those rows. But if the criteria is OR salary, the index is not useful because all the rows would need to be examined, not just the 'jones' entries.

The phone book analogy also applies. If you want to find people with the name "jones" AND that live on 5th Avenue, using the phone book can help greatly reduce the size of your search. But if you want to find people with the name "Jones" OR that live on 5th Avenue, you have to scan every entry in the book. (This assumes that you have only one phone book that is sorted alphabetically by name. If you have two phone books, and one is sorted by name and one is sorted by street, that's another story, which we'll discuss a bit later.)

An expression that is not sargable cannot limit the search. (That is, every row must be evaluated.) So an index is not useful to nonsargable expressions. Typical nonsargable expressions include negation operators such as NOT, !=, <>, !>, !<, NOT EXISTS, NOT IN, and NOT LIKE. Don't extrapolate too far and think that this means using a nonsargable clause always results in a table scan. An index is not useful to the nonsargable clause, but there might be indexes useful to other SARGs in the query. Queries often have multiple clauses, so a great index for one or more of the other clauses might be available. Here are some examples of nonsargable clauses:

Note: The last example above is not a single search argument, but each expression on either side of the OR is individually sargable. So a single index won't be used to evaluate both expressions, as it might if the operator is AND. A separate index can still be useful to each expression.

Unlike previous versions of SQL Server, expressions involving computations are not always nonsargable. SQL Server 7 can do scalar simplification in some cases. The following table shows examples of expressions that the SQL Server optimizer will simplify, along with the resultant simplified expression that is used to determine the usefulness of an index:

Original Expression

Simplified Expression

WHERE price * 12 = sales/costs

WHERE price = sales/costs/12

WHERE salary * 1 > 40000

WHERE salary > 40000

The simplified expression is not guaranteed to be exactly equivalent to the original, particularly if you have a mixture of datatypes and implicit conversions are occurring. However, the simplified expression is used only during optimization, to detect and measure the usefulness of a particular index. When determining whether a particular row actually qualifies for the result set, SQL Server always uses the original expression.

Expressions that apply a function to a column are not sargable, and in SQL Server 7, the optimizer does not attempt to internally convert them to something that is sargable. In some cases, however, you might be able to write an equivalent expression. For example, suppose you have an index on the lname column of the newemployee table. The following two queries return the same results, but the first one does not use an index and the second one does:

SELECT * FROM newemployee WHERE substring(lname,1,1) = 'K'
SELECT * FROM newemployee WHERE lname LIKE 'K%'

Future versions of SQL Server will address more issues of nonsargable clauses.

Index Selection

During the second phase of query optimization, index selection, the query optimizer determines whether an index exists for a sargable clause, assesses the index's usefulness by determining the selectivity of the clause (that is, how many rows will be returned), and estimates the cost to find the qualifying rows. An index is potentially useful if its first column is used in the search argument and the search argument establishes a lower bound, upper bound, or both to limit the search. In addition, if an index contains every column referenced in a query, even if none of those columns is the first column of the index, the index is considered useful.

An index that contains all the referenced columns is called a covering index and is one of the fastest ways to access data. The data pages do not have to be accessed at all, which can mean a substantial savings in physical I/O. For example, suppose we have an index on last name, first name, and date of hire. The query at the top of the following page is covered by this index.

SELECT lname, hire_date
FROM employee
WHERE fname = 'Sven'

You might have more covering indexes than you are aware of. In SQL Server 7, a nonclustered index contains the clustered index key as part of the row locator. So if the employee table has a clustered index on the employee ID (emp_id) column, the nonclustered index on lname contains both the last name and the ID for every data row, stored in the leaf level of the index. Thus, the following is also a covered query:

SELECT emp_id, fname, lname
FROM employee
WHERE lname LIKE 'B%'

Index Statistics

After the query optimizer finds a potentially useful index that matches the clause, it evaluates the index based on the selectivity of the clause. The optimizer checks the index's statistics—the histogram of values accessible through the index's row in the sysindexes table. The histogram is created when the index is created on existing data, and the values are refreshed each time UPDATE STATISTICS runs. If the index is created before data exists in the table, no statistics appear. The statistics will be misleading if they were generated when the dispersion of data values was significantly different from what appears in the current data in the table. However, SQL Server detects if statistics are not up-to-date, and by default it automatically updates them during query optimization.

Statistics are a histogram consisting of an even sampling of values for the index key (or the first column of the key for a composite index) based on the current data. The histogram is stored in the statblob field of the sysindexes table, which is of type image. (As you know, image data is actually stored in structures separate from the data row itself. The data row merely contains a pointer to the image data. For simplicity's sake, we'll talk about the index statistics as being stored in the image field called statblob.) To fully estimate the usefulness of an index, the optimizer also needs to know the number of pages in the table or index; this information is stored in the dpages column of sysindexes.

The statistics information also includes details about the uniqueness of the data values encountered, referred to as the density, which provides a measure of how selective the index is. Recall that the more selective an index is, the more useful it is, because higher selectivity means that more rows can be eliminated from consideration. A unique index, of course, is the most selective—by definition, each index entry can point to only one row. A unique index has a density value of 1/number of rows in the table.

Density values range from 0 through 1. Highly selective indexes have density values of 0.10 or lower. For example, a unique index on a table with 8345 rows has a density of 0.00012 (1/8345). If there is a nonunique nonclustered index with a density of 0.2165 on the same table, each index key can be expected to point to about 1807 rows (0.2165 * 8345). This is probably not selective enough to be more efficient than just scanning the table, so this index is probably not useful. Because driving the query from a nonclustered index means that the pages must be retrieved in index order, an estimated 1807 data page accesses (or logical reads) are needed if there is no clustered index on the table and the leaf level of the index contains the actual RID of the desired data row. The only time a data page doesn't need to be reaccessed is for the occasional coincidence that can occur when two adjacent index entries happen to point to the same data page.

Assume in this example that about 40 rows fit on a page, so there are about 209 total pages. The chance of two adjacent index entries pointing to the same page is only about 0.5 percent (1/209). The number of logical reads for just the data pages, not even counting the index I/O, is likely to be close to 1807. If there is also a clustered index on the table, a couple of additional page accesses will be needed for each nonclustered key accessed. In contrast, the entire table can be scanned with just 209 logical reads—the number of pages in the table. In this case, the optimizer looks for better indexes or decides that a table scan is the best it can do.

The statistics histogram records steps (samples) for only the lead column of the index. This optimization takes into account the fact that an index is useful only if its lead column is specified in a WHERE clause. The density information stored in the statblob field consists of two types of information, called density values and all_density values. The values are computed based on the number of rows in the table, the cardinality of the indexed columns (the number of distinct values), the number of nonfrequent (NF) values (values that appear no more than once in the histogram step boundaries), and the cardinality of NF values. This will become clearer when we see some actual statistics information.

SQL Server defines density and all_density as follows:

density = (NFcount/distinctNFcount)/(numberofrows)

all_density = 1/cardinality of index keys

Statistics for a single column consist of one histogram, one all_density value, and one density value. The multicolumn statistics for one set of columns in a composite index consist of one histogram for the first column in the index, one density value for the first column, and all_density values for each prefix combination of columns (including the first column alone).The fact that density information is kept for all columns helps you decide how useful the index is for joins.

Suppose, for example, that an index is composed of three key fields. The density on the first column might be 0.50, which is not too useful. But as you look at more columns in the key, the number of rows pointed to is fewer (or in the worst case, the same) as the first column, so the density value goes down. If you're looking at both the first and second columns, the density might be 0.25, which is somewhat better. And if you examine three columns, the density might be 0.03, which is highly selective. (It doesn't make sense to refer to the density of only the second column. The lead column density is always needed.)

The histogram contains up to 300 values of a given key column. In addition to the histogram, the statblob field contains the following information:

The time of the last statistics collection

The number of rows used to produce the histogram and density information

The average row length

Densities for other combinations of columns

The following example uses the authors table in the pubs database. Assume that we have built a nonclustered composite index (idx3) on state and au_lname in addition to an existing clustered index on au_id. We can use DBCC SHOW_STATISTICS to display the statistics information for this index:

This output indicates that the statistics for this index were last updated on January 20, 1999. It also indicates that the table currently has 23 rows. There are eight distinct values for state, but two of them occur in multiple steps, so only six of them are nonfrequent. In this data set, all the NF key values actually occur only once in the data, so the NF count is the same as the unique NF count, and density is computed as 6/6/23, or 4.3478262E-2. The all_density value for the state column is 1/8, or 0.125. For the combination of state and au_lname, there is only one duplicate, so there are 22 unique values, and 1/22 is 4.5454547E--2. Notice that there is also an all_density value for the combination of state, au_lname, and au_id. Since this table has a clustered index on au_id, that key appears along with every nonclustered index key and can be used in determining density information. Since the addition of au_id makes the three-valued key unique, all_density is 1/23, or 4.3478262E--2. (In this case, this is the same as the density for first column, but this won't always be true.)

In addition to statistics on indexes, SQL Server 7 can also keep track of statistics on columns with no indexes. Knowing the density, or the likelihood of a particular value occurring, can help the optimizer determine an optimum processing strategy, even if SQL Server can't use an index to actually locate the values. We'll discuss column statistics in more detail when we look at statistics management later in this section.

Query optimization is probability-based, which means that its conclusions can be wrong. It can sometimes make decisions that are not optimal, even if those decisions make sense from a probability standpoint. (As an analogy, you can think of national election polls, which show how accurate relatively small samples can be for predicting broader results. But the famous headline "Dewey Defeats Truman!" proves that predictions based on sample populations can also be wrong.)

Note: The information about the meaning of the SHOW_STATISTICS values, as well as information about statistics on columns, was adapted from a preliminary copy of a whitepaper by Lubor Kollar. We are indebted to him for his assistance.

Index Cost

The second part of determining the selectivity of a clause is calculating the estimated cost of the access methods that can be used. Even if a useful index is present, it might not be used if the optimizer determines that it is not the cheapest access method. One of the main components of the cost calculation, especially for queries involving only a single table, is the amount of logical I/O, which refers to the number of page accesses that are needed. Logical I/O is counted whether the pages are already in the memory cache or must be read from disk and brought into the cache. The term is sometimes used to refer to I/O from cache, as if a read were either logical or physical, but this is a misnomer. As discussed in Chapter 3, pages are always retrieved from the cache via a request to the Buffer Manager, so all reads are logical. If the pages are not already in the cache, they must be brought in first by the Buffer Manager. In those cases, the read is also physical. Only the Buffer Manager, which serves up the pages, knows whether a page is already in cache or must be accessed from disk and brought into cache. The ratio of how much I/O is already in the cache and does not require a physical read is referred to as the cache-hit ratio, an important metric to watch.

The optimizer evaluates indexes to estimate the number of likely "hits" based on the density and step values in the statistics. Based on these values, the optimizer estimates how many rows qualify for the given SARG and how many logical reads would retrieve those qualifying rows. It might find multiple indexes that can find qualifying rows, or it might determine that just scanning the table and checking all the rows is best. It chooses the access method that it predicts will require the fewest logical reads.

Whether the access method uses a clustered index, one or more nonclustered indexes, a table scan, or another option determines the estimated number of logical reads. This number can be very different for each method. Using an index to find qualifying rows is frequently referred to as an index driving the scan. When a nonclustered index drives the scan, the query optimizer assumes that the page containing each identified qualifying row is probably not the same page accessed the last time. Because the pages must be retrieved according to the order of the index entries, retrieving the next row with a query driven by a nonclustered index will likely require that a different page than the one that contained the previous row be fetched because the two rows probably don't reside on the same page. They might reside on the same page by coincidence, but the optimizer correctly assumes that typically this will not be the case. (With 5000 pages, the chance of this occurring is 1/5000.) If the index is not clustered, data order is random with respect to the index key.

If the scan is driven from a clustered index, the next row is probably located on the same page as the previous row, since the index leaf is, in fact, the data. The only time this is not the case when you use a clustered index is when the last row on a page has been retrieved; the next page must be accessed, and it will contain the next bunch of rows. But moving back and forth between the index and the data pages is not required.

There is, of course, a chance that a specific page might already be in cache when you use a nonclustered index. But a logical read will still occur, even if the scan does not require physical I/O (a read from disk). Physical I/O is an order of magnitude more costly than I/O from cache. But don't assume that only the cost of physical I/O matters—reads from the cache are still far from free. In some of the advanced examples in Chapter 12, different solutions for the small pubs database each use about the same number of physical reads because the tables are small enough to all be cached. But the solutions have widely different needs for logical I/O, and the performance differences are quite dramatic.

To minimize logical I/O, the query optimizer tries to produce a plan that will result in the fewest number of page operations. In doing so, the optimizer will probably minimize physical I/O as well. For example, suppose we need to resolve a query with the following SARG:

WHERE Emp_Name BETWEEN 'Smith' AND 'Snow'

The relevant index entries for a nonclustered index on Emp_Name appear at the top of the following page.

If this table is a heap, when we use this nonclustered index, six logical reads are required to retrieve the data pages in addition to the I/O necessary to read the index. If none of the pages is already cached, the data access results in three physical I/O reads, assuming that the pages remain in the cache long enough to be reused for the subsequent rows (which is a good bet).

Suppose that all the index entries are on one leaf page and the index has one intermediate level. In this case, three I/O reads (logical and probably physical) are necessary to read the index (one root level, one intermediate level, and one leaf level). So chances are that driving this query via the nonclustered index requires about nine logical I/O reads, six of which are probably physical if the cache started out empty. The data pages are retrieved in the following order:

For each of the six nonclustered keys, we have to traverse the clustered index. You typically see about three to four times as many logical reads for traversing a nonclustered index for a table that also has a clustered index. However, if the indexes for this table are re-created so that a clustered index existed on the Emp_Name column, all six of the qualifying rows are probably located on the same page. The number of logical reads is probably only three (the index root, the intermediate index page, and the leaf index page, which is the data page), plus the final read that will retrieve all the qualifying rows. This scenario should make it clear to you why a clustered index can be so important to a range query.

With no clustered index, you have a choice between doing a table scan and using one or more nonclustered indexes. The number of logical I/O operations required for a scan of the table is equal to the number of pages in the table. A table scan starts at the first page and uses the Index Allocation Map (IAM) to access all the pages in the table. As the pages are read, the rows are evaluated to see whether they qualify based on the search criteria. Clearly, if the table from the previous example had fewer than nine total data pages, fewer logical reads would be necessary to scan the whole thing than to drive the scan off the nonclustered index (which was estimated to take nine logical reads).

The estimated logical reads for scanning qualifying rows is summarized in Table 14-2. The access method with the least estimated cost is chosen based on this information.

Table 14-2. The cost of data access for tables with different index structures.

Access Method

Estimated Cost (in Logical Reads)

Table scan

The total number of data pages in the table.

Clustered index

The number of levels in the index plus the number of data pages to scan. (Data pages to be scanned = number of qualifying rows / rows per data page.)

Nonclustered index on a heap

The number of levels in the index plus the number of leaf pages plus the number of qualifying rows (a logical read for the data page of each qualifying row).The same data pages are often retrieved (from cache) many times, so the number of logical reads can be much higher than the number of pages in the table.

Nonclustered index on a table with a clustered index

The number of levels in the index plus the number of leaf pages plus the number of qualifying rows times the cost of searching for a clustered index key.

Covering nonclustered index

The number of levels in the index plus the number of leaf index pages (qualifying rows / rows per leaf page). The data page need not be accessed because all necessary information is in the index key.

Using Multiple Indexes

In SQL Server 7, the optimizer can decide to use two or more nonclustered indexes to satisfy a single query. When more than one index is considered useful because of two or more SARGs, the cost estimate changes a bit from the formulas given in Table 14-1. In addition to the I/O cost of finding the qualifying rows in each index, there is an additional cost of finding the intersection of the indexes so that SQL Server can determine which data rows satisfy all search conditions. Since each nonclustered index key contains a locator for the actual data, the results of the two index searches can be treated as worktables and joined together on the locator information.

Suppose we have the following query and we have indexes on both the lastname and firstname columns. In this case, we have a clustered index on ID number, which is a unique value.

SELECT firstname, lastname, ID_num FROM employees
WHERE lastname BETWEEN 'Smith' AND 'Snow'
AND firstname BETWEEN 'Daniel' and 'David'

If the query optimizer decides to use both indexes, it builds two worktables with the results of each index search:

These worktables are joined on the row locator information, which is unique for each row. (Even if you don't declare a clustered index as unique, SQL Server adds a special uniqueifier where needed so there is always a unique row locator in every nonclustered index.) From the information above, we can see that only two rows have both their first name and last name values in the requested range. The result set is:

In this example, the data in the two indexes is enough to satisfy the query. If we also want to see information such as address and phone number, the query optimizer uses the locator to find the actual data row by traversing the clustered index. It can also use multiple indexes if a single table has multiple SARGs that are OR'ed together. In this case, SQL Server finds the UNION of the rows returned by each index to determine all the qualifying rows. If we take the same query and data and write an OR query instead of an AND, we can demonstrate the main differences:

SELECT firstname, lastname, ID_num FROM employees
WHERE lastname BETWEEN 'Smith' AND 'Snow'
OR firstname BETWEEN 'Daniel' and 'David'

SQL Server can use the two indexes in the same way and build the same two worktables. However, it must UNION the two worktables together and remove duplicates. Once it removes the duplicates, it uses the locators to find the rows in the table. All the result rows will have either a first name or a last name in the requested range, but not necessarily both. For example, we can return a row for Daniel Delaney or Bridgette Smith. If we don't remove duplicates, we end up with Daniel Smythe and Danielle Smith showing up twice because they meet both conditions of the SARGs.

Unusable Statistics

In two cases, the query optimizer can't use the statistics to estimate how many rows will satisfy a given SARG. First, if there are no statistics available, SQL Server obviously can't come up with any cost estimate. However, this situation is rare in SQL Server 7 because by default every database has the two options auto create statistics and auto update statistics set to TRUE. As we'll see later, you can turn this behavior off at the table level or the database level, but it is usually not recommended. The second situation in which there are no usable statistics is if the values in a SARG are variables. Consider this example:

When the preceding query is optimized, the SET statement has not been executed and the value of @name is unknown. No specific value can be used to compare the steps in the index's statistics information, so the query optimizer must guess.

Note: Don't confuse variables with parameters, even though their syntax is nearly identical. A variable's value is never known until the statement is actually executed; it is never known at compile time. Stored procedure parameters are known when the procedure is compiled because compilation and optimization don't even take place until the procedure is actually called with specific values for the parameters.

If the query optimizer can't use the statistics for either of the reasons mentioned, the server uses fixed percentages, shown below, depending on the operator. These fixed percentages can be grossly inaccurate for your specific data, however, so make sure that your statistics are up-to-date.

Operator

Percentage of Rows

=

10

>

30

<

30

BETWEEN

10

When the SARG involves an equality, the situation is usually handled in a special way. The 10 percent estimate shown above is actually used only in the unusual case in which there are no statistics at all. If there are statistics, they can be used. Even if we don't have a particular value to compare against the steps in the statistics histogram, the density information can be used. That information basically tells the query optimizer the expected number of duplicates, so when the SARG involves looking for one specific value, it assumes that the value occurs the average number of times. An even more special case occurs when the query optimizer recognizes an equality in the WHERE clause and the index is unique. Because this combination yields an exact match and always returns at most one row, the query optimizer doesn't have to use statistics. For queries of one row that use an exact match such as a lookup by primary key, a unique nonclustered index is highly efficient. In fact, many environments probably shouldn't "waste" their clustered index on the primary key. If access via the complete primary key is common, it might be beneficial to specify NONCLUSTERED when you declare the primary key and save the clustered index for another type of access (such as a range query) that can benefit more from the clustered index.

Join Selection

Join selection is the third major step in query optimization. If the query is a multiple-table query or a self-join, the query optimizer evaluates join selection and selects the join strategy with the lowest cost. It determines the cost using a number of factors, including the expected number of reads and the amount of memory required. It can use three basic strategies for processing joins: nested loop joins, merge joins, and hash joins. In each method, the tables to be joined (or subsets of tables that have already been restricted by applying SARGs) are called the join inputs.

Nested Loop Joins

Joins are usually processed as nested iterations—a set of loops that take a row from the first table and use that row to scan the inner table, and so on, until the result that matches is used to scan the last table. The number of iterations through any of the loops equals the number of scans that must be done. (This is not a table scan, since it is usually done using an index. In fact, if no useful index is available to be used for an inner table, nested iteration probably isn't used and a hash join strategy is used instead.) The result set is narrowed down as it progresses from table to table within each iteration in the loop. If you've done programming with an ISAM-type product, this should look familiar in terms of opening one "file" and then seeking into another in a loop. To process this join

WHERE dept.deptno=empl.deptno

you use pseudocode that looks something like this:

DO (until no more dept rows);
GET NEXT dept row;
{
begin
// Scan empl, hopefully using an index on empl.deptno
GET NEXT empl row for given dept
end
}

Merge Joins

Merge joins were introduced in SQL Server 7. You can use a merge join when the two join inputs are both sorted on the join column. Consider the query we looked at in the previous section:

WHERE dept.deptno=empl.deptno

If both the department table and the employee table have clustered indexes on the deptno column, the query is a prime candidate for a merge join. You can think of merge joins as taking two sorted lists of values and merging them. Suppose you have two piles of contractor information. One pile contains the master contract that each contractor has signed, and the second contains descriptions of each project that the contractor has worked on. Both piles are sorted by contractor name. You need to merge the two piles of paperwork so each contractor's master contract is placed with all the project descriptions for that contractor. You basically need to make just one pass through each stack.

Going back to the employee and department query, our pseudocode would look something like this:

GET one dept row and one empl row
DO (until one input is empty);
IF dept_no values are equal
Return values from both rows
ELSE IF empl.dept_no < dept.dept_no
GET new employee row
ELSE GET new department row
GET NEXT dept row;

The query optimizer usually chooses the merge join strategy when both inputs are already sorted on the join column. In some cases, even if a sort must be done on one of the inputs, the query optimizer might decide that it is faster to sort one input and then do a merge join than to use any of the other available strategies. If the inputs are both already sorted, little I/O is required to process a merge join if the join is one to many. A many-to-many merge join uses a temporary table to store rows instead of discarding them. If there are duplicate values from each input, one of the inputs must rewind to the start of the duplicates as each duplicate from the other input is processed.

Hash Joins

Hash joins, also introduced in SQL Server 7, can be used when no useful index exists on the join column in either input. Hashing is commonly used in searching algorithms (as discussed in detail in Donald Knuth's classic The Art of Computer Programming: Sorting and Searching). We'll look at a simplified definition here. Hashing lets you determine whether a particular data item matches an already existing value by dividing the existing data into groups based on some property. You put all the data with the same value in what we can call a hash bucket. Then, to see if a new value has a match in existing data, you simply look in the bucket for the right value.

For example, suppose you need to keep all your wrenches organized in your workshop so that you can find the right size and type when you need it. You can organize them by size property and put all half-inch wrenches (of any type) together, all 10-mm wrenches together, and so on. When you need to find a 15-mm box wrench, you simply go to the 15-mm drawer to find one (or see whether the last person who borrowed it has returned it). You don't have to look through all the wrenches, only the 15-mm ones, which greatly reduces your searching time. Alternatively, you can organize your wrenches by type and have all the box wrenches in one drawer, all the socket wrenches in another drawer, and so on.

Hashing data in a table is not quite so straightforward. You need a property by which to divide the data into sets of manageable size with a relatively equal membership. For integers, the hashing property might be something as simple as applying the modulo function to each existing value. If you decide to hash on the operation modulo 13, every value is put in a hash bucket based on its remainder after dividing by 13. This means you need 13 buckets because the possible remainders are the integers 0 through 12. (In real systems, the actual hash functions are substantially more complex; coming up with a good hash function is an art as well as a science.)

When you use hashing to join two inputs, SQL Server uses one input, called the build input, to actually build the hash buckets. Then, one row at a time, it inspects the other input and tries to find a match in the existing buckets. The second input is called the probe input. Suppose we use modulo 3 as the hash function and join the two inputs shown in Figure 14-4 on the following page. (The rows not matching the SARGs are not shown, so the two inputs are reduced in size quite a bit.) Here is the query:

When processing hash joins, SQL Server tries to use the smaller input as the build input, so in this case we'll assume that the hash buckets have been built based on the department table values. The buckets are actually stored as linked lists, in which each entry contains only the columns from the build input that are needed. In this case, all we have to keep track of is the department number and manager name. The collection of these linked lists is called the hash table. Our hash table is shown in Figure 14-5. Our pseudocode looks something like this:

Allocate an Empty Hash Table
For Each Row in the Build Input
Determine the hash bucket by applying the hash function
Insert the relevant fields of the record into the bucket
For Each Record in the Probe Input
Determine the hash bucket
Scan the hash bucket for matches
Output matches
Deallocate the Hash Table

Although you can use hashing effectively when there are no useful indexes on your tables, the query optimizer might not always choose it as the join strategy because of its high cost in terms of memory required. Ideally, the entire hash table should fit into memory; if it doesn't, SQL Server must split both inputs into partitions, each containing a set of buckets, and write those partitions out to disk. As each partition (with a particular set of hash buckets) is needed, it is brought into memory. This increases the amount of work required in terms of I/O and general processing time. In the academic literature, you might see this type of hashing operation referred to as a Grace hash; the name comes from the project for which this technique was developed. SQL Server has another alternative, somewhere in between keeping everything in memory and writing everything out to disk. Partitions are only written out as needed, so you might have some partitions in memory and some out on disk.

To effectively use hashing, SQL Server must be able to make a good estimate of the size of the two inputs and choose the smaller one as the build input. Sometimes, during execution SQL Server will discover that the build input is actually the larger of the two. At that point, it can actually switch the roles of the build and probe input midstream, in a process called role reversal. Determining which input is smaller isn't too hard if there are no restrictive SARGs, as in this query:

SQL Server knows the sizes of the inputs because sysindexes keeps track of the size of each table. However, consider the case in which additional conditions are added to the query, as in this example:

WHERE empl.phone like '425%'

It would help the query optimizer to know what percentage of the employees have phone numbers starting with 425. This is a case in which column statistics can be helpful. Statistics can tell the query optimizer the estimated number of rows that will meet a given condition, even if there is no actual index structure to be used.

Note: Hash and merge join can be used only if the join is an equijoin—that is, if the join predicate compares columns from the two inputs for an equality match. If the join is not based on an equality, as in the example below, a nested loops join is the only possible strategy:

WHERE table1.col1 BETWEEN table2.col2 AND table2.col3,

The hashing and merging strategies can be much more memory intensive than the nested loops strategy, and since the query optimizer takes into account all available resources, you might notice that a system with little available memory might not use the hash and merge joins as often as systems with plenty of memory. Desktop installations are particularly vulnerable to memory pressure because you are more likely to have many competing applications running simultaneously. Although the online documentation states that the desktop edition of SQL Server does not support hash and merge joins at any time, this is not true. Because in most situations there is limited memory for a desktop installation, the query optimizer is much more pessimistic when it decides to use one of these two strategies.

Multiple-Table Joins

According to some folklore, SQL Server "does not optimize" joins of more than four tables. There was some truth to this in earlier versions of SQL Server, but in SQL Server 7 optimization doesn't happen in the same way at all. There are so many different possible permutations of join strategies and index usage that the query optimizer goes through multiple optimization passes, each considering a larger set of the possible plans. On each pass, it keeps track of the best plan so far and uses a cost estimate to reduce the search in succeeding passes. If your query contains many tables, the query optimizer is likely to interrupt the last pass before completion because the last pass is quite exhaustive. The cheapest plan found will then be used.

Other Processing Strategies

Although our discussion of the query optimizer has dealt mainly with the choice of indexes and join strategies, it makes other decisions as well. It must decide how to process queries with GROUP BY, DISTINCT, and UNION, and it must decide how updates should be handled—either row at a time or table at a time. We discussed update strategies in Chapter 8. Here we'll briefly discuss GROUP BY, DISTINCT, and UNION.

GROUP BY Operations

In prior versions of SQL Server, a GROUP BY operation was processed in only one way. SQL Server sorted the data (after applying any SARGs) and then formed the groups from the sorted data. SQL programmers noticed that queries involving grouping returned results in sorted order, although this was merely a byproduct of the internal grouping mechanism. In SQL Server 7, you can use hashing to organize your data into groups and compute aggregates. The pseudocode looks like this:

For Each Record in the Input
Determine the hash bucket
Scan the hash bucket for matches
If a match exists
Aggregate the new record into the old
Drop the new record
Otherwise
Insert the record into the bucket
Scan and Output the Hash Table

If the query optimizer chooses to use hashing to process the GROUP BY, the data does not come back in any predictable order. (The order actually depends on the order in which records appear in the hash table, but without knowing the specific hash function, you can't predict this order.) If you use a lot of GROUP BY queries, you might be surprised that sometimes the results come back sorted and sometimes they don't. If you absolutely need your data in a particular order, you should use ORDER BY in your query.

Note: If a database has its compatibility level flag set to 60 or 65, the query processor automatically includes an ORDER BY in the query. This allows you to mimic the behavior of older SQL Server versions.

DISTINCT Operations

As with GROUP BY queries, the only technique for removing duplicates in previous versions of SQL Server was to first sort the data. In SQL Server 7, you can use hashing to return only the distinct result rows. The algorithm is much like the algorithm for hashing with GROUP BY, except that an aggregate does not need to be maintained.

For Each Record in the Input
Determine the hash bucket
Scan the hash bucket for matches
If a match exists
Drop the new record
Otherwise
Insert the record into the bucket
and return the record as output

UNION Operations

The UNION operator in SQL Server has two variations for combining the result sets of two separate queries into a single result set. If you specify UNION ALL, SQL Server returns all rows from both as the final result. If you don't specify ALL, SQL Server removes any duplicates before returning the results. For performance reasons, you should carefully evaluate your data, and if you know that the inputs are not overlapping, with no chance of duplicates, you should specify the ALL keyword to eliminate the overhead of removing duplicates. If SQL Server has to find the unique rows, it can do so in three ways: It can use hashing to remove the duplicates in a UNION, just the way it does for DISTINCT; it can sort the inputs and then merge them; or it can concatenate the inputs and then sort them to remove the duplicates.

For GROUP BY, DISTINCT, and UNION operations, your best bet is to let the query optimizer decide whether to use hashing or sorting and merging. If you suspect that the query optimizer might not have made the best choice, you can use query hints to force SQL Server to use a particular processing strategy and compare the performance with and without the hints. In most cases, the query optimizer will have made the best choice after all.

Maintaining Statistics

As we've seen, the query optimizer uses statistics on both indexed and nonindexed columns to determine the most appropriate processing strategy for any particular query. Whenever an index is created on a table containing data (as opposed to an empty table), the query optimizer collects statistics and stores them for that index in sysindexes.statblob. By default, the statistics are automatically updated whenever the query optimizer determines that they are out-of-date. In addition, the query optimizer generates statistics on columns used in SARGs when it needs them, and it updates them when needed.

The histogram for statistics on an index key or statistics on a nonindexed column is a set of up to 300 values for that column. SQL Server composes this set of values by taking either all occurring values or a sampling of the occurring values and then sorting that list of values and dividing it into as many as 299 approximately equal intervals. The endpoints of the intervals become the 300 histogram steps, and the query optimizer's computations assume that the actual data has an equal number of values between any two adjacent steps. These step values are stored in the statblob column of sysindexes. Even if the table or sample has more than 300 rows, the histogram might have fewer than 300 steps to achieve uniformity of step size.

You can specify the sampling interval when the indexes or statistics are initially created or at a later time using the statistics management tools. You can specify the FULLSCAN option, which means that all existing column values are sorted, or you can specify a percentage of the existing data values. By default, SQL Server uses a computed percentage that is a logarithmic function of the size of the table. If a table is less than 8 MB in size, a FULLSCAN is always done. When sampling, SQL Server randomly selects pages from the table by following the IAM chain. Once a particular page has been selected, all values from that page are used in the sample. Since disk I/O is the main cost of maintaining statistics, using all values from every page read minimizes that cost.

Statistics on nonindexed columns are built in exactly the same way as statistics on indexes. The system procedure sp_helpindex shows column statistics as well as indexes on a table. If the statistics are created automatically, their names are very distinctive, as shown in this example:

Here we have two indexes and two sets of statistics. The name of the explicitly generated statistics is s1. The name of the system-generated statistics is _WA_Sys_au_fname_07020F21. To avoid long-term maintenance of unused statistics, SQL Server ages out the statistics that are automatically created on nonindexed columns. After it updates the column statistics more than a certain number of times, the statistics are dropped rather than updated again. The StatVersion column of the sysindexes table shows how many times the statistics have been updated. Once StatVersion exceeds an internally generated threshold, the statistics are dropped. If they are needed in the future, they can be created again. There is no substantial cost difference between creating statistics and updating them. Statistics that are explicitly created (such as s1 in the preceding example) are not automatically dropped.

Statistics are automatically updated when the query optimizer determines they are out-of-date. This basically means that sufficient data modification operations have occurred since the last time they were updated to minimize their usefulness. The number of operations is tied to the size of the table and is usually something like 500 + 0.2 * (number of rows in the table). This means that the table must have at least 500 modification operations before statistics are updated during query optimization; for large tables, this threshold can be much larger.

You can also manually force statistics to be updated in one of two ways. You can run the UPDATE STATISTICS command on a table or on one specific index or column statistics. Or you can also execute the procedure sp_updatestats, which runs UPDATE STATISTICS against all user-defined tables in the current database.

You can create statistics on nonindexed columns using the CREATE STATISTICS command or by executing sp_createstats, which creates single-column statistics for all eligible columns for all user tables in the current database. This includes all columns except computed columns and columns of the ntext, text, or image datatypes, and columns that already have statistics or are the first column of an index.

Every database is created with the database options auto create statistics and auto update statistics set to true, but either one can be turned off. You can also turn off automatic updating of statistics for a specific table in one of two ways:

sp_autostats This procedure sets or unsets a flag for a table to indicate that statistics should or should not be updated automatically. You can also use this procedure with only the table name to find out whether the table is set to automatically have its index statistics updated.

UPDATE STATISTICS In addition to updating the statistics, the option WITH NORECOMPUTE indicates that the statistics should not be automatically recomputed in the future. Running UPDATE STATISTICS again without the WITH NORECOMPUTE option enables automatic updates.

However, setting the database option auto update statistics to FALSE overrides any individual table settings. In other words, no automatic updating of statistics takes place. This is not a recommended practice unless thorough testing has shown you that you don't need the automatic updates or that the performance overhead is more than you can afford.

The Index Tuning Wizard

We've seen that the query optimizer can come up with a reasonably effective query plan even in the absence of well-planned indexes on your tables. However, this does not mean that a well-tuned database doesn't need good indexes. The query plans that don't rely on indexes frequently consume additional system memory, and other applications might suffer. In addition, having appropriate indexes in place can help solve many blocking problems.

Designing the best possible indexes for the tables in your database is a complex task; it not only requires a thorough knowledge of how SQL Server uses indexes and how the query optimizer makes its decisions, but it requires that you be intimately familiar with how your data will actually be used. SQL Server 7 provides two tools to help you design the best possible indexes.

The Query Analyzer includes a tool called Index Analyzer, which you can access by choosing Perform Index Analysis from the Query menu. The index analyzer recommends indexes for the queries that you've included in the current workspace. It assumes that all your existing indexes will stay in place, so it might end up not recommending any new ones. It never recommends a clustered index because the optimum choice for a clustered index must take more than a single query into account. The index analyzer displays a dialog box with its recommendations and lets you choose whether to implement the recommendations. You cannot postpone the creation of the indexes or save a script of the CREATE INDEX statements to be used. If you need to do either of these, you can use the Index Tuning Wizard.

You can access the Index Tuning Wizard from the Wizards list in the Enterprise Manager or from SQL Server Profiler under the Tools menu. The wizard tunes a single database at a time, and it bases its recommendations on a workload file that you provide. The workload file can be a file of captured trace events that contains at least the RPC and SQL Batch starting or completed events, as well as the text of the RPCs and batches. It can also be a file of SQL statements. If you use the SQL Server Profiler to create the workload file, you can capture all SQL statement submitted by all users over a period of time. The wizard can then look at the data access patterns for all users, for all tables, and make recommendations with everyone in mind.

The book's companion CD includes a whitepaper with more details about this wizard. The CD also includes a sample database called orders, which is in two files: orders_data.mdf and orders_log.ldf. You can use the procedure sp_attachdb to add the orders database to your system. Finally, the CD includes a sample workload file (Orders_Workload.sql) composed of SQL statements that access the orders database.

When the wizard gets a workload file, it tunes the first 32 KB of parsable queries and produces five reports. You can make the analysis more exhaustive or less. You can choose to have the wizard keep all existing indexes, or you can have it come up with a totally new set of index recommendations, which might mean dropping some or all of your existing indexes.

After the wizard generates its reports, which you can save to disk, you can implement the suggestions immediately, schedule the implementation for a later time, or save the script files for the creation (and optional dropping) of indexes. Once the scripts are saved, you can run them later or just inspect them to get ideas of possible indexes to create.

You can also use the wizard purely for analyzing your existing design. One of the reports, Index Usage Report (Current Configuration), shows what percentage of the queries in the submitted workload make use of each of your existing indexes. You can use this information to determine whether an index is being used at all and get rid of any unused indexes along with their space and maintenance overhead.

The current version of the wizard cannot completely tune cross-database queries. It inspects the local tables involved in the queries for index recommendations, but it ignores the tables in other databases. Also, if you are using Unicode data, the wizard does not recognize Unicode constants (such as N'mystring') in your queries and cannot optimize for them. Future versions of the wizard will address these issues and probably add new features as well.

Note: If you're going to use SQL Server Profiler to capture a workload file, you should consider tracing events at several periods throughout the day. This can give you a more balanced picture of the kinds of queries that are actually being executed in your database. If you define your trace using the SQL Server Profiler extended procedures, you can set up a SQL Server Agent job to start and stop the trace at predefined times, appending to the same workload file each time.

Monitoring Query Performance

Before you think about taking some action to make a query faster, such as adding an index or denormalizing, you should understand how a query is processed. You should also get some baseline performance measurements so you can compare behavior both before and after making your changes. SQL Server provides these tools (SET options) for monitoring queries:

STATISTICS IO

STATISTICS TIME

Showplan

You enable any of these SET options before you run a query, and they will produce additional output. Typically, you run your query with these options set in a tool such as the Query Analyzer. When you are satisfied with your query, you can cut and paste the query into your application or into the script file that creates your stored procedures. If you use the SET commands to turn these options on, they apply only to the current connection. The Query Analyzer provides check boxes you can use to turn any or all of these options on and off for all connections.

STATISTICS IO

Don't let the term statistics fool you. STATISTICS IO doesn't have anything to do with the statistics used for storing histograms and density information in sysindexes. This option provides statistics on how much work SQL Server did to process your query. When this option is set to ON, you get a separate line of output for each query in a batch that accesses any data objects. (You don't get any output for statements that don't access data, such as PRINT, SELECT the value of a variable, or call a system function.) The output from SET STATISTICS IO ON includes the values Logical Reads, Physical Reads, Read Ahead Reads, and Scan Count.

Logical Reads

This value indicates the total number of page accesses needed to process the query. Every page is read from the data cache, whether or not it was necessary to bring that page from disk into the cache for any given read. This value is always at least as large and usually larger than the value for Physical Reads. The same page can be read many times (such as when a query is driven from an index), so the count of Logical Reads for a table can be greater than the number of pages in a table.

Physical Reads

This value indicates the number of pages that were read from disk; it is always less than or equal to the value of Logical Reads. The value of the Buffer Cache Hit Ratio, as displayed by Performance Monitor, is computed from the Logical Reads and Physical Reads values as follows:

Cache-Hit Ratio = (Logical Reads – Physical Reads) / Logical Reads

Remember that the value for Physical Reads can vary greatly and decreases substantially with the second and subsequent execution because the cache is loaded by the first execution. The value is also affected by other SQL Server activity and can appear low if the page was preloaded by read ahead activity. For this reason, you probably won't find it useful to do a lot of analysis of physical I/O on a per-query basis. When looking at individual queries, the Logical Reads value is usually more interesting because the information is consistent. Physical I/O and achieving a good cache-hit ratio is crucial, but they are more interesting at the all-server level. Pay close attention to Logical Reads for each important query, and pay close attention to physical I/O and the cache-hit ratio for the server as a whole.

STATISTICS IO acts on a per-table, per-query basis. You might want to see the physical_io column in sysprocesses corresponding to the specific connection. This column shows the cumulative count of synchronous I/O that has occurred during the spid's existence, regardless of the table. It even includes any Read Ahead Reads that were made by that connection.

Read Ahead Reads

The Read Ahead Reads value indicates the number of pages that were read into cache using the read ahead mechanism while the query was processed. These pages are not necessarily used by the query. If a page is ultimately needed, a logical read is counted but a physical read is not. A high value means that the value for Physical Reads is probably lower and the cache-hit ratio is probably higher than if a read ahead was not done. In a situation like this, you shouldn't infer from a high cache-hit ratio that your system can't benefit from additional memory. The high ratio might come from the read ahead mechanism bringing much of the needed data into cache. That's a good thing, but it could be better if the data simply remains in cache from previous use. You might achieve the same or a higher cache-hit ratio without requiring the Read Ahead Reads.

You can think of read ahead as simply an optimistic form of physical I/O. In full or partial table scans, the table's IAMs are consulted to determine which extents belong to the object. The extents are read with a single 64-KB scatter read, and because of the way that the IAMs are organized, they are read in disk order. If the table is spread across multiple files in a file group, the read ahead attempts to keep at least eight of the files busy instead of sequentially processing the files. Read ahead reads are asynchronously requested by the thread that is running the query; because they are asynchronous, the scan doesn't block while waiting for them to complete. It blocks only when it actually tries to scan a page that it thinks has been brought into cache and the read hasn't finished yet. In this way, the read ahead neither gets too ambitious (reading too far ahead of the scan) nor too far behind.

Scan Count

The Scan Count value indicates the number of times that the corresponding table was accessed. Outer tables of a nested loop join have a Scan Count of 1. For inner tables, the Scan Count might be the number of times "through the loop" that the table was accessed. The number of Logical Reads is determined by the sum of the Scan Count times the number of pages accessed on each scan. However, even for nested loop joins, the Scan Count for the inner table might show up as 1. SQL Server might copy the needed rows from the inner table into a worktable in cache and use this worktable to access the actual data rows. When this step is used in the plan, there is often no indication of it in the STATISTICS IO output. You must use the output from STATISTIC TIME, as well as information about the actual processing plan used, to determine the actual work involved in executing a query. Hash joins and merge joins usually show the Scan Count as 1 for both tables involved in the join, but these types of joins can involve substantially more memory. You can inspect the memusage value in sysprocesses while the query is being executed, but unlike the physical_io value, this is not a cumulative counter and is valid only for the currently running query. Once a query finishes, there is no way to see how much memory it used.

STATISTICS TIME

The output of SET STATISTICS TIME ON is pretty self-explanatory. It shows the elapsed and CPU time required to process the query. (In this context, it means the time not spent waiting for resources such as locks or reads to complete.) The times are separated into two components: the time required to parse and compile the query, and the time required to execute the query. For some of your testing, you might find it just as useful to look at the system time with getdate() before and after a query if all you need to measure is elapsed time. However, if you want to compare elapsed vs. actual CPU time or if you are interested in how long compilation and optimization took, you must use STATISTICS TIME.

Showplan

In SQL Server 7, there is not just a single option for examining the execution plan for a query. You can choose to see the plan in a text format, with or without additional performance estimates, or you can see a graphical representation of the processing plan.

SHOWPLAN_TEXT and SHOWPLAN_ALL

The two SET options SHOWPLAN_TEXT and SHOWPLAN_ALL let you see the estimated query plan without actually executing the query. Both options also enable the SET NOEXEC option, so you don't see any results from your query—you see only the way that SQL Server has determined is the best method for processing the query. SHOWPLAN_TEXT shows you all the steps involved in processing the query, including the type of join used, the order of table accesses, and which index or indexes are used for each table. Any internal sorts are also shown. SHOWPLAN_ALL provides this information plus estimates of the number of rows that are expected to meet the queries' search criteria, the estimated size of the result rows, the estimated CPU time, and the total cost estimate that was used to compare this plan to other possible plans.

Warning: Since turning on SHOWPLAN TEXT or SHOWPLAN ALL implies that NOEXEC is also on, you must set the SHOWPLAN option to OFF before you do anything else. For example, you must set SHOWPLAN_TEXT to OFF before setting SHOWPLAN_ALL to ON. You must also set these options to OFF before using the graphical Showplan, discussed on the next page.

The output from these options shows the order in which tables are accessed and how they are joined, which indexes are used, which tables are scanned, and what worktables are created. Showplan is your primary tool for determining which indexes are useful.

Typically, you add one or more indexes that you think might help speed up a query and then you use one of these Showplan options to see whether any of them were actually used. If an index is not going to be used (and you're not adding it to maintain a PRIMARY KEY or UNIQUE constraint), you might as well not add it. If an index is not useful for queries, the index is just overhead in terms of space and maintenance time. After you add indexes, be sure to monitor their effect on your updates, since indexes can add overhead to data modification (inserts, deletes, and updates).

If the index is useful, also consider whether it is worthwhile in terms of the effect on your data modification operations. If a change to indexing alone is not the answer, you should look at other possible solutions, such as using an index hint or changing the general approach to the query. (In Chapter 12, you saw that several different approaches can be useful to some queries—ranging from the use of somewhat tricky SQL to the use of temporary tables and cursors. If these approaches also fail, you should consider changes to the database design using the denormalization techniques discussed earlier.)

Graphical Showplan

SQL Server 7 provides a graphical representation of a query's estimated execution plan. Like the SET options for Showplan, by default your query is not executed when you choose to display the graphical Showplan output. You can choose to display this graphical information by choosing Display Estimated Execution Plan from the Query menu in the Query Analyzer or by clicking the corresponding toolbar button in the Query Analyzer. The graphical representation contains all the information available through SHOWPLAN_ALL, but not all of it is visible at once. You can, however, move your cursor over any of the icons in the graphical plan to see the additional performance estimates, as shown in Figure 14-6 on the following page.

Displaying the Actual Plan

All of the Showplan options discussed so far show the estimated query plan. As long as the query is not actually being executed, you can only see an estimate. Conditions such as memory resources might change before you actually run the query, so the estimated plan might not be the actual plan. If you need to see the actual plan at the time the query is executed, you have three options:

Choose Show Execution Plan from the Query menu in the Query Analyzer. You'll see two tabs of results for every query you run until you deselect Show Execution Plan. The first tab shows the actual query output, and the second shows the graphical Showplan.

Set STATISTICS PROFILE to ON. This option gives you the query results followed by the STATISTICS_ALL output in the same results pane.

Set up a trace in the SQL Server Profiler to capture the Execution Plan events. This displays all the information from SHOWPLAN_ALL in the trace output for every query that is executed.

Using Query Hints

As you know, locking and query optimization are done automatically by SQL Server. But because the query optimizer is probability-based, it sometimes makes wrong predictions. For example, to eliminate a deadlock, you might want to force an update lock. Or you might want to tell the query optimizer that you value the speed of returning the first row more highly than total throughput. You can specify these and other behaviors by using query hints. The word "hint" is a bit of a misnomer in this context, because the hint is handled as a directive rather than as a suggestion. SQL Server provides four general types of hints:

Join hints Specify the join technique that will be used.

Index hints Specify one or more specific indexes that should be used to drive a search or a sort.

Lock hints Specify a particular locking mode that should be used.

Processing hints Specify that a particular processing strategy should be used.

If you've made it this far in this book, you probably understand the various issues related to hints—locking, how indexes are selected, overall throughput vs. the first row returned, and how join order is determined. Understanding these issues is the key to effectively using hints and intelligently instructing SQL Server to deviate from "normal" behavior when necessary. Hints are simply syntactic hooks that override default behavior; you should now have insight into those behaviors so you can make good decisions about when such overrides are warranted.

Query hints should be used for special cases—not as standard operating procedure. When you specify a hint, you constrain SQL Server. For example, if you indicate in a hint that a specific index should be used, and later you add another index that would be even more useful, the hint prevents the query optimizer from considering the new index. In future versions of SQL Server, you can expect new query optimization strategies, more access methods, and new locking behaviors. If you bind your queries to one specific behavior, you forgo your chances of benefiting from such improvements. SQL Server offers the nonprocedural development approach—that is, you don't have to tell SQL Server how to do something. Rather, you tell it what to do. Hints run contrary to this approach. Nonetheless, the query optimizer isn't perfect and never can be. In some cases, hints can make the difference between a project's success and failure, if they are used judiciously.

When you use a hint, you must have a clear vision of how it might help. It's a good idea to add a comment to the query to justify the hint. Then test your hypothesis by watching the output of STATISTICS IO, STATISTICS TIME, and one of the Showplan options both with and without your hint.

Tip Since your data is constantly changing, a hint that worked well today might not indicate the appropriate processing strategy next week. In addition, SQL Server's optimizer is constantly evolving, and the next upgrade or service pack you apply might invalidate the need for that hint. You should periodically retest all queries that rely on hinting, to verify that the hint is still useful.

Join Hints

You can use join hints only when you use the ANSI-style syntax for joins—that is, when you actually use the word JOIN in the query. In addition, the hint comes between the type of join and the word JOIN, so you can't leave off the word INNER for an inner join. Here's an example of forcing SQL Server to use a HASH JOIN:

Alternatively, you can specify a LOOP JOIN or a MERGE JOIN. You can use another join hint, REMOTE, when you have defined a linked server and are doing a cross-server join. REMOTE specifies that the join operation is performed on the site of the right table (that is, the table after the word JOIN). This is useful when the left table is a local table and the right table is a remote table. You should use REMOTE only when the left table has fewer rows than the right table.

Index Hints

You can specify that one or more specific indexes be used by naming them directly in the FROM clause. You can also specify the indexes by their indid value, but this makes sense only when you specify not to use an index or to use the clustered index, whatever it is. Otherwise, the indid value is prone to change as indexes are dropped and re-created. You can use the value 0 to specify that no index should be used—that is, to force a table scan. And you can use the value 1 to specify that the clustered index should be used regardless of the columns on which it exists. The index hint syntax looks like this:

SELECT au_lname, au_fname
FROM authors (INDEX(0))
WHERE au_lname LIKE 'C%'

This example forces the query to use the index named aunmind:

SELECT au_lname, au_fname
FROM authors (INDEX(aunmind))
WHERE au_lname LIKE 'C%'

The following example forces the query to use both indexes 2 and 3. Note, however, that identifying an index by indid is dangerous. If an index is dropped and re-created in a different place, it might take on a different indid. If you don't have an index with a specified ID, you get an error message. Also, you cannot specify index 0 (no index) along with any other indexes.

A second kind of index hint is FASTFIRSTROW, which tells SQL Server to use a nonclustered index leaf level to avoid sorting the data. This hint has been preserved only for backward compatibility; it has been superseded by the processing hint FAST n, which we'll discuss later.

Lock Hints

You can specify a lock type or duration with syntax similar to that for specifying index hints. Chapter 13 explains lock compatibility and other issues that are essential to using lock hints properly. Lock hints work only in the context of a transaction, so if you use these hints, you must use BEGIN TRAN/END TRAN blocks (or you must run with implicit transactions set to ON). The lock hint syntax is as follows:

SELECT select_list
FROM table_name [(lock_type)]

Tip Remember to put the lock hint in parentheses; if you don't, it will be treated as an alias name for the table instead of as a lock hint.

You can specify one of the following keywords for lock_type:

HOLDLOCK Equivalent to the SERIALIZABLE hint described on the following page. This option is similar to specifying SET TRANSACTION ISOLATION LEVEL SERIALIZABLE , except the SET option affects all tables, not only the one specified in this hint.

UPDLOCK Takes update page locks instead of shared page locks while reading the table and holds them until the end of the transaction. Taking update locks can be an important technique for eliminating conversion deadlocks.

TABLOCK Takes a shared lock on the table even if page locks would be taken otherwise. This option is useful when you know you'll escalate to a table lock or if you need to get a complete snapshot of a table. You can use this option with HOLDLOCK if you want the table lock held until the end of the transaction block (REPEATABLE READ).

PAGLOCK Takes shared page locks when a single shared table lock might otherwise be taken. (There is no hint for taking an exclusive page lock. Instead, you hint to take an update page lock using UPDLOCK. Once the lock is acquired, you know that UPDLOCK can be automatically upgraded to an exclusive lock if required.)

TABLOCKX Takes an exclusive lock on the table that is held until the end of the transaction block. (All exclusive locks are held until the end of a transaction, regardless of the isolation level in effect.)

ROWLOCK Specifies that a shared row lock be taken when a single shared page or table lock is normally taken.

READUNCOMMITTED | READCOMMITTED | REPEATABLEREAD | SERIALIZABLE These hints specify that SQL Server should use the same locking mechanisms as when the transaction isolation level is set to the level of the same name. However, the hint controls locking for a single table in a single statement, as opposed to locking of all tables in all statements in a transaction.

NOLOCK Allows uncommitted, or dirty, reads. Shared locks are not issued by the scan, and the exclusive locks of others are not honored. This hint is equivalent to READUNCOMMITTED.

Although it is not specifically a query hint, SET LOCK_TIMEOUT also lets you control SQL Server locking behavior. By default, SQL Server does not time out when waiting for a lock; it assumes optimistically that the lock will be released eventually. Most client programming interfaces allow you to set a general timeout limit for the connection so that a query is automatically canceled by the client if no response comes back after a specified amount of time. However, the message that comes back when the time period is exceeded does not indicate the cause of the cancellation; it could be because of a lock not being released, it could be because of a slow network, or it could just be a long running query.

Like other SET options, SET LOCK_TIMEOUT is valid only for your current connection. Its value is expressed in milliseconds and can be accessed by using the system function @@LOCK_TIMEOUT. This example sets the LOCK_TIMEOUT value to 5 seconds and then retrieves that value for display:

SET LOCK_TIMEOUT 5000
SELECT @@LOCK_TIMEOUT

If your connection exceeds the lock timeout value, you receive the following error message:

Server: Msg 1222, Level 16, State 50, Line 0
Lock request time out period exceeded.

Setting the LOCK_TIMEOUT value to 0 means that SQL Server does not wait at all for locks. It basically cancels the entire statement and goes on to the next one in the batch. This is not the same as the READPAST hint, which skips individual rows.

Warning: Exceeding the LOCK_TIMEOUT value does not automatically roll back a transaction, although it cancels the current statement. If you have a user-defined transaction containing multiple UPDATE statements, the first one might be canceled due to a lock timeout and the second one might succeed and commit. If this is unacceptable to your applications, you should program a check for error 1222 and then explicitly roll back the transaction. If you do not change the LOCK_TIMEOUT value, you don't have to worry about this behavior because locks will never time out.

The following example illustrates the difference between READPAST, READUNCOMMITTED, and setting LOCK_TIMEOUT to 0. All of these techniques let you "get around" locking problems, but the behavior is slightly different in each case.

In a new query window, execute the following batch to lock one row in the titles table:

USE pubs
SET LOCK_TIMEOUT 0
SELECT * FROM titles
SELECT * FROM authors

Notice that after error 1222 is received, the second select statement is executed, returning all the rows from the authors table.

Open a third connection, and execute the following statements:

USE pubs
SELECT * FROM titles (READPAST)
SELECT * FROM authors

SQL Server skips (reads past) only one row, and the remaining 17 rows of titles are returned, followed by all the authors rows.

Open a fourth connection, and execute the following statements:

USE pubs
SELECT * FROM titles (READUNCOMMITTED)
SELECT * FROM authors

In this case, SQL Server does not skip anything. It reads all 18 rows from titles, but the row for title BU1032 shows the dirty data that you changed in step 1. This data has not yet been committed and is subject to being rolled back.

Note: The NOLOCK, READUNCOMMITTED, and READPAST table hints are not allowed for tables that are targets of delete, insert, or update operations.

Combining Hints

You can combine index and lock hints and use different hints for different tables, and you can combine HOLDLOCK with the level of shared locks. Here's an example:

Query processing hints follow the word OPTION at the very end of your SQL statement and apply to the entire query. If your query involves a UNION, only the last SELECT in the UNION can include the OPTION clause. You can include more than one OPTION clause, but you can specify only one hint of each type. For example, you can have one GROUP hint and also use the ROBUST PLAN hint. Here's an example of forcing SQL Server to process a GROUP BY using hashing:

There are eight different types of processing hints, most of which are fully documented in SQL Server Books Online. The key aspects of these hints appear in the following list:

Grouping hints You can specify that SQL Server use a HASH GROUP or an ORDER GROUP to process GROUP BY operations.

Union hints You can specify that SQL Server form the UNION of multiple result sets by using HASH UNION, MERGE UNION, or CONCAT UNION. HASH UNION and MERGE UNION are ignored if the query specifies UNION ALL because UNION ALL does not need to remove duplicates. UNION ALL is always carried out by simple concatenation.

Join hints You can specify join hints in the OPTION clause as well as in the JOIN clause, and any join hint in the OPTION clause applies to all the joins in the query. OPTION join hints override those in the JOIN clause. You can specify LOOP JOIN, HASH JOIN, or MERGE JOIN.

FAST number_rows This hint tells SQL Server to optimize so that the first rows come back as quickly as possible, possibly reducing overall throughput. You can use this option to influence the query optimizer to drive a scan using a nonclustered index that matches the ORDER BY clause of a query rather than using a different access method and then doing a sort to match the ORDER BY clause. After the first number_rows are returned, the query continues execution and produces its full result set.

FORCE ORDER This hint tells SQL Server to process the tables in exactly the order listed in your FROM clause. However, if any of your joins is an OUTER JOIN, this hint might be ignored.

MAXDOP number Overrides the max degree of parallelism configuration option for only the query specifying this option. We'll discuss parallelism when we look at server configuration in Chapter 15.

ROBUST PLAN Forces the query optimizer to attempt a plan that works for the maximum potential row size, even if it means degrading performance. This is particularly applicable if you have wide varchar columns. When the query is processed, some types of plans might create intermediate tables. Operators might need to store and process rows that are wider than any of the input rows, which might exceed SQL Server's internal row size limit. If this happens, SQL Server produces an error during query execution. If you use ROBUST PLAN, the query optimizer will not consider any plans that might encounter this problem.

KEEP PLAN This option ensures that a query is not recompiled as frequently when there are multiple updates to a table. This can be particularly useful when a stored procedure does a lot of work with temporary tables, which might cause frequent recompilations.

Summary

This chapter walked you through performance-related topics that you should consider while designing and implementing a high-performance application. As a starting point, skilled team members are vital. More than anything else, a skilled and experienced staff is the best predictor and indicator of a project's potential success. You must establish performance criteria and prototype, test, and monitor performance throughout development.

We also reviewed techniques that you can use in database design and in making indexing decisions. We saw how to deal with lock contention issues and how to resolve deadlocks. We also discussed how the query optimizer chooses a plan and what you can do to ensure that it performs as well as possible.

As you make and monitor changes, remember to change just one variable at a time. When you change multiple variables at the same time, it becomes impossible to evaluate the effectiveness of any one of them. This goes for all changes, whether they are related to design, indexing, the use of an optimizer hint, or a configuration setting. This is a fundamental aspect of all performance optimization. In the next chapter, we'll look at configuration and performance monitoring.

We at Microsoft Corporation hope that the information in this work is valuable to you. Your use of the information contained in this work, however, is at your sole risk. All information in this work is provided "as -is", without any warranty, whether express or implied, of its accuracy, completeness, fitness for a particular purpose, title or non-infringement, and none of the third-party products or information mentioned in the work are authored, recommended, supported or guaranteed by Microsoft Corporation. Microsoft Corporation shall not be liable for any damages you may sustain by using this information, whether direct, indirect, special, incidental or consequential, even if it has been advised of the possibility of such damages. All prices for products mentioned in this document are subject to change without notice.