Introduction

The motivation behind writing these sessions is that I have found that relatively few people have a good intuition for the way the optimizer works, partly because the official documentation is rather sparse, and partly because what information is available is dispersed across many books and blog posts. The content presented here is very much geared to my preferred way of learning – it shows the concepts in what seems to me to be a reasonably logical sequence, and provides tools to enable the interested reader to explore further, if desired.

When we write a query, we are not writing a computer program that can be directly executed. SQL is a declarative language used to logically describe the results we want to see. SQL Server goes through a process of compilation and optimization to create a physically executable implementation of the stated logical requirement. To put it another way, a SQL query is the specification for an executable program that SQL Server writes for us. The SQL Server component responsible for finding an efficient physical execution plan for a given logical query requirement is the Query Optimizer.

The SQL Server Core Engine

The diagram below shows conceptually where the Query Optimizer sits in the core SQL Server engine:

Language Processing is concerned with things like parsing the text of the query to make sure it is syntactically valid, binding table references to real database objects, deriving types for expressions, semantic analysis (for example checking that the query is not trying to execute a table), and binding GROUP BY references to the appropriate logical scope. The Query Optimizer aims to find a good physical execution plan that matches the logical semantic of the query, and the Query Executor is responsible for running that plan to completion. Below that, the Storage Engine provides access to physical storage, together with locking and transaction services. Finally, SQL-OS is the layer that provides threading, memory management, and scheduling services.

Using the same colours for the components shown above, a very high-level overview of query processing looks like this:

This series of blog posts is mostly concerned with the “Optimize” step above. It takes as its input a tree of logical operations produced by the previous stages.

Optimization Pipeline

The diagram below shows the principal stages within query optimization, starting with the bound tree produced by the parser and algebrizer. The coloured steps are the ones we will explore in depth – the right hand side of the diagram lists a bit more detail that I talk through when presenting this material live:

Input Tree

We start by looking at the bound logical tree (input tree). To make the discussion a little less abstract, we will use a test query issued against the AdventureWorks sample database. This simple query shows products and the quantity in stock, limited to products with names that start with a letter between A and G:

-- Test query

SELECT

p.Name,

Total = SUM(inv.Quantity)

FROM

Production.Product AS p,

Production.ProductInventory AS inv

WHERE

inv.ProductID = p.ProductID

AND p.Name LIKE N'[A-G]%'

GROUPBY

p.Name;

The input tree generated by this query is shown below, slightly simplified:

The operations in this tree are all purely logical, and have their roots in Relational Theory. The GET operation logically reads an entire table, though you should not yet be thinking in terms of physical operations such as a scan or a seek. The JOIN is a cartesian product, logically joining every row in one table with every row in the other. The SELECT is not a SQL statement, it is a relational operation: filtering rows based on some condition (known as a predicate). Here, we are applying two SELECT operations to filter on matching product IDs and rows where the product name starts with a letter between A and G. The next logical operation is a Group-By Aggregate, an extension to the basic relational algebra that logically groups rows by some common attribute, and optionally computes an aggregate expression for each group. Finally, we have a relational Project operation, which filters columns – here we only need the Name and Total columns, and those are the only two returned in the logical result set.

To see this tree directly, we can use undocumented trace flag 8605. This flag also requires the common trace flag 3604 to produce output to the SSMS messages window. We can use another undocumented syntax, the QUERYTRACEON hint, to set the trace flag just for this query:

-- Input tree (ISO-92)

SELECT

p.Name,

Total = SUM(inv.Quantity)

FROM Production.Product AS p

JOIN Production.ProductInventory AS inv ON

inv.ProductID = p.ProductID

WHERE

p.Name LIKE N'[A-G]%'

GROUPBY

p.Name

OPTION (RECOMPILE, QUERYTRACEON 8605);

Now, in the SSMS messages tab, we see a textual representation of the tree of logical operators I drew with coloured boxes before:

There are a few minor differences between this and the previous diagram. In particular, the expression computed by the aggregate is added by a separate projection (omitted for clarity before), and the SQL-92 join syntax means the join predicate is more tightly bound to the logical join. Rewriting the query using SQL-89 syntax highlights the second difference, producing a tree closer to the cartesian product followed by a relational SELECT:

-- Input tree (ISO-89)

SELECT

p.Name,

Total = SUM(inv.Quantity)

FROM

Production.Product AS p,

Production.ProductInventory AS inv

WHERE

inv.ProductID = p.ProductID

AND p.Name LIKE N'[A-G]%'

GROUPBY

p.Name

OPTION (RECOMPILE, QUERYTRACEON 8605);

Simplification

The first step of the optimization pipeline we will look at is simplification, where the optimizer looks to rewrite parts of the logical tree to remove redundancies and move logical operations around a bit to help later stages. Major activities that occur around the time simplification occurs (I have taken a little artistic licence here to group a few separate stages together) include:

Constant folding

Domain simplification

Predicate push-down

Join simplification

Contradiction detection

Constant folding is the process of evaluating an expression during optimization, rather than repeatedly at execution time. In the following example, the complex expression in the WHERE clause can safely be evaluated early, resulting in “WHERE p.Name LIKE ‘D%’:

Domain simplification enables the optimizer to reason about the range of valid values a column or expression can take. In the next query, the only value for Product ID that logically matches all the predicates in the extended WHERE clause is the single value ‘400’. The query plan shows that simplification has reduced the three predicates to a single equality predicate, “WHERE p.ProductID = 400”:

SELECTTOP (10) *

FROM Production.Product AS p

WHERE p.ProductID BETWEEN 300 AND 400

AND p.ProductID BETWEEN 200 AND 500

AND p.ProductID BETWEEN 400 AND 600;

Predicate push-down involves pushing relational SELECTs (logically filtering rows based on a predicate) down the logical tree toward the leaves. The heuristic here is that this is always a good thing to do (at least as far as expressions that reference a column are concerned) because it eliminates rows early, reducing the number for later logical operations. Moving SELECTs closer to the logical GETs (reading tables) also helps index and computed-column matching.

Join simplification allows the optimizer to remove unnecessary joins, convert logical outer joins to simpler inner joins where the NULLs introduces by the outer join are later rejected by another feature in the query, and remove provably empty subexpressions. This is an incredibly powerful optimizer facility, with its roots again in relational theory. Rob Farley has written (SQL Server MVP Deep Dives) and presented (http://bit.ly/SimpleRob) about ways to use this feature effectively. In particular, he talks about how to write views to benefit from compile-time simplification, and avoid the views-on-views-on-views problem. The following queries illustrate some of the join simplifications:

-- Remove unnecessary joins

SELECT

th.ProductID,

SUM(th.ActualCost)

FROM Production.TransactionHistory AS th

JOIN Production.Product AS p ON

p.ProductID = th.ProductID

GROUPBY

th.ProductID;

GO

-- Outer join to join (null rejection)

SELECT *

FROM Production.Product AS p

LEFTJOIN Production.TransactionHistory AS th ON

th.ProductID = p.ProductID

WHERE th.ProductID < 10;

GO

-- Complex example combining multiple simplifications

WITH Complex AS

(

SELECT

pc.ProductCategoryID, pc.Name AS CatName,

ps.ProductSubcategoryID, ps.Name AS SubCatName,

p.ProductID, p.Name AS ProductName,

p.Color, p.ListPrice, p.ReorderPoint,

pm.Name AS ModelName, pm.ModifiedDate

FROM Production.ProductCategory AS pc

FULLJOIN Production.ProductSubcategory AS ps ON

ps.ProductCategoryID = pc.ProductCategoryID

FULLJOIN Production.Product AS p ON

p.ProductSubcategoryID = ps.ProductSubcategoryID

FULLJOIN Production.ProductModel AS pm ON

pm.ProductModelID = p.ProductModelID

)

SELECT c.ProductID, c.ProductName

FROM Complex AS c

WHERE c.ProductName LIKE N'G%';

That last example, with three full outer joins in a CTE (in-line view), simplifies down to a simple index seek on one table.

Cardinality Estimation

The optimizer only has direct information about table cardinality (total number of rows) for base tables. Statistics can provide additional histograms and frequency statistics, but again these are only maintained for base tables. To help the optimizer choose between competing strategies later on, it is essential to know how many rows are expected at each node in the logical tree, not just at the GET leaf nodes. In addition, cardinality estimates above the leaves can be used to choose an initial join order (assuming the query contains several joins).

Cardinality estimation computes expected cardinality and distribution statistics for each node, working up from the leaves one node at a time. The logic used to create these derived estimates and statistics uses a model that makes certain assumptions about your data. For example, where the distribution of values is unknown, it is assumed to be uniform across the whole range of potential values.

You will generally get more accurate cardinality estimates if your queries fit the model used, and, as with many things in the wider optimizer, simple and relational is best. Complex expressions, unusual or novel techniques, and using non-relational features often means cardinality estimation has to resort to guessing.

Most of the choices made by the optimizer are driven by cardinality estimation, so if these are wrong, any good execution plans you might see are highly likely to be down to pure luck.

Trivial Plan

The next stage of the optimizer pipeline is Trivial Plan, which is a fast path to avoid the time and effort involved in full cost-based optimization, and only applies to logical query trees that have a clear and obvious ‘best’ execution plan. The details of which types of query can benefit from Trivial Plan change frequently, but things like joins, subqueries, and inequality predicates generally prevent this optimization. The queries below show some examples where Trivial Plan can, and cannot be applied:

-- Trivial plan

SELECT p.ProductID

FROM Production.Product AS p

WHERE p.Name = N'Blade';

GO

-- Still trivial

SELECT

p.Name,

RowNumber =

ROW_NUMBER() OVER (

ORDERBY p.Name)

FROM Production.Product AS p

WHERE p.Name LIKE N'[A-G]%';

GO

-- 'Subquery' prevents a trivial plan

SELECT (SELECT p.ProductID)

FROM Production.Product AS p

WHERE p.Name = N'Blade';

GO

-- Inequality

SELECT p.ProductID

FROM Production.Product AS p

WHERE p.Name <> N'Blade';

One interesting case where a Trivial Plan is not applied is where the estimated cost of the Trivial Plan query exceeds the configured ‘cost threshold for parallelism’. The reasoning here is that any trivial plan that would qualify for a parallel plan suddenly presents a choice, and a choice means the query is no longer ‘trivial’, and we have to go through full cost-based optimization. Taking this to the extreme, a SQL Server instance with the parallelism cost threshold set to zero will never produce a trivial plan (since all queries have a cost greater than zero). At the risk of stating the slightly obvious, note that a plan produced at this stage will always be serial. For experimentation only, the trivial plan stage can also be disabled using trace flag 8757.

You can check whether a query used a trivial plan or not by inspecting the Properties window in SSMS, and clicking on the root node of the graphical query plan. The ‘optimization level’ property will show ‘Trivial’ or ‘Full’ – the latter shows that a trip through full cost-based optimization was required. If a query results in a trivial plan, a physical execution plan is produced and the optimization process ends here.

End of Part 1

The original slides and demo code are contained in the zip file attached below. Links to other parts of this series: Part 2Part 3Part 4

Great stuff, Paul. I have been wondering for some time how to get better visibility into the plan decision process, and recently discovered your blog. These TFs are new to me, and I like working through your examples. On that note, are you aware of other blogs with a focus on "advanced" query tuning/plan exploration in this vein? I am aware of Conor's, and the Query Opt and Programmability team blogs, but other than that it seems that Query Opt internals have always taken a back seat to the storage engine (and to a lesser extent, Sqlos).