Physical Join Operators in SQL Server – Nested Loops

SQL Server implements three
different physical operators to perform joins. In this article series we will
examine how
each operators works, its advantages and challenges. We will try to
understand the logic behind the optimizer’s decisions on which operator to use
for various joins using (semi) real life examples and how to avoid common
pitfalls.

I’m sure you are all very familiar with T-SQL joins so I’ll
skip the introductory part this time. For this article, I’ll focus on the most
common type of join, the ‘E-I-J’ (or ‘Equi-Inner-Join’) which is just a cool
sounding name for an inner join that uses only equality predicates for the join
conditions. With sadness, I will skip some extremely interesting issues
concerning joins such as logical processing order of outer joins, NULL value
issues, join parallelism and others. This article focuses on helping you make
your E-I-J joins faster. We’ll accomplish this by helping you understand how
the SQL Server optimizer decides which physical operators it will use to carry
out your query joins. We’ll show you situations where the SQL Server optimizer
is occasionally tricked into choosing a slower method by the characteristics of
the data and SQL you wrote. If you understand these pitfalls, you can code to
overcome them, thus speeding up your queries – and impressing your colleagues
and bosses.

As amazing as the SQL Server optimizer can sometimes be,
it’s not making its decisions intuitively. It’s using the information it has
to work with. Recall that a join is basically ‘looking for matching rows’ from
two inputs, based on the join condition. The creators of the optimizer have
made three techniques available to it for carrying out these joins:

Nested Loops

Merge
Hash

Match

For this article series I will use an analogy of two sets of standard playing cards. One set with
a blue back and another with a red back that need to be ‘joined’
together according to various join conditions. The two sets of cards simply
represent the rows from the two joined tables, the red input and the blue input
- “Now Neo – which input do you choose”?

Nested Loops

Probably the simplest and most obvious operator is the
nested loops operator. The algorithm for this operator is as follows:

Flowchart
1 – Nested Loops

The ‘outer loop’ consists of going through all rows from the
blue input and for each row; some mysterious ‘inner operator’ is performed to
find the matching rows from the red input. If we would use this operator to
join our sets of cards, we would browse through all the blue-back cards and for
each one we would have to go through all the red-back cards to find its
matches. Not the most efficient way of matching cards but it probably would be
most people’s first choice… if there are only a handful of cards to
match.

Let’s take a closer look and see when nested loops are a
good choice. The first parameter to consider, as with any iterative loop, is
the number of required iterations. For nested loops to be efficient, it
requires at least one relatively small row set to be used as the outer loop
input that determines the number of iterations. Remember that this does not
necessarily reflect the number of rows in the table but the number of rows that
satisfy the query filters.

The second parameter to consider is the (intentionally)
vague “Find matching rows in red input” part in Flow Chart 1
above. Assuming that we do have a small input for the outer loop, hence a small
number of required iterations, we now must consider how much work is required
to actually find the matching rows in every iteration. ”Find matching rows”
might consist of a highly efficient index seek but it might require a full
table scan if the join columns are not properly indexed; making nested loops a
far less optimal plan. Since it is very common for joins to be performed on one-to-many
relationships, and since the parent node in these relationships is often the
smaller input (the ‘one’ in the ‘one-to-many’), and since this parent node is
always indexed (must be a primary key or a unique constraint), the well known best
practice rule that requires that FK columns should be indexed, now makes
perfect sense, doesn’t it?

Let’s see a few examples:

*Note: All the queries used in this article can be found in
the demo code file, plus a few more. I highly
recommend that you play around with the code, change parameters and select list
columns, try to use outer joins and even create some indexes and see how they
affect the query plans. Just remember to drop your indexes before proceeding to
the next query so that the demo plans will remain valid.

Small outer loop with a well indexed inner input

The most obvious case for nested loops is when one input is
very small (is 1 row small enough?) and the other input is ideally indexed for
the join. In Execution Plan 1 you can see that the
optimizer retrieved the single row for product 870 from the Product
table using a PK clustered index seek (needed to retrieve the Product Name)
and all the matching Order IDs were retrieved using the non clustered
index on the Product ID column of the Order Detail table.

*Tip: The optimizer was clever enough to duplicate the
filter for Product ID to
both tables although the predicate specifies only the Product table. Since the filter and the join
condition are based on the same column and since the join is an equi-join, the
optimizer knows that when constructing its physical operators, it can add a
second predicate ‘AND SOD.ProductID = 870’ to optimize the data access for the Order Detail table.