Divide and Conquer Halloween

The classic algorithm design paradigm known as Divide and Conquer (D&C) takes a problem and reduces it by dividing it into two or more sub-problems in a recursive/iterative manner. A variation called Decrease and Conquer reduces the original problem into one sub-problem—again, in a recursive/iterative manner.

There's a common recursive/iterative solution pattern used in SQL Server to handle graph-related tasks that could be considered as being based on a Decrease and Conquer algorithm. Examples of such tasks include returning a subgraph (find subordinates of a manager, perform a parts explosion in a Bill of Materials, or BOM), returning a path (find managers of an employee), and others. The common solution pattern involves iterating through the graph levels, and with the help of a temporary table or a table variable, handling one level at a time.

Recently I observed that query plans of solutions based on this pattern contain extra work to deal with Halloween protection. (As background, I recommend reading Paul White's excellent series on the topic of Halloween protection.) When the source and target of a modification are the same and there's potential for the same rows to be handled more than once (e.g., updated multiple times, read and inserted multiple times), the optimizer uses Halloween protection to avoid the problem. Such Halloween protection can sometimes involve extra work added to the plan that otherwise wouldn't be necessary.

I wondered whether I could come up with an alternative generic pattern that removes the need for Halloween protection, and I managed to find one. I split the handling of each level into two parts, alternating between two temporary tables/table variables—thus the source and target are always different. Before I wrote this article that explains the technique, I described it to my wife and asked for her advice regarding how to name the pattern. Interestingly, she intuitively said "Divide and Conquer," without any prior knowledge of the algorithm design pattern of the same name. I liked her suggestion; therefore, I'll refer to the technique as Divide and Conquer Halloween.

The Problem

I'll use a generic representation of a graph for my sample data. Use the code in Listing 1 to create a helper function called GetNums that returns a sequence of integers in a requested range. This function is used to help create the sample data.

SET NOCOUNT ON;
USE tempdb;
GO

IF OBJECT_ID(N'dbo.GetNums', N'IF') IS NOT NULL DROP FUNCTION dbo.GetNums;
GO

CREATE FUNCTION dbo.GetNums(@low AS BIGINT, @high AS BIGINT) RETURNS TABLE
AS
RETURN
WITH
L0 AS (SELECT c FROM (SELECT 1 UNION ALL SELECT 1) AS D(c)),
L1 AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM L5)
SELECT TOP(@high - @low + 1) @low + rownum - 1 AS n
FROM Nums
ORDER BY rownum;
GO

Run the code in Listing 2 to create a table called Graph and populate it with 10,000,000 nodes. The degree of this graph (maximum depth) is 7.

Suppose you need to implement a solution for a subgraph request. Given an input node @root, return the nodes in the subgraph of @root. For our purposes, in order not to generate very large outputs, instead of returning all individual nodes in the subgraph, compute the sum of the values in their val column and their count. Note that the sample data I populated in the table Graph represents a graph that's a tree, meaning that only one path can lead to each node; however, the techniques I describe in the article are just as applicable to the more generic case known as a directed acyclic graph (DAG), meaning that multiple paths can lead to a node. A typical way to handle such a subgraph request is to use a recursive query such as the one shown in Listing 3.

Figure 1 shows the execution plan for this query. You'll notice in the plan that SQL Server creates an Index Spool (a B-tree-based work table) where it stores the intermediate result sets. Using a Nested Loops operator, the plan iteratively processes one level of nodes at a time. In each iteration, it reads from the spool to collect the nodes from the previous level, joining to the Graph table to collect the nodes from the current level, and inserts those nodes into the spool.

The problem with this solution is that you can control only the optimization aspects of the user table Graph in terms of indexing, statistics, and so on. You have no say in terms of the work table's indexing and statistics, or any other aspect of its handling. The end result is a slow-running solution. The solution took 19 seconds to complete on my system for the input @root = 5, performing a total of 13,337,269 logical reads (measured with an Extended Events session).

Despite the elegance and conciseness of a recursive query, you'll tend to get better performance implementing the same algorithm with a loop and a temporary table or a table variable, as the code in Listing 4 demonstrates. SQL Server maintains distribution statistics (histograms) on temporary tables but not on table variables; therefore, estimates tend to be better with temporary tables.

The first plan represents the insertion of the first node into the temp table. The next seven plans represent the iterative insertions of the "next level." Finally, the last plan represents the query computing the sum and the count. Observe the highlighted Table Spool (Eager Spool) operators in the plans for the first six iterations of the INSERT statement. The Eager Spool is a work table that eagerly (completely) consumes all input rows before passing them on to the operator to the left of it. It's a blocking (stop-and-go) operator.

SQL Server uses a blocking operator for Halloween protection purposes—to ensure that data that was read from and written to the same table won't be read more than once. The Eager Spool naturally adds work, mainly I/O activity in tempdb, latches, and so on—resources that often tend to be in contention in SQL Server. The plan for the seventh iteration of the INSERT statement doesn't include an Eager Spool operator for Halloween protection purposes. That’s because the Hash Match operator is a blocking operator for the build input, and hence is already providing the needed protection.

This solution ran for 8 seconds on my system, performing 4,232,753 logical reads. The work done by the Eager Spool operators for Halloween protection purposes accounts for a few hundreds of thousands of those reads. I was intrigued by the challenge to try and come up with a generic solution pattern for such tasks, but one that somehow avoids the need for Halloween protection and its associated overhead.

The Solution: Divide and Conquer Halloween

As I mentioned, the potential for Halloween protection is only there when the source and target are one and the same. To avoid the need for Halloween protection, I use two temporary tables rather than one, alternating between them in each iteration of the loop. Based on the parity of the level counter (@lvl), I know which of the two to use as the source and which as the target. Then the final query that computes the sum and count simply applies the calculations to the unified sets from both tables. Listing 5 contains the complete solution.

SELECT COUNT(*) AS cnt, SUM(val) AS total_val
FROM (SELECT * FROM #T1 UNION ALL SELECT * FROM #T2) AS U;

DROP TABLE #T1, #T2;

Figure 3 shows the execution plans for the queries in this solution. Observe that Halloween protection with an Eager Spool was completely removed from all plans. This solution ran for 6 seconds on my system, reducing the number of logical reads by over half a million to 3,651,072.

Avoid Unnecessary Costs

In this article, I described a common pattern used in iterative T-SQL solutions to handle graph-related tasks and other tasks. The problem with this pattern is that you iteratively read from and write to the same temporary table, requiring SQL Server to employ Halloween protection techniques, which come at a cost. I provided a modified pattern that I referred to as Divide and Conquer Halloween, in which I alternate between two temporary tables, using a different source and target in each iteration. Because the solution never uses the same table as both the source and the target, SQL Server doesn't need to add any Halloween protection controls, thereby avoiding unnecessary costs.