Recursive Queries

If you're not excited about CTEs yet, then you will be shortly, because CTEs provide the ability to do something that SQL developers have been praying for a very long time: querying hierarchical data. Didn't you just cringe when your boss asked you to create a company employee org chart? At long last the prayers of SQL developers have been answered.

In straightforward terms, a recursive CTE is a CTE that references itself. It is executed over and over until no more results are returned. Just like a normal CTE, a recursive CTE can be used with a normal SELECT, UPDATE, INSERT, AND DELETE statement, and even a CREATE VIEW statement.

A recursive CTE, while similar to that of a normal of a normal CTE, contains three parts versus the one of a normal CTE. Those three parts are explained below.

common_table_expression_name: No different than that of a normal CTE, this is the identifier for the common table expression. This name must be different than any other table or view used within the CTE. It can, however, be named the same as a base table or view, but when the common table expression is referenced in the normal execution statement, the common table expression name must be used, not the base name.

query_definition_anchor_member: The anchor member is the query definition that forms the base result set. Anchor members must precede the recursive member definition joined together by a UNION ALL operator.

query_definition_recursive_member: The recursive query definition includes the query definitions joined to the anchor member by a UNION ALL operator.

A correctly coded recursive CTE must contain at least one anchor member query definition and one recursive member query definition.

A Simple Example

To understand the recursive CTE structure and they work, a simple sample is in order.

Examining the example above you should see that a recursive CTE works as follows:

First, the anchor query definition returns a base result set. Next, the recursive query definition queries and returns information based on the anchor query definition, accomplished by joining the recursive query definition to the anchor query definition via a UNION ALL operator. The recursive query is executed again using the results returned from the first iteration as the base query (or input value to the next iteration of the recursive query). The recursive query is executed again and again until it finds no more results, at which point the entire result set is returned to the execution statement.

This example, a scaled down version of the example found in the SQL Server 2005 BOL, follows the steps outlined in the previous paragraph. The anchor query returns the base result set, in this case, the big cheese since he has no ManagerID. The recursive query is then executed, joining on the anchor query to return all employees who has a ManagerID of the person found in the anchor query (in this case, the big cheese). The recursive query is then executed again using those employees recently found in the first iteration of the recursive query, using those as the base query or input value to the current recursive query.

As stated above, a recursive CTE is executed over and over until no more results are returned. It does this by providing an internal, implicit termination check. With each iteration of the CTE, it basically asks itself, "Hey, have we retrieved all the rows?" And if the results of that check are true, then it says, "OK, we're done!" and exits the recursion.

During the 1960s, there was a popular TV show called Lost in Space in which an unnamed robot and his human counterpart, Will Robinson, traveled through space tackling one adventure after another. Part of the responsibility of the robot was to "watch the back" of Will and warn him of any potentially dangerous situations. When a situation did arise that could be or indeed was dangerous, the robot would utter the phrase "Danger, Will Robinson!" and our hero would spring into action.

Why does this have to do with recursive CTEs? Because as cool as recursive CTEs are, there exists the possibility of mistakenly creating a recursive CTE that results in an infinite loop. Not good. "Danger, Will Robinson!" This happens when a recursive member of the CTE returns the same value for both the parent and child columns. Bam, you're in an infinite loop.

Summary

OK, there you have it. CTEs in a nutshell. The intent of this article is to provide a helpful explanation and overview of common table expressions in SQL Server 2005 and show the true power and flexibility they provide with recursive query capability. We started out covering simple common table expressions and showed how much easier and clean they are to use over what you used to have to use in previous versions of SQL Server, and from there we discussed recursive CTEs and how to use them to retrieve hierarchical data. Very cool stuff. Try not to drool when you start using common table expressions. They have that affect.

Scott has been involved with databases for over ten years. He started using Microsoft Access prior to beta 1, then started using SQL Server when version 4.2 came out and has been using it heavily ever since. XML caught his attention about three years ago and most of his development since that time has been around XML and SQL Server. Scott is the author of the recently published Wrox book Professional SQL Server 2005 XML.