A common table expression (CTE) is an ANSI SQL-99
expression that produces a table that is referred to by name within the
context of a single query. The general syntax for a CTE is as follows:

The WITH clause, in effect, defines a table and its columns. Note that the syntax of the WITH
clause is similar to that of a view. You can think of a CTE as a
temporary view that lasts only for the life of the query that defines
the CTE. Listing 1
shows an example of a simple CTE. This CTE is used to return the
average and maximum sales quantities for each store. The CTE is then
joined to the sales table to return the average and maximum sales quantity for the store, along with sales records for a specific title_id.

Note

If the WITH
clause for a CTE is not the first statement in the batch, you should
delimit it from the preceding statement by placing a semicolon (;) in front of it. The semicolon is used to avoid ambiguity with other uses of the WITH
clause (for example, for table hints). Including a semicolon is not
necessary in all cases, but it is recommended that you use it
consistently to avoid problems.

It is also possible to define
multiple CTEs in a single query, with each CTE delimited by a comma.
Each CTE is able to refer to previously defined CTEs. Listing 2 shows an example of a nested CTE that calculates the minimum, maximum, and difference of counts of store orders.

MN MX Diff----------- ----------- -----------1 22 21

A CTE must be followed by a single SELECT, INSERT, UPDATE, or DELETE statement that references some or all of the CTE columns. A CTE can also be specified in a CREATE VIEW statement as part of the defining SELECT statement of the view.

Most valid SELECT statement constructs are allowed in a CTE, except the following:

COMPUTE or COMPUTE BY

ORDER BY (except when a TOP clause is specified)

INTO

OPTION clause with query hints

FOR XML

FOR BROWSE

Recursive Queries with CTEs

Nonrecursive CTEs
are ANSI SQL-99 compliant expressions that provide T-SQL coding
flexibility. However, for each nonrecursive CTE, there is usually
another T-SQL construct that can be used to achieve the same results
(for example, derived tables). The real power and capability of CTEs is
revealed when you use them to create recursive queries.

A recursive CTE can help simplify the code required to run a recursive query within a SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW
statement. Recursive queries are often useful for expanding a hierarchy
stored in a relational table (for example, displaying employees in an
organizational chart). In previous versions of SQL Server, a recursive
query usually required using temporary tables, cursors, and logic to
control the flow of the recursive steps.

A CTE is considered
recursive when it refers to itself within the CTE definition. Recursive
CTEs are constructed from at least two queries. One is a nonrecursive
query, also referred to as the anchor member (AM). The other is the
recursive query, also referred to as the recursive member (RM). The
queries are combined using the UNION ALL operator.

The following pseudocode defines the basic structure of a recursive CTE:

Logically, you can think of the algorithm implementing the recursive CTE as follows:

1.

The anchor member is activated, and the initial result set (R) is generated.

2.

The recursive member is activated, using the initial result set (Rn) as input and generating result set Rn+1.

3.

The logic of step 2 is run repeatedly, incrementing the step number (n) until an empty set is returned.

4.

The outer query is executed, getting the cumulative (UNION ALL) result of all the previous steps when referring to the recursive CTE.

You can have more than two members in a recursive CTE, but only the UNION ALL operator is allowed between a recursive member and another recursive or nonrecursive member. Other operators, such as UNION, are allowed only between nonrecursive members. Recursive CTEs also require an exact match of the columns in all members, including the same data type, length, and precision.

Listing 4
shows a simple recursive CTE that generates a list of sequential
numbers. Note that the AM generates the base result, and the RM
following the UNION ALL controls the recursion. It is important in this example that a valid endpoint be defined to avoid infinite recursion.

Listing 4. An Example of a Simple Recursive CTE

val-----------12345678910

The following sections present some examples and uses of recursive CTEs.

Using Recursive CTEs for Expanding a Hierarchy

For this hierarchy example, we use the parts table in the bigpubs2008 database. This table contains a simplified hierarchy of car parts, as shown in Figure 1.

Figure 1. The parts table hierarchy.

In the parts table, any part that is a subpart of another part has the parent part ID stored in the parentpartid column. The parentpartid column is a foreign keythat references the partid column. Therefore, the parentpartid must either correspond to a valid partid within the table or be NULL. For example, the car itself has NULL in the parentpartid column.

Following are some common requests that might be run on the parts table:

Return all the parts for the engine.

Show me all parts that are two levels below the drivetrain.

Show me all the parts in such a way that it is easy to see their hierarchical dependencies.

The first request is probably the most common one: returning a part (for example, the engine, which has partid = 2) and all subparts. The recursive CTE shown in Listing 5 provides a solution to this request.

Notice that the lvl
value is repeatedly incremented with each recursive invocation of the
CTE. You can use this level counter to limit the number of iterations in
the recursion. For example, Listing 6 shows an example of a CTE that returns all parts two levels below the drivetrain.

Listing 6. A Recursive CTE to Return All Subparts Two Levels Below a Part

In Listing 6, the filter WHERE lvl < 2 in the recursive member is used as a recursion termination check; recursion stops when lvl = 2. The filter on the outer query (WHERE lvl = 2) is used to remove all parts up to the second level. Logically, the filter in the outer query (lvl = 2)
is sufficient by itself to return only the desired rows, but for
performance reasons, you should include the filter in the recursive
member to stop the recursion as soon as two levels below the drivetrain
are returned.

SQL Server allows the use of local
variables in a CTE to help make the query more generic. For example,
you can use variables instead of constants for the part ID and level, as
shown in Listing 7.

WITH PartsCTE(parentpartid, lvl)AS( SELECT parentpartid, 0 FROM PARTS WHERE parentpartid is not null UNION ALL SELECT P.parentpartid, lvl+1 FROM Parts as P JOIN PartsCTE as PP ON PP.parentpartid = P.Partid WHERE P.parentpartid is not null)SELECT C.parentpartid, P.PartName, COUNT(*) AS cntFROM PartsCTE CJOIN PArts P on C.ParentPartID = P.PartIDGROUP BY C.parentpartid, P.PArtNamego

In the example in Listing 8, the anchor member returns a row with the parentpartid for each part, being sure to filter out the NULL value in the parentpartidcolumn because it is essentially the top of the hierarchy and represents no parent part. The recursive member returns the parentpartid of each parent of thepreviously returned parts, again excluding any NULL
values. Eventually, the CTE contains, for each part, as many
occurrences as their direct or indirect number of subparts. The outer
query is then left with the tasks of grouping the results by parentpartid and returning the count of occurrences. A join to Parts is included to get the corresponding partname for each parent part to provide more meaningful results.

Suppose you want to generate a
report that is a bit more readable, with the subparts sorted and
indented according to hierarchical dependencies. Listing 9provides a way you could accomplish this.

In this example, you use a varbinary string as the sortcol to sort subparts according to the partid value. The anchor member is the starting point, generating a binary value for the partid
of the root part. In each iteration, the recursive member appends the
current part ID, converted to a binary value, to the parent part ID’s sortcol. The outer query then sorts the result by sortcol, which groups the subparts under each immediate parent part.

Setting the MAXRECURSION Option

To help avoid infinite recursion in CTEs, SQL Server, by default, sets a MAXRECURSION value of 100. If a recursive CTE attempts to perform more than 100 recursions, it is aborted, with the following error message:

Msg 530, Level 16, State 1, Line 1The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

You can override the default MAXRECURSION setting by using the OPTION(MAXRECURSION value) query hint to force termination of the query after a specific number of recursive iterations have been invoked. Listing 10 shows an example.

Msg 530, Level 16, State 1, Line 2The statement terminated. The maximum recursion 10 has been exhausted before statement completion.

Keep in mind that if you use MAXRECURSION
to control the number of levels of recursion in a CTE, your application
receives the error message. It is not considered good programming
practice to use code that returns errors in valid situations. Certain
applications may discard query results if an error message is received.
Instead, it is recommended that you use the level counter to limit
recursion, in Listing 7. You should use the MAXRECURSION hint as a safeguard against infinite loops due to bad data or as a coding safeguard.