Simplify queries with SQL Server 2005 common table expressions (CTEs)

Common table expressions (CTEs) are a handy alternative to using derived tables and views in SQL Server 2005 for retrieving data. There's no need to repeat complex code because CTEs separate code into unique units and they're self-referencing within your query. Here's a CTE how-to with examples, including details for one of its most valuable uses – creating a recursive query.

By submitting my Email address I confirm that I have read and accepted the Terms of Use and Declaration of Consent.

By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.

You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

that can be a handy alternative to derived tables and views. By using CTEs, you can create named result sets to reference from within your SELECT, INSERT, UPDATE, and DELETE statements without having to persist any metadata about the result set structure. In this article, I explain how to create CTEs in SQL Server 2005 – including how to create a recursive query using a CTE – and provide several examples that demonstrate how they work. Note that for these examples I use the AdventureWorks sample database that ships with SQL Server 2005.

Creating a basic CTE in SQL Server 2005

You construct CTEs by adding a WITH clause before the SELECT, INSERT, UPDATE, or DELETE statement. The following syntax shows the basic structure of the WITH clause and CTE definition:

As the syntax shows, you can define multiple CTEs within the optional WITH clause. The CTE definition includes the name of the CTE, the CTE column names, the AS keyword and the CTE query enclosed in parentheses. Note that the number of CTE column names must match the number of columns returned by the CTE query. In addition, the column names are optional if the CTE query supplies all column names.

Now that you have a basic idea of the CTE syntax in SQL Server, let's look at an example of a CTE definition to help better understand this syntax. The following example defines a CTE named ProductSold and then references the CTE within a SELECT statement:

As you can see, a WITH clause precedes the SELECT statement that references the CTE. The first line of the WITH clause includes the name of the CTE (ProductSold) and the names of the two columns within the CTE (ProductID and TotalSold). Next comes the AS keyword, followed by the CTE query in parentheses. In this case, the CTE query returns the total number of individual products sold.

The SELECT statement that follows the WITH clause references the CTE by name when joining the Product table to the CTE, based on the ProductID columns. The statement calls the CTE as it would a table or view. However, unlike a table or view, the CTE is available only to the statement that immediately follows the WITH clause. If you reference the CTE in a subsequent statement -- without redefining the CTE -- you'll receive an error.

One advantage to using common table expressions is that you can reference a CTE multiple times in the calling statement. For example, the following statement defines a CTE named Employees and then calls that CTE twice in the SELECT statement that follows the WITH clause:

In this example, the SQL Server CTE query returns a list of employee IDs, names and email addresses, as well as their managers' IDs. The SELECT statement following the WITH clause then joins the CTE with itself to return the managers' names and email addresses. You can achieve the same results by using derived tables (subqueries), but that means repeating the same subquery multiple times and working with code that's more complex.

Creating multiple CTEs in a WITH clause

As you saw in the CTE syntax, you can define multiple CTEs in your WITH clause and then call each of those CTEs as often as necessary in the statement that follows. Take a look at the example below that demonstrates how this works. The following WITH clause includes two CTE definitions:

The first CTE definition defines a CTE named Cost, and the second definition defines one named Sold. The Cost CTE returns the average cost of each product based on its cost history. The Sold CTE returns the average number of products sold per order. The SELECT statement after the WITH clause joins these two CTEs with the Product table to return the total cost of each product based on the average number sold and the average cost of the product.

You can easily define multiple CTEs within the WITH clause, but you can take this a step further by defining CTEs that reference the CTEs defined before it. For example, the WITH clause below defines three CTEs (Cost, Sold, and Total):

Notice that the CTE query in the Total CTE definition joins the Cost and Sold CTEs, whose CTE definitions precede Total. You cannot reference an undefined CTE. For example, you cannot reference the Total CTE within the Sold CTE. Now, the SELECT statement that follows the WITH clause needs to join the Product table to the Total CTE only, rather than joining to both the Cost and Sold CTEs.

Creating a recursive common table expression

One of the most valuable features of a CTE in SQL Server is its ability to create a recursive query -- a type of query that repeatedly references itself in order to return subsets of data. A recursive query is most commonly used to return hierarchical data. For instance, the Employee table in the AdventureWorks database includes the manager ID of each employee. The manager ID is actually the employee ID of that manager. As a result, the Employee table contains the entire hierarchical direct reports structure from the CEO on down.

You can define a CTE to retrieve this hierarchical structure by creating a CTE query that uses a UNION ALL, UNION, INTERSECT, or EXCEPT operator to join multiple SELECT statements. The best way to show you how this works is through an example.

In this WITH clause, the CTE query includes two SELECT statements joined by a UNION ALL operator:

In other words, this is the person who does not directly report to another manager. Note: The first column in the SELECT statement is 1. This is used to designate that the employee returned by this query is at the top level, Level 1.

The second SELECT statement (after the UNION ALL operator) joins the Employee table to the Reports CTE itself based on the manager and employee IDs. By self-referencing the CTE in this way, SQL Server automatically treats this as a recursive query and repeats the query as many times as necessary to return each level of employees. Every time the query runs, the first column adds 1 to the value so that each level is incremented by 1.

The SELECT statement that follows the WITH clause joins the Reports CTE to the Contact table to retrieve the employees' name. The following query results show a sample of the data returned by this statement:

EmpLevel

EmpID

EmpName

MgrID

1

109

Ken Sanchez

NULL

2

6

David Bradley

109

2

12

Terri Duffy

109

2

42

Jean Trenary

109

2

140

Laura Norman

109

2

148

James Hamilton

109

2

273

Brian Welcker

109

3

2

Kevin Brown

6

3

46

Sariya Harnpadoungsataya

6

3

106

Mary Gibson

6

3

119

Jill Williams

6

3

203

Terry Eminhizer

6

3

269

Wanida Benshoof

6

3

271

John Wood

6

3

272

Mary Dempsey

6

3

3

Roberto Tamburello

12

3

66

Janaina Bueno

42

3

102

Dan Bacon

42

3

117

François Ajenstat

42

3

128

Dan Wilson

42

3

149

Ramesh Meyyappan

42

3

150

Stephanie Conroy

42

3

176

Karen Berg

42

3

30

Paula Barreto de Mattos

140

3

71

Wendy Kahn

140

3

103

David Barber

140

3

139

David Liu

140

3

21

Peter Krebs

148

3

44

A. Scott Wright

148

3

200

Hazem Abolrous

148

3

218

Gary Altman

148

3

268

Stephen Jiang

273

3

284

Amy Alberts

273

3

288

Syed Abbas

273

4

4

Rob Walters

3

4

9

Gail Erickson

3

4

11

Jossef Goldberg

3

The results are listed according to the employee's level. Notice that the second through seventh rows show a MgrID value of 109, which is the ID of the top-level employee shown in the first row. The subsequent rows reflect the same hierarchical nature of the data.

Recursive CTEs, like other common table expressions in SQL Server, provide powerful tools for retrieving data. Unlike views, you don't have to persist the metadata. And unlike derived tables, you don't have to repeat code unnecessarily. CTEs help simplify complex code by letting you more easily separate your code into discrete units. And when it comes to recursive queries, CTEs can't be beat. When you first start using CTEs, you might have to play around a little to become comfortable with them, but once you do, you'll never go back.

ABOUT THE AUTHORRobert Sheldon is a technical consultant and the author of numerous books, articles, and training material related to Microsoft Windows, various relational database management systems, and business intelligence design and implementation. You can find more information at http://www.rhsheldon.com.

0 comments

Register

Login

Forgot your password?

Your password has been sent to:

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy