Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

8 Answers
8

This is pretty broad, but I'll give you as general an answer as I can.

CTEs...

Are unindexable (but can use existing indexes on referenced objects)

Cannot have constraints

Are essentially disposable VIEWs

Persist only until the next query is run

Can be recursive

Do not have dedicated stats (rely on stats on the underlying objects)

#Temp Tables...

Are real materialized tables that exist in tempdb

Can be indexed

Can have constraints

Persist for the life of the current CONNECTION

Can be referenced by other queries or subprocedures

Have dedicated stats generated by the engine

As far as when to use each, they have very different use cases. If you will have a very large result set, or need to refer to it more than once, put it in a #temp table. If it needs to be recursive, is disposable, or is just to simplify something logically, a CTE is preferred.

Also, a CTE should never be used for performance. You will almost never speed things up by using a CTE, because, again, it's just a disposable view. You can do some neat things with them but speeding up a query isn't really one of them.

@NickChammas - Not much. That is another inline table expression. You can't reference the same derived table more than once (i.e. you can't self join) and support for recursion.
–
Martin SmithFeb 15 '12 at 17:30

1

If you need a CTE and want to reuse the CTE results (usually recursion in my scenarios), you could always insert the results into a temp table or table variable. So it is easy to use one or the other or both.
–
RyanFeb 16 '12 at 15:26

2

One reason CTE's can be employable for performance reasons is because their (possibly forced) materialization could be more easily parallelized whereby temp tables are generally computed serially before their use.
–
crokusekApr 5 '13 at 19:03

1

@crokusek they don't get materialized at all AFAIK - they are just a table expression
–
JNK♦Apr 5 '13 at 19:05

1

One place where a CTE could be faster is if it is used instead of a correlated subquery.
–
HLGEMJul 10 '13 at 16:49

CTE

A CTE creates the table being used in memory, but is only valid for the specific query following it. When using recursion, this can be an effective structure, but bear in mind that it will need to be recreated everytime it's needed.

You might also consider here a table variable. This is used as a temp table is used, but is also in-memory only, but can be used multiple times without needing to be recreated every time. Also, if you need to persist a few records now, add a few more records after the next select, add a few more records after another op, then return just those handful of records, then this is a handy in-memory structure.

Temp Table

A temp table is literally a table created on disk, just in a specific database that everyone knows can be deleted. It is the responsibility of a good dev to destroy those tables when they are no longer needed, but a DBA can also wipe them.

Temporary tables come in two variety: Local and global. In terms of MS Sql Server you use a #tableName designation for local, and ##tableName designation for global (note the use of a single or double # as the identifying characteristic).

Notice that with temp tables, as opposed to table variables or CTE, you can apply indexes and the like, as these are legitimately tables in the normal sense of the word.

Generally I would use temp tables for longer or larger queries, and CTEs or table variables if I had a small dataset already and wanted to just quickly script up a bit of code for something small.

The CTE is not materialised as a table in memory. It is just a way of encapsulating a query definition. In the case of the OP it will be inlined and the same as just doing SELECT Column1, Column2, Column3 FROM SomeTable
–
Martin SmithFeb 15 '12 at 16:55

6

Also CTEs are not read only. They have the same semantics as updatable views. Updating or deleting from them affects the base tables as the CTE definition just gets expanded out into the query and they do not exist as objects in their own right.
–
Martin SmithFeb 15 '12 at 17:02

2

Most of the time they do not get materialised up front, which is why this returns no rows WITH T(X) AS (SELECT NEWID())SELECT * FROM T T1 JOIN T T2 ON T1.X=T2.X, also check the execution plans. Though sometimes it is possible to hack the plan to get a spool. There is a connect item requesting a hint for this.
–
Martin SmithFeb 15 '12 at 17:08

3

Also sorry but table variables aren't in memory necessarily either. They are created in tempdb too and have exactly the same page structure as #temp tables. For both of them the data pages may or may not get flushed to disc.
–
Martin SmithFeb 15 '12 at 17:12

A CTE may be called repeatedly within a query and is evaluated every time it is referenced - this process can be recursive. If it is just referred once then it behaves much like a sub-query, although CTEs can be parameterised.

A temporary table is physically persisted, and may be indexed. In practice the query optimiser may also persist intermediate join or sub-query results behind the scenes, such as in spool operations, so it is not strictly true that the results of CTEs are never persisted to disk.

... and may be indexed. My understanding is only global temp tables can be indexed, not local ones.
–
SQLAlanMay 21 '13 at 0:51

@SQLAlan - I've built systems that created local temp tables and built indexes on them. You have to create the index at runtime, so it's only beneficial in some circumstances.
–
ConcernedOfTunbridgeWellsMay 21 '13 at 9:44

with reallyfastcte as (
select *,
row_number() over (partition by groupingcolumn order by sortingcolumn) as rownum
from sometable
)
select *
from reallyfastcte
where rownum = 1;

You can run a similar query to the above using a correlated subquery or by using a sub-query but the CTE will be faster in almost all scenarios.

Additionally, CTEs can really help simplify your code. This can lead to performance gains because you understand the query more and can introduce more business logic to help the optimizer be more selective.

Additionally, CTEs can boost performance if you understand your business logic and know which parts of the query should be run first - typically, put your most selective queries first that lead to result sets that can use an index in their next join and add the option(force order) query hint

Finally, CTEs don't use tempdb by default so you reduce contention on that bottleneck through their use.

Temporary tables should be used if you need to query the data multiple times, or alternatively if you measure your queries and discover that by inserting to a temp table and then adding an index that your performance is improved.

The accepted answer here says "a CTE should never be used for performance" - but that could mislead. In the context of CTEs versus temp tables, I've just finished removing a swathe of junk from a suite of stored procs because some doofus must've thought there was little or no overhead to using temp tables. I shoved the lot into CTEs, except those which were legitimately going to be re-used throughout the process. I gained about 20% performance by all metrics. I then set about removing all the cursors which were trying to implement recursive processing. This was where I saw the greatest gain. I ended up slashing response times by a factor of ten.

CTEs and temp tables do have very different use cases. I just want to emphasise that, while not a panacea, the comprehension and correct use of CTEs can lead to some truly stellar improvements in both code quality/maintainability and speed. Since I got a handle on them, I see temp tables and cursors as the great evils of SQL processing. I can get by just fine with table variables and CTEs for almost everything now. My code is cleaner and faster.

+1 for this: I just want to emphasise that, while not a panacea, the comprehension and correct use of CTEs can lead to some truly stellar improvements in both code quality/maintainability and speed.
–
Dave HilditchOct 8 '13 at 1:43

My understanding of a CTE is that it's basically a kind of adhoc view. SQL is both a declarative and a set based language. CTE's are a great way of declaring a set! Not being able to index a CTE is actually a good thing because you don't need to! It's really a kind of syntactic sugar to make the query easier to read/write. Any decent optimizer will work out the best access plan using indexes on the underlying tables. This means you could effectively speed up your CTE query by following the index advice on the underlying tables.

Also, just because you defined a set as a CTE, it doesn't mean that all rows in the set must be processed. Dependent on the query the optimizer might process "just enough" rows to satisfy the query. Maybe you only needed the first 20 or so for your screen. If you built a temp table then you really do need to read/write all those rows!

Based on this I would say that CTE's are a great feature of SQL and can be used anywhere they make the query easier to read. I would only think about a temp table for a batch process that would really need to process every single record. Even then afaik it's not really recommended because on a temp table it's far harder for the database to help you with caching and indexes. It might be better to have a permanent table with a PK field unique to your transaction.

I have to admit that my experience is mainly with DB2 so I'm assuming that CTE's work in a similar way in both products. I will happily stand corrected if CTE's are somehow inferior in SQL server. ;)