When should I use a CTE?

If I had a dollar for every CTE solution to a simple query in the forums (pick any SQL forum), I’d be Mark Cuban. With that said I thought it would be a good idea to ask if a CTE was actually the right solution or if it’s just the new trend and everyone uses them just because they can and all the cool kids are too. Let’s put it this way: If I had a dollar for every time a CTE was misused, I’d be Mark Cuban… or at least Robert Herjavec.

Can you tell I like Shark Tank? Haha. I was happy to see that Shark Tank was on TV when I went to Toronto then at a commercial break I was shocked to see the show was named “Dragons’ Den”, but enough about that… Let’s get down to business.

Should I use a CTE for everything?

Hey! Here’s a SQL question were you don’t get to answer “It depends”. That is unless every query you run needs to be recursive. haha!

Have you ever been in an interview where you were asked about how you would build a copy of IMDB? That’s the Internet Movie Database in case you didn’t already know. I’ve had this question a few times and I liked it so much that I sometimes use it when I’m interviewing a candidate in person. Once I had a candidate that said they wouldn’t know where to start because they don’t watch or like movies; so, I changed it from IMDB to a music collection which is similar in concept. They replied “I don’t like or listen to music.”

Let’s say you build a very simple database to store your movies and meta data about them. You’d probably have something that looked like this:

You could add on to that with a table that described the person’s title, such as director, actor, key grip, and more. There’s a lot you could do to mature the data model. But let’s keep this very simple and just like it is. You have a list of movie titles, people associated with the movie, and a many-to-many join table.

Now let’s ask ourselves, Can this query be written better? Did this query require recursion?

Let’s say we want to make this query extremely modular. We want to bolt on as many names from the application with a very simple and straight forward way of coding. Something that a code first developer would be proud of.

We already know how to pull the data for one name. Why not use the INTERSECT function?

The IO here is a bit higher. Is it actually faster? Let’s use STATISTICS TIME and see.

It’s very close but these are very light weight queries as well. The CTE ranges from 38 to 118 ms execution time on my Surface Pro 3 256 and the “classic” statement ranges from 33 to 110 ms.

I think the key learning points here are there are many ways to accomplish the desired query results but they aren’t all created equal. If you happen to have a developer that writes everything as a CTE because all the cool kids are doing the same you may want to coach them on some of the “old school” query techniques.