Monday, July 09, 2018

Telling The Story with CTE's

I
can divide the history of my knowledge of SQL in two ages, before and
after Common Table Expressions (CTE's). I shake my head in disbelief when
I think of all the times I could have used CTE's and used subqueries
instead. To think that many years ago I was in a conversation and I said
"I don't
like subqueries" but didn't know there was a better way. I was still new with SQL and I found a subquery like
an hole to another dimension. But the skill of SQL seemed to be measured in the
ability to manage multiple nested subqueries so I had to learn the hard way. Having the mind of a
developer, I am more interested in being able to make sense of what the
query is doing based on its readability.

SELECT TranDate, sum(Amount)
FROM Sales s
join (SELECT ID FROM Customers where State in
(SELECT State from Regions
where AvgTemp > 70)
) f
on s.ID = f.ID

What does this do?
I can understand why people might not see much difference between a
subquery and a CTE. Aren't there two ways of doing the same? Yes and no.
A CTE is defined by RedGate software as "a temporary named result set that you can reference within a SELECT, INSERT, UPDATE, or DELETE statement". To be fair, both approaches are supposedly useful for different
purposes, but I still prefer CTE's. Why?

Encapsulation

You
can tell me that a subquery is encapsulated and it's correct, although
you have to give it an alias and it has to be short, such as "x", unless you want to see long references. With a
CTE you can make more declarative. It makes your query easier to read. Not to mention that the CTE is separated from the block of the main query, and you abstract it as if it were a table.

CTE's Tell the Story

Do
you still put a lot of comments in your SQL code? Why not let the code
structure be self explanatory? when you can write a function name that
describes just enough? In SQL, CTE's act like functions. You can get
super descriptive with the name, and when you call the CTE, you can
alias it.

A CTE
is more readable than a subquery. If I open up an existing view and it
has CTE's, then kudos to you. But then there is something else that can
make your CTE's go to the next level. Let me give you an example.

Naming matters

Supposed
that in your new job you need to alter a view for a report because a
source table has changed names. You open the view to edit it, and see
that the previous developer was thoughtful and used CTE's. But you're
not too happy when you see that the first CTE is called "pyt". What does
that stand for? Should anyone care? Yes, if you want to know what it
does. Since there are no comments, you do some analysis. The view is a
Join between 5 tables, with a calculated date as a parameter, based on
today's date but for one year prior, and it's all grouped. It's
possible for me to deduce that after doing some analysis and making an
educated guess as to what is the intent.

Imagine
another scenario. You open the view, and the CTE is called
PreviousYearTotals. You don't need to do analysis and guesswork because
someone had the good sense of naming the CTE. The long name is not a
problem because you can alias it to a short name when you join to it.

I
haven't gotten into the topic of performance, although an advantage
with a CTE is that, unlike a subquery, it only makes one trip to get the
data. A subquery has to call data each time it repeats.

With the flexibility of CTE's you can do something that can't be done with subqueries: a more self-explanatory structure.