Change data through CTE’s

Even though I’ve written a number of blog posts about CTE (Common Table Expression), there’s always more functionality to discover. Last week I read an article about updating values in a table through a CTE, and I decided to try it out and see how far I could take this.

How to use a CTE
A question I get a lot is “When should you use a CTE?”. Personally I use a CTE to create a temporary resultset, that I can use as a source for another query. For example, when I need to transform a datetime column into a separate year, month and day columns, that I need as separate columns in my following query, I use a CTE to create a resultset I can use in my second query. Inserting the transformed data into a a temporary table is a little bit overkill, so a CTE is the easiest option. But there are a lot of different ways to use a CTE.

Update
Updating your source data through a CTE sounds like a long shot. Normally I prefer to update my source tables directly, but updating my source data through a CTE might make your queries more readable in some cases. So how does this work? Let’s look at an example, using the AdventureWorks2012 database. Using a CTE, we’re only selecting one specific employee: Angela. Because she worked some extra time, we’re updating here vacation hours, and increase them with 2 extra hours:

If we now search for the source table for the currency code we just created, you’ll see that it’s inserted through the CTE:

This can shorten your insert statements a lot. If you have a table with 10 columns for example, and you only want to insert 4 values in the table, you can select these columns in a CTE, and just insert those columns. On the other hand, this can make your queries unreadable, because you hide your source table in your query. So if you use this in combination with source control, it can be hard to track down where you update or insert a record in any given table.

Delete data
Deleting data from a CTE is just as easy. As an example, let’s delete the currency we created:

The limits
Both inserting and updating data through a CTE can be a powerful help in making your queries shorter and easier to read. But it can also backfire on you. Updating your source tables through a CTE can also make your queries hard to read, or even impossible to decipher, when you look at it a few weeks later. But what are the technical limits?

Updating a CTE that contains a join is not a problem. Except when you want to update a column in multiple tables at once:

The same goes for an insert. If you want to insert data in 2 or more tables at once, that’s impossible through a CTE.

Another limit you can encounter, is updating columns that aren’t actual source data. For example, if you try to update a column that is transformed with a LEFT() function. If you select the LEFT(CurrencyCode, 2) in the CTE, and you try to update that, you’ll see this error:

Because the field is a derived column (a computed/transformed version of the source column), you can’t update it. So if you try to update a transformed column, you should use an update statement on your source table.

Should you use it?
Even though this can be a cool feature to use, you must be careful with this. If you use CTE’s to alter data, you should be aware of the fact that your queries can become unreadable or hard to decipher. And even though you might find it readable, your colleagues might have other ideas about that. Also, as mentioned earlier, it might make it harder for you to find all inserts/updates on a specific table in your source control system.

If you want to read more about this #SQLCOOP topic, don’t forget to check out these blog posts: