Chris Bell (MCITP) has been working with Microsoft technologies for over 20 years and using SQL Server for over 18 of those years. He is currently the Owner and Lead consultant for WaterOx Consulting, a Maryland based company focused on helping clients, that can’t afford or justify a full time data team, get the most from their SQL Server environments. For more information please visit WateroxConsulting.com.

I was going through some notes I had from previous projects and came across a sample script for created a Type 2 Slow Changing Dimension (SCD) in a database or data warehouse.
A Type 2 SCD is one where new records are added, but old ones are marked as archived and then a new row with the change is inserted. This allows for a complete historical trail of the row’s changes in detail. Sometimes this can be overkill, but in some cases it is required. The client was doing medical claims processing, so we were required to keep a full history of key information.

In the past (read: pre-2008 versions of SQL) this used to have to be done with a bunch of little code and triggers to make sure the old data was kept.
On one of my recent projects I decided I wanted to use the new MERGE statement to create this same effect, with a LOT less code to maintain.

What actually made this possible is the MERGE statement’s output option. The merge statement actually captured all the changes it makes and tags them in an output table that you can reference. This feature let me capture information that was in the merge portion and pass it through to another statement.

OK, still a little confusing so I’m going to describe what’s going on in plain English but first a quick note:
The current_flag field was added to the SCD so that an index could be created on it for real time reporting rather than spending the time searching for the current record through the detailed history. This is one of those cases that knowing what a customer really wants, regardless of what they ask for can make for simple performance tuning.

Now for the logic of the “Joe on the street” description of the merge statement:

If I do find a match, and the current_flag is ‘y’ and the last names do not match, then I want to update the Master table with the new values and then move to the next record.
Every other match/no match scenario in this sample, I don’t need so there are no further checks done.

When I’m done with the merge, I want to access the internal table and pass certain values out to any other statements. Note the $Action field in the output, this is what tells us the action the merge statement performed and will be what is used on the outer insert statement to perform the new record insert.

Hopefully this will help you see just how powerful the merge statement in SQL Server can be.
By using this process we streamlined not only our code base for the data warehouse being built, but also the performance of maintaining a Type 2 SCD.