Friday, August 22, 2008

Handle Slowly Changing Dimension in DB2

Excerpt from WWW about Slowly Changing Dimension:

The "Slowly Changing Dimension" problem is a common one particular to data warehousing. In a nutshell, this applies to cases where the attribute for a record varies over time. We give an example below:

Christina is a customer with ABC Inc. She first lived in Chicago, Illinois. So, the original entry in the customer lookup table has the following record:

Customer Key Name State 1001 Christina Illinois

At a later date, she moved to Los Angeles, California on January, 2003. How should ABC Inc. now modify its customer table to reflect this change? This is the "Slowly Changing Dimension" problem.

There are in general three ways to solve this type of problem, and they are categorized as follows:

Type 1: The new record replaces the original record. No trace of the old record exists.

Type 2: A new record is added into the customer dimension table. Therefore, the customer is treated essentially as two people.

Type 3: The original record is modified to reflect the change.

Well, you might be naive (talented?) enough to write your own framework to manage the slowly changing dimensions in your environment, however it is good to know that there is a handy tool at your disposal to minimize the plumbing works for those data management scenarios.

As of IBM DB2 version 8.1 FixPack 2 or later, MERGE statement is introduced.

The example below illustrates how Merge can take data from a staging table and use it to update or create dimension members accordingly.

MERGE INTO DWH.DIM_COMPANY AS TARGET USING (SELECT ID, NAME, PHONE FROM DWHSTAG.COMPANY) AS SOURCE ON SOURCE.ID = TARGET.ID WHEN MATCHED THEN UPDATE SET (NAME,PHONE)=(SOURCE.NAME,SOURCE.PHONE) WHEN NOT MATCHED THEN INSERT (ID, NAME, PHONE) VALUES (SOURCE.ID, SOURCE.NAME, SOURCE.PHONE) ELSE IGNORE;