Thursday, 13 September 2012

In Type 2 Slowly Changing Dimension, a
new record is added to the table to represent the new information. Therefore,
both the original and the new record will be present. The new record gets its
own primary key.

In our example, recall we originally
have the following table:

Customer Key

Name

State

1001

Williams

New
York

After Williams moved from New York to
Los Angeles, we add the new information as a new row into the table:

Customer Key

Name

State

1001

Williams

New
York

1005

Williams

Los
Angeles

Advantages

This allows us to accurately keep all
historical information.

Disadvantages

This
will cause the size of the table to grow fast. In cases where the number
of rows for the table is very high to start with, storage and performance
can become a concern.

This
necessarily complicates the ETL process.

Usage

About 50% of the time.

When to use Type 2

Type 2 slowly changing dimension should
be used when it is necessary for the data warehouse to track historical
changes.

7 comments:

In a history table, we should have a column called CURR_ROW_IND (Current Row Indicator) as a CHAR (Y/N). The current row would be Y, and all other history rows would be N. In this example, we assume that the most current address would be the one with the largest CUSTOMER_KEY for the customer. To build a query, we would have to use a GROUP BY clause and a MAX function on the CUSTOMER_KEY. Returning the results could be sluggish with a large history file. Using the clause WHERE CURR_RO_IND = Y, returns results much faster.