Wednesday, May 30, 2012

How to handle inferred member for “Late Arriving Dimension”

ContextThis is very general practice in warehouse project: Sometimes we will get the transactional data without dimensional data. It is called as “early arriving facts” or “Late Arriving Dimension”. There are a few of good solutions though, all of which are handled during ETL:

Hold onto the fact until all dimensions arrive

Create a dimension called ‘unknown’ or ‘not available’ with a primary key of -1

Insert a dummy row in the dimension and populate it with whatever you can

Usually, the first approach is not applied, as business want to get result even some dimensions not available. The exception mainly is handled during the populate fact table when there is NO match found in dimension table. An inferred member is used in third approach. This document is intended to provide overview to handle Inferred member, along with ETL process.

Detail

Add new table as Admin_Exception to save all exception records

Column name

Data type

Description

Log_Id

Bigint

Auto ID

SrcTableName

varchar(80)

Staging table name

SrcColumnName

varchar(80)

Staging column name

Type

Int

1: Null or Blank; 2: Inferred Dimension

RecordID

Int

Unique record ID from each staging table

Batch_Number

Int

Reference to Admin_Batch_Status.Batch_Number

Message

Text

Datetime

Datetime

Default as getdate()

Add new SP to insert data into Admin_Exception

Input:

[@RecordID = ] RecordID

[@SrcTableName = ]’ SrcTableName’

[@SrcColumnName = ]’ SrcColumnName’

[@Type = ]’ Type’

[@Message = ] ‘Message’

Output: NONE

Logic: add Batch_Number

While there will not be any changing for current existing package at all, a few adjustments will be added, When populating fact tables, all records that are not matched to individual dimension, will be insert into Admin_Exception using usp_InsertExecption. Please see sample below:

Use conditional split to split not matched dimension into NULL or Blank and inferred Member

Use Ole Command to add Exception into table Admin_Exception using usp_InsertExecption, the difference between two exceptions are the description.

Use derived column to set Surrogate key as 0

Use Ole Command to insert Inferred member using usp_InsertInferred<Dimtable> to Insert new dimension into correspondent dimension table, with IsInferred = 1. Each dimension has one stored procedure accordingly. In addition, Logic is already built in to avoid duplicated insertion.

Use lookup again to get Surrogate key, while the lookup condition is with NO cache, and only apply for inferred member