Mapping generation for Data Vault demo: part 5 (hubs and satellites)

by ·
Published April 19, 2010
· Updated June 14, 2012

After the successful creation of a history layer, it is time to focus on the core parts of the datawarehouse: the Data Vault. In this example a ‘raw’ Data Vault is created: no changes or transformations are done here. The first script in this layer creates the Hubs and the corresponding Satellite. Currently this is done based on the history layer, which is probably not the best approach. It would be better to have the Data Vault load directly from the staging area and keep the history layer as a pure archive. But for demonstration purposes this will do just fine. And it works, although always the most recent record is selected. This is erroneous functionally speaking because many changes in the history date might only lead to a few records in the Data Vault. If I have time in the future I will try to fix this.

In any case, the script generate Data Vault hubs (download: 3_generate_datavault_hubs). The main idea behind this script is that it creates an array of the available history tables and asks the user whether each table should lead to a Hub. If not the table will be skipped but if answered positively the script will generate a Hub and Satellite table and mapping. The user is first required to enter a name for the new Hub. The Hub entity is derived off the unique key in the source table, because we used this ‘tag’ to find the source (logical) key. The Hub table and mapping is based upon this fact.

When specified that a table is to be a Hub the source key is split of to form the Hub. After that the rest of the attributes will be put in the Satellite table and corresponding mapping. This will work in many occasions, and for this demo as well but it might not be true for every source system. In time I want to add a function to keep transaction attributes out of the Satellite. This can be easily done by showing attributes and entering a list of attributes to be left out. These attributes can be put in an array to store in a Link table.