Recently an acquaintance of mine asked about my thoughts on the approach for creating a MDM data model and the requirements artifacts for loading MDM.

I believe with this client and with most clients that the first thing that we have to help them understand is that they have to take an organic and evolving a.k.a. an “agile” approach utilizing Extreme Scoping in developing and fleshing out their MDM data model as well as the requirements.

High-level logical data models for customer, location, product etc, such as Open Data Model are readily available and generally not extremely customized or different.

If we look at the MDM aspect of data modeling and requirements, I look at it as three layers:

1.the logical data model.

2.the sub models(reference models) which detail the localization in the sources for each domain, which in essence is our source-based reference models.

3.an associative model that allows for the nexus/mapping between the sources their domain values in the master domain model.

It doesn’t really get interesting until you’re able to dive into the submodel and associative models and relationships creation and understand the issues that you’re dealing with and anomalies in the source data, which inherently involves quite a bit of data profiling and metadata analysis.

In my experience the creation of the MDM model and sub models involves three simultaneous and parallel tracks.

1. Meet with users of and preform metric decomposition(defined in my slides) and define the (information value chain) and create logical definitions of metrics , groupings of dimensions and hire attributes and subsequently hierarchies that are required from an analysis perspective. A data dictionary.

1a. The business deliverable here is a business matrix which will show the relationship between the metrics and the dimensions in relation to the business processes and/or KPI’s. I’ll send an example.

1b. The relationship of the above deliverable to the logical model, is that the business deliverables here specifically the business matrix and the associated hierarchies will drive the core required data elements from the KPI and therefore the MDM perspective. In addition it will drive the necessary hierarchies( relationships) and thereby the associations that will be required at the next level down which will involve the source to target mapping required for loading the reference tables.

1c. Lastly and once we’ve truly understand the clients requirements with enough detail in terms of metric definition and hierarchies we can then select and incorporate reference models or industry experts and validating and vetting our model, this should be relatively straightforward.

1d. From an MDM perspective, where multiple teams are simultaneously gathering information/requirements we would want to use and have access to the artifacts especially for new business process modeling or any BI requirements modeling.

2. Introduce the metadata mart concept and utilize local techniques and published capabilities(I have several that can easily be incorporated) to instantiate a metadata mart and incorporate metadata and data profiling as part of the data modeling process and user awarement process.

2a. The deliverable here is a set of Excel pivot tables and/or reports that allow the users to analyze and understand the sources for domains and how they’re going to relate to the master data model even at logical level

3. Capture relations to defining master domains and documenting in Source to Target Matrixs the required mappings and transformations.

This process will result in a comprehensive set of deliverables that while in their final state will be the logical and physical data models for MDM, in addition the client will have the necessary data dictionary definitions as well as I high-level source to target mappings.

My concern with presenting a high-level logical MDM model to user is that it is intuitively too simplistic and to straightforward. Obviously at the high-level the MDM constructs look very logical for customer, location, products etc..

They don’t expose the complexity and frankly the magic(hard work) from an MDM perspective of the underlining and supporting source reference models and associative models that are really the heart and soul of the final model. The sooner and more immediate that we can engage the client an organized! facilitated, methodological process of simultaneously profiling and understanding their existing data and defining the final state a.k.a. the MDM model that they are looking for the better we’re going to be in managing long-term expectations.

There’s an old saying which I think specifically applies here especially in relation to users and expectations it is “They get what they inspect, not what they expect

I know that this went on a bit long, I apologize I certainly don’t mean to lecture, but I think the crux of the problem in a waterfall or traditional approach to creating an MDM model is overcome by following a more iterative or agile approach.

The reality is that the MDM model is the heart of the integration engine.