Articles

The “springboard” of data quality and data governance stemming from a data migration

By: Interview by Dylan Jones, CEO of Data Quality Pro.

Bryn Davies, Managing Director of InfoBluePrint, discusses the “springboard” of data quality and data governance stemming from a data migration.

Dylan Jones: Thanks for taking time out to talk Bryn. Let’s start by looking at some examples of recent data migration projects, why were you folks called in, what was the challenge the company faced?

Bryn Davies: Typically what we have seen is that companies have burnt their fingers on recent application implementations due to having little or no focus on data quality. The project goes over time and budget mainly due to data quality issues in the target application, and this of course negatively affects user and management acceptance of the new system from the get go. Very often management has been sold on the fact that a new application, besides all the other reasons for implementation, will also “sort out our data problems”. But without a formal approach to data quality from early on in the project this clearly will not somehow magically happen. So there appears to have been a growing recognition and acknowledgement of this and the need to budget for it and bring it in as a formal stream to the next project.

Dylan Jones: Do you find that a lot of Systems Integrators prefer to leave the data quality aspects of a system integration or transformation in the hands of the customer?

Bryn Davies: Yes that is typically the case – somewhere in their project charter they will exclude this as not their problem and it is pushed back on the customer. It is a risk to their project milestones and payment schedule, so it’s safer to just leave it out. Whilst the SI or their subcontractor will have to do the data mapping and transformation work, this typically only covers enough to make the data physically “fit” into the target models, regardless of whether it is rubbish or not. This is particularly the case for master data, and to a lesser extent reference data. The fact is that data in the target needs to not only work structurally, but it also has to serve the business purposes of the new application, which often includes new capabilities that users have been sold on. If the source data quality has not been assessed as to whether it can properly support these new purposes, and then not corrected or improved during the migration, that’s it’s a recipe for disaster.

Dylan Jones: Okay, based on your data migration experience, what are some of the pitfalls you see organisations making when they attempt to tackle data quality during a data migration project without the necessary skills and technology?

Bryn Davies: Without the experienced planning and foresight leading to appropriate support structures, sponsorship, change management and oversight, data cleansing during a migration is approached haphazardly and generally seriously lags the rest of the project, ultimately leading to project overruns and even failure. Data migration is often last on the project agenda, but in reality it should be addressed early on in the project. The sooner source data quality is assessed through generic and business rule specific data profiling, the earlier the red flags will come up, allowing for proactive and effective management of data issues. Technology plays an important enabling and productivity role, and without capable data profiling and data quality software, all data engineering and associated project controls get done manually in a mix of SQL and Excel – this is typically difficult to control and leads to even more errors, inconsistent data cleansing and standardisation and unreliable or non-existent matching and de-duping. Whilst ETL tools are often readily present in a data migration, they generally fall short of the richer functionality found in proper data profiling, cleansing, fuzzy matching and data quality reporting technologies. That covers programmatic cleansing and de-duping, but there will always be a need for manual remediation, be it in source or staging or to cater for those issues that simply cannot be programmatically dealt with. For that an effective data steward interface into the data quality tool is essential, as it guides a controlled and formally agreed workflow for manual resolution within the tool, instead of in yet another out of control spreadsheet.

Dylan Jones: Obviously the focus of this interview is discovering ways to build a data quality and data governance capability after the migration has completed so is there any advice for some approaches that can be taken during the migration that will make it easier to grow data quality and governance post-migration?

Bryn Davies: Firstly it is very important to campaign and sensitise end users, as early as possible in the project planning phases, about the pending data migration and associated data quality requirements, and what roles they may be expected to play in helping to resolve data issues going forward. The programme’s change management team is a very useful ally in this process, and they should also be educated about what data migration means and why quality is critical. Then during the project, manual data cleansing by end users or by a dedicated team must be carefully controlled by the data migration stream – this requires artefacts to, for example, show data cleansing progress and how identified data load risks are reducing as the cleanse proceeds. In order to instil a permanent awareness in the users involved, these artefacts should be designed and published so that they can serve as the basis for on-going permanent data quality reporting and monitoring post go-live. It is also helpful to introduce a fun element by, for example, branding the data quality programme and having prizes for individuals or teams who perform the best in the cleansing process. In one project we produced a user friendly “data quality handbook” which educated and guided, and was adopted and enhanced post go live to be used in the particular business area on an on-going basis. Using these techniques consistently and visibly throughout the data migration, leads naturally to the evolution of some of the core foundations of data governance such as data stewardship, because for the first time individuals are being made accountable for data in a controlled and monitored environment. On large programmes this invariably sooner or later touches on the big issue of master data that is shared across the enterprise, leading to a practical understanding of the need for the upper echelons of a data governance organisation.

Dylan Jones: What has been the reaction from customer management? Do they see the value of leveraging data quality post-migration or is there any initial rejection or blockading?

Bryn Davies: Relevant management attends project progress sessions and steercos throughout the project, so it is important to use these opportunities to market the data quality work taking place by, for example, the readily available reporting and monitoring artefacts already in use. They are also invariably involved in some of the more important data decisions needed by the data teams and stewards, and so the education process is organic. Progress meetings are also opportunities to pro-actively raise the most critical data issues from a business impact perspective, backed by their top of mind expectations of the new system. As long as these opportunities are leveraged and the communication occurs, the support will be there post migration.

Dylan Jones: Finally, in hindsight, after completing many migrations, are there any steps or measures you wish you could go back and introduce? Any lessons you can share for other companies who want to leverage data quality post-migration?

Bryn Davies: Start the data migration and data quality discussion as early as possible, preferably before the project even kicks off. Point out the expected shortcomings due to the SI’s almost certain exclusion of data quality remediation, and engage management on the business level early on, so ensuring that there is an appreciation for the funding that will be required to do the data migration properly. Profile as much as possible of the source data early during project planning, even if the scope is not yet clear, so that there is factual input to the planning and further phases, rather than the usual guesstimates and the all-time classic line: “it should be OK”.