KPIs Are In Range

Key Performance Indicator Assessment (KPIA) Process

It was an attention seeking statement, “Ladies and gentlemen, I have just found a wallet at the front of the cabin.” I looked up at the flight attendant who was speaking on the public-address system as if he was talking directly to me about my wallet. This strategy has been used by the lead flight attendant on several flights I have taken recently. Each time I react in the same way, by looking up-- even though I know that this strategy is used to get people’s attention about the flight safety presentation that was commencing. Why did I continue to have this small moment of panic and react the same way even though I knew that it was a way of getting my attention? Because, there was a possibility that it was true. I was not 100% sure. Instead, I had a feeling of panic when discovering that something may be missing whether it be an item or piece of information.​A same sense of panic results after reviewing a data set absent of certain information. In essence it is a data gap. In cities with mass transit rail lines (e.g. light rail, commuter rail, subway) there are warning signs posted to remind passengers to be alert about a gap between the platform and the train car. Sometimes these gaps are minimal. At other times these gaps can be as large as a foot. Regardless, the warning should be taken seriously at all times. Data gaps require the same level of care.

When a data gap is discovered it should be documented. An analyst’s mind will be the most objective about the data gap at the time of discovery. As soon as the analyst gets involved in the details of solving the question as to why the data gap occurred, the clarity of capturing descriptive details becomes clouded. At the time of discovery, the data gap documentation should include the name of the data set, the column and rows where the data is missing and any other details that may be relevant to the discovery (e.g. data values that contain out of place characters).

After initial discovery, concerns about the missing data should be discussed with your data team and a plan of investigating the data gap should be developed. The first step in the investigation is to check the data dictionary for any notes from other colleagues who were involved in the initial capture of the data or the data extraction. If no documentation is included in the data dictionary with an explanation, then the next step is to determine the data source. Was the data extracted by:

The analyst directly from a source system?

If yes, then review the extract parameters to make sure that all of the data that was supposed to be pulled from the system was extracted.

If the information is pulled again and the gap still exists, then conduct item by item reviews in the system. If more data is found in an item by item review, there must be something wrong with the extract parameters. Contact the source system vendor and engage in their help desk process.

A programmer from a source system and provided to the analyst:

If yes, the analyst should discuss the data gap with the programmer and work together to determine whether the data truly is missing or was not extracted from the system correctly.

If the investigatory process yields no technical cause for the data gap, then data capture needs to be examined. How was the data captured into the source system? Was the data entered by:

A data entry specialist?

If yes, have a discussion with the specialist to determine a possible cause to the data gap.

Individuals?

If yes, then the investigation has reached a road block. When individual respondents are involved in the data capture process, there is no feasible way to check where data capturing went astray when dealing with individual contributors.

If after reviewing the data capture process the cause of the data gap still cannot be determined, the likelihood that a remedy to recover the missing data has been greatly reduced to highly unlikely. It is now time to enter into a decision-making process to determine how to move forward. Why wouldn’t you move forward? The answer to this question relies heavily on the data that is missing. Consider the following questions that are now relevant to the data set in question:

Is it a higher likelihood that the quality of the data in the entire data set is questionable given the data that is missing?

Are the data values that are missing linked to a key data variable that is crucial to the assessment of the KPI(s) in question?

Can the other data values be analyzed in absence of the data values that are missing?

Will a large part of the data file need to be eliminated from the analysis due to the missing data?

Will the limitations outweigh the conclusions that may be made?

The analyst should work with their team to reach consensus as to how to move forward. If consensus cannot be reach as to how to move forward, then the team should agree to suspend analysis for the KPI(s) in question.

The process of identifying the cause of data gaps and making a decision as to how to move forward is time consuming. When multiple people are involved there will be time spent on research of past practice and the review of meta data. This will involve waiting; however, it is important to figure out why the data gap exists. The process of discovery should not be abandoned because it is taking too long and is complex. Assessing a KPI based on flawed data will cause major issues down the road and should be avoided.Blog #7 Question: Are you taking care to mind the data gap?Blog #8 Sneak Peak: Initial Run Through