5 Reasons Healthcare Data Is Unique and Difficult to Measure

Editor’s Note: Below are both parts one and two of our series on the uniqueness of healthcare data.

Those of us who work with data tend to think in very structured, linear terms. We like B to follow A and C to follow B, not just some of the time, but all the time. Healthcare data isn’t that way. It’s both diverse and complex making linear analysis useless.

There are several characteristics of healthcare data that make it unique. Here are five, in particular:

1. Much of the data is in multiple places.

Healthcare data tends to reside in multiple places. From different source systems, like EMRs or HR software, to different departments, like radiology or pharmacy. The data comes from all over the organization. Aggregating this data into a single, central system, such as an enterprise data warehouse (EDW), makes this data accessible and actionable.

Sometimes the same data exists in different systems and in different formats. Such is the case with claims data versus clinical data. A patient’s broken arm looks like an image in the medical record, but appears as ICD-9 code 813.8 in the claims data.

And it looks like the future holds even more sources of data, like patient-generated tracking from devices like fitness monitors and blood pressure sensors.

2. The data is structured and unstructured.

Electronic medical record software has provided a platform for consistent data capture, but the reality is data capture is anything but consistent. For years, documenting clinical facts and findings on paper has trained an industry to capture data in whatever way is most convenient for the care provider with little regard for how this data could eventually be aggregated and analyzed. EMRs attempt to standardize the data capture process, but care providers are reluctant to adopt a one-size-fits-all approach to documentation. Thus, unstructured data capture is often allowed to appease the frustrated EMR users and avoid hindering the care delivery process. As a result, much of the data captured in this manner is difficult to aggregate and analyze in any consistent manner. As EMR products improve, as users become trained to standard workflows, and as care providers become more accustomed to entering data in structured fields as designed, we will have more and better data for analytics.

An example of the above phenomenon is found in a recent initiative to reduce unnecessary C-sections at a large health system in the Northwest. The first task for the team was to understand how the indications for C-section were documented in the EMR. It turned out that there were only two options to choose from: 1) fetal indication and 2) maternal indication. Because these were the only two options, delivering clinicians would often choose to document the true indication for C-section in a free text form, while others did not document it at all. Well, this was not conducive to understanding the root cause of unnecessary C-sections. So, the team worked with an analyst to modify the list of available options in the EMR so that more detail could be added. After making this slight modification to the data capture process, the team gained tremendous insight, and identified opportunities to standardize care delivery and reduce unnecessary C-sections.

3. Inconsistent/variable definitions; Evidence-based practice and new research is coming out every day.

Oftentimes, healthcare data can have inconsistent or variable definitions. For example, one group of clinicians may define a cohort of asthmatic patients differently than another group of clinicians. Ask two clinicians what criteria are necessary to identify someone as a diabetic and you may get three different answers. There may just not be a level of consensus about a particular treatment or cohort definition.

Also, even when there is consensus, the consenting experts are constantly discovering newly agreed-upon knowledge. As we learn more about how the body works, our understanding continues to change of what is important, what to measure, how and when to measure it, and the goals to target. For example, this year most clinicians agree that a diabetes diagnosis is an Hg A1c value above 7, but next year it’s possible the agreement will be something different.

There are best practices established in the industry, but there’s always ongoing discussion in the way those things are defined. Which means you’re trying to create order out of chaos and hit a target that’s not only moving, but seems to be moving in a way you can’t predict.