How to think about organizing data

I saw many cool demos at the SAS Analytics event. Image courtesy of SAS.

This week was full of travel to sunny and not-so-sunny locales. On Tuesday, I spent the day in San Diego at the SAS Analytics conference. (They are a sponsor of the show and the newsletter, but I paid my own way.) While there, I spent a lot of time talking about how to use data as well as about common misconceptions around data.

Thanks to the internet of things and all of these cheap sensors, businesses are awash with data. And yet, many are still struggling to figure out what to do with it. Bill Roberts, director of the global IoT practice at SAS, told me that one of his manufacturing customers was tracking 125 different data streams about eight to 10 years ago but today has 5,000. When it was only tracking 125 pieces of data it had a clear understanding of how each stream of information affected its manufacturing process.

Now that this company has 5,000 streams of data it has lost a lot of understanding around how the measurements affect its manufacturing line. Consumers can likely relate to this issue simply by thinking about their own lives. I have lots of devices giving me data — from my fitness tracker to my air quality monitors — but I have little to no understanding of how my number of steps tracks to my overall feeling of health.

Roberts’ takeaway is that there are no one-to-one relationships between data and a single result. And that if you want to figure out how to read different streams of data, you’re going to have to develop a keen understanding of the business process you are trying to measure. In complex systems such as factories, the human body, public health, and even large organizations, that will be ridiculously hard.

But there’s room for optimism. Another common trope of the data industry is that data is the new oil. And it leads businesses to agonize over partnerships as they try to collect data needed to build new products and services.

Yet, for many uses, sensors are cheap. Which means there are multiple ways to get the data you need. So while companies might embed sensors and proprietary software into their products in an effort to capture data for later use, it’s also possible to get similar information from sensors outside of a device. For example, one might track a machine’s health using the proprietary sensors from the machine’s manufacturer, or one might pop a sensor that measures nearby noise and get similar information from another company entirely.

In many cases, it’s possible to track something indirectly using algorithms and more generalized sensors. The key is that you have to get a large enough install base — and good data scientists — to develop the algorithms in the first place.

So there’s hope there. Once they have the algorithms making sense of the data, companies need to think about how they present the insights derived from that information. For many businesses, even if they understand their processes and how the data they collect relates to those process, they fall down when it comes to communicating the results to employees. Line workers on a factory floor don’t need Tableau tables; they need dashboards.

Finally, whatever the data says, it’s not enough. The business has to develop a process around those insights. So if the dashboard tells a line worker there is a problem, the business needs to offer a way to deal with it along with incentives to ensure that the process works. In other words, while data is indeed a challenge, in most cases it’s the easiest part of a digital transformation.