Become a Fan

January 29, 2018

Data Mining & Easter Eggs

A: No, it's data mining. Knowledge is two steps up the added value chain. Once something becomes knowledge, you don't really need data mining. So let's first talk about the added value chain.

Data is less valuable than information. You can have bad data. It’s still data. You cleanse the data and organize it then you have information. Encrypted data is still data. It’s not information until you decrypt it. That’s the difference between data and information.

Good, clean, organized data = information. It’s not knowledge until it tells a story. That means a human can understand what it is. Knowledge is information in context. If I say “35 Units are in Department 7” and “27 Units are in Department 8” I have information. But it’s not knowledge until I have complete information. I don’t know the date. I don’t know how many departments there are.

Intelligence is actionable knowledge with appropriate qualifications. We could get into what make makes something actionable. But that has been done here[1].

All of these elevations in value come from human organizing data into information into knowledge into intelligence. But humans have a particular way of organizing and thinking about where to find intelligence. Think of how children search on an easter egg hunt. They don’t brute force search every square inch for things that look like eggs, they try to emulate the thinking of ‘Easter Bunnies’, the parents who hide them. Machines, on the other, search more exhaustively and find patterns that humans don’t consider relevant. A machine might notice for example that light colored eggs were hidden an average of 2 inches from tall vertical surfaces. Children would not notice, they just count the eggs. But machine can surface patterns in the DATA that create a different kind of information, knowledge and intelligence. For example, data mining an easter egg hunt might show based on proximity of egg colors and the color of occluding objects, that some of the people who hid the eggs might be colorblind. Their inability to distinguish certain colors made the hunt easier for non-colorblind children to find the eggs, and therefore the hunt rewarded faster running children without sophisticated hunting tactics. That is surfacing information in the data that humans would never have the inclination or patience to discover, which is the point of having machines do the mining rather than humans.

As soon as you have a machine-sensed pattern in the data, it changes how you organize the data into information. That is what adds value and it changes the way you think about analyzing easter egg hunts in the future.