Yesterday, Alexander van Elsas wrote a post about our pointless need for real-time information. As usual, it is a very interesting read, and he gets you thinking. I commented on his post, but I felt like there was more to say. To start, Alexander makes an interesting point of whether information really has any value if anyone can have access to it:

If anyone can have access to any information at any time, what is then the value of that information? As transaction costs to produce, distribute and consume information drop to zero the question arises if the information value itself drops to zero too? My guess is that in many cases the data itself will have less value. That same data all platforms are now fighting a war over, the data that makes web 2.0 more important than the destinations of web 1.0.

Because I have small children, I tend to relate things to simple quotes from cartoons or kids’ movies. Alexander’s quote reminds me of something said by Dash in The Incredibles. He states that if everyone is special, that is the same as saying nobody is special. Both this quote and the quote from Alexander seem to generalize the idea a little too broadly. If everyone has access to information, that does not mean that people get the same things from the information. In the case of news information, some people just skim headlines to stay on top of the general happenings in the world. Other people will read several articles in order to gain knowledge about a topic. They still have access to the same information, but the end result is very different. In the end, knowledge is the differentiator. Alexander does talk about this in relation to Stephen Hawking and information about black holes. If you have all published information about black holes readily available, that does not mean that you understand them as much as Stephen Hawking. I totally agree with this idea.

The main problem I have is that information is not knowledge. If I understand Alexander’s post, he feels the same way. However, he does not take this a step further to determine what this real time access may become. Real time data is just data. The real-time part of it just means that we have access to it quicker. However, quicker access to information is not the problem. Without knowing what to do with the data, the data is useless. I am going to make the same comparison I did in my comment on Alexander’s post.

There is knowledge to be mined

We are currently aggregating all of this real time information into sites like FriendFeed. This is very similar to a trend we saw in the 90s with databases. Many major corporations had various departmental databases. So, marketing may have had some interesting information on the various advertising campaigns, but they did not have any sales information. In order to get the sales information, they had to make a special request to the sales department for a report on the sales during specific periods of time. The sales department had all of this data in their own database, and just needed to write a custom report for the marketing team. People found that the delay in getting these reports was fairly lengthy, and they wondered why the data could not be brought together.

So, data warehousing arouse as a way to centralize the data being generated by various loosely related departmental databases. The simple benefits were immediately obvious. That same report for the marketing group now took a few hours to generate instead of the team waiting for a week to get the report, convert it into a readable format and load it into their own database. On top of this, data mining started to become a more formalised discipline. Once the data from the various departments was aggregated, people noticed that their reports only contained a small subset of the data available. There was a large amount of information that they had never seen before. What did this other information tell them?

For example, large pharmaceutical companies run advertisements all the time. How do they know if they are effective? In the data warehouse model, they can review the sales information for the timeframe of the advertising campaign and the few months after the campaign. If there is a non-seasonal increase in sales of the drug, then the campaign was probably effective. The other information that they could find in the data warehouse is the golden nugget that they are after. The other information that became available is the actual prescription data. These companies can receive daily or monthly feeds of anonymized prescription data from various pharmacy chains. This data will tell them which areas of the country purchase a specific drug more often. In addition, this data can be correlated to the advertising campaign to see if the advertising helped in those areas or even if it helped in areas where a competitor’s drug is selling better.

What does this have to do with real-time information access? First, we are still in the aggregation stage with tools like FriendFeed. Once the aggregation problem is fundamentally solved, people will start clamoring for better tools to help them understand and filter this data. We are currently building our data warehouses of real-time information. We are still waiting for the effective reporting and data mining. Some of this could come from the semantic web technologies and other pieces we probably have not seen yet. However, the mining of the real-time data is the reason we need to collect it. The problem is that you have to collect the information before you can understand what is in it.

4 responses to "Real Time Information is Just Data, Knowledge Comes Later"

Hi Rob, thanks for the followup. I already used to many words so I couldn’t work out the stuff you wrote about 😉 I think you present good examples how data mining can be very useful and valuable (in a business context).
In a private context I am not so sure yet where this will get us. We are going from an evolution where the only information we had was information we shared in conversation, art, via print, to a world in which we can have access to any information that is available online. And now that we have that, we need to find new ways to structure it, reduce it to a level where we can actually cope with it. The quality of information is bound to increase as we invent better algorithms for that.

My original thought however is that even with better mining tools, the information isn’t real knowledge. It doesn’t really provide us insight in what to do with it. In the end it seems to me there are no shortcuts to knowledge and understanding.

And if you then compare that value to the information we exchange with the people we really know well in real-life, I’m inclined to think that that information is more valuable. Simple exchanges of ‘useless’ information, but they tend to have a much bigger impact on our lives then anything found online. Making that exchange ‘priceless’.

Alexander,
I too have too many words in my post and still missed things I wanted to talk about. Knowledge of a topic is useless by itself. It still requires someone that will do something with the knowledge. So we agree there. Information without any action is just data sitting in a database waiting to be reported on. If you share that information with someone, you have taken action. In addition there are assumptions made about that information. The receiver puts a higher value on the information if you are a trusted source. That is a significantly different situation than what we do with the aggregated data in a service like FriendFeed.

I like this article! and I think having data is good to look into so that you can dissect it and interpret it. Now working on this technology and waiting to see how accurate it is, may be worth while to see.

[…] Real Time Information is Just Data, Knowledge Comes Later | Regular Geek regulargeek.com/2009/01/15/real-time-information-is-just-data-knowledge-comes-later – view page – cached For real time information, we are still in the aggregation stage with tools like FriendFeed. Once the aggregation problem is fundamentally solved, people will start clamoring for better tools to help them understand and filter this data. We are currently building our data warehouses of real-time information. We are still waiting for the effective reporting and data mining. — From the page […]