I was really looking forward to the keynote by Clay Shirky, and I was not disappointed. The title of his talk was “To Make Sense of Data, First Make Sense of People“.
His central theme is that for a business, knowledge management is not purely knowledge management, and is becoming more & more associated with people management. Change is getting messier, more human, and more social. New tools and techniques are needed, and are becoming available for problem solving.

He started with a story of a collaboration challenge. DARPA set out a challenge and offered a $40K prize for a team to find all 10 weather balloons that were distributed around the country. They wanted to see how quickly it could be done, and to observe exactly how it was accomplished. The MIT team decided to leverage social networks. They offered to pay people who had seen the balloons, and also to pay people who introduce them to people who had seen them. And even a third level of people who find the people who find the people who saw the balloons. What they in fact set up was a distributed query through people across a very large, very badly formatted database. It worked brilliantly. Even more important, DARPA has set aside 30 days to accomplish the challenge, but the MIT solved it 9 hours.

What we are tapping into here is this extra capability we might call “cognitive surplus“. People have time to think and work on thing, but there never before has been a network that would allow them to collaborate on this scale (and at any time or place they wanted).

What is the size of the cognitive surplus that is available? One way to get a feeling of this is to compare this to another big job. Start by asking the question: how many person-hours did it take to create Wikipedia? A study was done to estimate this, and concluded that Wikipedia represents about 100 million hours of work — that is enormous. But how does it compare to the amount of cognitive surplus that is available? We might compare this to the amount of time watching TV. It is estimated that America spends about 200 billion hours watching TV every year. Think about that: a Wikipedia could be created in one year with just 1/2000 of the time spend watching TV. In fact, a Wikipedia worth of time is wasted every weekend just watching ads. Nobody has been previously networked at this scale, and it has never before been possible to easily share this information.

Information management and people management now overlap. Knowledge management is not something that is extracted at the end of Friday and dumped into an archive. Instead, knowledge management is a real time that people are doing all the time while they work.

This opens up some important new opportunities for collecting data on a scale, and with an ease that has not been possible before — and turns the traditional data collection ideas around. Consider a project to distribute the collection of data for flu surveillance. This is typically tracked by officials reporting for a county, to a region, and from there to a larger area and so on. Someone decided to try something different; go to each local clinic, and just ask them to fax in a single number every day representing how many cases of flu they observed. The data collection exported pointed out all sorts of problems: there were no controls to assure that data was consistent, or even was representing the same thing. However, cleanliness of data does not matter, when you worry only about the trends. If you are not trying to measure exactly how many cases there are, but instead just want to know if it is increasing or decreasing, then this is good enough. The results clearly showed that the peaks of flu cases matched the official records. The real advantage is how much faster this is collected. The data comes in and can be viewed every day instead of a couple weeks later. This is especially important in the case of a brewing epidemic, to identify trends as quickly as possible.

Consider what they do at Bit.ly, the URL shortening company. They took a look at when a user clicks on X what also do they click on next? Got a good map of strong correlations between topics. This is a map showing real answers to one of the hardest questions to people who deal with information. Normally you have to refine and collect and it takes a lot of time. It makes a dramatic difference to be able to get this kind of instant information.

Told a story about the “stack overflow” web site. Even though Microsoft has access to their own source code, many Microsoft engineers hang out at stack overflow because it is a reality that is best represented there. Inside Microsoft they don’t have access to the real world of what users are doing and needing.

No matter who you are, most of the smart people work for someone else. -Bill Joy

Value is building up outside the firewall. Industry after industry will have to face this. Stack overflow is a general purpose architecture; they found they could re-purpose it for other things. They started looking for groups of people who have intelligence of a particular domain or subject. Find the group, then build a stack exchange site. This is one of the big changes we are seeing. If you as a company do not offer a way for such groups to collaborate around your products, someone else will.

Information Management has been often in practice a write-only medium. Lot of stuff will be stored, but people almost never ask for anything back. What is happening out in the world is a way of rethinking the feedback loop.

This is showing up in unusual places, like Foursquare. They are able to collect incredible information on user behavior. Presented graphically to finish the feedback loop. Now, there is a whole generation has more accurate information about their drinking habits, than they do about their working habits. What if this was done for workers? It is ironic that our social life is better instrumented and better designed for feedback than what goes on inside the organization. There is an opportunity to collect and give it to the user before they ask for it. People don’t normally know what kind of questions they might even ask, and so you have to show them first.

Another example, “Patients Like Me” web site where people can share health information. The outcome is information on what people report on what they are treating with, and how well it is doing. This is hard and expensive to collect otherwise. All of the data is self reported, but aggregated in near real time. There was a large venting thread about depressed people needing support. This aspect of the site is not very medical, but it turns out that this is an important part of of making the whole thing work. People enter their data for aggregation BECAUSE of the venting threads. Don’t make the mistake of eliminating the support for human interactions in a social environment.

All of this defies a lot of the traditional assumptions of aggregating and collecting data. That is a real challenge. It is a huge opportunity as well. Managing knowledge, managing people, together.

Q&A

What about my company of 300 people: Is there a critical threshold for making this kind of collection working? Work with intensity rather than scale. Show the partial results to employees. This can generate value even inside the firewall with smaller groups. Monitor data that allows you to see the trends: Where are there an unmet needs? Try to identify a small trend, try small interventions, instead of large interventions, and then monitor change. For example hand washing and hand sanitation stations for flu prevention. This can be effective because the feedback loops allow for detecting more detailed changes. Start “small and good” and not “large and mediocre”.