Journalism Sites

All of a sudden, “curation” is one of the hottest words in the Web 2.0 world. That’s because it’s an idea that addresses a problem humans have never confronted before: too much information. In the process, it’s creating some compelling new ways to derive value from content.

Amount of data published in 2010 depicted as iPads stacked on the playing field of Wembley Stadium

Content curation is about filtering the stuff that people really need from out of all the noise around it. In the same way that museum curators choose which items from a collection to put on display, content curators select and publish information that’s of interest to a particular audience.

This function is becoming more and more critical as the volume of information on the Internet explodes. It’s projected that the amount of digital information that will be created in 2010 could fill 75 billion 16 GB Apple iPads (fun infographic here). Yet, as influencer relations expert Katie Paine points out, 90% of it is crap. As more and more crappy content pervades the Internet, the value of curation should grow.

The problem is that curation is labor-intensive. Someone has to sift through all that source information to decide what to keep and what to throw away, and human decision-making isn’t easy to automate. Keyword filtering has all kinds of shortcomings and RSS feeds, while useful in many contexts, are basically headline services.

We’ve recently been working with a startup that’s developed an innovative technology that vastly improves the speed and quality of content curation. CIThread has spent the last 15 months building an inference engine that uses artificial intelligence principles to give curators a kind of intelligent assistant. The company is attacking the labor problem by making curators (or you can call them “editors”) more productive rather than trying to replace them.

Full disclosure: We have received a small equity stake and a referral incentive from CIThread as compensation for our advice. Other than that, the pay has amounted to a couple of free lunches. We make no money unless this idea is as good as we think it is.

CIThread (the name stands for “Collective Intelligence Threading” and yeah, they know they have to change it) essentially learns from choices that an editor or curator makes and applies that learning to delivering better source material.

The curator starts by presenting the engine with a basic set of keywords. CIThread scours the Web for relevant content, much like a search engine does. Then the curator combs through the results to make decisions about what to publish, what to promote and what to throw away.

As those decisions are made, the engine analyzes the content to identify patterns. It then applies that insight to delivering a better quality of source content. In effect, it learns to “think” like the curator. CIThread can be linked to popular content management systems to make it possible to automatically publish content to a website and even syndicate to Twitter and Facebook without leaving the curation dashboard.

That’s what happens on the back end, but there’s intelligence on the audience side, too. CIThread can also tie in to Web analytics engines to fold audience behavior into its decision-making. For example, the curator can set the engine to overweight content that generates a lot of views or clicks into its decisions and to deliver more source material just like it to the curator. All of these factors can be controlled via a dashboard.

Shhhhh!

CIThread is still pretty early stage. It has some test customers, but none can yet be identified. Here’s a general description of what one of them is doing, though.

This company owns a portfolio of properties throughout the US and uses localized websites as both a marketing and customer service tool. Each site contains frequently updated news about the region, but the portfolio is administered centrally for cost and quality reasons.

Using CIThread, individual editors can now maintain literally dozens of these websites at once. The more the engine learns about their preferences, the more sites they can support. That’s one of the coolest features of inference engines: they get smarter the more they’re used.

The technical brain behind CIThread is Mike Matchett, an MIT-educated developer with a background in computational linguistics and machine learning. The CEO is Tom Riddle (no relation to Lord Voldemort), a serial entrepreneur with a background in data communications, storage and enterprise software.

The two founders started out targeting professional publishers, and that’s a pretty safe bet. But we think the opportunity is much bigger. Nearly any company or organization today can develop unique value for its constituents by delivering curated content. Using tools like CIThread, they can do it more quickly and productively than by training humans. They can also capture the knowledge of their editors so that experience doesn’t walk out the door due to resignation or layoff.

Since we first wrote this, a couple of other tools have come to our attention that attack this same curation task. Curata has an engine that scours the Web for content and auto-posts it to blogs and social network sites. The company has a shipping product and real customers. Curata is positioning its service as more of a lead generation tool than an editorial productivity aid. See the two-minute video below.

CurationStation looks a lot like Curata. It’s a low-cost service that filters content based upon keywords and publishes automatically to multiple destinations. The $2.99 signup incentive is attractive, but set a reminder on your calendar, because it turns into a $279 monthly fee after the first 30 days. If anyone has experience with either of these products, or is aware of other solutions, please comment.

Comments

comments

This entry was posted
on Friday, July 30th, 2010 at 5:12 am and is filed under Future of Journalism, Newspapers, Solutions.
You can follow any responses to this entry through the RSS 2.0 feed.
Both comments and pings are currently closed.

[…] of Information requests from James Ball: Some great advice on making requests for information. Tools to Empower a New Kind of Journalism: Interesting piece on curation by Paul Gillin. Not-so lazy journalism: A great spot over at […]

In my experience, editors stopped editing, other than via automated content- and copy-editing many years ago. Not sure what they were doing, other than occasionally holding forth in meetings … kidding. SK