Update: Forget Jeeves, ask Powerset

Last month, I blogged about Powerset, the first Google competitor that really nails natural-language search. VentureBeat reports that Microsoft just purchased the Silicon Valley start-up at the rumored price of $100 million. To follow up on this news, here's my introduction to Powerset from May, with some fresh analysis on what this buyout means.

Remember Ask Jeeves? The search engine branded itself as the web’s trusty maître d’. Type in your query – feel free to phrase it as a question – and Jeeves suggested where you could find an answer. But Jeeves turned off many users by directing them toward rather irrelevant websites. Since then, most search engines – including the Jeeves’ replacement, Ask.com – ignore all of those who, what, when, where, and whys. They just pluck out keywords.

Along comes Powerset. This startup website actually reads what you wrote. The search engine encourages you to write the way you speak, and then uses your phrasing to search entries in Wikipedia.

Type in “Who started Google?” and Powerset’s first response is portraits of Sergey Brin and Larry Page, Google’s co-founders. The labeled pictures link to their Wikipedia entries. If that wasn’t quite what you wanted, Powerset offers other links, just like any other search engine.

Google has incorporated a few instant replies – answers that don’t require you to link to another page. The search engine field can convert measurements, calculate exchange rates, and answer “What is the population of Chile?”

But if you rephrase the question and ask Google “How many people are in Chile?” the search engine doesn’t answer “16,284,741,” as it did before. Powerset, on the other hand, answers bothversions with the correct number.

The site is not perfect. For one, it only searches Wikipedia entries, so its pool of knowledge is incomplete and possibly inaccurate. The startup hopes to refine the algorithm and expand its sources, maybe one day allowing Powerset to search the entire web.

Update: Another strike against Powerset is that it's really good at answering some questions and awful at others. I must have been lucky back in May because it answered almost all of my questions correctly the first time around. Now, as I play with it again, it's batting about a .400. It seems happier with "who" questions rather than "when." I bet this is because of its source – Wikipedia has comprehensive entries relating to people and places, and pretty meager posts on individual years. Let's hope Microsoft helps the search engine improve its swing.

For now, though, Powerset is an early player in what’s called “semantic search” – that’s an ugly, technical term for teaching computers to understand natural language. Many futurists think that the “semantic Web” will be the next wave in Internet sites – a so-called Web 3.0. Expect a lot more of these natural-language options to come.

Update: Google has been a bit dismissive of semantic search, preferring (for now at least) its quick keyword approach. But this Microsoft news puts a lot of weight – and $100 million – behind the notion that web users want to ask questions to a search engine, not just feed it keyword clues. We have yet to see if Microsoft will keep the Powerset name or, more likely, integrate the technology into its Live Search. That site certainly needs some help. The company has fought a losing battle against Google and Yahoo for years now. Despite its best efforts and even cash incentives, Microsoft has not been able to distinguish itself. Offering a strong semantic search option is a good way to reboot the challenge.

But it had better get moving. Other semantic startups, such as Hakia and TextDigger, might not have the media attention that has been given to Powerset, but a sudden breakthrough or similar buyout from a competitor could quickly change the equation. Also, Google will continue to tinker with its search options. It already suggests that users try fill-in-the-blank search terms like "the parachute was invented by *" and let Google hunt for an answer to the asterisk.