What Problem Does Natural Language Search Solve?

from the just-wondering dept

Matt Marshall recently posted a story about a new search engine looking to raise a lot of money at a very high valuation, which has created quite a bit of buzz as people argue over whether or not the company has a chance, or deserves such a high valuation. Matt followed up with more details on the company, though he still expresses some reasonable skepticism. Like many people, my first reaction on hearing about it was that I can't remember a year that's gone by without someone claiming to have come out with a revolution in natural language search. However, when it comes to search engine news, no one can go through the history and explain why something is a bad idea quite like Danny Sullivan can. He lists out all the attempts at natural language search, and shows how each one failed (in some cases, miserably). He also points out that the problem with natural language search is that it requires everyone to change their behavior. As with any startup, when you're looking at their chances, the big question to ask is pretty simple: what problem does it solve? Plenty of people have figured out how to search with keywords. In fact, many of us find it more natural and faster than trying to construct a natural language query. So, while all the natural language search engines that come along insist that searches suck because they can't understand the the searcher, it's not clear that's the real problem. When people want to use a search engine, they want to find what they want. That means being able to search quickly. Dumping two or three keywords into a box is always going to be a lot faster than figuring out the natural language equivalent. So, perhaps someone can enlighten us. What is the problem natural language search solves?

i don't agree... if implemened right, natural language search is very powerful.

For e.g.

I want to find out why pres bush is ineffective... with the key word search, search engine will spit out all the pages with bush and ineffective which might not contain the answer to my quesitons... on the other hand, a natural language search engine will give me the pages which has the information about why bush presidency is not working even if the pages doesn't have bush and ineffective in them

We definitely need something better than what we h

It's not so much that we need "natural language" search but something more than simple "page ranked query string matching". For example, I am looking for the answer to the following question: "What makes a cell decide that it is a good time to divide itself into two? Is it something based on timing, or some chemical signal or pressure inside the cell or what?" I have tried various keyword combinations and nothing has quite answered it.

I agree that the real problem is not the search query, but the index. Someting may be a relevant document but not have the search terms imbedded in it. Imagine someone asking you a question. Would you use the exact same words in your answer? Not always. This is the essence of the search problem, and what "natural language" usually claims to fix (and fails).

RE: dot dot dot

Autonomy?

OK folks. Thanks for all the examples of why natrual language search will work.

Autonomy does that. We implemented that in our company... good stuff.

They can even read the contents of video and audio files and index the words spoken in the files. Imagine being able to jump right to the spot where the words were spoken is a video or audio file... Autonomy can do that.

I assume nobody is suggesting that everybody should start using natural language searches instead of keyword searches. The natural language search would be solving the problem that there are people who want to use a natural language search. My wife almost always uses complete sentences in searches on Google. It’s a little embarrassing...

Someting may be a relevant document but not have the search terms imbedded in it. Imagine someone asking you a question. Would you use the exact same words in your answer? Not always.

What would be ideal would be if a search engine would match up the search term/phrase used with the keywords in the resulting page of a successful search. The next time somebody enters a similar search phrase, those pages that answered the first user’s query would be given more weight to the second user. It would involve somehow guessing if a search was successful or not, which may or may not be possible.

The real issue

isn't the search engines we currently have right now... it's the people using the search engines. I don't know how many times I've had someone tell me they can't find "such and such about such" online, and i google such, such and such and what they're looking for is within the first few pages. on a rare occasion i have to do a broader search for something and do a little perusing through pages, but really, it takes 10 minutes tops to find ANYTHING AT ALL ONLINE with todays search engines.

What it solves

Natural language search solves the problem of information retrieval as opposed to "search".
Earlier today, I needed to find what time trains leave New York Penn Station for Metropark on October 25. To do so, I googled "new jersey transit". Google was smart enough to give me, as an option under the njtransit.com website, a deeplink to the rail schedules. But I then had to go to the schedules, select New York Penn Station as the departure station, Metropark as the arrival station, and the Weekday schedule, and then submit the request to get the schedules I wanted.
That's fine - but the real question is "what time do trains leave new york penn station for metropark on october 25," and it would be quicker and easier to enter that question and have it give me the schedules I wanted.

One other point to be made in favor of natural language search is that it serves the purpose of the so-called "advanced search" interfaces to most search engines far better than trying to teach the general public about boolean search terms or regular expressions.

I am quite at home using the more advanced search features of most search engines to pull out the specific details I am interested in, but my parents wade through pages and pages of crap trying to get to the document they are looking for. This is good for google et al. because the user is exposed to more ads, but only because of a failure of their interface to serve anything but the most primitive queries for the general user.

Natural Language? Naturally!

Powerset, even by the name, is going to offer a lot more than natural language capability. The holy grail is to harness AI in ways Google, Yahoo, and MSN (with their amazing neural net approach) have yet to do. Also significant is that most people on earth don't use online searches yet. This will change and online search will be the overwhelming choice of an info hungry world. When it does, natural language is the obvious approach to queries.

Re: What it solves

The trouble with your transit question is that the page probably doesn't exit. The database has the info and the search engine can only get you to the gateway page. I typed "yyz arrival flights" (Google and Yahoo) and got the link to the gateway page in the first result. Then typed in the departure city and got the result with updated arrival time. Hard to beat. Tried the same searches but with the departure city included and didn't get the result on the 1st page of either engine. Too much detail confused them into returning too many results.