This type of pattern-matching and extraction of facts is part of how Google uses the Web as a database of information. By extracting facts and storing them in a data repository, like Google’s knowledge graph, it makes those facts available as direct answers.

The search engine was described and demoed on 60 Minutes. Only 5 minutes, you can view it from CBS – DARPA: Nobody’s safe on the Internet. Mind – the objective is to help law enforcement track down crime – and to do so through data mining.

Developers haven’t given up on data visualization for the search interface. Etsimo in Finland is a new contender with its SciNet interface. It is available as a demo. This Techcrunch has a short video.

The SciNet approach to the increasingly hard problem of effective search is to involve the human user more by having them steer the algorithmic results — by signaling multiple intents as the process progresses. This generates a dynamic and visible spectrum of results — depending on what they are looking for, or interested in — and allows them to selectively drill down into complex queries in an informed, and self-guided way. The basic idea being that human-steered results are better than algorithms alone.

The visual interface never seems to grab users. Will be interesting to see if this company succeeds.

– Google has patents on something it calls a “browseable fact repository” – this became the Knowledge Graph.

– Google engineers considered a query language for this. It would have had to be something like SPARQL for searching a relational database. Bill Slawski describes the key bits, which we can be pretty sure almost no one would ever learn.

This article provides a good overview with illustrations of how search works. The classic types of search – navigational, informational, and transactional are noted, and also that Google addresses all of these in “using semantic and exploratory techniques to information retrieval”. Search has changed greatly over the past 5 to 10 years to become much more personalized, more monitored, and more commercialized – all concerns that are explored here.

Entities is the key to search-engine placement – not keyword trigger words – but content that names people, places, events, and provides other important information. Bill Slawski gives the example of building a page on Black History that will rank well based entirely on meaningful content.

The best designed sites provide clear navigation structure (such as a taxonomy or table of contents) to direct users to content. That structure, as we learn in this article, informs and guides users, in ways that keyword search doesn’t. Keyword search requires that the user have knowledge and a specific skill.

Navigation serves important functions: it shows people what they can find on the site, and teaches them about the structure of the search space. Using the navigation categories is often faster and easier for users than generating a good search query. Plus, many times site search does not work well or requires users to have a good understanding of its limitations.

Search engines today – especially Google and Bing – seek to identify entities and their relationships. This posting distinguishes between implicit and explicit entities. Explicit is known from structured markup; implicit is inferred from the text on the page.

Google does index tweets, but not a substantial percentage and not very quickly according to this analysis by Eric Enge at Stone Temple Consulting. May mean that Google doesn’t use tweets as a ranking factor.

Google search guru Daniel Russell spoke on the The Revolution in Asking and Answering Questions in this video – showing how new search capabilities empower us to ask new questions and get “deep answers”. Video lecture has many good examples from Google search. Closes with demo of conversational style for question and answer.