Archives for September 2014

Google’s Pierre Far announced on his Google+ page that Google was releasing a new Panda update that supposedly included some new signals that could potentially help “identify low-quality content more precisely.”

The Google+ post also tells us that this change can help lead to a “greater diversity of high-quality small- and medium-sized sites ranking higher, which is nice.”

A new patent application shows off a quality scoring approach for content, based upon phrases. More on that patent filing below, but it might have something to do with this update.

In the past few years, Google has been busy building what has become known as the Google Brain team, which started out by having its deep learning approach watching videos until it learned to recognize cats.

In creating a knowledge base, there seem to be a number of approaches that can be used to supply entities and facts from sources like web pages and query logs.

In my last post, I wrote about how search queries might be used, along with linguistic patterns, to extract attributes about facts from those search queries, as described in a patent titled Inferring attributes from search queries.

A Microsoft paper from 2009, Named Entity Recognition in Query, tells of a manual analysis they performed of 1,000 queries, and told us that 70% of those queries contained named entities.

So entities do appear in queries, and Google receives a lot of queries a day (as does Microsoft and Yahoo).

Millions of searches stream into Google everyday as people try to meet their informational and situational needs. But those searches don’t disappear after the searches. They provide Google with some very interesting and useful information in return. For instance, they tell Google what people are interested in real time – right at this moment.

Those queries can help Google populate its knowledge base with more information as well.

When Google collects information about entities – people, places, and things, including products and brands, it might collect information about entities as well as information about attributes associated with those entities.

A couple of days ago, the Google Research Blog told us about how it might include that kind of factual information in search results, what they called Structured Snippets. In that post, Google gave us the news that Google finds information like this from Tables across the web.

Entities change all the time, and facts about them do as well. Imagine when Derek Jeter retires from playing baseball, that he might decide to become a coach. Or Tom Cruise acting in a new movie, and deciding to try directing it and producing it as well. And Scotland decides whether or not it should be independent of the UK after 300 years.

What we think of entities can change over time, when it comes to the type of entity they are, and the facts associated with them. When populations of places change, and they do on a regular basis, how does that information get updated? And unfortunately, sometimes some information never quite makes it to Google’s knowledge base.

A patent application published last week looks at some ways that a knowledge base might be updated when a question answering query is asked of it, and the search system notices that some information is missing.