Google brain simulator identifies cats on YouTube

When computer scientists at Google's mysterious X lab built a neural network of 16,000 computer processors with one billion connections and let it browse YouTube, it did what many web users might do -- it began to look for cats.

The "brain" simulation was exposed to 10 million randomly selected YouTube video thumbnails over the course of three days and, after being presented with a list of 20,000 different items, it began to recognise pictures of cats using a "deep learning" algorithm. This was despite being fed no information on distinguishing features that might help identify one.

Picking up on the most commonly occuring images featured on YouTube, the system achieved 81.7 percent accuracy in detecting human faces, 76.7 percent accuracy when identifying human body parts and 74.8 percent accuracy when identifying cats. "Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not," the team says in its paper, Building high-level features using large scale unsupervised learning, which it will present at the International Conference on Machine Learning in Edinburgh, 26 June-1 July. "The network is sensitive to high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained it to obtain 15.8 percent accuracy in recognising 20,000 object categories, a leap of 70 percent relative improvement over the previous state-of-the-art [networks]."

Advertisement

The findings -- which could be useful in the development of speech and image recognition software, including translation services -- are remarkably similar to the "grandmother cell" theory that says certain human neurons are programmed to identify objects considered significant. The "grandmother" neuron is a hypothetical neuron that activates every time it experiences a significant sound or sight. The concept would explain how we learn to discriminate between and identify objects and words. It is the process of learning through repetition. "We never told it during the training, 'This is a cat,'" Jeff Dean, the Google fellow who led the study, told the New York Times. "It basically invented the concept of a cat." "The idea is that instead of having teams of researchers trying to find out how to find edges, you instead throw a ton of data at the algorithm and you let the data speak and have the software automatically learn from the data," added Andrew Ng, a computer scientist at Stanford University involved in the project. Ng has been developing algorithms for learning audio and visual data for several years at Stanford.

Since coming out to the public in 2011, the secretive Google X lab -- thought to be located in the California Bay Area -- has released research on the Internet of Things, a space elevator and autonomous driving.

Its latest venture, though not nearing the number of neurons in the human brain ( thought to be over 80 billion), is one of the world's most advanced brain simulators. In 2009, IBM developed a brain simulator that replicated one billion human brain neurons connected by ten trillion synapses.

However, Google's latest offering appears to be the first to identify objects without hints and additional information. The network continued to correctly identify these objects even when they were distorted or placed on backgrounds designed to disorientate. "So far, most [previous] algorithms have only succeeded in learning low-level features such as 'edge' or 'blob' detectors," says the paper.

Advertisement

Ng remains skeptical and says he does not believe they are yet to hit on the perfect algorithm.

Nevertheless, Google considers it such an advance that the research has made the giant leap from the X lab to its main labs.