You are here:HomeNewsUsing large-scale brain simulations for machine learning and AI

Using large-scale brain simulations for machine learning and AI

June 27, 2012

"We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art." -- Google researchers (Credit: Google Research)

Today’s machine learning technology takes significant work to adapt to new uses. For example, say we’re trying to build a system that can distinguish between pictures of cars and motorcycles.

In the standard machine learning approach, we first have to collect tens of thousands of pictures that have already been labeled as “car” or “motorcycle” — what we call labeled data — to train the system. But labeling takes a lot of work, and there’s comparatively little labeled data out there.

Fortunately, recent research on self-taught learning (PDF) and deep learning suggests we might be able to rely instead on unlabeled data—such as random images fetched off the web or out of YouTube videos. These algorithms work by building artificial neural networks, which loosely simulate neuronal (i.e., the brain’s) learning processes.

Neural networks are very computationally costly, so to date, most networks used in machine learning have used only 1 to 10 million connections. “But we suspected that by training much larger networks, we might achieve significantly better accuracy,” said the Google team.

“So we developed a distributed computing infrastructure for training large-scale neural networks. Then, we took an artificial neural network and spread the computation across 16,000 of our CPU cores (in our data centers), and trained models with more than 1 billion connections.”

“We then ran experiments that asked, informally: If we think of our neural network as simulating a very small-scale ‘newborn brain,’ and show it YouTube video for a week, what will it learn? Our hypothesis was that it would learn to recognize common objects in those videos.

“Indeed, to our amusement, one of our artificial neurons learned to respond strongly to pictures of… cats. Remember that this network had never been told what a cat was, nor was it given even a single image labeled as a cat. Instead, it “discovered” what a cat looked like by itself from only unlabeled YouTube stills. That’s what we mean by self-taught learning.

“Using this large-scale neural network, we also significantly improved the state of the art on a standard image classification test—in fact, we saw a 70 percent relative improvement in accuracy. We achieved that by taking advantage of the vast amounts of unlabeled data available on the web, and using it to augment a much more limited set of labeled data. This is something we’re really focused on—how to develop machine learning systems that scale well, so that we can take advantage of vast sets of unlabeled training data.”

“We’re actively working on scaling our systems to train even larger models. To give you a sense of what we mean by ‘larger’—while there’s no accepted way to compare artificial neural networks to biological brains, as a very rough comparison an adult human brain has around 100 trillion connections. So we still have lots of room to grow.

“And this isn’t just about images — we’re actively working with other groups within Google on applying this artificial neural network approach to other areas such as speech recognition and natural language modeling. Someday this could make the tools you use every day work better, faster and smarter.”

The research happened inside Google’s secretive X laboratory, known for inventing self-driving cars and augmented reality glasses, where a small group of researchers began working several years ago on a simulation of the human brain, the New York Times reports.

Google researchers are not alone in exploiting the techniques, which are referred to as “deep learning” models, the Times said. Last year Microsoft scientists presented research showing that the techniques could be applied equally well to build computer systems to understand human speech.

“The Stanford/Google paper pushes the envelope on the size and scale of neural networks by an order of magnitude over previous efforts,” said David A. Bader, executive director of high-performance computing at the Georgia Tech College of Computing. He said that rapid increases in computer technology would close the gap within a relatively short period of time: “The scale of modeling the full human visual cortex may be within reach before the end of the decade.”

The downside of this pattern recognition is that it can be used for social control. Privacy won’t simply go away; it will be held exclusively in the hands of the controllers. It won’t be very long before they will have the capacity to track ALL of us, simultaneously, via ubiquitous CCTV and satellites fedding pattern-recognizing systems and only those who play ball without a hint of deviant behavior will have access to the perks of the control elite, such as real life extension, as in Ira Levin’s “This Perfect Day.”

Just wait till they start using neuromorphic chips. Then the fun starts. Again I stress. Don’t replicate the dopamine circuits. Then it will have needs. Then all Hell breaks llose! Nothing worse than a child having a temper tantrum! It will be a naive child!
E

Riiight.. so, this very site FAQ says that a computer functional equivalent to the brain supposedly needs only 10 quadrillion cps and this will cost only $1K by 2020.
Now, a mouse brain has less than 3 orders of magnitude (10 doublings) less neurons than the human brain (and thus exponentially less synapses, let alone the whole rest of computing elements, but I will leave it cheap for now), so why am I the only one who seemingly got a mouse simulation to actually resemble mouse behavior on a cheapa** computer worth a $1K of 2010 dollars?!
(2020 is 10 doublings away from 2010 if you accept the 1year figure and not 18 months, from the same kurzweil’s faq as above, or it would be 2005!)
I seriously doubt all the people ‘working’ on it are so humble as to prefer obscurity as myself (see how humble I am?)