Deep Learning in Artificial Neural Networks (NNs) is about
credit assignment across many (not just a few)
subsequent computational stages or layers,
in deep or recurrent NNs. To our knowledge,
the ancient expression "Deep Learning"
was introduced to
the NN field by Aizenberg & Aizenberg & Vandewalle's book (2000) - more
in this G+ post.
The field itself is much older though.
The first Deep Learning systems of the feedforward multilayer perceptron type were created half a
century ago
(Ivakhnenko et al., 1965, 1967, 1968, 1971). The 1971 paper already described an adaptive deep network with
8 layers of neurons. (Our first recurrent Deep Learners were published
much later, in 1991 - more below.)

Recently the field has experienced a resurgence.
Since 2009, our Deep Learning team has won 9
(nine) first prizes in important
and highly competitive international pattern recognition competitions
(with secret test set known only
to the organisers), far more than any other team.
Our neural nets also were the first Very Deep Learners to win such contests (e.g., on classification, object detection, segmentation),
and the first machine learning methods to reach
superhuman performance in such a contest.
Here the list of won competitions
(details in the rightmost column):

Remarkably, none of 1-9 & A-D above required the traditional
sophisticated computer vision techniques developed over the past six decades or so.
Instead, our biologically rather plausible systems are inspired by human brains,
and learn to recognize objects from numerous training examples.
We use deep, artificial, supervised,
feedforward or recurrent (deep by nature) neural networks with many
non-linear processing stages.

We started work on Very Deep Learning a quarter-century ago. Back then,
Sepp Hochreiter (now professor)
was an undergrad student working on Schmidhuber's neural net project. His 1991 thesis [2b]
is a Deep Learning milestone:
it formally showed that deep networks like the above are hard to train
because they suffer
from the now famous problem of vanishing or exploding gradients.
Since then we have developed various techniques to overcome this obstacle
(more here). The first system of 1991 used a stack of recurrent neural networks (RNNs) pre-trained in unsupervised fashion [2a,2c] to compactly encode input sequences, where lower layers of the hierarchy learn to extract compact sequence representations fed to higher layers. This can greatly accelerate subsequent supervised learning [2a,2c].
See [2c] for
an experiment with 1200 nonlinear virtual layers.
The 1991 system also was the first Neural Hierarchical Temporal Memory.

Today's computers are over a million times faster than those of 1991. We also use
graphics cards or GPUs (mini-supercomputers
for video games, see picture in 2nd column) to speed up learning on standard
CPUs by a factor of up to 50. In 2010, we broke the MNIST (benchmark C) record through GPU-based plain backprop for standard NNs (no unsupervised pre-training, no convolution etc) [6].
Our committees of networks improve the results even further [9-13,17-19].

Pattern recognition through Deep Learning
is becoming important for thousands of
practical applications. For example,
the future of search engines lies in image and video recognition as opposed to
traditional text search. Autonomous robots such as
driverless cars can greatly profit from it, too (see competitions 4,6). Deep Learning may even have lifesaving impact through
medical applications such as cancer detection, perhaps the most important application area, with the highest potential impact (see competitions 8,9).
Reference [14] uses fast deep nets
to achieve superior hand gesture recognition.
Reference [16] uses them to achieve superior
steel defect detection, three times better than
support vector machines (SVM)
trained on commonly used feature descriptors.

Our simple training algorithms for
deep, wide, often recurrent, artificial neural networks
similar to biological brains have won
competitions on a routine basis and yield
best known results on many famous benchmarks for computer vision, speech recognition, etc. Shown in this page are example images of traffic signs, Chinese characters,
connected handwriting, human tissue, sliced brains, etc.

.

We are currently experiencing a second Neural Network
ReNNaissance (title of JS' IJCNN 2011 keynote) - the first one happened in the 1980s and early 90s.
In many applications, our deep NNs are now outperforming all other methods
including
the theoretically less general and less powerful support vector machines
(which for a long time had the upper hand, at least in practice).
Check out the, in hindsight, not too optimistic
predictions of our RNNaissance workshop at NIPS 2003,
and compare the RNN book preface.

[2d] S. Hochreiter, J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997. Based on TR FKI-207-95, TUM (1995). PDF. Led to a lot of follow-up work, and is now used by leading IT companies all over the world.

[2c] J. Schmidhuber. Habilitation thesis, TUM, 1993. PDF. An ancient experiment with credit assignment across 1200 time steps or virtual layers and unsupervised pre-training for a stack of recurrent NNs
can be found here - try Google Translate in your mother tongue.

Ongoing work on active perception.
While Deep Learning methods tend to work fine in many applications,
they are passive learners - they do not learn to actively
search for the most informative parts of the sensory input. Humans, however,
use sequential gaze shifts for visual pattern recognition.
This can be much more efficient than the fully parallel one-shot approach.
That's why we want to combine the algorithms above with variants of our old method of 1990 [1a] - back then
we built what to our knowledge was
the first artificial fovea sequentially steered by a learning neural controller.
Without a teacher, it used a variant of reinforcement learning
to create saccades and find targets in a visual scene (and to track moving targets), although computers were a million times slower back then:

Copyright notice (2013):
This page was derived from the computer vision page.
Fibonacci web design by
J.S.,
who
will be delighted if you use this web page
for educational and non-commercial purposes, including
articles for
Wikipedia and similar sites.
Last update December 2015..

COMPETITION DETAILS

Links to the original datasets of competitions 1-9 and benchmarks
A-D
mentioned in the leftmost column,
plus more information on the world records set by our team:

9. 22 Sept 2013: our deep and wide MC GPU-MPCNNs [8,17,18]
won the MICCAI 2013 Grand Challenge on Mitosis Detection (important for cancer prognosis etc).
This was made possible through the efforts of Dan and Alessandro [20].
Do not confuse this with the earlier ICPR 2012 contest below!
Comment: When we started our work on Very Deep Learning over two decades ago, limited computing power forced us to focus on tiny toy applications to illustrate the benefits of our methods. How things have changed!
It is gratifying to observe that today
our techniques may actually help to improve healthcare and save lives.

D. As of 1 Sep 2013, our Deep Learning Neural Networks are the best artificial offline recognisers of Chinese characters from the
ICDAR 2013 competition
(3755 classes), approaching human performance [23].
This is relevant for smartphone producers who want to build phones that can translate photos of foreign texts and signs.
As always in such competitions, GPU-based pure supervised gradient descent (40-year-old backprop) was applied to deep and wide multi-column networks with interleaving max-pooling layers and convolutional layers (multi-column GPU-MPCNNs) [8,17]. Many leading IT companies and research labs are now using this technique, too.

8.ICPR 2012 Contest on Mitosis Detection in Breast Cancer Histological Images (MITOS Aperio images).
There were 129 registered companies / institutes / universities from 40 countries, and 14 results.
Our team (with Alessandro & Dan) clearly
won the contest (over 20% fewer errors than the second best team, first Deep Learner to win a contest on object detection).
See ref [20], as well as the later MICCAI 2013 Grand Challenge above.

7. ISBI 2012
Segmentation of neuronal structures in EM stacks challenge.
See the TrakEM2 data sets of INI.
Our team won the contest on all three evaluation metrics
by a large margin,
with superhuman performance in terms of pixel error (March 2012) [15].
(First pure image segmentation competition won by a Deep Learner; ranks 2-6 for researchers at ETHZ, MIT, CMU, Harvard.)
This is relevant for the recent huge brain projects in Europe and the US, which try to build 3D models of real brains.

C. The MNIST dataset of NY University, 1998. Our team set the new record (0.35% error rate) in 2010 [6] (through plain backprop without convolution or unsupervised pre-training), tied it again
in January 2011 [8], broke it again in March 2011 (0.31%) [9], and again (0.27%, ICDAR 2011) [12],
and finally achieved the first human-competitive result: 0.23% [17] (mean of many runs; many individual runs
yield better results, of course, down to 0.17% [12]).
This represented a dramatic improvement, since by then the
MNIST record had hovered around 0.4% for almost a decade.

B.NORB object recognition dataset for stereo images, NY University, 2004.
Our team set the new record on the standard set (2.53% error rate) in January 2011 [8],
and achieved 2.7% on the full set [17] (best previous result by others: 5%).

A. The
CIFAR-10 dataset of Univ. Toronto, 2009.
Our team set the
new record (19.51% error rate) on these rather challenging data in January 2011 [8],
and improved this to 11.2% [17].

ThreeConnected Handwriting Recognition Competitions at ICDAR 2009 were won by
our multi-dimensional LSTM recurrent neural networks [3,3a,4] through
the efforts of Alex. This was the first RNN system ever to win an official international pattern recognition competition. To our knowledge, this also was the first Very Deep Learning system ever (recurrent or not) to win such contests:

Our algorithms not only were the first Very Deep Learners to win official international competitions with secret test sets (since 2009) and to become human-competitive, they also have numerous immediate industrial and medical applications. Apple & Google and many others adopted our techniques.
Are you an industrial company that wants to solve
interesting pattern recognition problems? Don't hesitate to contact
JS.
We already developed: