Neural networks have ability to modify its behavior by learning. This makes
neural networks to have powerful information-processing capability. There
are two types of learning, supervised (with teacher) and unsupervised
(without teacher).

The present talk focuses on fundamental mathematical aspects of learning of
neural networks. We begin with a general theory of learning of a neuron,
followed by the mechanism of unsupervised learning of feature extractors and
self-organization maps.

The MLP (multilayer perceptron) is a model of supervised learning, which has
a universal power of approximating any functions. Backpropagation is a
well-known method of learning. However, its learning behavior is known to be
very slow, being trapped in plateaus and taking long time before getting rid
of them.
The present talk uses information geometry to understand the dynamical
behavior of learning in MLP. The set of MLPs forms a Riemannian manifold, in
which the trajectories of learning are described. The manifold includes
continua of singular points, which cause the plateau phenomena. Such strange
geometrical structure is ubiquitous in hierarchical systems. We analyze the
dynamics of learning near singularities explicitly, and propose a modified
method of learning called the natural gradient method, which is free of the
plateau phenomenon.