The weight transport problem

Jun 30, 2017
• Aidan Rocke

Introduction:

In an excellent paper published less than two years ago, Timothy Lillicrap, a theoretical neuroscientist at DeepMind, found
a simple yet reasonable solution to the weight transport problem. Essentially, Timothy and his co-authors showed that it’s
possible to do backpropagation with random weights and still obtain very competitive results on various benchmarks [2]. The
reason why this is really significant is that it marks an important step towards biologically plausible deep learning.

The weight transport problem:

While backpropagation is a very effective approach for training deep neural networks, at present it’s not at all clear whether
the brain might actually use this method for learning. In fact, backprop has three biologically implausible requirements [1]:

feedback weights must be the same as feedforward weights

forward and backward passes require different computations

error gradients must be stored separately from activations

A biologically plausible solution to the second and third problems is to use an error propagation network with the same topology
as the feedforward network but used only for backpropagation of error signals. However, there is no known biological mechanism
for this error network to know the weights of the feedforward network. This makes the first requirement, weight symmetry, a
serious obstacle.

This is also known as the weight transport problem [3].

Random synaptic feedback:

The solution proposed by Lillicrap et al. is based on two good observations:

Any fixed random matrix may serve as a substitute
for the original matrix in backpropagation provided that on average we have:

where is the error in the network’s output. Geometrically, this is equivalent to requiring that and are within
of each other.

Over time we get better alignment between and due to the modified update rules which means that the first requirement becomes
easier to satisfy with more iterations.

A simple example:

Let’s consider a simple three layer linear neural network that is intended to approximate a linear mapping:

The loss is given by:

From this we may derive the following backpropagation update equations:

Now the random synaptic feedback innovation is essentially to replace step with:

where is a fixed random matrix. As a result, we no longer need explicit knowledge of the original weights in our update equations.
I actually implemented this method for a three-layer sigmoid (i.e. nonlinear) neural network and obtained 89.5% accuracy on the MNIST dataset
after 10 iterations, a result
that is competitive with backpropagation.

Discussion:

In spite of its remarkable simplicity, Timothy Lillicrap’s solution to the weight transport problem is very effective and so I think it
deserves further investigation. In the near future I plan to implement random synaptic feedback for much larger sigmoid and ReLU networks
as well as recurrent neural networks in order to build upon the work of [1].

Considering all the approaches to biologically plausible deep learning attempted so far, I believe this work represents a very important step forward.