Taking Advantage of Paradigm Shifts

November
18th,
2012

Sometimes simply waiting for technology change is the quickest way of
moving forwards.

For some computation-intensive tasks, the quickest way to finish a
given computation
(for a given dollar amount of spending)
is simply to wait until a faster machine is available :
Moore’s law will take care of bringing the cost of a suitable machine down
quicker than running the task on
a machine available at today’s prices.

Not just computers…

The pace of sequencing grew enormously during the lifespan of the genome project.
The methods used didn’t just get incrementally better,
there was an entire paradigm shift in the technologies used for sequencing. A strategic outsider
might have realised that the most efficient way to get the project done
was to wait until the paradigm shift occurred,
and then (only after that point) spend wildly to get to the finish line first.

The stuck-in-a-rut feeling

After an initial rush of excitement when it was first devised,
the neural network training technique of backpropagation (BP) was soon
found to present a large number of computational hurdles.

For one, as a gradient descent method, BP proved slow to converge.
For another, networks tend to over-learn,
becoming brittle when presented with unseen data. A large number of techniques
have been devised to circumvent these problems -
by adding momentum terms, scaling / clamping weights, or ‘early stopping’. But at some point,
these ‘tricks of the trade’ become more of a hinderance
to understanding - despite the fact that they seem to help with the problem at hand.

To some extent, the proliferation of new techniques to solve a give problem,
none of which directly address the core issues (e.g. that the chain rule dilutes
information about how to change weights), should act as a signal that
we’re busily exploring a cul-de-sac.

Knowing that the brain has actually solved the problem of consciousness
is a very encouraging piece of data. It tells us that there must almost
certainly be a few paradigm shifts ahead, since we can make a decent guess at whether the sum of the current
(and prospective) set of ‘hacks’ could possibly get us to our goal -
and it seems that we’re going to fall short (since we’re likely entering into
a new plateau phase).

We know there’s a better way : We just don’t know where to jump next (yet).

New paradigms

The current resurgence of neural network research has been spurred by
the application of ever larger amounts of CPU/GPU power on problems
that had previously been ‘stuck’. Not only has computer power grown rapidly, but there has been a reassessment
of the overall tactics the brain uses to solve problems.

Previously, people were interested in solving problems using minimal
representations and efficient networks. Now, a new take on the brain’s use of simple neurons is that
nature has chosen a mechanism that is inherently wasteful of computing resources,
but very efficient in terms of power and robustness.

So, making use of this new perspective,
the commoditization of networked computers,
and the availability of vast datasets,
people can once again tackle old problems with renewed vigour.

New ruts

But let’s not imagine that blindly throwing computer resources at these
problems is going to be the last innovation that will start to be required. Soon
(if it has not already started), these methods will evolve principally
through the accumulation of ‘fixes’, each of which increases efficiency
marginally, but at the expense of clarity of purpose.

What fundamental assumption are we currently making that will be
discovered to be holding us back 5 years from now?