A Brief History of Neuromorphic Computing

Mimicking Nature’s Computers

How does nature compute? Attempting to answer this question naturally leads one to consider biological nervous systems, although examples of computation abound in other manifestations of life. Some examples include plants [1–5], bacteria [6], protozoan [7], and swarms [8], to name a few. Most attempts to understand biological nervous systems fall along a spectrum. One end of the spectrum attempts to mimic the observed physical properties of nervous systems. These models necessarily contain parameters that must be tuned to match the biophysical and architectural properties of the natural model. Examples of this approach include Boahen’s neuromorphic circuit at Stanford University and their Neurogrid processor [9], the mathematical spiking neuron model of Izhikevich [10] as well as the large scale modeling of Eliasmith [11]. The other end of the spectrum abandons biological mimicry in an attempt to algorithmically solve the problems associated with brains such as perception, planning and control. This is generally referred to as machine learning. Algorithmic examples include support vector maximization [12], k-means clustering [13] and random forests [14].

Motivation For Neuromorphics

Many approaches fall somewhere along the spectrum between mimicry and machine learning, such as the CAVIAR [15] and Cognimem [16] neuromorphic processors as well as IBM’s neurosynaptic core [17]. For over a decade, we have been considering an alternative approach outside of the typical spectrum by asking ourselves a simple but important question: How can a brain compute given that it is built of volatile components? Exploring this question has lead us to a formalized theory of “AHaH Computing”, designs for a neuromorphic co-processor called “Thermodynamic RAM” and promising results from our first memristive “Knowm Synapses”. But before we talk about what the future landscape of neuromorphic computing may look like, let’s take a look backwards in time at the groundwork and events that led to the current state of the field.

Standing on the Shoulders of Giants

In 1936, Turing, best known for his pioneering work in computation and his seminal paper ‘On computable numbers’ [18], provided a formal proof that a machine could be constructed to be capable of performing any conceivable mathematical computation if it were representable as an algorithm. This work rapidly evolved to become the computing industry of today. Few people are aware that, in addition to the work leading to the digital computer, Turing anticipated connectionism and neuron-like computing. In his paper ‘Intelligent machinery’ [19], which he wrote in 1948 but was not published until well after his death in 1968, Turing described a machine that consists of artificial neurons connected in any pattern with modifier devices. Modifier devices could be configured to pass or destroy a signal, and the neurons were composed of NAND gates that Turing chose because any other logic function can be created from them.

In 1944, physicist Schrödinger published the book What is Life? based on a series of public lectures delivered at Trinity College in Dublin. Schrödinger asked the question: “How can the events in space and time which take place within the spatial boundary of a living organism be accounted for by physics and chemistry?” He described an aperiodic crystal that predicted the nature of DNA, yet to be discovered, as well as the concept of negentropy being the entropy of a living system that it exports to keep its own entropy low [20].

In 1949, only one year after Turing wrote ‘Intelligent machinery’, synaptic plasticity was proposed as a mechanism for learning and memory by Hebb [21]. Ten years later in 1958 Rosenblatt defined the theoretical basis of connectionism and simulated the perceptron, leading to some initial excitement in the field [22].

In 1953, Barlow discovered neurons in the frog brain fired in response to specific visual stimuli [23]. This was a precursor to the experiments of Hubel and Wiesel who showed in 1959 the existence of neurons in the primary visual cortex of the cat that selectively responds to edges at specific orientations [24]. This led to the theory of receptive fields where cells at one level of organization are formed from inputs from cells in a lower level of organization.

In 1960, Widrow and Hoff developed ADALINE, a physical device that used electrochemical plating of carbon rods to emulate the synaptic elements that they called memistors [25]. Unlike memristors, memistors are three terminal devices, and their conductance between two of the terminals is controlled by the time integral of the current in the third. This work represents the first integration of memristive-like elements with electronic feedback to emulate a learning system.

In 1969, the initial excitement with perceptrons was tampered by the work of Minsky and Papert, who analyzed some of the properties of perceptrons and illustrated how they could not compute the XOR function using only local neurons [26]. The reaction to Minsky and Papert diverted attention away from connection networks until the emergence of a number of new ideas, including Hopfield networks (1982) [27], back propagation of error (1986) [28], adaptive resonance theory (1987) [29], and many other permutations. The wave of excitement in neural networks began to fade as the key problem of generalization versus memorization became better appreciated and the computing revolution took off.

In 1971, Chua postulated on the basis of symmetry arguments the existence of a missing fourth two terminal circuit element called a memristor (memory resistor), where the resistance of the memristor depends on the integral of the input applied to the terminals [30,31].

VLSI pioneer Mead published with Conway the landmark text Introduction to VLSI Systems in 1980 [32]. Mead teamed with John Hopfield and Feynman to study how animal brains compute. This work helped to catalyze the fields of Neural Networks (Hopfield), Neuromorphic Engineering (Mead) and Physics of Computation (Feynman). Mead created the world’s first neural-inspired chips including an artificial retina and cochlea, which was documented in his book Analog VLSI Implementation of Neural Systems published in 1989 [33].

Beinenstock, Cooper and Munro published a theory of synaptic modification in 1982 [34]. Now known as the BCM plasticity rule, this theory attempts to account for experiments measuring the selectivity of neurons in primary sensory cortex and its dependency on neuronal input. When presented with data from natural images, the BCM rule converges to selective oriented receptive fields. This provides compelling evidence that the same mechanisms are at work in cortex, as validated by the experiments of Hubel and Wiesel. In 1989 Barlow reasoned that such selective response should emerge from an unsupervised learning algorithm that attempts to find a factorial code of independent features [35]. Bell and Sejnowski extended this work in 1997 to show that the independent components of natural scenes are edge filters [36]. This provided a concrete mathematical statement on neural plasticity: Neurons modify their synaptic weight to extract independent components. Building a mathematical foundation of neural plasticity, Oja and collaborators derived a number of plasticity rules by specifying statistical properties of the neuron’s output distribution as objective functions. This lead to the principle of independent component analysis (ICA) [37,38].

At roughly the same time, the theory of support vector maximization emerged from earlier work on statistical learning theory from Vapnik and Chervonenkis and has become a generally accepted solution to the generalization versus memorization problem in classifiers [12,39].

In 2004, Nugent et al. showed how the AHAH plasticity rule is derived via the minimization of a kurtosis objective function and used as the basis of self-organized fault tolerance in support vector machine network classifiers. Thus, the connection that margin maximization coincides with independent component analysis and neural plasticity was demonstrated [40,41]. In 2006, Nugent first detailed how to implement the AHaH plasticity rule in memristive circuitry and demonstrated that the AHaH attractor states can be used to configure a universal reconfigurable logic gate [42–44].

In 2008, HP Laboratories announced the production of Chua’s postulated electronic device, the memristor [45] and explored their use as synapses in neuromorphic circuits [46]. Several memristive devices demonstrating the tell-tale hysteresis loops were previously reported by this time, predating HP Laboratories [47–51], but they were not described as memristors. In the same year, Hylton and Nugent launched the Systems of Neuromorphic Adaptive Plastic Scalable Electronics (SyNAPSE) program with the goal of demonstrating large scale adaptive learning in integrated memristive electronics at biological scale and power. Since 2008 there has been an explosion of worldwide interest in memristive devices [52–56] device models [57–62], their connection to biological synapses [63–69], and use in alternative computing architectures [70–81].

In early 2014, we published AHaH Computing — From Metastable Switches to Attractors to Machine Learning, a formal introduction to a new approach to computing we call “AHaH Computing” where, unlike traditional computers, memory and processing are combined. The idea is based on the attractor dynamics of volatile dissipative electronics inspired by biological systems, presenting an attractive alternative architecture that is able to adapt, self-repair, and learn from interactions with the environment. We demonstrated high level machine learning functions including unsupervised clustering, supervised and unsupervised classification, complex signal prediction, unsupervised robotic actuation and combinatorial optimization. Later that same year, we published Thermodynamic RAM Technology Stack and Cortical Computing with Thermodynamic-RAM, outlining a design and full stack integration including the kT-RAM instruction set and the Knowm API for putting AHaH-based neuromorphic learning co-processors into existing computer platforms.

kT-RAM

In late 2014, IBM announced their “spiking-neuron integrated circuit” called TrueNorth, circumventing the von-Neumann-architecture bottlenecks, boasting a power consumption of 70 mW, about 1/10,000 th the power density of conventional microprocessors. It is however not currently capable of on-chip learning or adaptation.

What About Quantum Computers?

Quantum computers are an absolutely amazing and beautiful idea. But as Yogi Berra wisely said “In theory there is no difference between theory and practice. In practice there is.” QCs rely on the concept of a Qubit. A qubit can exhibit a most remarkable property called quantum entanglement. Entanglement allows quantum particles to become ‘linked’ and behave not as isolated particles but as a system. The problem is that a particle can become linked to anything, like a stray molecule or photon floating around. So long as we can exercise extreme control over the linking, we can exploit the collective to solve problems in truly extraordinary ways. Sounds great right? Absolutely! If it can be built. In the 30 years since physics greats like Richard Feynman starting talking about it, we have yet to realize a practical QC that works better than off-the-shelf hardware. Why is this? Because Nature abhors qubits. It stomps them out as fast as we can make them. Every living cell, every neuron, pretty much every atom and molecule on the planet is constantly interacting with everything around it, and indeed it is the process of this interaction (i.e. decoherence) that define particles in the first place. Using as a base unit of computation a state of matter that Nature clearly abhors is really hard! It is why we end up with machines like this, where we expend tremendous amounts of energy lowering temperatures close to absolute zero. Does this mean a QC is impossible? Of course not. But it has significant practical problems and associated costs that must be overcome and certainly not overlooked.

In a Google Tech Talk, Seth Lloyd says that quantum computing is inherently difficult to understand.

So will a technology that is hard to understand and use (also assuming it will even deliver on its promises) gain wide adoption if only a few people, the people who built it, know how to use it? Wouldn’t a technology that is “straight-forward, familiar and intuitive” be better? We’re not saying that quantum computing should not be pursued, just that people should keep in mind all the practical considerations when evaluating an AI technology.

Looking Ahead

The race is definitely on to build and commercialize the world’s first truly neuromorphic chips and open up a door to unimaginable possibilities in low-power, low-volume and high-speed machine learning applications. We’re obviously biased toward our own path and believe that our methods and results so far stand their ground and provide a solid foundation to build upon. Alex’s original idea and inspiration over a decade ago was to “reevaluate our preconceptions of how computing works and build a new type of processor that physically adapts and learns on its own”. In a nervous system (and all other natural systems), the processor and memory are the same machinery. The distance between processor and memory is zero. Whereas modern chips must maintain absolute control over internal states (ones and zeros), Nature’s computers are volatile – the components are analog, their states decay, and they heal and build themselves continuously. Driven by the second law of thermodynamics, matter spontaneously configures itself to optimally dissipate the flow of energy. The challenge remaining was to figure out how to recreate this phenomenon on a chip and understand it sufficiently to interface with existing hardware and solve real-world machine learning problems. Years of work designing various chip architectures and validating capabilities lead to the specification of Thermodynamic RAM or kT-RAM for short, a co-processor that can be plugged into existing hardware platforms to accelerate machine learning tasks. Validated capabilities of kT-RAM include unsupervised clustering, supervised and unsupervised classification, complex signal prediction, anomaly detection, unsupervised robotic actuation and combinatorial optimization of procedures – all key capabilities of biological nervous systems and modern machine learning algorithms with real world application.

Neuromorphic Computing Trends

If you’re still not convinced, you don’t have to take our word for it anymore! The new White House grand challenge for future computing has just been announced and it fits exactly the motivations for developing a new type of computer.

Neuromorphics Grand Challenge

Similar challenges are appearing in Europe and other parts of the world as well. The first call for proposals just came out and is co-sponsored by SRC and the NSF, and note the two main points it highlights:

severely limited by physics of managing data

constrained by physical and economical limits

This says, what we’ve been saying for a long time, you need to build a brain, where the distance is zero between memory and processor, and stop trying to simulate a brain on a digital platform!