Harel Shouval, Ph.D.

Professor, Neurobiology & Anatomy

My research focuses on identifying the rules by which changes in synaptic strength – believed to be the basis of learning, memory and development in the cortex – take place. These synapses are the means by which one neuron communicates with another, and changes in these weights are called synaptic plasticity. I concentrate on theoretical/ computational approaches to the study of synaptic plasticity and its implications on learning, memory and development. I study synaptic plasticity at many levels, from its molecular basis to its functional implications and I believe that theoretical studies are essential for forming the link between these different levels of description.

Our ability to survive depends on our ability to predict the future and make appropriate decisions based on these predictions. The world though is ever changing and it is not feasible to make such predictions on the basis of a genetically encoded algorithm. Therefore, the ability of Brains to learn is their central design principle. In my lab we study how brains are able to learn. Our approach is mechanistic and closely linked to experimental observations. We study this question at different levels from the molecular and up to systems level. Our projects include:

Mechanistic Models of Synaptic Plasticity

Synaptic Plasticity was first proposed as the basis of learning and memory on theoretical grounds, but has since garnered significant experimental support. We know that synaptic plasticity is bi-directional and includes both long-term potentiation (LTP) and long-term depression (LTD). We also know that synaptic efficacies change during learning and that inhibiting synaptic plasticity can inhibit learning. Moreover, we now have extensive knowledge about the cellular and molecular underpinnings of synaptic plasticity. Most theoretical models of synaptic plasticity used today do not take into account such knowledge and remain phenomenological models, that are based on correlations or on simple spike timing dependent kernels. Our lab has led the development of biophysical synaptic plasticity rules (Castellani et al 2001, Shouval et al 2002). We develop rules that are based on the actual cellular biophysics, that are simplify them to obtain rules that trackable and often intuitive, yet retain a connection with the experimental reality. Such rules are able to account for many experimental observations and induction protocols (See: Shouval, Wand and Wittenberg, 2008 for review). Such simple rules cannot typically account for all experimental observations on the cellular level, but adding additional biophysical features can often account for additional experimental observations (Shouval and Kalantzis, 2005; Cai et al, 2007). Our aim is to use such rules to account for higher level phenomena such as receptive field development (Yeung et al, 2004) or place field plasticity (Yu et al 2008).

Mechanistic Models of Reinforcement Learning

Computational models of synaptic plasticity, are usually models of unsupervised learning. That is models in which plasticity depends only on the input, and not on reward, punishment or performance. At the cellular and molecular level much less is known about reinforcement learning, and most of what we do know is related to the responses of dopaminergic neurons, a neuromodulator implicated in some forms of reinforcement learning. Recent experimental results have shown that a reinforcing signal can lead to changes in network dynamics, and our lab is interested in modeling these results. Models of reinforcement learning have existed for many decades; however, such models are typically highly abstract models that are not easily related to biophysical models of synaptic plasticity. Models of reinforcement learning face two basic problems. First – how to associate a stimulus with a delayed reward, this is called the “Temporal Credit Assignment Problem”. Second- how to stop learning once the goal is attained. We have developed two models of reinforcement learning. Both of these models solve the temporal credit assignment problem using synaptic eligibility traces – an idea proposed many years ago. Both of our models are embedded in a recurrent network that learns the temporal dynamics of expected rewards (see section below). In one model we assume that the network that learns to predict reward inhibits the actual reward once it is active enough during reward (Gavornik et al 2009). Our other model assumes that there are two distinct eligibility traces in each synapse one for LTP and one for LTD. These two traces temporally compete with each other to reach a stable steady state. Elements of this two-trace model have been confirmed experimentally (He et al 2015).

Learning Network Dynamics from the Environment

Predicting means that given the current state of the world, one can predict what is likely to happen and when its likely to occur. Our hypothesis is that a major function of the brain is to serve as a dynamical system that emulates the world in order to predict likely outcomes. In this context learning means adjusting the parameters of this dynamical systems so that it as closely as possible matches reality. Our approach is not to solve this grand challenge on this very abstract level, but instead to address concrete experimental examples. The first example that we have extensively addressed is based on the experimental results of Shuler and Bear (2006). Shuler and Bear showed that neurons in primary visual cortex of a rodent learn new dynamics as a result of a visual stimulus paired with a delayed reward, and that the dynamics of such cells can be used to predict the expected reward time. We have developed models that couple recurrent spiking network models with novel reinforcement leaning algorithms (see above). Our theoretical work on this subject (Gavornik et al 2009,2011, Huertas et al 2015) is coupled with ongoing experiments (Chubykin et al 2013, Namboodiri et al 2015, Liu et al 2015) which in turn are used to modify and refine our theory. We intend keep on using to use different specific experiments, for example Gavornik and Bear (2014) showed that animals that passively observe temporal sequences of stimuli and their cortical responses reflect their experience.

Maintenance of Synaptic Plasticity

Memories that last a lifetime are thought to be stored, at least in part, as persistent potentiation of the efficacies of particular synapses. The synaptic mechanism of these persistent changes, late long-term potentiation (L-LTP), depends on the state and number of specific synaptic proteins. Synaptic proteins have limited dwell times due to molecular turnover and diffusion, leading to a fundamental question: how can this transient molecular machinery store memories lasting a lifetime? Our lab has studied several theoretical approaches to address this problem. Currently our approach is based on the assumption that the long term maintenance of synaptic plasticity is based on a bi- or multi-stable switch in each synapse. The substrate of the switch is a positive feedback loop at the level of translation (Aslam et al, 2009). Currently we are focusing on PKMz, an atypical subtype of protein kinase C, as the substrate for this switch (Jalill et al 2015). There is overwhelming experimental evidence that PKMz, indeed plays a fundamental role in maintenance. Our model can account for many of the properties of L-LTP, including its dependence of protein synthesis inhibitors and the ability of PKMz inhibitors to erase memory.

The Contribution of Synaptic Plasticity to Receptive Field Development

Many properties of receptive fields in visual cortex, as well as other cortical areas are experience dependent. We have previously accounted for such properties using more traditional, rate-based models of synaptic plasticity, in visual environments composed of natural images. We have previously addressed the formation of receptive fields with biophysical models (Yeung et al, 2004), but not using simplified visual environments. Currently we are examining if the calcium dependent model, or other biophysically realistic models with spiking cells can account for the development of receptive fields as well.

On the origin and Scaling of Sensory Errors

When we estimate sensory variables such as the weight of an object, the brightness of an image, the orientation of a bar or the temporal duration of a sound, we make mistakes, and our errors often scale linearly with the magnitude of the stimulus (Webers law). These errors have been systematically measured since the 19’th century, yet the physiological origin of these errors is unknown. We have developed a theory that relates the behavioral errors to the statistics of the physiological substrate and the tuning curves of sensory neurons (Shouval et al 2013). We have shown how to calculate analytically population tuning curves given the statistics of the encoding neurons and the scaling of the behavioral errors. We have also shown that linear scaling of errors is only optimal for very specific statistics of the natural world. From our theoretical framework we found a method to find out what are the statistics of the physiological substrate that is the origin of perceptual errors. We hypothesized that these statistics will resemble the statistics of sensory neurons; that is, they will be Poisson like. We carried out psychophysical experiments to test the statistics of the ‘sensory’ noise, surprisingly we find that our experimental results are consistent with a constant noise model. These results indicate that perceptual errors do not arise from the variability of sensory neurons and are more likely to arise from a downstream process such as decision making.