"... How does the brain know what firing patterns of what neurons are responsible for the reward if 1) the patterns are no longer there when the reward arrives and 2) all
neurons and synapses are active during the waiting period to the
reward? Here, we show how the conundrum is resolved by a model network
of cortical spiking neurons with spike-timing-dependent plasticity
(STDP) modulated by dopamine (DA). Although STDP is triggered by
nearly coincident firing patterns on a millisecond timescale, slow
kinetics of subsequent synaptic plasticity is sensitive to changes in
the extracellular DA concentration during the critical period of a few
seconds. ... This study emphasizes the importance of precise firing
patterns in brain dynamics and suggests how a global diffusive
reinforcement signal in the form of extracellular DA can selectively
influence the right synapses at the right time." See paper for more and details.
Reference: 1 .
Izhikevich EM (2007) Solving the distal reward problem through linkage of STDP and dopamine signaling. Cereb Cortex17:2443-52 [PubMed]

This is the readme a model associated with the publication:
Izhikevich E.M. (2007) Solving the Distal Reward Problem through
Linkage of STDP and Dopamine Signaling. Cerebral Cortex,
10.1093/cercor/bhl152
Abstract:
In Pavlovian and instrumental conditioning, reward typically comes
seconds after reward-triggering actions, creating an explanatory
conundrum known as "distal reward problem": How does the brain know
what firing patterns of what neurons are responsible for the reward if
1) the patterns are no longer there when the reward arrives and 2) all
neurons and synapses are active during the waiting period to the
reward? Here, we show how the conundrum is resolved by a model network
of cortical spiking neurons with spike-timing-dependent plasticity
(STDP) modulated by dopamine (DA). Although STDP is triggered by
nearly coincident firing patterns on a millisecond timescale, slow
kinetics of subsequent synaptic plasticity is sensitive to changes in
the extracellular DA concentration during the critical period of a few
seconds. Random firings during the waiting period to the reward do not
affect STDP and hence make the network insensitive to the ongoing
activity-the key feature that distinguishes our approach from previous
theoretical studies, which implicitly assume that the network be quiet
during the waiting period or that the patterns be preserved until the
reward arrives. This study emphasizes the importance of precise firing
patterns in brain dynamics and suggests how a global diffusive
reinforcement signal in the form of extracellular DA can selectively
influence the right synapses at the right time.
Usage: The model replicates figure 1 in the paper.
Simple start matlab, cd to the daspnet directory and run the matlab
program by typing daspnet in the command window. The program then
begins drawing especially figure 1d, synaptic strength vs time as the
1000 neuron network is run.