Abstract

The striatum has an essential role in neural control of instrumental behaviors by reinforcement learning. Adenosine A(2A) receptors (A(2A)Rs) are highly enriched in the striatopallidal neurons and are implicated in instrumental behavior control. However, the temporal importance of the A(2A)R signaling in relation to the reward and specific contributions of the striatopallidal A(2A)Rs in the dorsolateral striatum (DLS) and the dorsomedial striatum (DMS) to the control of instrumental learning are not defined. Here, we addressed temporal relationship and sufficiency of transient activation of optoA(2A)R signaling precisely at the time of the reward to the control of instrumental learning, using our newly developed rhodopsin-A2AR chimeras (optoA(2A)R). We demonstrated that transient light activation of optoA(2A)R signaling in the striatopallidal neurons in 'time-locked' manner with the reward delivery (but not random optoA(2A)R activation) was sufficient to change the animal's sensitivity to outcome devaluation without affecting the acquisition or extinction phases of instrumental learning. We further demonstrated that optogenetic activation of striatopallidal A(2A)R signaling in the DMS suppressed goal-directed behaviors, as focally genetic knockdown of striatopallidal A(2A)Rs in the DMS enhanced goal-directed behavior by the devaluation test. By contrast, optogenetic activation or focal AAV-Cre-mediated knockdown of striatopallidal A(2A)R in the DLS had relatively limited effects on instrumental learning. Thus, the striatopallidal A(2A)R signaling in the DMS exerts inhibitory and predominant control of goal-directed behavior by acting precisely at the time of reward, and may represent a therapeutic target to reverse abnormal habit formation that is associated with compulsive obsessive disorder and drug addiction.

Targeted expression and phospho-MAPK (p-MAPK) signaling of optoA2AR in striatopallidal neurons. (a) Schematic illustration of the optoA2AR chimera construction by replacing the intracellular loops 1, 2, and 3 and C terminal of the bovine rhodopsin with that of the adenosine A2A receptor (A2AR) to achieve control of A2AR signaling by 473 nm light (left panel). Representative fluorescent image shows the expression of mCherry-optoA2AR in the striatum after injection of AAV5-DIO-mCherry-optoA2AR to adora2a-cre mice for 2 weeks (right panel). (b) The quantitative data shows that 88% mCherry-positive cells (n=114, from four mice) were colocalized with enkephalin (ENK), whereas only 17% mCherry-positive cells (n=106, from four mice) were colocalized with substance P (SP). (c) Double immunostaining with the mCherry and the specific antibodies (ENK or SP) showed that optoA2ARs were specifically expressed in ENK-positive striatopallidal neurons (white arrows, upper panels) but not SP-positive striatonigral neurons (yellow arrows, lower panels). (d) Following injection of AAV-DIO-mCherry-optoA2AR virus in the dorsomedial striatum (DMS) of adora2a-cre mice, the mCherry fluorescence of striatopallidal projection terminals was specifically expressed in the global pallidum (GP) but not in the substantia nigra pars reticulate (SNr). The green fluorescence of striatonigral projection terminals containing endogenous SP was specifically expressed in the SNr. (e) The expression of p-MAPK was induced by optoA2AR stimulation (white arrows, left panels) or CGS21680 injection (white arrows, right panels). Quantified analysis showed that light-induced p-MAPK activation was detected in 57% mCherry-optoA2AR-positive cells (n=1218 from four mice).

‘Time-locked' but not random optogenetic activation of striatopallidal adenosine A2A receptor (A2AR) signaling in the dorsomedial striatum (DMS) suppresses goal-directed behavior. (a) Left panel: Schematic illustration of the locations of the fiber tips for each animal in the ‘light-off' group (the red triangles) and ‘time-locked' activation group (the blue circles). Right panel: Typical coronal section of mCherry-optoA2AR expression in the DMS of adora2a-cre(+) mice. The white arrow indicates the optical fiber tip. (b) Schematic illustration of timing of lever pressing, sucrose reward delivery, and optical stimulation. Light stimulation (the blue flash) was delivered to the DMS during a 2-s period in ‘time-locked' manner with (the flashes between the two red dotted vertical lines) or in ‘random' manner with (the flashes in the random interval periods) reward delivery (the liquid drops). (c) Two groups of mice expressing optoA2AR in the DMS were subjected to either ‘time-locked' light stimulation or ‘light off' (n=8 per group) during the random interval (RI) training session (as indicated by the blue bar). The two groups performed indistinguishably in the acquisition phase of instrumental learning by repeated-measures analysis of variance (ANOVA)—RI period × optoA2AR stimulation interaction effect: F5,70=0.098, p>0.05; optoA2AR stimulation main effect: F1,14=0.371, p>0.05. (d) Following the RI training sessions, a 2-day devaluation test without any experimental (optoA2AR activation) manipulation was conducted as described in the Materials and Methods section. Mice without optoA2AR activation during the RI training sessions significantly reduced their lever presses in devalued condition compared with valued condition (normalized devaluation: t1,7=6.861, ***p<0.001, preplanned t-test). By contrast, mice with optoA2AR ‘time-locked' stimulation showed no significant devaluation effect (normalized devaluation: t1,7=0.709, p>0.05, preplanned t-test). However, there was no normalized devaluation × optoA2AR interaction effect by repeated-measures ANOVA analysis (F1,14=0.429, P=0.523). (e) We further performed instrumental behavioral analyses of a separate set of four experimental groups: mice expressing mCherry with ‘time-locked' light stimulation (n=7), mice expressing optoA2AR with ‘light off' (n=9), mice expressing optoA2AR with ‘time-locked' light stimulation (n=8), and mice expressing optoA2AR with random light stimulation (n=8). Consistent with the result in (c) repeated-measures ANOVA analysis indicated that there was neither between-subject effect (F3,28=1.481, p=0.241) nor RI training sessions × manipulation groups interaction effect (F15,140=1.284, p=0.220) in the acquisition phase. (f) Repeated-measures ANOVA analyses of the devaluation test revealed that there was significant effect of optogenetic manipulation × (normalized) devaluation interaction effect: F3,28=3.258, p=0.036. Similarly, the simple main-effect analyses of the devaluation test in four groups indicated that only mice with optoA2AR expression in the DMS and time-locked light stimulation performed habitually, whereas other groups displayed goal-directed behavior (simple effect analyses: F1,8=7.141, *p<0.05 for ‘light off' and F1,7=6.074, *p<0.05 for ‘random' stimulation groups, and F1,6=16.050, **p<0.01 for mCherry group). Data are presented as the mean±SEM. The color reproduction of this figure is available on the Neuropsychopharmacology journal online.

Focal knockdown of adenosine A2A receptors (A2ARs) in the dorsomedial striatum (DMS) enhances goal-directed behavior. (a) Left: Schematic illustration of the maximal (black) and minimal (gray) A2AR knockdown areas in the DMS. Right: Representative immunofluorescent photomicrographs show focal knockdown expression of A2ARs in the DMS after injection of AAV-Cre-zsGreen into the A2AR(flox/flox) (right panels) and A2AR(+/+) mice (left panels). Intensity of A2ARs (red) were significantly deceased in the overlapping area with zsGreen expression (the yellow circle) in A2AR(flox/flox) mice but not in A2AR(+/+) mice. (b) Quantitative analysis showed that A2AR expression were markedly reduced in the virus-transfected regions of A2AR(flox/flox) mice compared with A2AR(+/+) mice. (c) Two–three weeks after bilateral injection of AAV-Cre-zsGreen into the DMS, A2AR(flox/flox) mice and A2AR(+/+) mice (n=8 per group) were under CRF-RI30-RI60 training paradigm as described in the Materials and Methods section. Both groups similarly increased their lever pressing rate during the acquisition phases (repeated-measures analysis of variance (ANOVA) revealed no random interval (RI) period × genotype interaction effect: F5,65=0.859, p>0.05; and no genotype main effect: F1,13<0.001, p>0.05). (d) Mice with DMS A2AR knockdown significantly reduced their lever pressing in the devalued condition compared with that of the valued condition, but the A2AR(+/+) mice responded insensitively to the selective satiety devaluation treatment (normalized devaluation × genotype interaction effect: F1,13=9.161, p=0.01; simple effect analysis: F1,6=35.683, **p<0.01 for A2AR focal knockdown mice by repeated-measures ANOVA). Data are presented as the mean±SEM. CRF, continuous reinforcement. The color reproduction of this figure is available on the Neuropsychopharmacology journal online.