Elsevier

Neuroscience

Volume 282, 12 December 2014, Pages 49-59
Neuroscience

Review
Distinct dopaminergic control of the direct and indirect pathways in reward-based and avoidance learning behaviors

https://doi.org/10.1016/j.neuroscience.2014.04.026Get rights and content

Highlights

  • Action selection for rewarding and aversive behaviors is controlled by the NAc.

  • The selective blockade of the direct and indirect pathways in the NAc was developed.

  • The two parallel pathways distinctly control reward and avoidance learning.

  • Distinct control by D1 and D2 receptors is essential for learning and flexibility.

  • The significance of two parallel pathways in prompt reaction selection is discussed.

Abstract

The nucleus accumbens (NAc) plays a pivotal role in reward and aversive learning and learning flexibility. Outputs of the NAc are transmitted through two parallel routes termed the direct and indirect pathways and controlled by the dopamine (DA) neurotransmitter. To explore how reward-based and avoidance learning is controlled in the NAc of the mouse, we developed the reversible neurotransmission-blocking (RNB) technique, in which transmission of each pathway could be selectively and reversibly blocked by the pathway-specific expression of transmission-blocking tetanus toxin and the asymmetric RNB technique, in which one side of the NAc was blocked by the RNB technique and the other intact side was pharmacologically manipulated by a transmitter agonist or antagonist. Our studies demonstrated that the activation of D1 receptors in the direct pathway and the inactivation of D2 receptors in the indirect pathway are key determinants that distinctly control reward-based and avoidance learning, respectively. The D2 receptor inactivation is also critical for flexibility of reward learning. Furthermore, reward and aversive learning is regulated by a set of common downstream receptors and signaling cascades, all of which are involved in the induction of long-term potentiation at cortico-accumbens synapses of the two pathways. In this article, we review our studies that specify the regulatory mechanisms of each pathway in learning behavior and propose a mechanistic model to explain how dynamic DA modulation promotes selection of actions that achieve reward-seeking outcomes and avoid aversive ones. The biological significance of the network organization consisting of two parallel transmission pathways is also discussed from the point of effective and prompt selection of neural outcomes in the neural network.

Introduction

Reward-based and aversive forms of learning are essential for animals to survive in different environments. Animals possess the innate ability to effectively gain rewards such as food but also to rapidly avoid uncomfortable or dangerous situations. However, when rewards are present in dangerous environments, the animal needs to select actions as to whether it will still seek rewards or avoid dangerous places. The basal ganglia are the key neural substrate that controls not only motor balance but also decision making based on reward-based and aversive forms of learning (Graybiel, 2008, Bromberg-Martin et al., 2010, Gerfen and Surmeier, 2011, Aggarwal et al., 2012, Salamone and Correa, 2012). This circuitry receives and integrates neural information from the cerebral cortex and thalamus and facilitates selection of actions that achieve reward-seeking outcomes and avoid aversive ones (Graybiel, 2000, Graybiel, 2008, Bromberg-Martin et al., 2010). Dysfunction of the basal ganglia leads to severe cognitive and learning impairments as exemplified in Parkinson’s disease, schizophrenia, and drug addiction (Hyman et al., 2006, Israel and Bergman, 2008, Simpson et al., 2010, Wichmann et al., 2011, Grueter et al., 2012).

In the basal ganglia circuitry, the projection neurons in the striatum are divided into two subpopulations, i.e., striatonigral neurons of the direct pathway and striatopallidal neurons of the indirect pathway (Albin et al., 1989, Alexander and Crutcher, 1990) (Fig. 1A). The outputs of these two parallel pathways converge at substantia nigra pars reticulata (SNr) and ventral tegmental area (VTA) and control the dynamic balance of the basal ganglia-thalamocortical circuitry (Graybiel, 2008, Wickens, 2009, Bromberg-Martin et al., 2010, Gerfen and Surmeier, 2011). In this circuit, dopamine (DA) from the VTA and substantia nigra pars compacta is essential for controlling both pathways by dichotomously modulating glutamatergic synaptic plasticity of striatal neurons (Grace et al., 2007, Surmeier et al., 2007, Surmeier et al., 2009, Kreitzer and Malenka, 2008, Shen et al., 2008, Flores-Barrera et al., 2011). In the dorsal striatum, the striatonigral neurons selectively express D1 receptors and the substance P neuropeptide; and this expression is in marked contrast to the predominant expression of D2 receptors and the enkephalin neuropeptide in the striatopallidal neurons (Gerfen et al., 1990, Flajolet et al., 2008, Heiman et al., 2008). The difference in expression profile as well as the distinct ligand affinities of D1 receptors (μM order) and D2 receptors (nM order) is thought to be critical for differential modulation of transmission of these two pathways (Surmeier et al., 2007, Graybiel, 2008, Kreitzer and Malenka, 2008). However, the transmission circuit is more complicated in the nucleus accumbens (NAc), the ventral part of the striatum. The D2 receptor/enkephalin-expressing NAc neurons project to the ventral pallidum (VP), but the D1 receptor/substance P-expressing NAc neurons innervate not only the SNr (from the NAc core) and the VTA (from the NAc shell) but also the VP (Lu et al., 1998, Zhou et al., 2003, Nicola, 2007, Smith et al., 2013). Thus, the SNr and the VTA exclusively receive inputs from the D1 receptor-expressing NAc neurons via the direct pathway, but the VP receives inputs from both D1 receptor- and D2 receptor-expressing NAc neurons. Interestingly, it has been discussed that the VP neurons that receive inputs from the D1 receptor-expressing NAc neurons could directly transmit their outputs to the thalamus, thereby retaining segregated transmission characteristic of the direct and indirect pathways (Smith et al., 2013), although this possibility needs to be further investigated.

The two types of striatal projection neurons are morphologically indistinguishable and it remains a key question as to how these different types of DA receptors in the two pathways control reward-based and aversive learning behaviors. To address this fundamental question regarding control of the basal ganglia circuitry, we developed novel gene-manipulating techniques termed reversible neurotransmission blocking (RNB) (Yamamoto et al., 2003, Hikida et al., 2010) and asymmetric RNB (aRNB) techniques (Yawata et al., 2012, Hikida et al., 2013). These two techniques allowed us to reversibly and separately block neurotransmission in either the direct or the indirect pathway and to investigate how reward-based and aversive learning is controlled by the different DA receptors and other transmitter receptors in a pathway-specific manner. In this article, we review our studies concerning how the two parallel pathways are involved in reward-based and aversive learning and propose a mechanistic model that could explain how dynamic DA modulation controls reward-seeking and avoidance learning behaviors in the basal ganglia circuitry (Hikida et al., 2010, Hikida et al., 2013, Yawata et al., 2012).

Section snippets

The RNB and aRNB techniques

The RNB technique was established by combining the transgenic technique and the adeno-associated virus (AAV)-mediated gene expression system (Hikida et al., 2010) (Fig. 1B). In this technique, bilateral transmission blockade of either the direct pathway or the indirect pathway is achieved by the pathway-specific expression of transmission-blocking tetanus toxin, which is driven by the interaction of the tetracycline-repressive transcription factor (tTA) and the tetracycline-responsive element

Roles of the two pathways in acute and chronic psychostimulant-induced responses

Psychostimulants such as methamphetamine and cocaine massively increase the DA level in the striatum and the NAc and induce both acute hyperlocomotion and long-lasting adaptive responses called locomotor sensitization (Kalivas and Stewart, 1991, Hikida et al., 2001, Hikida et al., 2003, Hyman et al., 2006, Kimura et al., 2011). The D-RNB and I-RNB mice showed no abnormal locomotor activity under the ordinary condition. However, when D-RNB or I-RNB mice were administered methamphetamine or

Distinct roles of the two pathways in naturally occurring reward-based learning and passive avoidance learning

DA neurons of the VTA exhibit two distinct patterns of firings, a tonic firing and a phasic firing (Grace et al., 2007, Schultz, 2007). A burst of the phasic firing is evoked by rewarding stimuli and is thought to serve as the signal involved in reward-related behavior (Mirenowicz and Schultz, 1994, Grace et al., 2007, Cohen et al., 2012). In contrast to those to rewarding stimuli, the responses of DA neurons to aversive stimuli are not homogeneous; i.e., some DA neurons are activated but most

The role of D1 and D2 receptors in the pathway-specific reward-based and avoidance learning behaviors

The aRNB technique was introduced to explore what DA receptor subtypes control appetitive reward learning and passive avoidance learning (Hikida et al., 2013) (Fig. 2). The unilaterally blocked D- or I-aRNB mice showed the normal ability to induce chocolate-CPP in the CPP test and avoid the electrically shocked dark chamber, verifying that blockade of one side of transmission had no effect on reward-based or passive avoidance learning. Then, the D1 agonist SKF81297 (SKF), the D1 antagonist

Flexibility of reward-based learning

The basal ganglia circuitry is also important for learning flexibility to effectively acquire rewards under environmental changes (Frank et al., 2007, Grace et al., 2007, Frank, 2011). A visual cue task (VCT) was performed to address how the flexibility of goal-directed reward learning is controlled by pathway-specific mechanisms of the NAc (Yawata et al., 2012) (Fig. 3). A reward was placed at one fixed arm in a four-arm cross maze in the first test, so that the mice had to learn a correct

Signaling mechanisms of the indirect pathway neurons in avoidance learning

A number of previous studies have elucidated characteristic features of key neurotransmitter receptors and intracellular signaling cascades operating in the direct and indirect pathway neurons (Kreitzer and Malenka, 2007, Surmeier et al., 2007, Higley and Sabatini, 2010, Gerfen and Surmeier, 2011, Shiflett and Balleine, 2011, Lerner and Kreitzer, 2012). In the indirect-pathway neurons, D2 receptors and adenosine A2a receptors are postsynaptically co-localized and functionally counteract each

A mechanistic model of pathway-selective DA modulation in reward-based and passive aversive types of learning

In the striatal projection neurons, low-affinity D1 receptors and high-affinity D2 receptors are exclusively expressed in the direct and indirect pathway neurons, respectively (Maeno, 1982, Richfield et al., 1989, Gerfen et al., 1990, Gerfen and Surmeier, 2011). On the basis of the characteristic feature of DA transmission in the two pathways, our studies allowed us to propose a model to explain how DA modulation of the two pathways distinctly controls reward-directed learning and passive

The significance of two parallel pathways in dynamic shift of neural information

The D1 receptors and A2a receptors are selectively expressed in the direct and indirect pathway neurons, respectively, and both receptors commonly activate the cAMP-protein kinase A (PKA) signaling cascade (Fuxe et al., 2007, Surmeier et al., 2007, Gerfen and Surmeier, 2011). Interestingly, when D2 receptors are inactivated by a reduction in DA levels, A2a receptors become a predominant receptor in the indirect pathway. Thus, the common cAMP-PKA signaling mechanism of D1 and A2a receptors

Acknowledgments

This work was supported by Research Grants-in-Aid 2222005 (to S.N.), 23120011 (to S.N., T.H., and S.Y.), 23680034 (to T.H.), and 25871080 (to S.Y.) from the Ministry of Education, Culture, Sports, Science and Technology of Japan.

References (87)

  • B.A. Grueter et al.

    Integrating synaptic plasticity and striatal circuit function in addiction

    Curr Opin Neurobiol

    (2012)
  • H. Hall et al.

    Some in vitro receptor binding properties of [3H] eticlopride, a novel substituted benzamide, selective for dopamine-D2 receptors in the rat brain

    Eur J Pharmacol

    (1985)
  • M. Heiman et al.

    A translational profiling approach for the molecular characterization of CNS cell types

    Cell

    (2008)
  • T. Hikida et al.

    Distinct roles of synaptic transmission in direct and indirect striatal pathways to reward and aversive behavior

    Neuron

    (2010)
  • J. Hyttel

    SCH 23390 – the first selective dopamine D-1 antagonist

    Eur J Pharmacol

    (1983)
  • Z. Israel et al.

    Pathophysiology of the basal ganglia and movement disorders: from animal models to human clinical applications

    Neurosci Biobehav Rev

    (2008)
  • M. Joshua et al.

    Synchronization of midbrain dopaminergic neurons is enhanced by rewarding events

    Neuron

    (2009)
  • P.W. Kalivas et al.

    Dopamine transmission in the initiation and expression of drug- and stress-induced sensitization of motor activity

    Brain Res Brain Res Rev

    (1991)
  • A.C. Kreitzer et al.

    Striatal plasticity and basal ganglia circuit function

    Neuron

    (2008)
  • T.N. Lerner et al.

    RGS4 is required for dopaminergic control of striatal LTD and susceptibility to parkinsonian motor deficits

    Neuron

    (2012)
  • M. Masu et al.

    Specific deficit of the ON response in visual transmission by targeted disruption of the mGluR6 gene

    Cell

    (1995)
  • S. Nakanishi

    Second-order neurons and receptor mechanisms in visual- and olfactory-information processing

    Trends Neurosci

    (1995)
  • A. Nomura et al.

    Developmentally regulated postsynaptic localization of a metabotropic glutamate receptor in rat rod bipolar cells

    Cell

    (1994)
  • E.K. Richfield et al.

    Anatomical and affinity state comparisons between dopamine D1 and D2 receptors in the rat central nervous system

    Neurosci

    (1989)
  • J.D. Salamone et al.

    The mysterious motivational functions of mesolimbic dopamine

    Neuron

    (2012)
  • W. Schultz

    Behavioral dopamine signals

    Trends Neurosci

    (2007)
  • M.W. Shiflett et al.

    Contributions of ERK signaling in the striatum to instrumental learning and performance

    Behav Brain Res

    (2011)
  • E.H. Simpson et al.

    A possible role for the striatum in the pathogenesis of the cognitive symptoms of schizophrenia

    Neuron

    (2010)
  • R.J. Smith et al.

    Cocaine-induced adaptations in D1 and D2 accumbens projection neurons (a dichotomy not necessarily synonymous with direct and indirect pathways)

    Curr Opin Neurobiol

    (2013)
  • D.J. Surmeier et al.

    D1 and D2 dopamine-receptor modulation of striatal glutamatergic signaling in striatal medium spiny neurons

    Trends Neurosci

    (2007)
  • D.J. Surmeier et al.

    Dopamine and synaptic plasticity in dorsal striatal circuits controlling action selection

    Curr Opin Neurobiol

    (2009)
  • K.R. Tan et al.

    GABA neurons of the VTA drive conditioned place aversion

    Neuron

    (2012)
  • N.X. Tritsch et al.

    Dopaminergic modulation of synaptic transmission in cortex and striatum

    Neuron

    (2012)
  • M.A. Ungless

    Dopamine: the salient issue

    Trends Neurosci

    (2004)
  • N.R. Wall et al.

    Differential innervation of direct- and indirect-pathway striatal projection neurons

    Neuron

    (2013)
  • J.R. Wickens

    Synaptic plasticity in the basal ganglia

    Behav Brain Res

    (2009)
  • L. Zhou et al.

    Chemical organization of projection neurons in the rat accumbens nucleus and olfactory tubercle

    Neuroscience

    (2003)
  • M. Aggarwal et al.

    Neural control of dopamine neurotransmission: implications for reinforcement learning

    Eur J Neurosci

    (2012)
  • K.K. Anstrom et al.

    Restraint increases dopaminergic burst firing in awake rats

    Neuropychopharmacology

    (2005)
  • J. Arnt et al.

    Relative dopamine D1 and D2 receptor affinity and efficacy determine whether dopamine agonists induce hyperactivity or oral stereotypy in rats

    Pharmacol Toxicol

    (1988)
  • J. Barik et al.

    Chronic stress triggers social aversion via glucocorticoid receptor in dopaminoceptive neurons

    Science

    (2013)
  • A.E. Block et al.

    Thalamic-prefrontal cortical-ventral striatal circuitry mediates dissociable components of strategy set shifting

    Cereb Cortex

    (2007)
  • R. Bock et al.

    Strengthening the accumbal indirect pathway promotes resilience to compulsive cocaine use

    Nat Neurosci

    (2013)
  • Cited by (82)

    • Cellular bases for reward-related dopamine actions

      2023, Neuroscience Research
      Citation Excerpt :

      Moreover, D2R-SPN inhibition induced selective impairment in aversive conditioning. Based on these observations along with dopamine dynamics during learning, a model was proposed, according to which the high-threshold D1R detects a transient increase in dopamine for reward learning, whereas the low-threshold D2R is saturated at the basal level of dopamine and detects a transient decrease in dopamine for aversive learning (Nakanishi et al., 2014). What are the exact cellular mechanisms that detect transient dopamine dynamics on a behavioral time scale and store information for behavioral learning?

    View all citing articles on Scopus
    View full text