Elsevier

NeuroImage

Volume 30, Issue 2, 1 April 2006, Pages 554-562
NeuroImage

Controlling for individual differences in fMRI brain activation to tones, syllables, and words

https://doi.org/10.1016/j.neuroimage.2005.10.021Get rights and content

Abstract

Previous neuroimaging studies have consistently reported bilateral activation to speech stimuli in the superior temporal gyrus (STG) and have identified an anteroventral stream of speech processing along the superior temporal sulcus (STS). However, little attention has been devoted to the possible confound of individual differences in hemispheric dominance for speech. The present study was designed to test for speech-selective activation while controlling for inter-individual variance in auditory laterality, by using only subjects with at least 10% right ear advantage (REA) on the dichotic listening test. Eighteen right-handed, healthy male volunteers (median age 26) participated in the study. The stimuli were words, syllables, and sine wave tones (220–2600 Hz), presented in a block design. Comparing words > tones and syllables > tones yielded activation in the left posterior MTG and the lateral STG (upper bank of STS). In the right temporal lobe, the activation was located in the MTG/STS (lower bank). Comparing left and right temporal lobe cluster sizes from the words > tones and syllables > tones contrasts on single-subject level demonstrated a statistically significant left lateralization for speech sound processing in the STS/MTG area. The asymmetry analyses suggest that dichotic listening may be a suitable method for selecting a homogenous group of subjects with respect to left hemisphere language dominance.

Introduction

In contrast to the classic Wernicke–Geschwind model for speech processing, functional neuroimaging studies have consistently reported bilateral activation in the superior temporal gyrus (STG) by auditorily presented syllables or words (Wise et al., 1991, Binder et al., 1994a, Binder et al., 1994b, Binder et al., 1996, Poeppel et al., 2004). Moreover, in order to identify cortical regions that are selectively responsive to speech sounds, it has been common to compare activation to speech sounds with activation to tone sequences (see Binder et al., 1997, Jancke et al., 2002), random noise (Jancke et al., 2002), reversed speech (Binder et al., 2000), or noise-vocoded speech (Scott et al., 2000). Studies using this approach have typically found activation in the ventrolateral part of the STG and in the superior temporal sulcus (STS) (Binder et al., 1997, Binder et al., 2000, Scott et al., 2000), in contrast to the STG area predicted by the classic Wernicke–Geschwind model. The discrepancy between lesion and imaging data with regard to speech representation in the brain creates a problem for our understanding of how speech processing is organized in the cerebral hemispheres (cf. Hugdahl, 2000). A potential source of error variance is the fact that no imaging study has controlled for individual differences in speech lateralization in the subjects being scanned, beyond using right-handed subjects. Individual variability in speech representation may bias the outcome of brain imaging studies. Subtle activation effects could be overlooked because, across subjects, effects may be diversely distributed between the hemispheres. For instance, mixing left- and right-hemisphere-speech-dominant individuals could imply mixing sub-groups with left- versus right-sided brain activation, which would cancel out in an averaged analysis. Not controlling for lateralization of speech processing could, thus, introduce extra variance that may disguise activation occurring in areas such as the classic Wernicke's area.

The approach taken in the present study was therefore to select only subjects that had a left hemisphere dominance for speech as measured with dichotic listening (DL) to CV syllables. DL is the most frequently used behavioral test (see O'Leary, 2003) for investigating auditory laterality. DL involves presenting two different stimuli simultaneously in the left and right ear, asking the subject to indicate which sound he heard best or most clearly. The most common result is a superior number of reports from the right ear — “right ear advantage” (REA) (Hugdahl, 2003). Strauss et al. (1987), as well as Hugdahl et al. (1997), have demonstrated a correlation between left hemisphere dominance on the Wada test and REA on the DL test. This suggests that DL may be a good measure of laterality of speech processing and therefore also suitable as a method for pre-selecting subjects to fMRI studies of speech processing. In order to increase the sensitivity in the fMRI analyses with regard to the classic Wernicke area, region-of-interest (ROI) analyses were conducted. The anatomical boundaries of Wernicke's area are not clear (Williams, 1995). It is usually defined as the posterior third of STG; however, adjacent parts of BA 39–40 in the parietal lobe have also been implicated (Mesulam, 1998). Thus, the ROI for Wernicke's area was defined as the union of the superior temporal gyrus, the angular gyrus, and the supramarginal gyrus.

Speech stimuli with different degrees of linguistic complexity were used. The speech stimuli were consonant–vowel–consonant (CVC) words and consonant–vowel (CV) syllables. Sinusoid tones were used as nonspeech comparison stimuli to control for activation of sensory, motor, and general purpose executive systems with minimal automatic activation of phonological and semantic systems. A block design was used to ensure that the subjects were not expecting to hear familiar words during the presentation of the syllables in order to rule out expectancy effects.

Section snippets

Subjects

Eighteen right-handed, healthy male volunteers (median age 26) participated in the study. None of the subjects had a history of a neurological or psychiatric illness, and all were native speakers of Norwegian. The subjects were screened with standard audiometry (250, 500, 1000, 2000, and 3000 Hz), and subjects with an auditory threshold higher than 20 dB, or an interaural difference larger than 10 dB, on any frequency were excluded from the study. The subjects were also screened for laterality

Behavioral data

There was a significant main effect for omission errors, F(2,34)=3.97, P = 0.028. The post-hoc test revealed significantly fewer errors in the tones condition than in the words (P = 0.017) and syllables (P = 0.024) conditions. There was also a significant main effect for commission errors, F(2,34)=13.25, P < 0.001. The post-hoc test revealed significantly fewer errors in the tones condition than in the words (P = 0.032) and syllables (P < 0.001) conditions and significantly fewer errors in the

Discussion

The words, syllables, and tones activated areas in the STS and MTG bilaterally when contrasted with the passive baseline condition (OFF). The activation observed to the tones was, on average, more extensive in the right hemisphere. Contrasting words and syllables with tones, respectively, resulted in more extensive temporal lobe activation in the left hemisphere. The peak activation was observed in the left lateral superior temporal gyrus (STG) for the words > tones contrast and in the superior

References (46)

  • C. Price et al.

    Speech-specific auditory processing: where is it?

    Trends Cogn. Sci.

    (2005)
  • Y. Shtyrov et al.

    Background acoustic noise and the hemispheric lateralization of speech processing in the human brain: magnetic mismatch negativity study

    Neurosci. Lett.

    (1998)
  • K. Specht et al.

    Functional segregation of the temporal lobes into highly differentiated subsystems for auditory perception: an auditory rapid event-related fMRI-task

    NeuroImage

    (2003)
  • E. Strauss et al.

    Performance on a free-recall verbal dichotic listening task and cerebral dominance determined by the carotid amytal test

    Neuropsychologia

    (1987)
  • G. Thierry et al.

    Hemispheric dissociation in access to the human semantic system

    Neuron

    (2003)
  • K.J. Worsley et al.

    Analysis of fMRI time-series revisited-again

    NeuroImage

    (1995)
  • J. Ashburner et al.

    Nonlinear spatial normalization using basis functions

    Hum. Brain Mapp.

    (1999)
  • P. Belin et al.

    Voice-selective areas in human auditory cortex

    Nature

    (2000)
  • J. Binder

    The new neuroanatomy of speech perception

    Brain

    (2000)
  • J. Binder et al.

    Functional neuroimaging of language

  • J.R. Binder et al.

    Functional magnetic resonance imaging of human auditory cortex

    Ann. Neurol.

    (1994)
  • J.R. Binder et al.

    Function of the left planum temporale in auditory and linguistic processing

    Brain

    (1996)
  • J.R. Binder et al.

    Human brain language areas identified by functional magnetic resonance imaging

    J. Neurosci.

    (1997)
  • Cited by (29)

    • Dissociating the functions of three left posterior superior temporal regions that contribute to speech perception and production

      2021, NeuroImage
      Citation Excerpt :

      For example, reliance on auditory short-term memory may increase during audio-visual integration (Erickson et al., 2014; Szycik et al., 2012) and the attention, memory and executive tasks included in the meta-analysis conducted by Liebenthal et al. (2014) who reported greater left dpSTS activation for non-linguistic than linguistic stimuli. It is also possible that, in the absence of a behavioural task, the reliance on short-term representation of relevant sound features increases when passively listening to (1) non-verbal sounds as they become familiar (Dehaene-Lambertz et al., 2005; Dick et al., 2011; Leech et al., 2009; Liebenthal et al., 2010) and (2) non-semantic speech sounds compared to complex unintelligible sounds (Benson et al., 2006; Giraud et al., 2004; Narain et al., 2003; Rimol et al., 2006; Scott et al., 2000). We consider how the response we observed in left TPJ fits with two non-mutually exclusive perspectives reported in the prior literature: (A) this region plays a role in auditory-motor integration (Hickok et al., 2003, 2004; Buchsbaum et al., 2001; Buchsbaum and D'Esposito 2019); and/or (B) it plays a role in short-term memory of auditory representations (Buchsbaum and D'Esposito, 2019, 2009; Koelsch et al., 2009; Kraemer et al., 2005; McGettigan et al., 2011) that are not necessarily linked to phonology or auditory-motor integration (Aleman, 2004; Bunzeck et al., 2005; Zatorre and Halpern, 2005; Xu et al., 2006; Jäncke and Shah, 2004; Pekkola et al., 2006; Wheeler et al., 2000; Hasegawa et al., 2004).

    • Phonological processing in speech perception: What do sonority differences tell us?

      2015, Brain and Language
      Citation Excerpt :

      Neuroimaging studies have identified a distributed network of brain regions involved in the processing of phonological information during the perception of speech sounds. For example, studies that have investigated phonological processing by contrasting the processing of syllables or phonemes to the processing of complex auditory stimuli (e.g. environmental sounds (Giraud & Price, 2001), bird songs (Tremblay, Baroni, & Hasson, 2012), tones (Demonet et al., 1992; Poeppel et al., 2004; Rimol, Specht, & Hugdahl, 2006; Vouloumanos, Kiehl, Werker, & Liddle, 2001), and unintelligible speech sounds (Benson, Richardson, Whalen, & Lai, 2006; Liebenthal, Binder, Spitzer, Possing, & Medler, 2005; Obleser, Zimmermann, Van Meter, & Rauschecker, 2007; Okada et al., 2010)) have consistently reported clusters of activation within the superior temporal gyrus (STG) and superior temporal sulcus (STS). In addition, the presentation of auditory and/or orthographic stimuli (word and/or pseudowords) requiring a phonological judgment recruits regions located within the inferior frontal gyrus (IFG), the middle frontal gyrus (MFG) and the inferior parietal lobules (IPL) (Booth et al., 2002; Burton, Locasto, Krebs-Noble, & Gullapalli, 2005; Burton, Small, & Blumstein, 2000; Jacquemot, Pallier, LeBihan, Dehaene, & Dupoux, 2003; Kareken, Lowe, Chen, Lurito, & Mathews, 2000; Poldrack et al., 2001).

    • Stimulus-dependent activations and attention-related modulations inthe auditory cortex: A meta-analysis of fMRI studies

      2014, Hearing Research
      Citation Excerpt :

      Numerous fMRI studies have also focused on speech processing. Many of these studies have compared AC activations to speech sounds (e.g., vowels, consonants, syllables, words, pseudowords, and sine-wave speech) with activations to non-speech sounds (e.g., noise, tones, and chords) during passive or active listening (e.g., Benson et al., 2001, 2006; Binder et al., 2000; Burton et al., 2000; Burton and Small, 2006; Dick et al., 2007; Jäncke et al., 2002a, 2002b; Liebenthal et al., 2005; LoCasto et al., 2004; Obleser et al., 2006; Rimol et al., 2005, 2006; Specht and Reul, 2003; Uppenkamp et al., 2006; Zaehle et al., 2004). In addition, studies have contrasted AC activations to intelligible speech with activations to unintelligible degraded speech or speech-enveloped noise (Davis and Johnsrude, 2003; Giraud et al., 2004; Narain et al., 2004; Obleser et al., 2007, 2008; Specht et al., 2005).

    • Clinical significance and developmental changes of auditory-language-related gamma activity

      2013, Clinical Neurophysiology
      Citation Excerpt :

      Gamma-augmentation in the left middle-temporal region reached the maximum value immediately prior to stimulus offset (Fig. 5). Previous fMRI studies reported that BOLD responses in both left superior- and middle-temporal regions were larger when word stimuli were given compared to when nonlinguistic control stimuli were provided (Scott and Wise, 2004; Rimol et al., 2006). Gamma-augmentation in the left medial-temporal region reached the maximum value around stimulus offset (Fig. 5).

    View all citing articles on Scopus
    View full text