Controlling for individual differences in fMRI brain activation to tones, syllables, and words
Introduction
In contrast to the classic Wernicke–Geschwind model for speech processing, functional neuroimaging studies have consistently reported bilateral activation in the superior temporal gyrus (STG) by auditorily presented syllables or words (Wise et al., 1991, Binder et al., 1994a, Binder et al., 1994b, Binder et al., 1996, Poeppel et al., 2004). Moreover, in order to identify cortical regions that are selectively responsive to speech sounds, it has been common to compare activation to speech sounds with activation to tone sequences (see Binder et al., 1997, Jancke et al., 2002), random noise (Jancke et al., 2002), reversed speech (Binder et al., 2000), or noise-vocoded speech (Scott et al., 2000). Studies using this approach have typically found activation in the ventrolateral part of the STG and in the superior temporal sulcus (STS) (Binder et al., 1997, Binder et al., 2000, Scott et al., 2000), in contrast to the STG area predicted by the classic Wernicke–Geschwind model. The discrepancy between lesion and imaging data with regard to speech representation in the brain creates a problem for our understanding of how speech processing is organized in the cerebral hemispheres (cf. Hugdahl, 2000). A potential source of error variance is the fact that no imaging study has controlled for individual differences in speech lateralization in the subjects being scanned, beyond using right-handed subjects. Individual variability in speech representation may bias the outcome of brain imaging studies. Subtle activation effects could be overlooked because, across subjects, effects may be diversely distributed between the hemispheres. For instance, mixing left- and right-hemisphere-speech-dominant individuals could imply mixing sub-groups with left- versus right-sided brain activation, which would cancel out in an averaged analysis. Not controlling for lateralization of speech processing could, thus, introduce extra variance that may disguise activation occurring in areas such as the classic Wernicke's area.
The approach taken in the present study was therefore to select only subjects that had a left hemisphere dominance for speech as measured with dichotic listening (DL) to CV syllables. DL is the most frequently used behavioral test (see O'Leary, 2003) for investigating auditory laterality. DL involves presenting two different stimuli simultaneously in the left and right ear, asking the subject to indicate which sound he heard best or most clearly. The most common result is a superior number of reports from the right ear — “right ear advantage” (REA) (Hugdahl, 2003). Strauss et al. (1987), as well as Hugdahl et al. (1997), have demonstrated a correlation between left hemisphere dominance on the Wada test and REA on the DL test. This suggests that DL may be a good measure of laterality of speech processing and therefore also suitable as a method for pre-selecting subjects to fMRI studies of speech processing. In order to increase the sensitivity in the fMRI analyses with regard to the classic Wernicke area, region-of-interest (ROI) analyses were conducted. The anatomical boundaries of Wernicke's area are not clear (Williams, 1995). It is usually defined as the posterior third of STG; however, adjacent parts of BA 39–40 in the parietal lobe have also been implicated (Mesulam, 1998). Thus, the ROI for Wernicke's area was defined as the union of the superior temporal gyrus, the angular gyrus, and the supramarginal gyrus.
Speech stimuli with different degrees of linguistic complexity were used. The speech stimuli were consonant–vowel–consonant (CVC) words and consonant–vowel (CV) syllables. Sinusoid tones were used as nonspeech comparison stimuli to control for activation of sensory, motor, and general purpose executive systems with minimal automatic activation of phonological and semantic systems. A block design was used to ensure that the subjects were not expecting to hear familiar words during the presentation of the syllables in order to rule out expectancy effects.
Section snippets
Subjects
Eighteen right-handed, healthy male volunteers (median age 26) participated in the study. None of the subjects had a history of a neurological or psychiatric illness, and all were native speakers of Norwegian. The subjects were screened with standard audiometry (250, 500, 1000, 2000, and 3000 Hz), and subjects with an auditory threshold higher than 20 dB, or an interaural difference larger than 10 dB, on any frequency were excluded from the study. The subjects were also screened for laterality
Behavioral data
There was a significant main effect for omission errors, F(2,34)=3.97, P = 0.028. The post-hoc test revealed significantly fewer errors in the tones condition than in the words (P = 0.017) and syllables (P = 0.024) conditions. There was also a significant main effect for commission errors, F(2,34)=13.25, P < 0.001. The post-hoc test revealed significantly fewer errors in the tones condition than in the words (P = 0.032) and syllables (P < 0.001) conditions and significantly fewer errors in the
Discussion
The words, syllables, and tones activated areas in the STS and MTG bilaterally when contrasted with the passive baseline condition (OFF). The activation observed to the tones was, on average, more extensive in the right hemisphere. Contrasting words and syllables with tones, respectively, resulted in more extensive temporal lobe activation in the left hemisphere. The peak activation was observed in the left lateral superior temporal gyrus (STG) for the words > tones contrast and in the superior
References (46)
- et al.
Modeling geometric deformations in EPI time series
NeuroImage
(2001) - et al.
Effects of stimulus rate on signal response during functional magnetic resonance imaging of auditory cortex
Brain Res. Cogn. Brain Res.
(1994) - et al.
Lesion analysis of the brain areas involved in language comprehension
Cognition
(2004) - et al.
Analysis of fMRI time-series revisited
NeuroImage
(1995) - et al.
Event-related fMRI: characterizing differential responses
NeuroImage
(1998) Lateralization of cognitive processes in the brain
Acta Psychol. (Amst.)
(2000)- et al.
Phonetic perception and the temporal cortex
NeuroImage
(2002) - et al.
An automated method for neuroanatomic and cytoarchitectonic atlas-based interrogation of fMRI data sets
NeuroImage
(2003) - et al.
Precentral gyrus discrepancy in electronic versions of the Talairach atlas
NeuroImage
(2004) - et al.
Auditory lexical decision, categorical perception, and FM direction discrimination differentially engage left and right auditory cortex
Neuropsychologia
(2004)
Speech-specific auditory processing: where is it?
Trends Cogn. Sci.
Background acoustic noise and the hemispheric lateralization of speech processing in the human brain: magnetic mismatch negativity study
Neurosci. Lett.
Functional segregation of the temporal lobes into highly differentiated subsystems for auditory perception: an auditory rapid event-related fMRI-task
NeuroImage
Performance on a free-recall verbal dichotic listening task and cerebral dominance determined by the carotid amytal test
Neuropsychologia
Hemispheric dissociation in access to the human semantic system
Neuron
Analysis of fMRI time-series revisited-again
NeuroImage
Nonlinear spatial normalization using basis functions
Hum. Brain Mapp.
Voice-selective areas in human auditory cortex
Nature
The new neuroanatomy of speech perception
Brain
Functional neuroimaging of language
Functional magnetic resonance imaging of human auditory cortex
Ann. Neurol.
Function of the left planum temporale in auditory and linguistic processing
Brain
Human brain language areas identified by functional magnetic resonance imaging
J. Neurosci.
Cited by (29)
Dissociating the functions of three left posterior superior temporal regions that contribute to speech perception and production
2021, NeuroImageCitation Excerpt :For example, reliance on auditory short-term memory may increase during audio-visual integration (Erickson et al., 2014; Szycik et al., 2012) and the attention, memory and executive tasks included in the meta-analysis conducted by Liebenthal et al. (2014) who reported greater left dpSTS activation for non-linguistic than linguistic stimuli. It is also possible that, in the absence of a behavioural task, the reliance on short-term representation of relevant sound features increases when passively listening to (1) non-verbal sounds as they become familiar (Dehaene-Lambertz et al., 2005; Dick et al., 2011; Leech et al., 2009; Liebenthal et al., 2010) and (2) non-semantic speech sounds compared to complex unintelligible sounds (Benson et al., 2006; Giraud et al., 2004; Narain et al., 2003; Rimol et al., 2006; Scott et al., 2000). We consider how the response we observed in left TPJ fits with two non-mutually exclusive perspectives reported in the prior literature: (A) this region plays a role in auditory-motor integration (Hickok et al., 2003, 2004; Buchsbaum et al., 2001; Buchsbaum and D'Esposito 2019); and/or (B) it plays a role in short-term memory of auditory representations (Buchsbaum and D'Esposito, 2019, 2009; Koelsch et al., 2009; Kraemer et al., 2005; McGettigan et al., 2011) that are not necessarily linked to phonology or auditory-motor integration (Aleman, 2004; Bunzeck et al., 2005; Zatorre and Halpern, 2005; Xu et al., 2006; Jäncke and Shah, 2004; Pekkola et al., 2006; Wheeler et al., 2000; Hasegawa et al., 2004).
Phonological processing in speech perception: What do sonority differences tell us?
2015, Brain and LanguageCitation Excerpt :Neuroimaging studies have identified a distributed network of brain regions involved in the processing of phonological information during the perception of speech sounds. For example, studies that have investigated phonological processing by contrasting the processing of syllables or phonemes to the processing of complex auditory stimuli (e.g. environmental sounds (Giraud & Price, 2001), bird songs (Tremblay, Baroni, & Hasson, 2012), tones (Demonet et al., 1992; Poeppel et al., 2004; Rimol, Specht, & Hugdahl, 2006; Vouloumanos, Kiehl, Werker, & Liddle, 2001), and unintelligible speech sounds (Benson, Richardson, Whalen, & Lai, 2006; Liebenthal, Binder, Spitzer, Possing, & Medler, 2005; Obleser, Zimmermann, Van Meter, & Rauschecker, 2007; Okada et al., 2010)) have consistently reported clusters of activation within the superior temporal gyrus (STG) and superior temporal sulcus (STS). In addition, the presentation of auditory and/or orthographic stimuli (word and/or pseudowords) requiring a phonological judgment recruits regions located within the inferior frontal gyrus (IFG), the middle frontal gyrus (MFG) and the inferior parietal lobules (IPL) (Booth et al., 2002; Burton, Locasto, Krebs-Noble, & Gullapalli, 2005; Burton, Small, & Blumstein, 2000; Jacquemot, Pallier, LeBihan, Dehaene, & Dupoux, 2003; Kareken, Lowe, Chen, Lurito, & Mathews, 2000; Poldrack et al., 2001).
Stimulus-dependent activations and attention-related modulations inthe auditory cortex: A meta-analysis of fMRI studies
2014, Hearing ResearchCitation Excerpt :Numerous fMRI studies have also focused on speech processing. Many of these studies have compared AC activations to speech sounds (e.g., vowels, consonants, syllables, words, pseudowords, and sine-wave speech) with activations to non-speech sounds (e.g., noise, tones, and chords) during passive or active listening (e.g., Benson et al., 2001, 2006; Binder et al., 2000; Burton et al., 2000; Burton and Small, 2006; Dick et al., 2007; Jäncke et al., 2002a, 2002b; Liebenthal et al., 2005; LoCasto et al., 2004; Obleser et al., 2006; Rimol et al., 2005, 2006; Specht and Reul, 2003; Uppenkamp et al., 2006; Zaehle et al., 2004). In addition, studies have contrasted AC activations to intelligible speech with activations to unintelligible degraded speech or speech-enveloped noise (Davis and Johnsrude, 2003; Giraud et al., 2004; Narain et al., 2004; Obleser et al., 2007, 2008; Specht et al., 2005).
Clinical significance and developmental changes of auditory-language-related gamma activity
2013, Clinical NeurophysiologyCitation Excerpt :Gamma-augmentation in the left middle-temporal region reached the maximum value immediately prior to stimulus offset (Fig. 5). Previous fMRI studies reported that BOLD responses in both left superior- and middle-temporal regions were larger when word stimuli were given compared to when nonlinguistic control stimuli were provided (Scott and Wise, 2004; Rimol et al., 2006). Gamma-augmentation in the left medial-temporal region reached the maximum value around stimulus offset (Fig. 5).