Elsevier

NeuroImage

Volume 23, Issue 3, November 2004, Pages 1046-1058
NeuroImage

Experimental designs and processing strategies for fMRI studies involving overt verbal responses

https://doi.org/10.1016/j.neuroimage.2004.07.039Get rights and content

Event-related paradigms have been used increasingly in the past few years for the localization of function in tasks involving overt speech. These designs exploit the differences in the temporal characteristics between the rapid motion-induced and the slower hemodynamic signal changes. The optimization of these designs and the best way to analyze the acquired data has not yet been fully explored. The purpose of this study is to investigate various design and analysis strategies for maximizing the detection of function while minimizing task-induced motion artifacts. Both event-related and blocked paradigms can be specifically designed to meet these goals. Various event-related and blocked designs were compared both in simulation and in experiments involving overt word reading in their ability to detect function and to avoid speech-induced motion artifact. A blocked design with task and control durations of 10 s and an event-related design with a minimum stimulus duration (SD) of 5 s and an average interstimulus interval (ISI) of 10 s were found to optimally detect blood oxygenation level-dependent signal changes without significant motion artifact. Ignoring images acquired during the speech can help recover function in areas particularly affected by motion but substantially reduces the detection power in other regions. Using the stimulus timing as an additional regressor to model the motion offers little benefit in practice due to the variability of the motion-induced signal change.

Introduction

The ability for a subject to speak out loud during fMRI time series collection is of significant utility in the study of brain function. In addition to the study of brain systems subserving the production of speech and processing of language, many studies would benefit from having the subject vocalize a response since vocalization can provide substantially more precise and information-rich feedback than button box responses in the context of language tasks. Since there is no animal model that can adequately represent the complex task of language production, the need for a noninvasive imaging method, such as functional MRI, to assess language production is clear. The difficulty with speaking in the MR scanner in these tasks is that the repositioning of the head, jaw, tongue, and facial muscles during speech lead to distortions and misregistration in the time series MR images (Barch et al., 1999, Binder, 1995, Birn et al., 1998, Birn et al., 1999a, Birn et al., 1999b). These artifactual signal changes can both mask and mimic the blood oxygenation level-dependent (BOLD) signal changes associated with neuronal activity, making detection and localization of speech-related brain activation difficult.

A number of solutions have been proposed to overcome this problem. The most common approach has been to eliminate the motor component of speech in the tasks, relying instead on silent word production (Binder, 1995, Buckner et al., 2000). Huang et al. (2002) have found reduction of motion artifacts when subjects are trained to reduce speech associated head movements prior to the actual scan, and Small et al. (1996) have obtained reasonable results when head movement was severely restricted by using a bite bar.

Each of these techniques has its limitations in studies involving overt speech. While silent word production certainly reduces the occurrence of motion artifacts, overt word production could certainly involve the activation of additional brain regions not active during silent word processing (Barch et al., 1999, Huang et al., 2002, Palmer et al., 2001). Additionally, the restriction on speaking out loud may not be psychologically or behaviorally appropriate for the particular task being studied, for example, if the task requires the subject to receive feedback from the vocalization of the words or when it is necessary to record the subject's verbal response. Postprocessing of images using rigid-body image registration techniques cannot remove all of the image distortions arising from speaking since the movement of the subject's head, jaw, tongue, and facial muscles also causes changes in the magnetic field. These magnetic field changes cause a warping of the image in the phase encode direction (for echo-planar acquisitions) or a blurring of the image (for spiral acquisitions). This distortion can be significant, especially in slices in the inferior region of the brain, leading to signal changes of anywhere from 5% to 100% (Birn et al., 1998, Yetkin et al., 1996). Since this warping is not necessarily uniform across the image, the ‘apparent’ motion cannot be corrected using rigid-body image registration routines. Dynamic correction of magnetic field changes would require continuous acquisition of magnetic field maps throughout the imaging run. This requires a modification of existing imaging sequences and is susceptible to physiologically induced phase variations. Training of subjects prior to the scan can reduce, but not completely eliminate, speech-related movement artifacts since the movement of the jaw, tongue, and facial muscles are inherent to word production.

More recently, studies have begun to use event-related fMRI designs to separate the effects of motion from the neuronal-induced BOLD signal changes (Barch et al., 1999, Birn et al., 1999a, Birn et al., 1999b, Burgund et al., 2003, Huang et al., 2002, Palmer et al., 2001, Preibisch et al., 2003). The key to these methods is the difference in the temporal dynamics of motion-induced and hemodynamic signal changes arising from the difference in the physical mechanisms producing these changes. The BOLD response is delayed in onset by several seconds and increases to a peak value 5–6 s after the initiation of a task. In contrast, motion-induced signal changes for tasks such as overt word production, jaw clenching, tongue movement, or swallowing occur primarily during the task performance. If the task is performed only briefly, such as in an event-related paradigm, then the signal changes resulting from motion occur prior to and have a much different temporal shape than the delayed BOLD signal changes.

In the simplest case, overt speech can be performed for brief periods, separated by periods of time sufficiently long to allow for the full evolution of the hemodynamic response (Birn et al., 1999a, Birn et al., 1999b). The motion-induced signal changes appears as a rapid increase or decrease in the MR signal, concomitant with the speech production. These artifactual signal changes usually occur in less than a second, much more rapidly than the slower hemodynamic response. This difference in the temporal delay and shape between the motion-induced and BOLD signal changes can then be exploited either by ignoring the images occurring during the motion or by modeling the signal as a sum of the stimulus timing (representing the motion-induced changes) and the slower ideal hemodynamic response. A predominant drawback with event-related techniques using constant interstimulus intervals is that tasks are limited to brief periods of word production, separated by long rest periods, which may not be appropriate for all psychological studies. Since the signals from brief stimuli are so small, the task must be repeated numerous times to reach sufficient functional contrast to noise, leading to long acquisition times if long interstimulus intervals are required. The hemodynamic response must also be sampled quickly enough to allow discrimination against motion-induced signal changes, limiting the TR and hence the number of slices that can be acquired.

Successful functional imaging during more rapid speech is possible by employing an event-related design with a varying interstimulus interval (ISI) (Birn et al., 1999a, Birn et al., 1999b). The success of this type of design was recently demonstrated by Palmer et al. (2001) in a word stem completion task. In this study, localization of function without significant motion artifacts was achieved without explicitly modeling the motion in their analysis. The effectiveness of this strategy is based on the fact that the model hemodynamic responses of these designs have a low intrinsic correlation with the motion-induced signal changes. As a result, a linear fit of the model hemodynamic response to voxel time series will contain only a very small component of the motion-induced signal.

A key principle is that stimulus time courses can be specifically designed to minimize the correlation between anticipated motion-induced and BOLD signal changes. This minimization can also be employed for a block design, where the duration of the task and control periods can be designed such that the correlation between the stimulus timing (which is quite similar in character to the expected motion-induced signal change) is orthogonal to the expected hemodynamic response. A question that therefore arises is which stimulus design is optimal in the sense of reducing sensitivity to task-related motion and maximizing detection of BOLD signal changes. The purpose of this paper is to develop a framework for designing optimal stimulus paradigms and evaluate different analysis strategies to provide motion artifact-free functional activation maps during task-induced motion, such as overt speaking. Several different stimulus designs and analysis strategies are presented and compared in terms of their sensitivity to motion and detection power, first in simulation, and finally in experiments involving overt word production.

In the first section of this paper, the effects of motion-induced signal changes in fMRI using both blocked and event-related stimulus designs with both constant and varied ISI are simulated. Two quantities of interest are computed: (1) the correlation between motion-induced and BOLD signal changes (leading to false-positives in signal detection), and (2) the efficiency of the design to detect BOLD signal changes both in the presence and in the absence of motion-induced signal changes (an assessment of the true-positives and false-negatives). The efficiency of the design optimal for minimizing the detection of motion-induced signal changes will be compared to the efficiency of the design optimal for detection of function in the absence of motion artifacts; the latter has been the subject of several recent studies (Birn et al., 2002, Dale, 1999, Friston et al., 1999, Liu et al., 2001). In the second part of this paper, experiments involving an overt word generation task are performed using a blocked design and an event-related design with either a constant or a varying ISI, and the sensitivity to motion and detection power of BOLD activation are compared.

Section snippets

Simulations

The effectiveness of various stimulus timing designs in reducing the false-positives and increasing the correct detection of BOLD signal changes was first tested by a series of simulations. Three types of paradigms were assessed: (1) a blocked design with equal task and control periods; (2) an event-related design with a constant interstimulus interval (ISI) and stimulus duration (SD); and (3) an event-related design with varying ISIs and varying SDs. A task block was considered to consist of

Simulations

Of particular interest is the minimization of false-positives while maximizing the detection efficiency. Fig. 5 shows the detection efficiency of various designs in the absence of motion on the horizontal axis plotted against the t statistic of detecting purely motion-induced signal changes as BOLD signal changes. Blocked designs with long task and control periods have a high detection efficiency but also show the greatest likelihood of false-positives, as indicated by the high t statistic of

Experiments

Functional images obtained from the blocked-trial paradigm with 30 s task and rest periods contained significant artifacts, most prominently at the edge of the brain. These artifacts were reduced substantially when the block duration was shortened to 10 s. Artifacts were also reduced for all three event-related techniques. In the blocked-trial paradigm with 30 s task and rest periods, the signal intensity time course of a pixel near an edge is similar to the signal time course of a pixel in the

Discussion

As demonstrated by both simulations and experimental results, the sensitivity to motion caused by overt speech can be significantly reduced by properly designing the stimulus paradigm. All of these designs work by exploiting the difference in the temporal properties (i.e., the delay and duration) of rapid motion-induced signal changes and the more sluggish hemodynamic BOLD response. In previous studies, this strategy was implemented in an event-related paradigm with long constant ISIs (Barch et

Conclusions

There are a number of paradigm design strategies for reducing the influence of task-induced motion. For blocked designs with equal task and control period durations, a block duration of 10 s offsets the hemodynamic response by precisely a quarter cycle relative to the task timing. While this strategy exploits primarily the temporal delay of the BOLD response and is therefore susceptible to variations in the onset delay, it is a simple modification for many fMRI studies. Across all subjects

References (29)

  • J. Binder

    Functional magnetic resonance imaging of language cortex

    Int. J. Imaging Syst. Technol.

    (1995)
  • R.M. Birn et al.

    Estimated BOLD impulse response depends on stimulus ON/OFF ratio

    NeuroImage

    (2001)
  • R.M. Birn et al.

    Magnetic field changes in the human brain due to swallowing or speaking

    Magn. Reson. Med.

    (1998)
  • R.M. Birn et al.

    Event-related fMRI of tasks involving brief motion

    Hum. Brain Mapp.

    (1999)
  • Cited by (134)

    • Abnormally weak functional connections get stronger in chronic stroke patients who benefit from naming therapy

      2021, Brain and Language
      Citation Excerpt :

      For a given participant, stimuli included color photographs of their assigned/trained items (n = 36); the unassigned/untrained items from the same categories as their assigned items (n = 36); a set of scrambled pictures (n = 36); and items from the control category (i.e., fruit, n = 36). Stimuli were presented in random order and were separated by a randomly varying interstimulus interval (ISI) of two or four seconds, an approach that has been shown to reduce the effects of motion associated with overt verbal responses in event-related designs (Birn et al., 2004). A black fixation cross on a white background was presented during the ISI.

    • ICA-based denoising strategies in breath-hold induced cerebrovascular reactivity mapping with multi echo BOLD fMRI

      2021, NeuroImage
      Citation Excerpt :

      There are different ways to account for motion effects on task-based fMRI data analysis. For instance, such effects can be reduced during acquisition by implementing an event-related task paradigm (Birn et al., 1999, 2004). However, in a BH task the periods of apnoea are typically between 10 and 20 s in duration to achieve a robust and reproducible vasodilatory response (Bright and Murphy, 2013a; Magon et al., 2009), and are not readily adapted to a brief event-related design.

    View all citing articles on Scopus
    View full text