PhD Project 2

Driving forces behind perceptual adaptation in speech


PhD-candidate: Shruti Ullas
PIs: Anne Cutler (WP1) and Elia Formisano (WP1)
Start date: 16 June 2014

(last update 2019-06-27)

Research Content

Spoken-language comprehension requires dynamic adjustments in how the acoustic signal is mapped onto abstract phonetic categories. This enables listeners to deal rapidly and effectively with many sources of variability in speech. Several specific mechanisms have been documented which demonstrate that the re-tuning of phonetic categories can be driven by a range of different kinds of information. This project asks whether these mechanisms converge or if they are in fact separate phenomena. Four different types of adaptation will be investigated in terms of their behavioural characteristics and their effect(s) on neural processing in auditory cortex.


Lexical and audio-visual cues induce recalibration through both visual and auditory pathways
Team members involved: Ullas, Hausfeld (UM), Cutler, and Formisano

A functional MRI study was conducted which compared lexical and audio-visual recalibration in the brain. While in the 7T MRI scanner, participants were presented with lexical and audio-visual stimuli that could induce a perceptual bias towards one of two particular phonemes. In order to measure recalibration effects, ambiguous phoneme blends were presented after lexical and audio-visual stimuli, and participants were asked to respond with the phoneme they perceived to hear.
Based on the responses to the ambiguous phoneme blends, participants showed recalibration effects. Responses to these sounds largely matched with the phoneme being biased for by the preceding audio-visual or lexical stimuli. However, audio-visual stimuli were more effective than lexical stimuli in inducing this effect, as seen previously in a behavioural study as well (see figure). Neural responses to the audio-visual and lexical stimuli showed activity across both the auditory and visual cortices. Visual cortex activity was more pronounced for audio-visual stimuli, but still significant for lexical stimuli as well. Activity during the test blocks, or while participants responded to ambiguous sounds, was also found primarily across auditory regions, but in visual areas as well. Most notably, although all test phases were identical, test blocks following audio-visual exposure still showed strong activity in visual areas, despite the lack of visual stimuli. This result suggests that visual “traces” from the preceding audio-visual exposure underlie the strength of the recalibration effect. The study bridged multiple disciplines and was able to successfully establish a new design for comparing two types of recalibration, both behaviourally and using neuroimaging. The study also accomplished the larger goal of gaining a deeper understanding of how listeners adjust to variability in speech, a universal problem for all listeners.

Recalibration effects were measured as /t/-responses to ambiguous phoneme blends after exposure to blocks of lexical (audio recordings of words) and audiovisual stimuli (video recordings of pseudo-words) that induced a bias towards either /p/ or /t/. Audiovisual stimuli were more effective in inducing this bias than lexical stimuli, although both forms of exposure still led to significant effects, in both within- and between-subjects designs.

Progress 2018

An fMRI study was conducted which compared lexical and audio-visual recalibration in the brain. Participants performed the same recalibration task as in the previous behavioural studies, where lexical and audio-visual cues containing a phoneme bias were presented in alternating blocks, followed by a forced-choice test on ambiguous sounds which could resemble one of two phonemes. Behavioural results were replicated in the scanner, and listeners were able to adjust their category boundary between two phonemes following the presentation of the lexical and audio-visual cues. Both cue types led to successful recalibration, although audio-visual information was more effective than lexical in inducing the effects. fMRI results showed that presentations of both audio-visual and lexical stimuli lead to activity primarily in and around auditory regions, while areas in the visual cortex also showed activation for both types of cues, although more strongly for audio-visual stimuli. Notably, neural activity during the test phases (which only consist of ambiguous sounds regardless of the prior cue presentation) was driven primarily by visual cortex regions. These patterns are currently being explored in follow-up analyses and a manuscript is under preparation.

Groundbreaking characteristics

This study brought LiI and UM researchers together with varied backgrounds in speech and auditory perception, extensive experience with brain imaging (particularly fMRI at 7T), as well as expertise in psycholinguistics and cognitive psychology.