PhD Project 20

How to slow down and speed up: the regulation of speech rate

 

PhD-candidate: Joe Rodd
PIs: Antje Meyer (WP5) and Mirjam Ernestus (WP1)
Start date: 01 February 2016

(last update 2019-07-01)

Research Content

This project will investigate the psychological mechanisms underlying the regulation of speech rate. It will include two tightly linked components: the development of a family of computationally implemented models of speech planning and a series of multiple-object naming experiments designed to assess predictions of the models. Dependent variables will be derived from analyses of the participants’ eye movements, their speed and accuracy of speaking, and phonetic properties of the utterances. The results will contribute to understanding of the regulation of speech rate and to theories about the interface of the speech production system with executive control and articulation.

Highlight

Team members: Rodd, Bosker, Ten Bosch (CLS), Ernestus, and Meyer

Speakers are in control of their speaking rate, but this ability is not well explained by current theories of speech production. An existing theory of speech production was extended to be able to account for speech rate control, and a computational implementation of the theory was simulated. It was found that speakers adopt qualitatively different configurations of their speech production apparatus to produce speech at different rates.

That speakers can vary their speaking rate is evident, but how they accomplish this has hardly been studied. Consider this analogy: when walking, speed can be continuously increased, within limits, but to speed up further, humans must run. Are there multiple qualitatively distinct speech 'gaits' that resemble walking and running? Or is control achieved by continuous modulation of a single gait?
This study investigates these possibilities through simulations of a new connectionist computational model of the cognitive process of speech production. The model was fitted to a corpus of disyllabic Dutch words produced at different speaking rates.
After training, the model achieved good fits in all three speaking rates. The parameter settings associated with each speaking rate were not linearly related, suggesting the presence of cognitive gaits.

Upper panel: the parameter settings best suited to modelling each speaking rate (coloured dots), projected into parameter space by PCA. ‘Axes’ through the space were extracted (black lines), and the model was refitted.  Lower panel: The model was refitted using the parameters associated with the intermediate points along the axes. The predicted durations of the syllables and overlap between them are shown (colours). By modelling these data, non-linearity was assessed.

The project crosses discipline boundaries by combining careful phonetic analysis and psycholinguistic experimentation with computational simulation. By examining variation in speech (in this case speaking rate variation) the project brings typically abstract psycholinguistic models a step closer to describing the speaking process anchored in its natural context.

Progress 2018

The phonetic analysis underlying various aspects of the project as a whole was validated and tested, and submitted to JASA and will be published in February 2019. This validation will support the publishability of the other components of the project.
Simulations of a computationally implemented, chronometric variant of the Dell, Burger and Svec (1997) connectionist model of serial order revealed that speakers adopt qualitatively different configurations to achieve different speaking rates, in a parameter learning paradigm. Further modelling work has reinforced these conclusions, by demonstrating that models where the parameter optimization algorithm was allowed to find qualitatively different values for each speaking rate outperformed similar models where the optimizer was constrained to find solutions where the parameter values for the different speaking rates were linearly related.

Groundbreaking characteristics

This project involves an interdisciplinary study, combining phonetics, psycholinguistics, and computational modelling. It is innovative in aiming at building and testing the first psycholinguistic model of speech production that explicitly implements speech rate variation in speech production. By elucidating how speakers control fluctuations in speech rate, it will contribute to LiI's overarching aim of explaining variability and universality at different levels of language processing.