PhD Project 19

Uncovering the mechanisms of language switching

PhD-candidate: Chara Tsoukala
PIs: Stefan Frank (Tenure Track) and Antal van den Bosch (WP7)
Start date: 01 September 2015

(last update 2019-06-27)

Research Content

Multilingual environments are becoming more prevalent and it is increasingly common to acquire at least one foreign language. Yet, much language research (and in particular formal modelling) takes the monolingual perspective, ignoring the possible presence of additional languages. Explicitly modelling the effect of multilingualism on sentence production provides a more realistic insight into the human processing system.
People who speak multiple languages are able to switch from one to the other, and often do so unconsciously. Although much is known about the factors that facilitate or inhibit language switching in both comprehension and production, the underlying mechanisms are not well understood. The goal of this project is to develop a computational cognitive model that captures the mechanisms and representations underlying language switching and thereby simulates multilingual behaviour. The final result is a thoroughly validated and formal model of the language-switching process, which will greatly increase understanding of how multiple language systems interact in a single speaker.


Simulating auxiliary phrase asymmetry in code-switched Spanish-English
Team members: Tsoukala, Kroff (UoF), Frank, Broersma (CLS), and Van den Bosch

Spanish-English bilinguals are likely to code-switch in auxiliary phrases after the auxiliary “estar” (“to be”) but not after “haber” (“to have”). This phenomenon remains to be explained, but one hypothesis is that it is caused by the fact that “haber” is exclusively used as an auxiliary. This was tested using the bilingual dual-path model of bilingual sentence production.

The model showed a strong preference for switches after “estar” than after “haber”, mirroring the human data (Fig. left). When “haber” is replaced with “tener” (the Spanish non-auxiliary verb “to have”), the model no longer showed a preference for a participle code-switch after the auxiliary verb “estar” (Fig. right). This supports the hypothesis that the code-switching phenomenon is caused by the exclusively auxiliary use of “haber”. These simulations demonstrate the importance of computational modelling, because the same manipulation (i.e., changing verb use in Spanish) is impossible to apply in human participants.

Percentage of Spanish-to-English code-switches after “estar” (green) and “haber/tener” (purple) as a function of amount of network training on Spanish and English sentences. Left: “haber” is only used as an auxiliary, as in Spanish. Right: “haber” was replaced as auxiliary by “tener”, which is also a regular verb (unlike in real Spanish).

This research highlight arises from an interdisciplinary combination of linguistic analysis, psycholinguistic experiments, and computational modelling. It applies an innovative modelling methodology in which a linguistic hypothesis that is not experimentally testable (“the phenomenon is caused by property X of the language”) is tested in a simulation: Varying only property X of the language makes the simulated effect disappear.

Progress 2018

The implemented model of bilingual sentence production was investigated on its ability to produce code-switched sentences. Surprisingly, the model did spontaneously code-switch, even though it has never experienced (i.e., was never trained on) code-switched sentences. Moreover, the general patterns of code-switches match findings from human bilingual corpora. The model also accounts for the difference in types of code-switches produced by early versus late bilinguals: a larger number, and more complex, code-switches were produced when the model was made to simulate early compared to late bilinguals.
During a three-month research visit to the University of Florida, Tsoukala collaborated with Kroff on a study regarding an interesting and unexplained phenomenon in Spanish-English code-switching. The code-switching results were presented at several conferences and an invited research talk. A paper is in preparation for the workshop Cognitive Modelling and Computational Linguistics.

Groundbreaking characteristics

This work would not have been possible without collaboration between experimental psycholinguists, computational linguists, and specialists in the field of multilingualism (in particular code-switching). The research is also interdisciplinary and innovative with respect to research methods, which combine computational modelling technologies with collecting behavioural data from multilingual speakers. The issue of bilingual sentence production has never before been approached with computational modelling.