The nature of the mental lexicon: How to bridge neurobiology and psycholinguistic theory by computational modelling?
This Big Question addresses how to use computational modelling to link levels of description, from neurons to cognition and behaviour, in understanding the language system. Focus is on the mental lexicon and the aim is to characterize its structure in a way that is precise and meaningful in neurobiological and (psycho)linguistic terms. The overarching goal is to devise causal/explanatory models of the mental lexicon that can explain neural and behavioural data. This will significantly deepen our understanding of the neural, cognitive, and functional properties of the mental lexicon, lexical access, and lexical acquisition.
The BQ1 team takes advantage of recent progress in the understanding of modelling realistic neural networks, improvements in neuroimaging techniques and data analysis, and developments in accounting for the semantic, syntactic and phonological properties of words and other items stored in the mental lexicon. Using one common notation ‒high-dimensional numerical vectors‒ neurobiological and computational (psycho)linguistic models of the mental lexicon are integrated and methods are developed for comparing model predictions to large-scale neuroimaging data.
BQ1 thus comprises three main research strands, respectively focusing on models of lexical representation, models of neural processing, and methods for bridging between model predictions and neural data. It is taken into account that lexical items rarely occur in isolation but form parts of (and are interpreted in the context of) sentences and discourse. Moreover, the BQ1-team refrains from prior assumptions about what the lexical items are, that is, lexical items do not need to be equivalent to words but may be smaller or large units.
Thus, the Big Question is tackled from three directions:
(i) by investigating which vector representations of items in the mental lexicon are appropriate to encode their linguistically salient (semantic, combinatorial, and phonological) properties;
(ii) by developing neural processing models of access to, and development of, the mental lexicon; and
(iii) by designing novel evaluation methods and accessing appropriate data for linking the models to neuroimaging and behavioural data.
The BQ1 endeavour is inherently interdisciplinary in that it applies computational research methods to explain neural, behavioural, and linguistic empirical phenomena. One of its main innovative aspects consists of bringing together neurobiology, psycholinguistics, and linguistic theory (roughly corresponding to different levels of description of the language system) using a single mathematical formalism; a feat that requires extensive interdisciplinary team collaboration. Thus, BQ1 integrates questions of a Linguistic, Psychological, Neuroscientific, and Data-analytic nature.
Dr. Stefan Frank
Tenure track researcher
Dr. Jelle Zuidema
Tenure track researcher
Prof. dr. Rens Bod
Prof. dr. Mirjam Ernestus
Dr. Raquel Fernández
Prof. dr. Peter Hagoort
PI / Coordinator BQ2
Research Highlights (2020)
Quantifying Attention Flow in Transformers (understanding what’s happening inside state-of-the-art models)
Team members: Samira Abnar and Jelle Zuidema
Transformers’ are the state-of-the-art technology in Natural Language Processing, and also the backbone of models that give the current best predictions of brain activity associated with language processing as measured through ElectroEncephaloGraphy (EEG), Magnetic Resonance Imaging (MRI) or ElectroCorticoGraphy (ECoG). But how do we interpret the internal representations of Transformers? Abnar & Zuidema present two techniques to better analyze and visualize the so-called ‘attention network’ inside these models.
A Transformer model (in our case GPT2 style, with 24 layers) trained on a large amount of text can predict which word to expect at a masked location in a sentence. In the example, the model strongly predicts “his” at the masked position in the sentence “The author talked to Sara about MASK book” (Figure 1, leftmost panel of (a)), presumably because of an expected anaphoric relation with “author” and an unfortunate gender bias. Visualizing raw attention scores (second panel), as was standard in NLP before our paper was published, does not reveal the fact that the model views “author” rather than “Sara” as the likely antecedent.
Our new methods, Attention Rollout (Figure 1, third panel) and Attention Flow (Figure 1, fourth panel), do this successfully, in this example as well as many others. The paper also discusses limitations, including cases (Figure 1 (b)) where the two methods disagree.
The paper brings together insights from various branches of computer science, Artificial Intelligence (AI) and cognitive neuroscience, to propose two simple but useful algorithms for interpreting deep learning models in Natural Language Processing (NLP). These models are state-of-the-art for predicting brain imaging data; by making them more interpretable, this work helps getting closer to understanding the neurobiological basis of language processing.
This work builds on much earlier work in Big Question 1, showing the promise of Transformer models (Merkx & Frank; Abnar, Beinborn & Zuidema) and studying ways of interpreting deep learning models (Hupkes & Zuidema).
Towards naturalistic speech decoding from brain data
Team members: Julia Berezutskaya, Nick Ramsey, and Marcel van Gerven
Speech decoding from the brain activity can enable development of brain-computer interfaces (BCI) to restore naturalistic communication in paralyzed patients. In this study we describe a novel approach to speech decoding that relies on a Generative Adversarial Neural Network (GAN) to generate speech based on the neural activity. We used the novel approach to obtain sound reconstructions from the intracranial neural data recorded during a speech listening task and compared them to several simpler speech decoding baselines. In this project we propose and validate a new speech decoding scheme based on Generative Adversarial Neural Networks (GANs). We used a publicly available dataset of spoken speech to train a GAN. Then, using an intracranial brain dataset we trained a decoder network to predict latent vectors, which were input to the GAN generator. The GAN’s generator was used to reconstruct speech spectrograms that were synthesized into speech using an external vocoder. We showed that the GAN-based model (GAN-Z) achieved the best decoding accuracy in terms of recovering high-level sound properties and perceptual quality of sound (see Table 1 and Figure 2). This was in contrast to baseline models (Vanilla and GAN-D) that were trained to decode speech spectrograms directly.
These results demonstrate the potential of GAN-based models to advance the BCI field and make continuous speech decoding from the brain in naturalistic noisy environments more plausible.
The present study is among the first attempts to leverage advances in automatic sound generation with GANs for reconstructing naturalistic continuous speech from brain recordings. These results demonstrate the potential of GAN-based models to advance the BCI field and make continuous speech decoding from the brain in naturalistic noisy environments more plausible.