Big Question 1

(last update 2019-06-18)

The nature of the mental lexicon: How to bridge neurobiology and psycholinguistic theory by computational modelling?

This Big Question addresses how to use computational modelling to link levels of description, from neurons to cognition and behaviour, in understanding the language system. Focus is on the mental lexicon and the aim is to characterize its structure in a way that is precise and meaningful in neurobiological and (psycho)linguistic terms. The overarching goal is to devise causal/explanatory models of the mental lexicon that can explain neural and behavioural data. This will significantly deepen our understanding of the neural, cognitive, and functional properties of the mental lexicon, lexical access, and lexical acquisition.

The BQ1 team takes advantage of recent progress in the understanding of modelling realistic neural networks, improvements in neuroimaging techniques and data analysis, and developments in accounting for the semantic, syntactic and phonological properties of words and other items stored in the mental lexicon. Using one common notation ‒high-dimensional numerical vectors‒ neurobiological and computational (psycho)linguistic models of the mental lexicon are integrated and methods are developed for comparing model predictions to large-scale neuroimaging data.

BQ1 thus comprises three main research strands, respectively focusing on models of lexical representation, models of neural processing, and methods for bridging between model predictions and neural data. It is taken into account that lexical items rarely occur in isolation but form parts of (and are interpreted in the context of) sentences and discourse. Moreover, the BQ1-team refrains from prior assumptions about what the lexical items are, that is, lexical items do not need to be equivalent to words but may be smaller or large units.

Thus, the Big Question is tackled from three directions:
(i) by investigating which vector representations of items in the mental lexicon are appropriate to encode their linguistically salient (semantic, combinatorial, and phonological) properties;
(ii) by developing neural processing models of access to, and development of, the mental lexicon; and
(iii) by designing novel evaluation methods and accessing appropriate data for linking the models to neuroimaging and behavioural data.

The BQ1 endeavour is inherently interdisciplinary in that it applies computational research methods to explain neural, behavioural, and linguistic empirical phenomena. One of its main innovative aspects consists of bringing together neurobiology, psycholinguistics, and linguistic theory (roughly corresponding to different levels of description of the language system) using a single mathematical formalism; a feat that requires extensive interdisciplinary team collaboration. Thus, BQ1 integrates questions of a Linguistic, Psychological, Neuroscientific, and Data-analytic nature.

Highlights

Highlight 1: Neuronal memory for language processing

A neurobiological read-write memory for sentence processing

Team members: Fitz, Uhlmann, Van den Broek (MPI), Duarte (Research Centre Jülich), Hagoort, and Petersson

The neurobiological basis of short-term memory for language comprehension was investigated, using biological networks of spiking neurons. Specifically, the hypothesis was tested that memory on short timescales might be rooted in neuronal adaptation (intrinsic plasticity) rather than excitatory synaptic feedback or changes in functional connectivity.

The network simulations on a stream of language input show that neuronal processing memory with suitable time constants enables context-dependent sentence interpretation and the resolution of binding relations between words. The approach grounds central concepts such as memory span, trace decay and interference in neurobiological terms. An account is proposed of neural computation and memory where action potentials write information into slower dynamic variables that are coupled to the cell membrane. From these memory registers (storage), past input is continuously retrieved back into the active network state (computation). This contrasts with traditional accounts of short-term memory where information is maintained in persistent spiking activity. The paper is available on bioRxiv and currently under review.

This work is highly interdisciplinary since it combines methods from computational neuroscience with language research, neurobiology, and computability theory. The results challenge widely held views in both neuroscience and language modelling about the infrastructure of processing memory.

The project involved collaboration between very diverse areas of expertise. This causal modelling approach is unique in language research and, given the complexity of issues across levels of description, is only feasible within a collaborative team of researchers.

Highlight 2: Visualisation and ‘diagnostic classifiers’ reveal how recurrent and recursive neural networks process hierarchical structure

Performance of different model types on arithmetic task as a function of arithmetic formula length

Team members: Hupkes, Veldhoen (ILLC), and Zuidema

It is investigated how neural networks can learn and process languages with hierarchical, compositional semantics. To this end, the artificial task of processing nested arithmetic expressions was defined, and it was studied whether different types of neural networks can learn to compute their meaning. An approach (diagnostic classification) was developed where multiple hypotheses on the information encoded and processed by the network were formulated and tested.

It was found that recursive neural networks can implement a generalising solution to arithmetic problems. As a next step, recurrent neural networks were investigated and it was shown that a gated recurrent unit, that processes its input incrementally, also performs very well on this task. The diagnostic classification results indicate that the networks follow a cumulative strategy, which explains the high accuracy of the network on novel expressions, the generalisation to longer expressions than seen in training, and the mild deterioration with increasing length. This in turn shows that diagnostic classifiers can be a useful technique for opening up the black box of neural networks.

Diagnostic classification, unlike most visualisation techniques, does scale up from small networks in a toy domain to larger and deeper recurrent networks dealing with real-life data, and may therefore contribute to a better understanding of the internal dynamics of current state-of-the-art models in natural language processing. This will be a very valuable tool for analysing the neural models of language that are being developed in BQ1.

Highlight 3: Cortical information flow for system identification in neuroscience

Features of V1 extracted using the Cortical Information Flow model

Team members: Ambrogioni and Van Gerven

Cortical information flow (CIF) is a new framework for system identification in neuroscience. CIF models represent neural systems as coupled brain regions that each embody neural computations. These brain regions are coupled to observed data specific to that region. Neural computations are estimated via stochastic gradient descent. Using a large-scale fMRI dataset it was shown that, in this manner, models can be estimated that learn meaningful neural computations. The framework is general in the sense that it can be used in conjunction with any (combination of) neural recording techniques. It is also scalable, providing neuroscientists with a principled approach to make sense of the high-dimensional neural datasets.

Using fMRI data collected during prolonged naturalistic stimulation it was shown that BOLD responses across different brain regions could successfully be predicted. Furthermore, meaningful receptive fields emerged after model estimation. Importantly, the learnt receptive fields are specific to each brain region but collectively explain all of the observed measurements.  These results demonstrate for the first time that biologically meaningful neural information processing systems can be estimated directly from neural data.  CIF allows neuroscientists to specify hypotheses about neuronal interactions and test these by quantifying how well the resulting models explain observed measurements.

The cortical information flow model has proven to be a powerful encoding model that can be used to jointly reproduce brain and behavioural data. The approach is very flexible as it learns all its parameters from the data. Consequently, this approach can be used to analyse neural data acquired using heterogeneous measurement devices and experimental settings in a common computational framework.

The project is the fruit of a strict collaboration between people with complementary fields of expertise. This is required as the development of the Cortical Flow model requires both sophisticated mathematical expertise and deep knowledge of the computational principles behind the human cortex.

Synergy with other Big Questions

  • The developed models can be evaluated against neuroimaging data that is collected in BQ2 and BQ4.
  • BQ1 PhD student Merkx explored possible collaborations with Piai (Tenure) and Camerino (PhD) on using Distributional Semantics Models to analyse patients’ word-association data.
  • The role of prior structure in the language network, which is being investigated within BQ2, is also of great importance to the neural processing models developed in BQ1, and it might be able to integrate findings and test specific hypotheses from BQ2 (e.g., on connectivity) within the BQ1 computational models.
  • Individual differences can be captured by variance in the orthographic/phonological, morpho-syntactic, and semantic vector representations developed in BQ1 (e.g., due to difference in training data or parameter setting), which may be able to account for findings from BQ4.
  • At a technical level, some of the modelling proposed in BQ3 can benefit from the extensive modelling expertise that exists and will further be developed in BQ1.
  • Language models have come to play an important role in one of the work packages of the BQ5 proposal. Zuidema (Tenure) and Frank (Tenure) play a role as co-investigators.

People involved

Collaborators

Dr. Renato Duarte, Forschungszentrum, Jülich, Germany
Prof. dr. Abigail Morrison, Bernstein Center, Freiburg, Germany