PhD Project 22

Encoding and decoding the neural signatures of natural language comprehension


PhD-candidate: Alessandro Lopopolo
PIs: Antal van den Bosch (WP7) and Karl-Magnus Petersson (WP3)
Start date: 01 February 2016

(last update 2019-07-01)

Research Content

Recent computational advances have made it possible to reconstruct naturalistic stimuli from neural responses. This approach is now transferred to reconstructing auditory and linguistic features from brain activity measured while subjects listen to narratives. Work by members of the team shows (a) the feasibility to describe neural responses by means of stimulus characterization with a computational language model, and (b) that conceptual representations can be decoded from brain activity. The current project joins and extends these studies, paving the way for the development of brain-computer interfaces driven by internal speech, and leading to a fuller understanding of the brain basis of language comprehension under naturalistic conditions.


Using Stochastic Language Models to map lexical, syntactic, and phonological information processing in the brain
Team members: Lopopolo, Frank, Van den Bosch, and Willems

This project combines computational linguistics and neurobiology of language. The work aims to find a subnetwork in the language network of the brain sensitive to the statistic structure of language computed on different level of description of the stimulus.

Language comprehension involves the simultaneous processing of information at the phonological, syntactic, and lexical level. These three distinct streams of information in the brain are tracked by using stochastic measures derived from computational language models to detect neural correlates of phoneme, part-of-speech, and word processing in an fMRI experiment. Probabilistic language models have proven to be useful tools for studying how language is processed as a sequence of symbols unfolding in time. Conditional probabilities between sequences of words are at the basis of probabilistic measures such as surprisal and perplexity which have been successfully used as predictors of several behavioural and neural correlates of sentence processing. Here perplexity is computed from sequences of words and their parts of speech, and their phonemic transcriptions. Brain activity time-locked to each word is regressed on the three model-derived measures. It was observed that the brain keeps track of the statistical structure of lexical, syntactic and phonological information in distinct areas.

Brain areas sensitive to lexical (green), part-of-speech (blue) and phonological (red) perplexity.

Progress 2018

In the last year, the analyses of the MEG data continued with the implementation of a convolutional neural network-based classifier aimed to extract linguistic information from brain activity. Preliminary results confirmed the feasibility both at single subject and group level.
In parallel, a series of studies have been conducted on the differential effects of dependency structures and probabilistic sequential models on behavioural and neural data. The number and typology of dependency relations between words in a sentence and their preceding sentential context seems to predict the number of eye regressions from the same words, suggesting a role of eye movements during online sentence parsing beyond reanalysis. Similar measures seem to correlate with the activity in the precuneus and in the posterior cingulate cortex.

Groundbreaking characteristics

This project combines state of the art computational linguistic techniques with advanced neuroimaging methodologies. Given the multidisciplinary nature of this endeavor, the project sees the active collaboration of researchers from different communities (computational linguistics, cognitive science and computational neuroscience). The coupling of computational features and patterns of brain activity will allow optimal probing of the spatial and temporal localization of how the linguistic information is encoded and processed in the human brain. This will provide a novel evaluation of different approaches in language representation and processing from a computational point of view using naturalistic stimuli.