PhD Project 26

Brains, machines & language: Using machine learning to learn about language processing in the brain


PIs: Jelle Zuidema (Tenure Track), Marcel van Gerven (BQ1), and R. Fernandez (BQ1)
PhD-candidate: Samira Abnar
Start date: April 01, 2017

(last update 2019-07-01)

Research Content

The key question in this PhD project is how vector representations should be adapted to make them encode the combinatorial properties of words. What should the vector representation be for function words (prepositions, determiners, pronouns, quantifiers, conjunctions)? How can vectors encode that words occur in certain contexts but not in others? How can vector representations be used to encode the difference between transitive and intransitive verbs? To answer these questions, inspiration is taken from syntactic and semantic theory and from computational (psycho)linguistics, and modern machine learning techniques are used to discover vector representations and composition functions that encode the required information.

Highlight

Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity
Team members: Abnar and Zuidema

In this project it was studied how well a range of methods to create word vectors (a.k.a. word embeddings, or distributional semantics of words) allow us to predict and decode brain activity, using the Mitchell et al. 2008 dataset.
The project brings together expertise from corpus linguistics, machine learning and neuroimaging, to address a key question of BQ1: bridging behavioural and neural levels of description of word meaning.
The approach tries to use semantic regularities (modelled using the best word embedding model: dependency based Word2Vec) and brain images obtained for some words to make predictions about the remaining words. E.g., knowing the brain activation associated with hand, arm and foot, can we predict the activation associated with leg, given a semantic model that relates hand-arm-foot-leg?

It is found that so called experiential word embedding models have quite a different error profile than distributional and syntax-based embedding models in the prediction/decoding task.

Brain activations associated with concrete nouns. Data: Mitchell et al. 2008
Accuracies under the different word embedding models

Progress 2018

This project has made a flying start by first replicating, with some modifications, the foundational Mitchell et al. (2008) study that shows that neural activation associated with concrete nouns can be predicted reasonably well from a model that uses co-occurrence counts of these nouns with 25 hand-picked verbs in a large corpus. It was shown that existing results can be improved by using modern word embeddings, such a GloVe and dependency-based Word2Vec. Interesting differences are shown in the errors that different word embedding methods make. This is an important baseline for follow-up studies, as well as a stepping stone for developing similar techniques for other categories of words and for words in context.
A similar pipeline was applied to cases where different types of words are presented in a more naturalistic context, e.g. a story reading task. Thus, the relation was studied between different word embeddings (obtained from different models) with brain activity patterns of people during the story reading task.  The word embeddings used in this work are so-called 'contextualised' word embeddings, which are obtained from the internal states of various neural models, trained with different objectives. This setup is used both for predictive modelling (where linear models are trained to transform word vectors into brain vectors or vice versa) and for representational similarity analysis (to measure the structural similarity of the word embedding spaces with each other and with the brain representations). The models and findings are described in two manuscripts: one in preparation and the other under review for NAACL'19.