PhD Project 17

Working towards a neurally plausible parser


PhD-candidate: Dieuwke Hupkes
PIs: Jelle Zuidema (Tenure Track)
Start date: 01 July 2015

(last update 2019-06-27)

Research Content

The meaning of a sentence depends on the meaning of the words it is composed of, the way they are combined and the context in which they are uttered. Computing such sentence meanings ("semantic parsing") in naturally occurring texts and dialogues is challenging, in part because of the enormous size of the required lexicon (containing at least tens of thousands of different words), and in part because of the complex rules of combination. The so-far most successful computer models of this process rely on symbolic combination rules (for which possible corresponding mechanisms in the brain are not easily identified) and on elaborate, hand-crafted lexicons.


Visualisation and ‘diagnostic classifiers’ reveal how recurrent and recursive neural networks process hierarchical structure
Team members: Hupkes, Veldhoen (ILLC), and Zuidema

It is investigated how neural networks can learn and process languages with hierarchical, compositional semantics. To this end, the artificial task of processing nested arithmetic expressions was defined, and it was studied whether different types of neural networks can learn to compute their meaning. An approach (diagnostic classification) was developed where multiple hypotheses on the information encoded and processed by the network were formulated and tested.

It was found that recursive neural networks can implement a generalising solution to arithmetic problems. As a next step, recurrent neural networks were investigated and it was shown that a gated recurrent unit, that processes its input incrementally, also performs very well on this task. The diagnostic classification results indicate that the networks follow a cumulative strategy, which explains the high accuracy of the network on novel expressions, the generalisation to longer expressions than seen in training, and the mild deterioration with increasing length. This in turn shows that diagnostic classifiers can be a useful technique for opening up the black box of neural networks.

Performance of different model types on arithmetic task as a function of arithmetic formula length

Diagnostic classification, unlike most visualisation techniques, does scale up from small networks in a toy domain to larger and deeper recurrent networks dealing with real-life data, and may therefore contribute to a better understanding of the internal dynamics of current state-of-the-art models in natural language processing. This will be a very valuable tool for analysing neural models of language.

Progress 2018

Building on the work from 2016 and 2017, much further progress was made in understanding how a neural architecture may represent the hierarchical structure of language – thus situating this work firmly between the disciplines of linguistics, neurobiology and machine learning.
In the work on diagnostic classification and hierarchical structure, focus shifted to neural language models, and it was investigated if and how these model represent richer structural linguistic information, such as number agreement and negative polarity. Furthermore, the properties of neural sequence-to-sequence models (such as those powering recent successes in machine translation) were investigated using a number of diagnostic tasks proposed in the emerging literature on compositionality in deep learning models. Finally, promising results were obtained with a variant of the 'symbolic guidance' that was studied earlier. In this variant, called 'attentive guidance', learning algorithms (as all state-of-the-art algorithms nowadays in natural language processing) are used that have an attention mechanism, and provide a supervision signal on the attention.

Groundbreaking characteristics

In this project new techniques will be developed that allow to define a semantic parser that is neurally more plausible and relies on learned neural representations for words, but can still deal accurately and efficiently with the variation and complexity of naturally occurring utterances. To accomplish these goals we build on developments in computational linguistics, machine learning and cognitive science, including work on 'deep' neural networks, vector symbolic architectures and reservoir computing.
The project thus combines insights from computational linguistics, neurobiology and formal semantics, to create a completely new framework for semantic parsing that is adequate both at linguistic and neural levels of description.