Big Question 3

(last update 2020-05-12)

Creating a shared cognitive space: How is language grounded in and shaped by communicative settings of interacting people?

Language is a key socio-cognitive human function predominantly used in interaction. Yet, linguistics and cognitive neuroscience have largely focused on individuals’ coding-decoding signals according to their structural dependencies. Understanding the communicative use of language requires shifting the focus of investigation to the mechanisms used by interlocutors to share a conceptual space.

This big question considers the influence of two dimensions over multiple communicative resources (speech, gestures, gaze) and linguistic structures (from phonology to pragmatics), namely the temporal structure of communicative interactions and the functional dynamics of real-life communicative interactions.

There is deep collaboration between all BQ3 subprojects. The qualitative results that follow from the simulation studies will be related to the empirical findings from the other subprojects and vice versa, the empirical observations from the other subprojects will inspire the qualitative hypotheses to be tested. The cognitive agent-based simulation studies go beyond the empirical paradigm in the BQ3 project, because they allow us to test for qualitative differences in interactive behaviour by manipulating the cognitive capacities of the agents—something that is difficult to do with human test subjects—while simultaneously leading to explicit theories of computational mechanisms.

Highlights

Highlight 1: Cognitive agent-based modelling of communication

Team members: Blokpoel, Dingemanse, Woensdregt, Kachergis (Stanford University), Bögels, Toni, and van Rooij

How can we understand each other in the face of word ambiguity and knowledge asymmetry? Using cognitive agent-based models of probabilistic pragmatic inference, we systematically vary ambiguity and asymmetry in the lexicons of interacting agents. We show that pragmatic communicators successfully deal with ambiguity by means of recursive pragmatic reasoning, and, remarkably, exploit the ambiguity of their lexicons to overcome knowledge asymmetries. Our formalization offers a principled explanation for the success of pragmatic inference in adverse conditions, even before recourse to computationally costlier contextual scaffolding and interactive feedback.

We investigated the effect of ambiguity , asymmetry and order of pragmatic inference on the agents' ability to communicate successfully. Agents produce and interpret signals based on Rational Speech Act (Frank & Goodman, 2012). Simulations show that pragmatic communicators can overcome asymmetry by exploiting the ambiguity in their lexicons. Figure 1 shows the results, where there is a clear increased asymmetry tolerance for pragmatic agents with moderately ambiguous lexicons. This proves that there exist computationally leaner mechanisms that allow interlocutors to counter the detrimental effects of asymmetry prior to resorting to more costly forms of pragmatic inference and interactive repair.

Figure 1.

Figure 1. Main simulation results. A: Mean communicative success for interacting agents with moderately ambiguous lexicons , where  indicates peak performance of literal (zero order) agents,  the first point where performance of pragmatic (first order) agents drops below that, and the dotted line between them is the additional amount of asymmetry pragmatic agents can tolerate by exploiting ambiguity. B: Increased asymmetry toleration for each possible combination of lexicon ambiguity, showing that for pragmatic agents there is a 'sweet spot' where moderate ambiguity on both sides helps rather than hinders communication. Surrounding smaller panels illustrate communicative success for different combinations of lexicon ambiguity.

This framework guides the integration of intuitive theories from the subprojects in BQ3 in a unified, formal theoretical framework, which is instrumental to BQ3’s interdisciplinary goal. Moreover, the project is innovative on multiple fronts: novel simulation methodology based on interacting agents, accessibility and open-science (see tutorials here).

The project and its team members have proven to be highly successful in translating difficult computational notions to non-expert collaborators. Through focus sessions, it has been the foundation of BQ3 internal collaboration, giving the team members a common language to speak. The full paper is available online as a preprint.

Highlight 2: Multimodal and pragmatic alignment in dialogue

Team members: Rasenberg, Özyürek, and Dingemanse

We aim to understand how people reach mutual understanding, for example when talking about novel objects or abstract ideas. Previous work has shown that people often repeat each other’s words and gestures, something referred to as “alignment”. However, little is known about when and why people exactly align to each other. Furthermore, we do not yet have a complete understanding of the relationship between different types of alignment (e.g., alignment of words vs. alignment of gestures). Our research sheds new light on these questions and enables us to unify different theoretical perspectives.

An initial step has been to review the diverging theoretical interpretations and empirical operationalizations of lexical and gestural alignment. To capture the multidimensional nature of the phenomenon of ‘alignment’, we identified five key dimensions to formalize the relationship between any pair of behaviors: sequence, time, semantics, form and modality.

The integrative framework proposed here (in Figure 2) draws upon work from a range of disciplines, and introduces a novel perspective on alignment by considering it from a multidimensional as well as a multimodal perspective. It also functions as a stepping stone for future empirical and theoretical work within BQ3, and contributes to the overall goal of Language in Interaction to understand the dynamics of language in social interaction.

Figure 2. Conceptual framework for understanding and investigating alignment.

The review is a direct result of BQ3’s endeavor to investigate various kinds of alignment (both in terms of behaviour as well as conceptual representations), which highlighted the need for an integrative framework. The combination of expertise within BQ3 – from psycholinguistics to gesture studies and from joint action to computational cognitive science – consequently shaped the development of a framework that is both applicable and relevant across a wide range of disciplines.

Highlight 3: Alignment of pitch and articulation rate

Team members: Eijk, Ernestus, and Schriefers

Previous studies have shown that speakers align their speech with each other at multiple linguistic levels. Conflicting results were reported about whether speakers adapt their speech to the directly preceding stretch of speech or to the general speech characteristics of their interlocutor. This study investigates whether alignment is mostly the result of priming from the immediately preceding speech materials, focussing on pitch and articulation rate (AR).

Native Dutch speakers completed sentences, first by themselves (pre-test), then in alternation with Confederate 1 (Round 1), with Confederate 2 (Round 2), with Confederate 1 again (Round 3), and lastly by themselves again (post-test). Results indicate that participants aligned to the confederates and that this alignment lasted during the post-test, as visible in Figures 3A and 3B.

Furthermore, the confederates’ directly preceding sentences were not good predictors for the participants’ pitch and AR. These results contribute to the main question of Big Question 3 by showing that alignment seems to be a global effect on the phonetic level, more than being a local priming effect.

Figure 3A. Participants’ median F0 over pre-test, Rounds 1, 2 and 3 and post-test; lines were fitted using lm. Points represent Confederates’ means.

Figure 3B. Participants’ AR over pre-test, Rounds 1, 2 and 3 and post-test; lines were fitted using lm. Points represent Confederates’ means.

Highlight 4: Creating shared (neural) representations

Team members: Bögels, Milivojevic, Arnese, and Toni

Empirical paradigm: The empirical part of BQ3 (see Figure 4A for an overview) is a study on about 70 pairs of participants engaged in face-to-face communicative interactions. Each pair needs to find their way of uniquely identifying novel objects (“Fribbles”, see Barry et al., 2014). Through repeated interactions about each Fribble, pair-specific labels emerge. We consider several elements of the face-to-face interactions in each pair, e.g. speech is transcribed, co-speech gestures and pragmatic devices are identified. These multimodal measurements are meant to identify regularities in how pairs achieve mutual understanding. We focus on linguistic alignment (e.g., syntactic, phonological/phonetic, lexical, and semantic alignment); gestural alignment; and pragmatic devices (e.g., backchannels and repair). Before and after each pair engaged in the face-to-face interactions, we measured the participants’ individual representations of the Fribbles, using both fMRI and behavioural metrics. We hypothesise that the individual representations of the Fribbles will change as a function of the level of alignment achieved during the face-to-face interaction. We aim to uncover which variables measured during the interaction are the major contributors to the emergence of conceptual alignment during communication.

Figure 4A. Overview of the empirical paradigm.

One aim of this project is to investigate how the conceptual representations of two communicators change and become more similar as a result of communication. To investigate this, we measure participants’ individual representations of the Fribbles (novel objects) both before and after a series of communicative interactions about these objects (see empirical paradigm above), using both neural and behavioural metrics. The communicative interactions are structured in a ‘director-matcher’ task (e.g., Clark & Wilkes-Gibbs, 1986) in which each member of a pair takes turns describing the Fribbles to the other.

In the naming task, participants name each Fribble, using one to three words. We compare the names given by the members of a pair to the same Fribble, before and after their communicative interactions, using similarity scores between the vector-based models of those words (Mandera, Keuleers, & Brysbaert, 2017). The similarity score reflects the co-occurrences of those words in large text corpora. A preliminary analysis on 51 pairs shows increased similarity scores after those pairs engaged in face-to-face communicative interactions over those Fribbles (real pairs, Figure 4B, left panel). The increased similarity score is driven by the communicative interactions of each pair: there is no change in the similarity scores of random pairs, i.e. pairs of participants that performed the same tasks and experienced the same interaction, but not with each other (i.e., random pairs, Figure 4B, right panel).

Figure 4B. Preliminary results (51 pairs) for naming similarity before (pre) and after (post) the interaction between names for the same Fribbles from real pairs, that interacted with each other (left), and random pairs that did not (right).

Similar observations emerge from an independent behavioural measure of participants’ mental representations of the Fribbles. In the features task, participants rate each Fribble across a number of features based on more visual (e.g., “How rounded is this Fribble?”), and more abstract (e.g., “How human is this Fribble”) properties of the objects (based on Binder et al., 2016). For each Fribble, we correlated the scores obtained across 29 features between two members of a pair (real pairs, Figure 4C, left panel) as well as between members of random pairs (Figure 4C, right panel). Preliminary results suggest that there is indeed an increase in feature-based similarity for real pairs but not random pairs after the interaction.

Figure 4C. Preliminary results (51) pairs for feature correlations before (pre) and after (post) the interaction between names for the same Fribbles from real pairs (left) and random pairs (right).

This project aims to characterize the neural mechanisms that lead to the increased similarity in communicators’ mental representations of the Fribbles. We do that by using fMRI to measure neurovascular responses to visual presentations of the Fribbles, one by one, before and after each participant is engaged in the communicative interactions. By using Representational Similarity Analysis (RSA, Kriegeskorte, Mur, & Bandettini, 2008) – a type of analysis which uses correlations of fMRI activity patterns as a proxy for similarity  of neural responses to different Fribbles – we can quantify how the relations between the Fribbles change within as well as between members of real pairs and random pairs. More precisely, we first measure the activation pattern elicited by each Fribble in a particular brain region of each participant. Second, we correlate the activation patterns for different Fribbles in the same participant, leading to an RSA matrix of correlations that is specific to that participant (see Figure 4D). These matrices will then be correlated between participants to see how similar their Fribbles’ representations are. We hypothesise that Fribbles’ representations become more similar following an interaction in real pairs, but not in random pairs, and we hypothesise that this effect would vary as a function of the level of alignment achieved during the face-to-face interaction. This project also aims to understand how these changes in neural representations across different brain regions are influenced by particular metrics of the communicative interaction.

Figure 4D. Overview of the planned fMRI analysis using representational similarity analysis (RSA) to compare brain activation patterns of two participants.

References Highlight 4:
Binder, J. R., Conant, L. L., Humphries, C. J., Fernandino, L., Simons, S. B., Aguilar, M., & Desai, R. H. (2016). Toward a brain-based componential semantic representation. Cognitive Neuropsychology33(3–4), 130–174. https://doi.org/10.1080/02643294.2016.1147426
Clark, H. H., & Wilkes-Gibbs, D. (1986). Referring as a collaborative process. Cognition22(1), 1–39. https://doi.org/10.1016/0010-0277(86)90010-7
Kriegeskorte, N., Mur, M., & Bandettini, P. A. (2008). Representational similarity analysis - connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience2. https://doi.org/10.3389/neuro.06.004.2008
Mandera, P., Keuleers, E., & Brysbaert, M. (2017). Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language92, 57–78. https://doi.org/10.1016/j.jml.2016.04.001
 
 

Synergy with other Big Questions

Synergies between BQ3 and BQ5 are anticipated, given a shared interest in understanding how agents navigate and organize conceptual spaces. Methodological tools are sharpened by interacting with the LiI Toolkit work package for developing automatic analysis of the co-speech gestures acquired during communicative interactions in BQ3.

Collaboration between BQ3-SP4 (Blokpoel and van Rooij) and BQ5-SP3 (Martin) has been started. The goal for BQ5-SP3 is to develop a formal theory and implement it in a computational model with the purpose of understanding the cognitive neural transformations that underlie structure generation in language and in action planning. This is a natural extension of the computational-level, cognitive and interactional focus of the theory development in BQ3-SP4, and vice versa. Both projects are expected to be mutually informing and constraining, further refining their respective theories; and taken together the theories aim to explain communication spanning the neural/linguistic and cognitive/interactional levels.

People involved

Coordinator

Ivan Toni