Inhaltspezifische Aktionen

Probral Workshops and Symposium, 21-22 May and 1-2 July 2024: Methodological aspects of analyzing motion and video-mediated interaction


Organised by the Chair of Romance Linguistics and Cultural Studies - Prof. Dr. Anna Ladilova

This event is part of the PROBRAL CAPES/DAAD project “The multimodal coordination of intercultural video-mediated interaction” involving the Federal University of Minas Gerais (UFMG), the Federal University of Pará (UFPA), the Federal University of Lavras (UFLA), the University of Potsdam, the University of Duisburg-Essen, and the Justus Liebig University (JLU) Gießen. With particular focus on the underexplored realm of intercultural communication in video-mediated interactions, the project addresses challenges posed by the transition from face-to-face encounters to fragmented visual and auditory environments. Investigating how interlocutors establish intercultural connections in multilingual interactions, the project employs quantitative and qualitative methods to explore the co-construction of meaning across verbal, prosodic, and gestural dimensions. These aspects of communication, often overlooked compared to written data, are approached through an interdisciplinary lens drawing from interactional linguistics, cognitive-cultural studies, and multimodal analysis. Methodologically, data is collected from teletandem activities, foreign language classes, and institutional, as well as private global videoconferences, utilizing techniques such as multimodal conversation analysis, gesture studies, and acoustic analysis. Building upon the work of the Research Center Intercultural Communication in Multimodal Interactions (ICMI) and related projects, the research aims to deepen our understanding of intercultural communication in the digital age.

Motion analysis

Presentations, May 21st, Phil II, Karl-Glöckner-Str. 21G, room G108 (and online)

14:00-15:30: Hani Camille Yehia (UFMG): “Motion analysis techniques during speech communication: dimensionality reduction and alignment of coordinate systems”

Audiovisual speech communication consists of both acoustic and visual components. The acoustic component is the speech signal, which is a one-dimensional pressure signal. To ensure accurate reproduction or computational analysis, the speech signal is typically sampled at ten or twenty thousand samples per second. For computational analysis, it is usually segmented into twenty-millisecond windows, each characterized by parameters such as formant frequencies or LPC coefficients. These parameters are represented as vectors within the acoustic space. In turn, the visual component comprises facial expressions, head movements, and body gestures. These movements are commonly represented by the coordinates of various points tracked in the speaker's image, forming vectors within the visual space.

In this presentation, we will demonstrate how to represent both speech acoustics and speech gestures in coordinate systems that align with the inherent degrees of freedom in speech communication. Additionally, we will explore methods for aligning acoustic and visual coordinate systems to investigate their relationship. These techniques are applicable for analyzing both individual speakers and dialogues.


15:30-17:00: Adriano Vilela Barbosa (UFMG): “Using optical flow and correlation maps to assess coordination during communicative interaction”

This talk presents a methodology for assessing the coupling between different modalities of speech (acoustic, visual) during spoken communication. The techniques presented can be equally applied to both intra- and inter-speaker scenarios. The talk focuses on two main points: i) the use of the FlowAnalyzer software to measure motion from video and ii) the use of CMA (Correlation Map Analysis) as a means of assessing the time-varying coupling between two domains of interest. We first show how to use FlowAnalyzer to extract motion signals from pre-recorded video sequences. We demonstrate the utility of the tool by showing how it allows the experimenter to rely only on ordinary video cameras to measure motion in a completely non-invasive way during speech production experiments. We then show how to use the CMA technique to quantify the time-varying coupling between the visual and acoustic components of speech. In our demonstration, the visual domain will be represented by the motion signals produced by FlowAnalyzer, whereas the acoustic domain will be represented by some parameterization of the acoustic waveform, such as the root mean square (RMS) value or Line Spectrum Pair (LSP) coefficients. We finish by discussing possible directions for future development of the tools and techniques presented.


Workshop, May 22nd, Phil II, Karl-Glöckner-Str. 21G, room G108 (and online)

Hani Camille Yehia & Adriano Vilela Barbosa (UFMG)

14:00-16:00: Part 1: “Motion extraction from video using optical flow analysis”
16:30-18:30: Part 2: “Coordination analysis during communicative interaction using correlation maps”

In this mini-course we will show how to use tools and techniques outlined in our talk "Using optical flow and correlation maps to assess coordination during communicative interaction" to analyze data acquired during actual interactive communication sessions. More specifically, we will show how to:

- install FlowAnalyzer
- use FlowAnalyzer to extract motion data from regions of interest in a video file
- plot the extracted motion signals
- export the extracted motion signals to different file formats so that they can be imported into other frameworks for later analysis
- import the motion files saved by FlowAnalyzer into Python and Matlab
- apply Correlation Map Analysis (CMA) to the motion signals extracted by FlowAnalyzer in order to quantify intra- and inter-speaker coordination during communicative interaction
- extract parameters of interest from the correlation maps with the aim of summarizing the main aspects of the coordination

We also look forward to discussing with colleagues how these tools could be applied to their particular research efforts and how they could be improved in order to make them more useful to researchers.


May 23rd & 24th: Individual data sesssions


Symposium "Methodological aspects of analyzing video-mediated interaction"

In this symposium, researchers will convene to discuss methodological challenges in analyzing multimodal aspects of video-mediated interactions.

July 1st and 2nd, Gustav-Krüger-Saal (R. 105), Ludwigstraße 23, 35390 Gießen (and online)



July 1st

Gustav-Krüger-Saal (R. 105),
Ludwigstraße 23, 35390 Gießen
(and online)


July 2nd

Gustav-Krüger-Saal (R. 105),
Ludwigstraße 23, 35390 Gießen
(and online)

9-13:30 (CET)


Ulrike Schöder (Universidade Federal de Minas Gerais) & Flavia Fidelis (Universidade Federal de Minas Gerais): „The transcription of facial gestures in video-mediated interaction”


Taiane Malabarba (Universität Potsdam): „Multiactivity in video-mediated L2 tutoring” (online)

Paloma Batista Cardoso (Universidade Federal de Sergipe / Europa-Universität Viadrina Frankfurt/Oder): „Methodological aspects of the multimodal description of negation in Brazilian Portuguese”

Sineide Gonçalves (Universidade Federal de Minas Gerais) & Ulrike Schröder (Universidade Federal de Minas Gerais): „Interacting ‘through’: Identifying attention division in interactions through mirrors compared to interactions through screens”

10.30-11.00 coffee break

Flavia Fidelis (Universidade Federal de Minas Gerais) & Milene Oliveira (Universität Potsdam): „Affiliation in opening sequences on VMI”

Fernanda Roque Amendoeira (Universidade Federal de Minas Gerais): „Storytelling and affliation in video-mediated interactions”

13:30-15 (CET)


12-14 (CET)





Ana Luíza Washington & Ulrike Schöder (Universidade Federal de Minas Gerais): "Challenges for multimodal video-mediated transcription in nonnative talk-in-interaction"



Jan Gorisch (Leibniz-Institut für Deutsche Sprache, Mannheim): „Accessing spoken data using automatic speech recognition

Thiago Nascimento (UFLA) & Aline Drumond (UFLA): „Embodied talk in interaction: a multimodal analysis of a storytelling sequence”

16.00-16.30 coffee break

Anna Ladilova (JLU Gießen): „Analysing categorizations in co-speech gesture”

Maria José D'Alessandro Nogueira (Universidade Federal de Minas Gerais): „Whisper as a tool for AST: a video-mediated sequence analysis of talk in ELF”



We welcome researchers from all disciplines to contribute to the fruitful discussion! In order to sign up, please email Charlotte Müller and state which of the presentations/workshops you would like to attend and in which format (in person on campus or online).