Search | arXiv e-print repository

AI-Based Feedback in Counselling Competence Training of Prospective Teachers

Authors: Tobias Hallmen, Kathrin Gietl, Karoline Hillesheim, Moritz Bauermann, Annemarie Friedrich, Elisabeth André

Abstract: This study explores the use of AI-based feedback to enhance the counselling competence of prospective teachers. An iterative block seminar was designed, incorporating theoretical foundations, practical applications, and AI tools for analysing verbal, paraverbal, and nonverbal communication. The seminar included recorded simulated teacher-parent conversations, followed by AI-based feedback and qual… ▽ More This study explores the use of AI-based feedback to enhance the counselling competence of prospective teachers. An iterative block seminar was designed, incorporating theoretical foundations, practical applications, and AI tools for analysing verbal, paraverbal, and nonverbal communication. The seminar included recorded simulated teacher-parent conversations, followed by AI-based feedback and qualitative interviews with students. The study investigated correlations between communication characteristics and conversation quality, student perceptions of AI-based feedback, and the training of AI models to identify conversation phases and techniques. Results indicated significant correlations between nonverbal and paraverbal features and conversation quality, and students positively perceived the AI feedback. The findings suggest that AI-based feedback can provide objective, actionable insights to improve teacher training programs. Future work will focus on refining verbal skill annotations, expanding the dataset, and exploring additional features to enhance the feedback system. △ Less

Submitted 6 May, 2025; originally announced May 2025.

arXiv:2504.11460 [pdf, other]

Semantic Matters: Multimodal Features for Affective Analysis

Authors: Tobias Hallmen, Robin-Nico Kampa, Fabian Deuser, Norbert Oswald, Elisabeth André

Abstract: In this study, we present our methodology for two tasks: the Emotional Mimicry Intensity (EMI) Estimation Challenge and the Behavioural Ambivalence/Hesitancy (BAH) Recognition Challenge, both conducted as part of the 8th Workshop and Competition on Affective & Behavior Analysis in-the-wild. We utilize a Wav2Vec 2.0 model pre-trained on a large podcast dataset to extract various audio features, cap… ▽ More In this study, we present our methodology for two tasks: the Emotional Mimicry Intensity (EMI) Estimation Challenge and the Behavioural Ambivalence/Hesitancy (BAH) Recognition Challenge, both conducted as part of the 8th Workshop and Competition on Affective & Behavior Analysis in-the-wild. We utilize a Wav2Vec 2.0 model pre-trained on a large podcast dataset to extract various audio features, capturing both linguistic and paralinguistic information. Our approach incorporates a valence-arousal-dominance (VAD) module derived from Wav2Vec 2.0, a BERT text encoder, and a vision transformer (ViT) with predictions subsequently processed through a long short-term memory (LSTM) architecture or a convolution-like method for temporal modeling. We integrate the textual and visual modality into our analysis, recognizing that semantic content provides valuable contextual cues and underscoring that the meaning of speech often conveys more critical insights than its acoustic counterpart alone. Fusing in the vision modality helps in some cases to interpret the textual modality more precisely. This combined approach results in significant performance improvements, achieving in EMI $ρ_{\text{TEST}} = 0.706$ and in BAH $F1_{\text{TEST}} = 0.702$, securing first place in the EMI challenge and second place in the BAH challenge. △ Less

Submitted 18 April, 2025; v1 submitted 16 March, 2025; originally announced April 2025.

arXiv:2407.13408 [pdf, other]

DISCOVER: A Data-driven Interactive System for Comprehensive Observation, Visualization, and ExploRation of Human Behaviour

Authors: Dominik Schiller, Tobias Hallmen, Daksitha Withanage Don, Elisabeth André, Tobias Baur

Abstract: Understanding human behavior is a fundamental goal of social sciences, yet its analysis presents significant challenges. Conventional methodologies employed for the study of behavior, characterized by labor-intensive data collection processes and intricate analyses, frequently hinder comprehensive exploration due to their time and resource demands. In response to these challenges, computational mo… ▽ More Understanding human behavior is a fundamental goal of social sciences, yet its analysis presents significant challenges. Conventional methodologies employed for the study of behavior, characterized by labor-intensive data collection processes and intricate analyses, frequently hinder comprehensive exploration due to their time and resource demands. In response to these challenges, computational models have proven to be promising tools that help researchers analyze large amounts of data by automatically identifying important behavioral indicators, such as social signals. However, the widespread adoption of such state-of-the-art computational models is impeded by their inherent complexity and the substantial computational resources necessary to run them, thereby constraining accessibility for researchers without technical expertise and adequate equipment. To address these barriers, we introduce DISCOVER -- a modular and flexible, yet user-friendly software framework specifically developed to streamline computational-driven data exploration for human behavior analysis. Our primary objective is to democratize access to advanced computational methodologies, thereby enabling researchers across disciplines to engage in detailed behavioral analysis without the need for extensive technical proficiency. In this paper, we demonstrate the capabilities of DISCOVER using four exemplary data exploration workflows that build on each other: Interactive Semantic Content Exploration, Visual Inspection, Aided Annotation, and Multimodal Scene Search. By illustrating these workflows, we aim to emphasize the versatility and accessibility of DISCOVER as a comprehensive framework and propose a set of blueprints that can serve as a general starting point for exploratory data analysis. △ Less

Submitted 18 July, 2024; originally announced July 2024.

ACM Class: J.4

arXiv:2403.11879 [pdf, other]

Unimodal Multi-Task Fusion for Emotional Mimicry Intensity Prediction

Authors: Tobias Hallmen, Fabian Deuser, Norbert Oswald, Elisabeth André

Abstract: In this research, we introduce a novel methodology for assessing Emotional Mimicry Intensity (EMI) as part of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild. Our methodology utilises the Wav2Vec 2.0 architecture, which has been pre-trained on an extensive podcast dataset, to capture a wide array of audio features that include both linguistic and paralinguistic componen… ▽ More In this research, we introduce a novel methodology for assessing Emotional Mimicry Intensity (EMI) as part of the 6th Workshop and Competition on Affective Behavior Analysis in-the-wild. Our methodology utilises the Wav2Vec 2.0 architecture, which has been pre-trained on an extensive podcast dataset, to capture a wide array of audio features that include both linguistic and paralinguistic components. We refine our feature extraction process by employing a fusion technique that combines individual features with a global mean vector, thereby embedding a broader contextual understanding into our analysis. A key aspect of our approach is the multi-task fusion strategy that not only leverages these features but also incorporates a pre-trained Valence-Arousal-Dominance (VAD) model. This integration is designed to refine emotion intensity prediction by concurrently processing multiple emotional dimensions, thereby embedding a richer contextual understanding into our framework. For the temporal analysis of audio data, our feature fusion process utilises a Long Short-Term Memory (LSTM) network. This approach, which relies solely on the provided audio data, shows marked advancements over the existing baseline, offering a more comprehensive understanding of emotional mimicry in naturalistic settings, achieving the second place in the EMI challenge. △ Less

Submitted 16 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 4657-4665

arXiv:2209.13914 [pdf, other]

An Efficient Multitask Learning Architecture for Affective Vocal Burst Analysis

Authors: Tobias Hallmen, Silvan Mertes, Dominik Schiller, Elisabeth André

Abstract: Affective speech analysis is an ongoing topic of research. A relatively new problem in this field is the analysis of vocal bursts, which are nonverbal vocalisations such as laughs or sighs. Current state-of-the-art approaches to address affective vocal burst analysis are mostly based on wav2vec2 or HuBERT features. In this paper, we investigate the use of the wav2vec successor data2vec in combinat… ▽ More Affective speech analysis is an ongoing topic of research. A relatively new problem in this field is the analysis of vocal bursts, which are nonverbal vocalisations such as laughs or sighs. Current state-of-the-art approaches to address affective vocal burst analysis are mostly based on wav2vec2 or HuBERT features. In this paper, we investigate the use of the wav2vec successor data2vec in combination with a multitask learning pipeline to tackle different analysis problems at once. To assess the performance of our efficient multitask learning architecture, we participate in the 2022 ACII Affective Vocal Burst Challenge, showing that our approach substantially outperforms the baseline established there in three different subtasks. △ Less

Submitted 28 September, 2022; originally announced September 2022.

Showing 1–5 of 5 results for author: Hallmen, T