Skip to main content

Showing 1–32 of 32 results for author: Amiriparian, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.08907  [pdf, other

    cs.AI cs.CL cs.CY

    Affective Computing Has Changed: The Foundation Model Disruption

    Authors: Björn Schuller, Adria Mallol-Ragolta, Alejandro Peña Almansa, Iosif Tsangko, Mostafa M. Amin, Anastasia Semertzidou, Lukas Christ, Shahin Amiriparian

    Abstract: The dawn of Foundation Models has on the one hand revolutionised a wide range of research problems, and, on the other hand, democratised the access and use of AI-based tools by the general public. We even observe an incursion of these models into disciplines related to human psychology, such as the Affective Computing domain, suggesting their affective, emerging capabilities. In this work, we aim… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

  2. arXiv:2407.11012  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Exploring Gender-Specific Speech Patterns in Automatic Suicide Risk Assessment

    Authors: Maurice Gerczuk, Shahin Amiriparian, Justina Lutz, Wolfgang Strube, Irina Papazova, Alkomiet Hasan, Björn W. Schuller

    Abstract: In emergency medicine, timely intervention for patients at risk of suicide is often hindered by delayed access to specialised psychiatric care. To bridge this gap, we introduce a speech-based approach for automatic suicide risk assessment. Our study involves a novel dataset comprising speech recordings of 20 patients who read neutral texts. We extract four speech representations encompassing inter… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

    Comments: accepted at INTERSPEECH 2024

    MSC Class: 68T10 ACM Class: J.3

  3. arXiv:2406.17667  [pdf, other

    cs.SD cs.CL eess.AS

    This Paper Had the Smartest Reviewers -- Flattery Detection Utilising an Audio-Textual Transformer-Based Approach

    Authors: Lukas Christ, Shahin Amiriparian, Friederike Hawighorst, Ann-Kathrin Schill, Angelo Boutalikakis, Lorenz Graf-Vlachy, Andreas König, Björn W. Schuller

    Abstract: Flattery is an important aspect of human communication that facilitates social bonding, shapes perceptions, and influences behavior through strategic compliments and praise, leveraging the power of speech to build rapport effectively. Its automatic detection can thus enhance the naturalness of human-AI interactions. To meet this need, we present a novel audio textual dataset comprising 20 hours of… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Interspeech 2024

  4. arXiv:2406.10275  [pdf, other

    cs.CL

    ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets

    Authors: Shahin Amiriparian, Filip Packań, Maurice Gerczuk, Björn W. Schuller

    Abstract: Foundation models have shown great promise in speech emotion recognition (SER) by leveraging their pre-trained representations to capture emotion patterns in speech signals. To further enhance SER performance across various languages and domains, we propose a novel twofold approach. First, we gather EmoSet++, a comprehensive multi-lingual, multi-cultural speech emotion corpus with 37 datasets, 150… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: accepted at INTERSPEECH 2024

    MSC Class: 68T10 ACM Class: I.2

  5. arXiv:2406.07753  [pdf, ps, other

    cs.AI cs.CL

    The MuSe 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor Recognition

    Authors: Shahin Amiriparian, Lukas Christ, Alexander Kathan, Maurice Gerczuk, Niklas Müller, Steffen Klug, Lukas Stappen, Andreas König, Erik Cambria, Björn Schuller, Simone Eulitz

    Abstract: The Multimodal Sentiment Analysis Challenge (MuSe) 2024 addresses two contemporary multimodal affect and sentiment analysis problems: In the Social Perception Sub-Challenge (MuSe-Perception), participants will predict 16 different social attributes of individuals such as assertiveness, dominance, likability, and sincerity based on the provided audio-visual data. The Cross-Cultural Humor Detection… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    MSC Class: 68T10 ACM Class: I.2

  6. Sustained Vowels for Pre- vs Post-Treatment COPD Classification

    Authors: Andreas Triantafyllopoulos, Anton Batliner, Wolfgang Mayr, Markus Fendler, Florian Pokorny, Maurice Gerczuk, Shahin Amiriparian, Thomas Berghaus, Björn Schuller

    Abstract: Chronic obstructive pulmonary disease (COPD) is a serious inflammatory lung disease affecting millions of people around the world. Due to an obstructed airflow from the lungs, it also becomes manifest in patients' vocal behaviour. Of particular importance is the detection of an exacerbation episode, which marks an acute phase and often requires hospitalisation and treatment. Previous work has show… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024

  7. arXiv:2406.02251  [pdf, other

    cs.CL cs.AI

    Modeling Emotional Trajectories in Written Stories Utilizing Transformers and Weakly-Supervised Learning

    Authors: Lukas Christ, Shahin Amiriparian, Manuel Milling, Ilhan Aslan, Björn W. Schuller

    Abstract: Telling stories is an integral part of human communication which can evoke emotions and influence the affective states of the audience. Automatically modeling emotional trajectories in stories has thus attracted considerable scholarly interest. However, as most existing works have been limited to unsupervised dictionary-based approaches, there is no benchmark for this task. We address this gap by… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings. arXiv admin note: text overlap with arXiv:2212.11382

  8. arXiv:2404.12132  [pdf, other

    cs.SD cs.CL eess.AS

    Non-Invasive Suicide Risk Prediction Through Speech Analysis

    Authors: Shahin Amiriparian, Maurice Gerczuk, Justina Lutz, Wolfgang Strube, Irina Papazova, Alkomiet Hasan, Alexander Kathan, Björn W. Schuller

    Abstract: The delayed access to specialized psychiatric assessments and care for patients at risk of suicidal tendencies in emergency departments creates a notable gap in timely intervention, hindering the provision of adequate mental health support during critical situations. To address this, we present a non-invasive, speech-based approach for automatic suicide risk assessment. For our study, we collected… ▽ More

    Submitted 30 October, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    ACM Class: I.2

  9. arXiv:2305.09485   

    econ.GN cs.LG

    Executive Voiced Laughter and Social Approval: An Explorative Machine Learning Study

    Authors: Niklas Mueller, Steffen Klug, Andreas Koenig, Alexander Kathan, Lukas Christ, Bjoern Schuller, Shahin Amiriparian

    Abstract: We study voiced laughter in executive communication and its effect on social approval. Integrating research on laughter, affect-as-information, and infomediaries' social evaluations of firms, we hypothesize that voiced laughter in executive communication positively affects social approval, defined as audience perceptions of affinity towards an organization. We surmise that the effect of laughter i… ▽ More

    Submitted 20 May, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: Method section needs to be updated

  10. arXiv:2305.03369  [pdf, other

    cs.LG cs.AI cs.CL cs.MM

    The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked Emotions, Cross-Cultural Humour, and Personalisation

    Authors: Lukas Christ, Shahin Amiriparian, Alice Baird, Alexander Kathan, Niklas Müller, Steffen Klug, Chris Gagne, Panagiotis Tzirakis, Eva-Maria Meßner, Andreas König, Alan Cowen, Erik Cambria, Björn W. Schuller

    Abstract: The MuSe 2023 is a set of shared tasks addressing three different contemporary multimodal affect and sentiment analysis problems: In the Mimicked Emotions Sub-Challenge (MuSe-Mimic), participants predict three continuous emotion targets. This sub-challenge utilises the Hume-Vidmimic dataset comprising of user-generated videos. For the Cross-Cultural Humour Detection Sub-Challenge (MuSe-Humour), an… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: Baseline paper for the 4th Multimodal Sentiment Analysis Challenge (MuSe) 2023, a workshop at ACM Multimedia 2023

  11. arXiv:2304.14882  [pdf, other

    cs.SD cs.LG eess.AS

    The ACM Multimedia 2023 Computational Paralinguistics Challenge: Emotion Share & Requests

    Authors: Björn W. Schuller, Anton Batliner, Shahin Amiriparian, Alexander Barnhill, Maurice Gerczuk, Andreas Triantafyllopoulos, Alice Baird, Panagiotis Tzirakis, Chris Gagne, Alan S. Cowen, Nikola Lackovic, Marie-José Caraty, Claude Montacié

    Abstract: The ACM Multimedia 2023 Computational Paralinguistics Challenge addresses two different problems for the first time in a research competition under well-defined conditions: In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected. We describe the Sub-Challenges, baseline feature extraction, and classi… ▽ More

    Submitted 1 May, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

    Comments: 5 pages, part of the ACM Multimedia 2023 Grand Challenge "The ACM Multimedia 2023 Computational Paralinguistics Challenge (ComParE 2023). arXiv admin note: text overlap with arXiv:2205.06799

    MSC Class: 68 ACM Class: I.2.7; I.5.0; J.3

  12. arXiv:2301.10477  [pdf, other

    cs.SD cs.CY eess.AS

    HEAR4Health: A blueprint for making computer audition a staple of modern healthcare

    Authors: Andreas Triantafyllopoulos, Alexander Kathan, Alice Baird, Lukas Christ, Alexander Gebhard, Maurice Gerczuk, Vincent Karas, Tobias Hübner, Xin Jing, Shuo Liu, Adria Mallol-Ragolta, Manuel Milling, Sandra Ottl, Anastasia Semertzidou, Srividya Tirunellai Rajamani, Tianhao Yan, Zijiang Yang, Judith Dineley, Shahin Amiriparian, Katrin D. Bartl-Pokorny, Anton Batliner, Florian B. Pokorny, Björn W. Schuller

    Abstract: Recent years have seen a rapid increase in digital medicine research in an attempt to transform traditional healthcare systems to their modern, intelligent, and versatile equivalents that are adequately equipped to tackle contemporary challenges. This has led to a wave of applications that utilise AI technologies; first and foremost in the fields of medical imaging, but also in the use of wearable… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  13. arXiv:2301.00142  [pdf, other

    cs.HC cs.AI cs.CV cs.LG cs.SD eess.AS

    Computational Charisma -- A Brick by Brick Blueprint for Building Charismatic Artificial Intelligence

    Authors: Björn W. Schuller, Shahin Amiriparian, Anton Batliner, Alexander Gebhard, Maurice Gerzcuk, Vincent Karas, Alexander Kathan, Lennart Seizer, Johanna Löchner

    Abstract: Charisma is considered as one's ability to attract and potentially also influence others. Clearly, there can be considerable interest from an artificial intelligence's (AI) perspective to provide it with such skill. Beyond, a plethora of use cases opens up for computational measurement of human charisma, such as for tutoring humans in the acquisition of charisma, mediating human-to-human conversat… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

    ACM Class: A.1

  14. arXiv:2212.11382  [pdf, other

    cs.CL

    Automatic Emotion Modelling in Written Stories

    Authors: Lukas Christ, Shahin Amiriparian, Manuel Milling, Ilhan Aslan, Björn W. Schuller

    Abstract: Telling stories is an integral part of human communication which can evoke emotions and influence the affective states of the audience. Automatically modelling emotional trajectories in stories has thus attracted considerable scholarly interest. However, as most existing works have been limited to unsupervised dictionary-based approaches, there is no labelled benchmark for this task. We address th… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: This work has been submitted to the IEEE for possible publication

  15. arXiv:2212.10690  [pdf, other

    cs.CV cs.CL cs.LG

    METEOR Guided Divergence for Video Captioning

    Authors: Daniel Lukas Rothenpieler, Shahin Amiriparian

    Abstract: Automatic video captioning aims for a holistic visual scene understanding. It requires a mechanism for capturing temporal context in video frames and the ability to comprehend the actions and associations of objects in a given timeframe. Such a system should additionally learn to abstract video sequences into sensible representations as well as to generate natural written language. While the major… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    ACM Class: I.2.10

  16. arXiv:2209.14272  [pdf, other

    cs.LG cs.CL cs.CV cs.MM cs.SD eess.AS

    Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results

    Authors: Lukas Christ, Shahin Amiriparian, Alexander Kathan, Niklas Müller, Andreas König, Björn W. Schuller

    Abstract: Humor is a substantial element of human social behavior, affect, and cognition. Its automatic understanding can facilitate a more naturalistic human-AI interaction. Current methods of humor detection have been exclusively based on staged data, making them inadequate for "real-world" applications. We contribute to addressing this deficiency by introducing the novel Passau-Spontaneous Football Coach… ▽ More

    Submitted 8 July, 2024; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: This work has been submitted to the IEEE for possible publication (Major Revision)

  17. Distinguishing between pre- and post-treatment in the speech of patients with chronic obstructive pulmonary disease

    Authors: Andreas Triantafyllopoulos, Markus Fendler, Anton Batliner, Maurice Gerczuk, Shahin Amiriparian, Thomas M. Berghaus, Björn W. Schuller

    Abstract: Chronic obstructive pulmonary disease (COPD) causes lung inflammation and airflow blockage leading to a variety of respiratory symptoms; it is also a leading cause of death and affects millions of individuals around the world. Patients often require treatment and hospitalisation, while no cure is currently available. As COPD predominantly affects the respiratory system, speech and non-linguistic v… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted in INTERSPEECH 2022

    Journal ref: Proc. Interspeech 2022, 3623-3627

  18. arXiv:2207.05691  [pdf, other

    cs.LG cs.AI cs.CL cs.MM eess.AS

    The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional Reactions, and Stress

    Authors: Lukas Christ, Shahin Amiriparian, Alice Baird, Panagiotis Tzirakis, Alexander Kathan, Niklas Müller, Lukas Stappen, Eva-Maria Meßner, Andreas König, Alan Cowen, Erik Cambria, Björn W. Schuller

    Abstract: The Multimodal Sentiment Analysis Challenge (MuSe) 2022 is dedicated to multimodal sentiment and emotion recognition. For this year's challenge, we feature three datasets: (i) the Passau Spontaneous Football Coach Humor (Passau-SFCH) dataset that contains audio-visual recordings of German football coaches, labelled for the presence of humour; (ii) the Hume-Reaction dataset in which reactions of in… ▽ More

    Submitted 21 October, 2022; v1 submitted 23 June, 2022; originally announced July 2022.

    Comments: Baseline paper for the 3rd Multimodal Sentiment Analysis Challenge (MuSe) 2022, a full-day workshop at ACM Multimedia 2022

  19. arXiv:2206.05833  [pdf, other

    cs.CV cs.HC cs.MM

    COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion Recognition

    Authors: Mani Kumar Tellamekala, Shahin Amiriparian, Björn W. Schuller, Elisabeth André, Timo Giesbrecht, Michel Valstar

    Abstract: Automatically recognising apparent emotions from face and voice is hard, in part because of various sources of uncertainty, including in the input data and the labels used in a machine learning framework. This paper introduces an uncertainty-aware audiovisual fusion approach that quantifies modality-wise uncertainty towards emotion prediction. To this end, we propose a novel fusion framework in wh… ▽ More

    Submitted 16 October, 2023; v1 submitted 12 June, 2022; originally announced June 2022.

    Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence

  20. arXiv:2205.06799  [pdf, other

    cs.SD cs.LG eess.AS

    The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes

    Authors: Björn W. Schuller, Anton Batliner, Shahin Amiriparian, Christian Bergler, Maurice Gerczuk, Natalie Holz, Pauline Larrouy-Maestri, Sebastian P. Bayerl, Korbinian Riedhammer, Adria Mallol-Ragolta, Maria Pateraki, Harry Coppock, Ivan Kiskin, Marianne Sinka, Stephen Roberts

    Abstract: The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audio human activity recognition from smartwatch senso… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: 5 pages, part of the ACM Multimedia 2022 Grand Challenge "The ACM Multimedia 2022 Computational Paralinguistics Challenge (ComParE 2022)"

    MSC Class: 68 ACM Class: I.2.7; I.5.0; J.3

  21. arXiv:2205.04343  [pdf, other

    cs.SD cs.LG eess.AS

    Fatigue Prediction in Outdoor Running Conditions using Audio Data

    Authors: Andreas Triantafyllopoulos, Sandra Ottl, Alexander Gebhard, Esther Rituerto-González, Mirko Jaumann, Steffen Hüttner, Valerie Dieter, Patrick Schneeweiß, Inga Krauß, Maurice Gerczuk, Shahin Amiriparian, Björn W. Schuller

    Abstract: Although running is a common leisure activity and a core training regiment for several athletes, between $29\%$ and $79\%$ of runners sustain an overuse injury each year. These injuries are linked to excessive fatigue, which alters how someone runs. In this work, we explore the feasibility of modelling the Borg received perception of exertion (RPE) scale (range: $[6-20]$), a well-validated subject… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

    Comments: Paper accepted at IEEE EMBC 2022. Rights remain with IEEE

  22. arXiv:2202.08981  [pdf, other

    cs.SD cs.LG eess.AS

    A Summary of the ComParE COVID-19 Challenges

    Authors: Harry Coppock, Alican Akman, Christian Bergler, Maurice Gerczuk, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Jing Han, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Panagiotis Tzirakis, Anton Batliner, Cecilia Mascolo, Björn W. Schuller

    Abstract: The COVID-19 pandemic has caused massive humanitarian and economic damage. Teams of scientists from a broad range of disciplines have searched for methods to help governments and communities combat the disease. One avenue from the machine learning field which has been explored is the prospect of a digital mass test which can detect COVID-19 from infected individuals' respiratory sounds. We present… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: 18 pages, 13 figures

  23. arXiv:2109.12662  [pdf, other

    cs.CL cs.IR cs.LG

    Improving Question Answering Performance Using Knowledge Distillation and Active Learning

    Authors: Yasaman Boreshban, Seyed Morteza Mirbostani, Gholamreza Ghassem-Sani, Seyed Abolghasem Mirroshandel, Shahin Amiriparian

    Abstract: Contemporary question answering (QA) systems, including transformer-based architectures, suffer from increasing computational and model complexity which render them inefficient for real-world applications with limited resources. Further, training or even fine-tuning such models requires a vast amount of labeled data which is often not available for the task at hand. In this manuscript, we conduct… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

  24. arXiv:2109.08049  [pdf, other

    cs.LG cs.AI cs.CV

    A Machine Learning Framework for Automatic Prediction of Human Semen Motility

    Authors: Sandra Ottl, Shahin Amiriparian, Maurice Gerczuk, Björn Schuller

    Abstract: In this paper, human semen samples from the visem dataset collected by the Simula Research Laboratory are automatically assessed with machine learning methods for their quality in respect to sperm motility. Several regression models are trained to automatically predict the percentage (0 to 100) of progressive, non-progressive, and immotile spermatozoa in a given sample. The video samples are adopt… ▽ More

    Submitted 17 September, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

    ACM Class: I.2.0; I.4.10

  25. arXiv:2104.11629  [pdf, other

    cs.SD cs.LG eess.AS

    DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data

    Authors: Shahin Amiriparian, Tobias Hübner, Maurice Gerczuk, Sandra Ottl, Björn W. Schuller

    Abstract: Deep neural speech and audio processing systems have a large number of trainable parameters, a relatively complex architecture, and require a vast amount of training data and computational power. These constraints make it more challenging to integrate such systems into embedded devices and utilise them for real-time, real-world applications. We tackle these limitations by introducing DeepSpectrumL… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: 5 pages, 1 figure

  26. arXiv:2104.10121  [pdf, other

    cs.SD cs.CL eess.AS

    On the Impact of Word Error Rate on Acoustic-Linguistic Speech Emotion Recognition: An Update for the Deep Learning Era

    Authors: Shahin Amiriparian, Artem Sokolov, Ilhan Aslan, Lukas Christ, Maurice Gerczuk, Tobias Hübner, Dmitry Lamanov, Manuel Milling, Sandra Ottl, Ilya Poduremennykh, Evgeniy Shuranov, Björn W. Schuller

    Abstract: Text encodings from automatic speech recognition (ASR) transcripts and audio representations have shown promise in speech emotion recognition (SER) ever since. Yet, it is challenging to explain the effect of each information stream on the SER systems. Further, more clarification is required for analysing the impact of ASR's word error rate (WER) on linguistic emotion recognition per se and in the… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 5 pages, 1 figure

    ACM Class: I.2.7; I.5.0

  27. arXiv:2103.08310  [pdf, other

    cs.SD cs.LG eess.AS

    EmoNet: A Transfer Learning Framework for Multi-Corpus Speech Emotion Recognition

    Authors: Maurice Gerczuk, Shahin Amiriparian, Sandra Ottl, Björn Schuller

    Abstract: In this manuscript, the topic of multi-corpus Speech Emotion Recognition (SER) is approached from a deep transfer learning perspective. A large corpus of emotional speech data, EmoSet, is assembled from a number of existing SER corpora. In total, EmoSet contains 84181 audio recordings from 26 SER corpora with a total duration of over 65 hours. The corpus is then utilised to create a novel framewor… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: 18 pages, 7 figures

  28. arXiv:2102.13468  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates

    Authors: Björn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, Jing Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri Zwerts, Jelle Treep, Casper Kaandorp

    Abstract: The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation SubChallenge, a three-way assessment of the level of es… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: 5 pages

    MSC Class: 68 ACM Class: I.2.7; I.5.0; J.3

  29. arXiv:2012.09478  [pdf, other

    cs.SD cs.CL eess.AS

    The voice of COVID-19: Acoustic correlates of infection

    Authors: Katrin D. Bartl-Pokorny, Florian B. Pokorny, Anton Batliner, Shahin Amiriparian, Anastasia Semertzidou, Florian Eyben, Elena Kramer, Florian Schmidt, Rainer Schönweiler, Markus Wehler, Björn W. Schuller

    Abstract: COVID-19 is a global health crisis that has been affecting many aspects of our daily lives throughout the past year. The symptomatology of COVID-19 is heterogeneous with a severity continuum. A considerable proportion of symptoms are related to pathological changes in the vocal system, leading to the assumption that COVID-19 may also affect voice production. For the very first time, the present st… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 8 pages

    MSC Class: 68T01 ACM Class: J.3

  30. arXiv:2005.08722  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    A Novel Fusion of Attention and Sequence to Sequence Autoencoders to Predict Sleepiness From Speech

    Authors: Shahin Amiriparian, Pawel Winokurow, Vincent Karas, Sandra Ottl, Maurice Gerczuk, Björn W. Schuller

    Abstract: Motivated by the attention mechanism of the human visual system and recent developments in the field of machine translation, we introduce our attention-based and recurrent sequence to sequence autoencoders for fully unsupervised representation learning from audio files. In particular, we test the efficacy of our novel approach on the task of speech-based sleepiness recognition. We evaluate the lea… ▽ More

    Submitted 19 May, 2020; v1 submitted 15 May, 2020; originally announced May 2020.

    Comments: 5 pages, 2 figures, submitted to INTERSPEECH 2020

  31. arXiv:1907.11510  [pdf, ps, other

    cs.HC cs.CV cs.IR cs.LG stat.ML

    AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect Recognition

    Authors: Fabien Ringeval, Björn Schuller, Michel Valstar, NIcholas Cummins, Roddy Cowie, Leili Tavabi, Maximilian Schmitt, Sina Alisamir, Shahin Amiriparian, Eva-Maria Messner, Siyang Song, Shuo Liu, Ziping Zhao, Adria Mallol-Ragolta, Zhao Ren, Mohammad Soleymani, Maja Pantic

    Abstract: The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions. The goal of the Challen… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

  32. arXiv:1712.04382  [pdf, other

    cs.SD eess.AS

    auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks

    Authors: Michael Freitag, Shahin Amiriparian, Sergey Pugachevskiy, Nicholas Cummins, Björn Schuller

    Abstract: auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data. It is based on a recurrent sequence to sequence autoencoder approach which can learn representations of time series data by taking into account their temporal dynamics. We provide an extensive command line interface in addition to a Python API for users and developers, both of which are comprehensively doc… ▽ More

    Submitted 22 December, 2017; v1 submitted 12 December, 2017; originally announced December 2017.