Skip to main content

Showing 1–5 of 5 results for author: Klumpp, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.03132  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    Speaker- and Text-Independent Estimation of Articulatory Movements and Phoneme Alignments from Speech

    Authors: Tobias Weise, Philipp Klumpp, Kubilay Can Demir, Paula Andrea Pérez-Toro, Maria Schuster, Elmar Noeth, Bjoern Heismann, Andreas Maier, Seung Hee Yang

    Abstract: This paper introduces a novel combination of two tasks, previously treated separately: acoustic-to-articulatory speech inversion (AAI) and phoneme-to-articulatory (PTA) motion estimation. We refer to this joint task as acoustic phoneme-to-articulatory speech inversion (APTAI) and explore two different approaches, both working speaker- and text-independently during inference. We use a multi-task le… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: to be published in Interspeech 2024 proceedings

  2. arXiv:2303.00802  [pdf, other

    cs.CL cs.SD eess.AS

    Synthetic Cross-accent Data Augmentation for Automatic Speech Recognition

    Authors: Philipp Klumpp, Pooja Chitkara, Leda Sarı, Prashant Serai, Jilong Wu, Irina-Elena Veliche, Rongqing Huang, Qing He

    Abstract: The awareness for biased ASR datasets or models has increased notably in recent years. Even for English, despite a vast amount of available training data, systems perform worse for non-native speakers. In this work, we improve an accent-conversion model (ACM) which transforms native US-English speech into accented pronunciation. We include phonetic knowledge in the ACM training to provide accurate… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  3. arXiv:2204.04016  [pdf, other

    eess.AS cs.CL cs.LG cs.SD q-bio.QM

    Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment

    Authors: Tobias Weise, Philipp Klumpp, Kubilay Can Demir, Andreas Maier, Elmar Noeth, Bjoern Heismann, Maria Schuster, Seung Hee Yang

    Abstract: Speech intelligibility assessment plays an important role in the therapy of patients suffering from pathological speech disorders. Automatic and objective measures are desirable to assist therapists in their traditionally subjective and labor-intensive assessments. In this work, we investigate a novel approach for obtaining such a measure using the divergence in disentangled latent speech represen… ▽ More

    Submitted 27 June, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: Submitted and Accepted at INTERSPEECH2022

  4. arXiv:2201.05912  [pdf, other

    eess.AS cs.LG cs.SD

    Common Phone: A Multilingual Dataset for Robust Acoustic Modelling

    Authors: Philipp Klumpp, Tomás Arias-Vergara, Paula Andrea Pérez-Toro, Elmar Nöth, Juan Rafael Orozco-Arroyave

    Abstract: Current state of the art acoustic models can easily comprise more than 100 million parameters. This growing complexity demands larger training datasets to maintain a decent generalization of the final decision function. An ideal dataset is not necessarily large in size, but large with respect to the amount of unique speakers, utilized hardware and varying recording conditions. This enables a machi… ▽ More

    Submitted 31 January, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: Pre-print submitted to LREC 2022 Link to Common Phone: https://zenodo.org/record/5846137

  5. arXiv:2112.11514  [pdf, ps, other

    eess.AS cs.AI cs.LG

    The Phonetic Footprint of Parkinson's Disease

    Authors: Philipp Klumpp, Tomás Arias-Vergara, Juan Camilo Vásquez-Correa, Paula Andrea Pérez-Toro, Juan Rafael Orozco-Arroyave, Anton Batliner, Elmar Nöth

    Abstract: As one of the most prevalent neurodegenerative disorders, Parkinson's disease (PD) has a significant impact on the fine motor skills of patients. The complex interplay of different articulators during speech production and realization of required muscle tension become increasingly difficult, thus leading to a dysarthric speech. Characteristic patterns such as vowel instability, slurred pronunciati… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

    Comments: https://www.sciencedirect.com/science/article/abs/pii/S0885230821001169

    Journal ref: Elsevier Computer Speech and Language, Volume 72, March 2022