Skip to main content

Showing 1–2 of 2 results for author: Yaruss, J S

Searching in archive cs. Search in all archives.
.
  1. Inclusive ASR for Disfluent Speech: Cascaded Large-Scale Self-Supervised Learning with Targeted Fine-Tuning and Data Augmentation

    Authors: Dena Mujtaba, Nihar R. Mahapatra, Megan Arney, J. Scott Yaruss, Caryn Herring, Jia Bin

    Abstract: Automatic speech recognition (ASR) systems often falter while processing stuttering-related disfluencies -- such as involuntary blocks and word repetitions -- yielding inaccurate transcripts. A critical barrier to progress is the scarcity of large, annotated disfluent speech datasets. Therefore, we present an inclusive ASR design approach, leveraging large-scale self-supervised learning on standar… ▽ More

    Submitted 1 October, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: Included in 2024 Proceedings of INTERSPEECH

    ACM Class: I.2; K.4

  2. arXiv:2405.06150  [pdf, other

    cs.CL cs.CY eess.AS

    Lost in Transcription: Identifying and Quantifying the Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent Speech

    Authors: Dena Mujtaba, Nihar R. Mahapatra, Megan Arney, J. Scott Yaruss, Hope Gerlach-Houck, Caryn Herring, Jia Bin

    Abstract: Automatic speech recognition (ASR) systems, increasingly prevalent in education, healthcare, employment, and mobile technology, face significant challenges in inclusivity, particularly for the 80 million-strong global community of people who stutter. These systems often fail to accurately interpret speech patterns deviating from typical fluency, leading to critical usability issues and misinterpre… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted to NAACL 2024