Skip to main content

Showing 1–3 of 3 results for author: Chetlur, M

.
  1. arXiv:2506.14434  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Unifying Streaming and Non-streaming Zipformer-based ASR

    Authors: Bidisha Sharma, Karthik Pandia Durai, Shankar Venkatesan, Jeena J Prakash, Shashi Kumar, Malolan Chetlur, Andreas Stolcke

    Abstract: There has been increasing interest in unifying streaming and non-streaming automatic speech recognition (ASR) models to reduce development, training, and deployment costs. We present a unified framework that trains a single end-to-end ASR model for both streaming and non-streaming applications, leveraging future context information. We propose to use dynamic right-context through the chunked atten… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Accepted in ACL2025 Industry track

  2. arXiv:2506.11089  [pdf, ps, other

    eess.AS cs.AI cs.CL

    Better Pseudo-labeling with Multi-ASR Fusion and Error Correction by SpeechLLM

    Authors: Jeena Prakash, Blessingh Kumar, Kadri Hacioglu, Bidisha Sharma, Sindhuja Gopalan, Malolan Chetlur, Shankar Venkatesan, Andreas Stolcke

    Abstract: Automatic speech recognition (ASR) models rely on high-quality transcribed data for effective training. Generating pseudo-labels for large unlabeled audio datasets often relies on complex pipelines that combine multiple ASR outputs through multi-stage processing, leading to error propagation, information loss and disjoint optimization. We propose a unified multi-ASR prompt-driven framework using p… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  3. arXiv:2305.12540  [pdf, other

    eess.AS cs.AI cs.SD

    On the Efficacy and Noise-Robustness of Jointly Learned Speech Emotion and Automatic Speech Recognition

    Authors: Lokesh Bansal, S. Pavankumar Dubagunta, Malolan Chetlur, Pushpak Jagtap, Aravind Ganapathiraju

    Abstract: New-age conversational agent systems perform both speech emotion recognition (SER) and automatic speech recognition (ASR) using two separate and often independent approaches for real-world application in noisy environments. In this paper, we investigate a joint ASR-SER multitask learning approach in a low-resource setting and show that improvements are observed not only in SER, but also in ASR. We… ▽ More

    Submitted 25 May, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: accepted to be part of INTERSPEECH 2023