Skip to main content

Showing 1–3 of 3 results for author: Tulsiani, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2409.10515  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    An Efficient Self-Learning Framework For Interactive Spoken Dialog Systems

    Authors: Hitesh Tulsiani, David M. Chan, Shalini Ghosh, Garima Lalwani, Prabhat Pandey, Ankish Bansal, Sri Garimella, Ariya Rastrow, Björn Hoffmeister

    Abstract: Dialog systems, such as voice assistants, are expected to engage with users in complex, evolving conversations. Unfortunately, traditional automatic speech recognition (ASR) systems deployed in such applications are usually trained to recognize each turn independently and lack the ability to adapt to the conversational context or incorporate user feedback. In this work, we introduce a general fram… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

    Comments: Presented at ICML 2024

  2. arXiv:2401.02417  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition

    Authors: David M. Chan, Shalini Ghosh, Hitesh Tulsiani, Ariya Rastrow, Björn Hoffmeister

    Abstract: While word error rates of automatic speech recognition (ASR) systems have consistently fallen, natural language understanding (NLU) applications built on top of ASR systems still attribute significant numbers of failures to low-quality speech recognition results. Existing assistant systems collect large numbers of these unsuccessful interactions, but these systems usually fail to learn from these… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: To appear in ICASSP 2024

  3. arXiv:2008.03923  [pdf, other

    cs.CL eess.AS

    Knowledge Distillation and Data Selection for Semi-Supervised Learning in CTC Acoustic Models

    Authors: Prakhar Swarup, Debmalya Chakrabarty, Ashtosh Sapru, Hitesh Tulsiani, Harish Arsikere, Sri Garimella

    Abstract: Semi-supervised learning (SSL) is an active area of research which aims to utilize unlabelled data in order to improve the accuracy of speech recognition systems. The current study proposes a methodology for integration of two key ideas: 1) SSL using connectionist temporal classification (CTC) objective and teacher-student based learning 2) Designing effective data-selection mechanisms for leverag… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.