Skip to main content

Showing 1–6 of 6 results for author: Sukhadia, V N

.
  1. arXiv:2407.09823  [pdf, ps, other

    cs.CL cs.AI

    NativQA: Multilingual Culturally-Aligned Natural Query for LLMs

    Authors: Md. Arid Hasan, Maram Hasanain, Fatema Ahmad, Sahinur Rahman Laskar, Sunaya Upadhyay, Vrunda N Sukhadia, Mucahid Kutlu, Shammur Absar Chowdhury, Firoj Alam

    Abstract: Natural Question Answering (QA) datasets play a crucial role in evaluating the capabilities of large language models (LLMs), ensuring their effectiveness in real-world applications. Despite the numerous QA datasets that have been developed and some work has been done in parallel, there is a notable lack of a framework and large scale region-specific datasets queried by native users in their own la… ▽ More

    Submitted 30 May, 2025; v1 submitted 13 July, 2024; originally announced July 2024.

    Comments: LLMs, Native, Multilingual, Language Diversity, Contextual Understanding, Minority Languages, Culturally Informed, Foundation Models, Large Language Models

    MSC Class: 68T50 ACM Class: F.2.2; I.2.7

  2. arXiv:2406.13431  [pdf, other

    cs.CL cs.SD eess.AS

    Children's Speech Recognition through Discrete Token Enhancement

    Authors: Vrunda N. Sukhadia, Shammur Absar Chowdhury

    Abstract: Children's speech recognition is considered a low-resource task mainly due to the lack of publicly available data. There are several reasons for such data scarcity, including expensive data collection and annotation processes, and data privacy, among others. Transforming speech signals into discrete tokens that do not carry sensitive information but capture both linguistic and acoustic information… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  3. arXiv:2305.19584  [pdf, other

    cs.CL eess.AS

    The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing Multilingual ASR

    Authors: Kaousheik Jayakumar, Vrunda N. Sukhadia, A Arunkumar, S. Umesh

    Abstract: Building a multilingual Automated Speech Recognition (ASR) system in a linguistically diverse country like India can be a challenging task due to the differences in scripts and the limited availability of speech data. This problem can be solved by exploiting the fact that many of these languages are phonetically similar. These languages can be converted into a Common Label Set (CLS) by mapping sim… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 5 pages,5 figures, submitted to INTERSPEECH2023

  4. arXiv:2211.01669  [pdf, other

    eess.AS cs.SD eess.SP

    Channel-Aware Pretraining of Joint Encoder-Decoder Self-Supervised Model for Telephonic-Speech ASR

    Authors: Vrunda N. Sukhadia, A. Arunkumar, S. Umesh

    Abstract: This paper proposes a novel technique to obtain better downstream ASR performance from a joint encoder-decoder self-supervised model when trained with speech pooled from two different channels (narrow and wide band). The joint encoder-decoder self-supervised model extends the HuBERT model with a Transformer decoder. HuBERT performs clustering of features and predicts the class of every input frame… ▽ More

    Submitted 3 June, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: 5 pages, 5 figures

  5. Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition

    Authors: A Arunkumar, Vrunda N Sukhadia, S. Umesh

    Abstract: Self-supervised learning (SSL) based models have been shown to generate powerful representations that can be used to improve the performance of downstream speech tasks. Several state-of-the-art SSL models are available, and each of these models optimizes a different loss which gives rise to the possibility of their features being complementary. This paper proposes using an ensemble of such SSL rep… ▽ More

    Submitted 11 June, 2022; originally announced June 2022.

    Comments: 4 pages , 2 figures,submitted to interspeech 2022

  6. Domain Adaptation of low-resource Target-Domain models using well-trained ASR Conformer Models

    Authors: Vrunda N. Sukhadia, S. Umesh

    Abstract: In this paper, we investigate domain adaptation for low-resource Automatic Speech Recognition (ASR) of target-domain data, when a well-trained ASR model trained with a large dataset is available. We argue that in the encoder-decoder framework, the decoder of the well-trained ASR model is largely tuned towards the source-domain, hurting the performance of target-domain models in vanilla transfer-le… ▽ More

    Submitted 29 May, 2023; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: 5 pages,2 figures