Skip to main content

Showing 1–4 of 4 results for author: Sarmah, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.03606  [pdf, ps, other

    eess.AS cs.AI cs.CL eess.SP

    Tone recognition in low-resource languages of North-East India: peeling the layers of SSL-based speech models

    Authors: Parismita Gogoi, Sishir Kalita, Wendy Lalhminghlui, Viyazonuo Terhiija, Moakala Tzudir, Priyankoo Sarmah, S. R. M. Prasanna

    Abstract: This study explores the use of self-supervised learning (SSL) models for tone recognition in three low-resource languages from North Eastern India: Angami, Ao, and Mizo. We evaluate four Wav2vec2.0 base models that were pre-trained on both tonal and non-tonal languages. We analyze tone-wise performance across the layers for all three languages and compare the different models. Our results show tha… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Accepted in Interspeech2025

  2. arXiv:2506.00861  [pdf, ps, other

    eess.AS cs.SD

    Leveraging AM and FM Rhythm Spectrograms for Dementia Classification and Assessment

    Authors: Parismita Gogoi, Vishwanath Pratap Singh, Seema Khadirnaikar, Soma Siddhartha, Sishir Kalita, Jagabandhu Mishra, Md Sahidullah, Priyankoo Sarmah, S. R. M. Prasanna

    Abstract: This study explores the potential of Rhythm Formant Analysis (RFA) to capture long-term temporal modulations in dementia speech. Specifically, we introduce RFA-derived rhythm spectrograms as novel features for dementia classification and regression tasks. We propose two methodologies: (1) handcrafted features derived from rhythm spectrograms, and (2) a data-driven fusion approach, integrating prop… ▽ More

    Submitted 14 June, 2025; v1 submitted 1 June, 2025; originally announced June 2025.

    Comments: Accepted in Interspeech, All codes are available in GitHub repo https://github.com/seemark11/DhiNirnayaAMFM

  3. arXiv:2410.20095  [pdf, other

    eess.AS

    Analyzing long-term rhythm variations in Mising and Assamese using frequency domain correlates

    Authors: Parismita Gogoi, Priyankoo Sarmah, S. R. M. Prasanna

    Abstract: The current work explores long-term speech rhythm variations to classify Mising and Assamese, two low-resourced languages from Assam, Northeast India. We study the temporal information of speech rhythm embedded in low-frequency (LF) spectrograms derived from amplitude (AM) and frequency modulation (FM) envelopes. This quantitative frequency domain analysis of rhythm is supported by the idea of rhy… ▽ More

    Submitted 26 October, 2024; originally announced October 2024.

    Comments: Submitted to International Journal of Asian Language Processing (IJALP)

  4. arXiv:2410.05724  [pdf, other

    eess.AS eess.SP

    Exploring rhythm formant analysis for Indic language classification

    Authors: Parismita Gogoi, Sishir Kalita, Priyankoo Sarmah, S. R Mahadeva Prasanna

    Abstract: This paper reports a preliminary study on quantitative frequency domain rhythm cues for classifying five Indian languages: Bengali, Kannada, Malayalam, Marathi, and Tamil. We employ rhythm formant (R-formants) analysis, a technique introduced by Gibbon that utilizes low-frequency spectral analysis of amplitude modulation and frequency modulation envelopes to characterize speech rhythm. Various mea… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: Submitted to ICASSP 2025

    ACM Class: I.2.7