Skip to main content

Showing 1–1 of 1 results for author: Giri, H K

Searching in archive eess. Search in all archives.
.
  1. arXiv:2410.16505  [pdf, other

    cs.SD cs.LG eess.AS

    Do Audio-Language Models Understand Linguistic Variations?

    Authors: Ramaneswaran Selvakumar, Sonal Kumar, Hemant Kumar Giri, Nishit Anand, Ashish Seth, Sreyan Ghosh, Dinesh Manocha

    Abstract: Open-vocabulary audio language models (ALMs), like Contrastive Language Audio Pretraining (CLAP), represent a promising new paradigm for audio-text retrieval using natural language queries. In this paper, for the first time, we perform controlled experiments on various benchmarks to show that existing ALMs struggle to generalize to linguistic variations in textual queries. To address this issue, w… ▽ More

    Submitted 19 February, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: Accepted to NAACL 2025