Skip to main content

Showing 1–1 of 1 results for author: Tovstogan, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.10057  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

    Authors: Ilaria Manco, Benno Weck, SeungHeon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam

    Abstract: We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models. The dataset consists of 1.1k human-written natural language descriptions of 706 music recordings, all publicly accessible and released under Creative Common licenses. To showcase the use of our dataset, we benchmark popular models o… ▽ More

    Submitted 22 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted to NeurIPS 2023 Workshop on Machine Learning for Audio