Skip to main content

Showing 1–11 of 11 results for author: Quinton, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.06224  [pdf, other

    cs.LG

    Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks

    Authors: Christos Plachouras, Julien Guinot, George Fazekas, Elio Quinton, Emmanouil Benetos, Johan Pauwels

    Abstract: Downstream probing has been the dominant method for evaluating model representations, an important process given the increasing prominence of self-supervised learning and foundation models. However, downstream probing primarily assesses the availability of task-relevant information in the model's latent space, overlooking attributes such as equivariance, invariance, and disentanglement, which cont… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

    Comments: Accepted at IJCNN 2025

  2. arXiv:2412.18955  [pdf, other

    cs.SD eess.AS

    Leave-One-EquiVariant: Alleviating invariance-related information loss in contrastive music representations

    Authors: Julien Guinot, Elio Quinton, György Fazekas

    Abstract: Contrastive learning has proven effective in self-supervised musical representation learning, particularly for Music Information Retrieval (MIR) tasks. However, reliance on augmentation chains for contrastive view generation and the resulting learnt invariances pose challenges when different downstream tasks require sensitivity to certain musical attributes. To address this, we propose the Leave O… ▽ More

    Submitted 25 December, 2024; originally announced December 2024.

  3. arXiv:2412.03373  [pdf

    cs.SD eess.AS

    Exploring trends in audio mixes and masters: Insights from a dataset analysis

    Authors: Angeliki Mourgela, Elio Quinton, Spyridon Bissas, Joshua D. Reiss, David Ronan

    Abstract: We present an analysis of a dataset of audio metrics and aesthetic considerations about mixes and masters provided by the web platform MixCheck studio. The platform is designed for educational purposes, primarily targeting amateur music producers, and aimed at analysing their recordings prior to them being released. The analysis focuses on the following data points: integrated loudness, mono compa… ▽ More

    Submitted 4 December, 2024; originally announced December 2024.

    Comments: 11 pages, 6 figures, Presented at the AES 157th Convention October 2024, New York, USA

  4. arXiv:2408.01337  [pdf, other

    cs.SD cs.CL cs.LG cs.MM eess.AS

    MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models

    Authors: Benno Weck, Ilaria Manco, Emmanouil Benetos, Elio Quinton, George Fazekas, Dmitry Bogdanov

    Abstract: Multimodal models that jointly process audio and language hold great promise in audio understanding and are increasingly being adopted in the music domain. By allowing users to query via text and obtain information about a given audio input, these models have the potential to enable a variety of music understanding tasks via language-based interfaces. However, their evaluation poses considerable c… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

    Comments: Accepted at ISMIR 2024. Data: https://doi.org/10.5281/zenodo.12709974 Code: https://github.com/mulab-mir/muchomusic Supplementary material: https://mulab-mir.github.io/muchomusic

  5. arXiv:2407.21545  [pdf, other

    cs.SD eess.AS

    Robust Lossy Audio Compression Identification

    Authors: Hendrik Vincent Koops, Gianluca Micchi, Elio Quinton

    Abstract: Previous research contributions on blind lossy compression identification report near perfect performance metrics on their test set, across a variety of codecs and bit rates. However, we show that such results can be deceptive and may not accurately represent true ability of the system to tackle the task at hand. In this article, we present an investigation into the robustness and generalisation c… ▽ More

    Submitted 31 July, 2024; originally announced July 2024.

    Comments: Accepted to be published in the Proceedings of the 25th International Society for Music Information Retrieval Conference 2024

  6. arXiv:2311.10057  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation

    Authors: Ilaria Manco, Benno Weck, SeungHeon Doh, Minz Won, Yixiao Zhang, Dmitry Bogdanov, Yusong Wu, Ke Chen, Philip Tovstogan, Emmanouil Benetos, Elio Quinton, György Fazekas, Juhan Nam

    Abstract: We introduce the Song Describer dataset (SDD), a new crowdsourced corpus of high-quality audio-caption pairs, designed for the evaluation of music-and-language models. The dataset consists of 1.1k human-written natural language descriptions of 706 music recordings, all publicly accessible and released under Creative Common licenses. To showcase the use of our dataset, we benchmark popular models o… ▽ More

    Submitted 22 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted to NeurIPS 2023 Workshop on Machine Learning for Audio

  7. arXiv:2310.11165  [pdf, other

    cs.SD cs.LG eess.AS

    Serenade: A Model for Human-in-the-loop Automatic Chord Estimation

    Authors: Hendrik Vincent Koops, Gianluca Micchi, Ilaria Manco, Elio Quinton

    Abstract: Computational harmony analysis is important for MIR tasks such as automatic segmentation, corpus analysis and automatic chord label estimation. However, recent research into the ambiguous nature of musical harmony, causing limited inter-rater agreement, has made apparent that there is a glass ceiling for common metrics such as accuracy. Commonly, these issues are addressed either in the training d… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted at MMRP23. 7 pages, 5 figures, 2 tables

  8. arXiv:2209.01478  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Equivariant Self-Supervision for Musical Tempo Estimation

    Authors: Elio Quinton

    Abstract: Self-supervised methods have emerged as a promising avenue for representation learning in the recent years since they alleviate the need for labeled datasets, which are scarce and expensive to acquire. Contrastive methods are a popular choice for self-supervision in the audio domain, and typically provide a learning signal by forcing the model to be invariant to some transformations of the input.… ▽ More

    Submitted 3 September, 2022; originally announced September 2022.

    Comments: Accepted at ISMIR 2022

  9. arXiv:2208.12208  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Contrastive Audio-Language Learning for Music

    Authors: Ilaria Manco, Emmanouil Benetos, Elio Quinton, György Fazekas

    Abstract: As one of the most intuitive interfaces known to humans, natural language has the potential to mediate many tasks that involve human-computer interaction, especially in application-focused fields like Music Information Retrieval. In this work, we explore cross-modal learning in an attempt to bridge audio and language in the music domain. To this end, we propose MusCALL, a framework for Music Contr… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: Accepted to ISMIR 2022

  10. arXiv:2112.04214  [pdf, other

    cs.SD cs.CL cs.IR cs.LG eess.AS

    Learning music audio representations via weak language supervision

    Authors: Ilaria Manco, Emmanouil Benetos, Elio Quinton, Gyorgy Fazekas

    Abstract: Audio representations for music information retrieval are typically learned via supervised learning in a task-specific fashion. Although effective at producing state-of-the-art results, this scheme lacks flexibility with respect to the range of applications a model can have and requires extensively annotated datasets. In this work, we pose the question of whether it may be possible to exploit weak… ▽ More

    Submitted 17 February, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Accepted to ICASSP 2022

  11. arXiv:2104.11984  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    MusCaps: Generating Captions for Music Audio

    Authors: Ilaria Manco, Emmanouil Benetos, Elio Quinton, Gyorgy Fazekas

    Abstract: Content-based music information retrieval has seen rapid progress with the adoption of deep learning. Current approaches to high-level music description typically make use of classification models, such as in auto-tagging or genre and mood classification. In this work, we propose to address music description via audio captioning, defined as the task of generating a natural language description of… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Comments: Accepted to IJCNN 2021 for the Special Session on Representation Learning for Audio, Speech, and Music Processing