Search | arXiv e-print repository

Degradation-Invariant Music Indexing

Abstract: For music indexing robust to sound degradations and scalable for big music catalogs, this scientific report presents an approach based on audio descriptors relevant to the music content and invariant to sound transformations (noise addition, distortion, lossy coding, pitch/time transformations, or filtering e.g.). To achieve this task, one of the key point of the proposed method is the definition… ▽ More For music indexing robust to sound degradations and scalable for big music catalogs, this scientific report presents an approach based on audio descriptors relevant to the music content and invariant to sound transformations (noise addition, distortion, lossy coding, pitch/time transformations, or filtering e.g.). To achieve this task, one of the key point of the proposed method is the definition of high-dimensional audio prints, which are intrinsically (by design) robust to some sound degradations. The high dimensionality of this first representation is then used to learn a linear projection to a sub-space significantly smaller, which reduces again the sensibility to sound degradations using a series of discriminant analyses. Finally, anchoring the analysis times on local maxima of a selected onset function, an approximative hashing is done to provide a better tolerance to bit corruptions, and in the same time to make easier the scaling of the method. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2305.18996 [pdf, other]

The barycenter in free nilpotent Lie groups and its application to iterated-integrals signatures

Authors: Marianne Clausel, Joscha Diehl, Raphael Mignot, Leonard Schmitz, Nozomi Sugiura, Konstantin Usevich

Abstract: We establish the well-definedness of the barycenter (in the sense of Buser and Karcher) for every integrable measure on the free nilpotent Lie group of step $L$ (over $\mathbb{R}^d$). We provide two algorithms for computing it, using methods from Lie theory (namely, the Baker-Campbell-Hausdorff formula) and from the theory of Gröbner bases of modules. Our main motivation stems from measures induce… ▽ More We establish the well-definedness of the barycenter (in the sense of Buser and Karcher) for every integrable measure on the free nilpotent Lie group of step $L$ (over $\mathbb{R}^d$). We provide two algorithms for computing it, using methods from Lie theory (namely, the Baker-Campbell-Hausdorff formula) and from the theory of Gröbner bases of modules. Our main motivation stems from measures induced by iterated-integrals signatures, and we calculate the barycenter for the signature of the Brownian motion. △ Less

Submitted 9 January, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

Comments: 48 pages, 1 figure

MSC Class: 60L10 (Primary) 22E25; 60J65; 13P10; 15A69 (Secondary)

arXiv:2202.05718 [pdf, other]

Audio Defect Detection in Music with Deep Networks

Authors: Daniel Wolff, Rémi Mignot, Axel Roebel

Abstract: With increasing amounts of music being digitally transferred from production to distribution, automatic means of determining media quality are needed. Protection mechanisms in digital audio processing tools have not eliminated the need of production entities located downstream the distribution chain to assess audio quality and detect defects inserted further upstream. Such analysis often relies on… ▽ More With increasing amounts of music being digitally transferred from production to distribution, automatic means of determining media quality are needed. Protection mechanisms in digital audio processing tools have not eliminated the need of production entities located downstream the distribution chain to assess audio quality and detect defects inserted further upstream. Such analysis often relies on the received audio and scarce meta-data alone. Deliberate use of artefacts such as clicks in popular music as well as more recent defects stemming from corruption in modern audio encodings call for data-centric and context sensitive solutions for detection. We present a convolutional network architecture following end-to-end encoder decoder configuration to develop detectors for two exemplary audio defects. A click detector is trained and compared to a traditional signal processing method, with a discussion on context sensitivity. Additional post-processing is used for data augmentation and workflow simulation. The ability of our models to capture variance is explored in a detector for artefacts from decompression of corrupted MP3 compressed audio. For both tasks we describe the synthetic generation of artefacts for controlled detector training and evaluation. We evaluate our detectors on the large open-source Free Music Archive (FMA) and genre-specific datasets. △ Less

Submitted 11 February, 2022; originally announced February 2022.

Comments: 6 pages

Journal ref: Proceedings of the 22nd International Society for Music Information Retrieval Conference, Online, 2021

Showing 1–3 of 3 results for author: Mignot, R