Skip to main content

Showing 1–43 of 43 results for author: Hennequin, R

.
  1. arXiv:2505.02492  [pdf, other

    cs.IR

    Uncertainty in Repeated Implicit Feedback as a Measure of Reliability

    Authors: Bruno Sguerra, Viet-Anh Tran, Romain Hennequin, Manuel Moussallam

    Abstract: Recommender systems rely heavily on user feedback to learn effective user and item representations. Despite their widespread adoption, limited attention has been given to the uncertainty inherent in the feedback used to train these systems. Both implicit and explicit feedback are prone to noise due to the variability in human interactions, with implicit feedback being particularly challenging. In… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

  2. arXiv:2501.12907  [pdf, other

    cs.SD eess.AS

    S-KEY: Self-supervised Learning of Major and Minor Keys from Audio

    Authors: Yuexuan Kong, Gabriel Meseguer-Brocal, Vincent Lostanlen, Mathieu Lagrange, Romain Hennequin

    Abstract: STONE, the current method in self-supervised learning for tonality estimation in music signals, cannot distinguish relative keys, such as C major versus A minor. In this article, we extend the neural network architecture and learning objective of STONE to perform self-supervised learning of major and minor keys (S-KEY). Our main contribution is an auxiliary pretext task to STONE, formulated using… ▽ More

    Submitted 1 April, 2025; v1 submitted 22 January, 2025; originally announced January 2025.

  3. arXiv:2501.10111  [pdf, other

    cs.SD eess.AS

    AI-Generated Music Detection and its Challenges

    Authors: Darius Afchar, Gabriel Meseguer-Brocal, Romain Hennequin

    Abstract: In the face of a new era of generative models, the detection of artificially generated content has become a matter of utmost importance. In particular, the ability to create credible minute-long synthetic music in a few seconds on user-friendly platforms poses a real threat of fraud on streaming services and unfair competition to human artists. This paper demonstrates the possibility (and surprisi… ▽ More

    Submitted 17 January, 2025; originally announced January 2025.

    Comments: Accepted for IEEE ICASSP 2025. arXiv admin note: substantial text overlap with arXiv:2405.04181

  4. arXiv:2411.05649  [pdf, other

    cs.IR

    Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation

    Authors: Elena V. Epure, Gabriel Meseguer-Brocal, Darius Afchar, Romain Hennequin

    Abstract: Recommender systems relying on Language Models (LMs) have gained popularity in assisting users to navigate large catalogs. LMs often exploit item high-level descriptors, i.e. categories or consumption contexts, from training data or user preferences. This has been proven effective in domains like movies or products. However, in the music domain, understanding how effectively LMs utilize song descr… ▽ More

    Submitted 18 November, 2024; v1 submitted 8 November, 2024; originally announced November 2024.

    Journal ref: 3rd Workshop on NLP for Music and Audio collocated with ISMIR 2024

  5. arXiv:2408.16578  [pdf, other

    cs.IR cs.LG

    Transformers Meet ACT-R: Repeat-Aware and Sequential Listening Session Recommendation

    Authors: Viet-Anh Tran, Guillaume Salha-Galvan, Bruno Sguerra, Romain Hennequin

    Abstract: Music streaming services often leverage sequential recommender systems to predict the best music to showcase to users based on past sequences of listening sessions. Nonetheless, most sequential recommendation methods ignore or insufficiently account for repetitive behaviors. This is a crucial limitation for music recommendation, as repeatedly listening to the same song over time is a common phenom… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 11 pages. Accepted by RecSys'2024, full paper

  6. arXiv:2407.08647  [pdf, other

    cs.SD cs.IR cs.LG eess.AS

    From Real to Cloned Singer Identification

    Authors: Dorian Desblancs, Gabriel Meseguer-Brocal, Romain Hennequin, Manuel Moussallam

    Abstract: Cloned voices of popular singers sound increasingly realistic and have gained popularity over the past few years. They however pose a threat to the industry due to personality rights concerns. As such, methods to identify the original singer in synthetic voices are needed. In this paper, we investigate how singer identification methods could be used for such a task. We present three embedding mode… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: To be published at ISMIR 2024

  7. arXiv:2407.07408  [pdf, other

    cs.SD eess.AS

    STONE: Self-supervised Tonality Estimator

    Authors: Yuexuan Kong, Vincent Lostanlen, Gabriel Meseguer-Brocal, Stella Wong, Mathieu Lagrange, Romain Hennequin

    Abstract: Although deep neural networks can estimate the key of a musical piece, their supervision incurs a massive annotation effort. Against this shortcoming, we present STONE, the first self-supervised tonality estimator. The architecture behind STONE, named ChromaNet, is a convnet with octave equivalence which outputs a key signature profile (KSP) of 12 structured logits. First, we train ChromaNet to re… ▽ More

    Submitted 1 April, 2025; v1 submitted 10 July, 2024; originally announced July 2024.

  8. arXiv:2406.11380  [pdf, other

    cs.CL

    Evaluating LLMs for Quotation Attribution in Literary Texts: A Case Study of LLaMa3

    Authors: Gaspard Michel, Elena V. Epure, Romain Hennequin, Christophe Cerisara

    Abstract: Large Language Models (LLMs) have shown promising results in a variety of literary tasks, often using complex memorized details of narration and fictional characters. In this work, we evaluate the ability of Llama-3 at attributing utterances of direct-speech to their speaker in novels. The LLM shows impressive results on a corpus of 28 novels, surpassing published results with ChatGPT and encoder-… ▽ More

    Submitted 26 January, 2025; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: NAACL 2025 Main Conference -- short paper

  9. Improving Quotation Attribution with Fictional Character Embeddings

    Authors: Gaspard Michel, Elena V. Epure, Romain Hennequin, Christophe Cerisara

    Abstract: Humans naturally attribute utterances of direct speech to their speaker in literary works. When attributing quotes, we process contextual information but also access mental representations of characters that we build and revise throughout the narrative. Recent methods to automatically attribute such utterances have explored simulating human logic with deterministic rules or learning new implicit r… ▽ More

    Submitted 4 October, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: EMNLP 2024 (Findings)

  10. arXiv:2406.04140  [pdf, other

    cs.SD eess.AS

    STraDa: A Singer Traits Dataset

    Authors: Yuexuan Kong, Viet-Anh Tran, Romain Hennequin

    Abstract: There is a limited amount of large-scale public datasets that contain downloadable music audio files and rich lead singer metadata. To provide such a dataset to benefit research in singing voices, we created Singer Traits Dataset (STraDa) with two subsets: automatic-strada and annotated-strada. The automatic-strada contains twenty-five thousand tracks across numerous genres and languages of more t… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  11. arXiv:2405.04181  [pdf, other

    cs.SD cs.LG eess.AS

    Detecting music deepfakes is easy but actually hard

    Authors: Darius Afchar, Gabriel Meseguer-Brocal, Romain Hennequin

    Abstract: In the face of a new era of generative models, the detection of artificially generated content has become a matter of utmost importance. The ability to create credible minute-long music deepfakes in a few seconds on user-friendly platforms poses a real threat of fraud on streaming services and unfair competition to human artists. This paper demonstrates the possibility (and surprising ease) of tra… ▽ More

    Submitted 22 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: Under review

  12. arXiv:2404.09177  [pdf, other

    cs.SD cs.LG eess.AS

    An Experimental Comparison Of Multi-view Self-supervised Methods For Music Tagging

    Authors: Gabriel Meseguer-Brocal, Dorian Desblancs, Romain Hennequin

    Abstract: Self-supervised learning has emerged as a powerful way to pre-train generalizable machine learning models on large amounts of unlabeled data. It is particularly compelling in the music domain, where obtaining labeled data is time-consuming, error-prone, and ambiguous. During the self-supervised process, models are trained on pretext tasks, with the primary objective of acquiring robust and informa… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  13. arXiv:2401.16968  [pdf, other

    cs.CL

    Distinguishing Fictional Voices: a Study of Authorship Verification Models for Quotation Attribution

    Authors: Gaspard Michel, Elena V. Epure, Romain Hennequin, Christophe Cerisara

    Abstract: Recent approaches to automatically detect the speaker of an utterance of direct speech often disregard general information about characters in favor of local information found in the context, such as surrounding mentions of entities. In this work, we explore stylistic representations of characters built by encoding their quotes with off-the-shelf pretrained Authorship Verification models in a larg… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: Accepted at EACL 2024's workshop LaTeCH-CLfL

  14. Ex2Vec: Characterizing Users and Items from the Mere Exposure Effect

    Authors: Bruno Sguerra, Viet-Anh Tran, Romain Hennequin

    Abstract: The traditional recommendation framework seeks to connect user and content, by finding the best match possible based on users past interaction. However, a good content recommendation is not necessarily similar to what the user has chosen in the past. As humans, users naturally evolve, learn, forget, get bored, they change their perspective of the world and in consequence, of the recommendable cont… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Journal ref: In Seventeenth ACM Conference on Recommender Systems (RecSys 2023)

  15. arXiv:2308.12767  [pdf, other

    cs.IR cs.LG stat.ML

    On the Consistency of Average Embeddings for Item Recommendation

    Authors: Walid Bendada, Guillaume Salha-Galvan, Romain Hennequin, Thomas Bouabça, Tristan Cazenave

    Abstract: A prevalent practice in recommender systems consists in averaging item embeddings to represent users or higher-level concepts in the same embedding space. This paper investigates the relevance of such a practice. For this purpose, we propose an expected precision score, designed to measure the consistency of an average embedding relative to the items used for its construction. We subsequently anal… ▽ More

    Submitted 30 August, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: 17th ACM Conference on Recommender Systems (RecSys 2023)

  16. arXiv:2307.01212  [pdf, other

    cs.IR cs.LG cs.SD eess.AS

    Of Spiky SVDs and Music Recommendation

    Authors: Darius Afchar, Romain Hennequin, Vincent Guigue

    Abstract: The truncated singular value decomposition is a widely used methodology in music recommendation for direct similar-item retrieval or embedding musical items for downstream tasks. This paper investigates a curious effect that we show naturally occurring on many recommendation datasets: spiking formations in the embedding space. We first propose a metric to quantify this spiking organization's stren… ▽ More

    Submitted 30 June, 2023; originally announced July 2023.

    Comments: Accepted for RecSys 2023 (Singapour, 18-22 September)

  17. arXiv:2304.08158  [pdf, other

    cs.IR cs.LG

    Attention Mixtures for Time-Aware Sequential Recommendation

    Authors: Viet-Anh Tran, Guillaume Salha-Galvan, Bruno Sguerra, Romain Hennequin

    Abstract: Transformers emerged as powerful methods for sequential recommendation. However, existing architectures often overlook the complex dependencies between user preferences and the temporal context. In this short paper, we introduce MOJITO, an improved Transformer sequential recommender system that addresses this limitation. MOJITO leverages Gaussian mixtures of attention-based temporal context and it… ▽ More

    Submitted 3 July, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: SIGIR 2023

  18. arXiv:2303.06944  [pdf, other

    cs.CL

    A Human Subject Study of Named Entity Recognition (NER) in Conversational Music Recommendation Queries

    Authors: Elena V. Epure, Romain Hennequin

    Abstract: We conducted a human subject study of named entity recognition on a noisy corpus of conversational music recommendation queries, with many irregular and novel named entities. We evaluated the human NER linguistic behaviour in these challenging conditions and compared it with the most common NER systems nowadays, fine-tuned transformers. Our goal was to learn about the task to guide the design of b… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Journal ref: The 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023)

  19. arXiv:2211.08972  [pdf, other

    cs.LG cs.SI stat.ML

    New Frontiers in Graph Autoencoders: Joint Community Detection and Link Prediction

    Authors: Guillaume Salha-Galvan, Johannes F. Lutzeyer, George Dasoulas, Romain Hennequin, Michalis Vazirgiannis

    Abstract: Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as powerful methods for link prediction (LP). Their performances are less impressive on community detection (CD), where they are often outperformed by simpler alternatives such as the Louvain method. It is still unclear to what extent one can improve CD with GAE and VGAE, especially in the absence of node features. It is mo… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: This NeurIPS 2022 GLFrontiers workshop paper summarizes results from the following journal article: arXiv:2202.00961. arXiv admin note: text overlap with arXiv:2205.14651

  20. Discovery Dynamics: Leveraging Repeated Exposure for User and Music Characterization

    Authors: Bruno Sguerra, Viet-Anh Tran, Romain Hennequin

    Abstract: Repetition in music consumption is a common phenomenon. It is notably more frequent when compared to the consumption of other media, such as books and movies. In this paper, we show that one particularly interesting repetitive behavior arises when users are consuming new items. Users' interest tends to rise with the first repetitions and attains a peak after which interest will decrease with subse… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Journal ref: In Sixteenth ACM Conference on Recommender Systems (RecSys 2022)

  21. arXiv:2207.11231  [pdf, other

    cs.SD cs.LG eess.AS

    Learning Unsupervised Hierarchies of Audio Concepts

    Authors: Darius Afchar, Romain Hennequin, Vincent Guigue

    Abstract: Music signals are difficult to interpret from their low-level features, perhaps even more than images: e.g. highlighting part of a spectrogram or an image is often insufficient to convey high-level ideas that are genuinely relevant to humans. In computer vision, concept learning was therein proposed to adjust explanations to the right abstraction level (e.g. detect clinical concepts from radiograp… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

    Comments: ISMIR 2022

  22. arXiv:2202.00961  [pdf, other

    cs.LG cs.SI stat.ML

    Modularity-Aware Graph Autoencoders for Joint Community Detection and Link Prediction

    Authors: Guillaume Salha-Galvan, Johannes F. Lutzeyer, George Dasoulas, Romain Hennequin, Michalis Vazirgiannis

    Abstract: Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as powerful methods for link prediction. Their performances are less impressive on community detection problems where, according to recent and concurring experimental evaluations, they are often outperformed by simpler alternatives such as the Louvain method. It is currently still unclear to which extent one can improve com… ▽ More

    Submitted 20 June, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: Accepted for publication in Elsevier's Neural Networks journal in 2022

  23. Explainability in Music Recommender Systems

    Authors: Darius Afchar, Alessandro B. Melchiorre, Markus Schedl, Romain Hennequin, Elena V. Epure, Manuel Moussallam

    Abstract: The most common way to listen to recorded music nowadays is via streaming platforms which provide access to tens of millions of tracks. To assist users in effectively browsing these large catalogs, the integration of Music Recommender Systems (MRSs) has become essential. Current real-world MRSs are often quite complex and optimized for recommendation accuracy. They combine several building blocks… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: To appear in AI Magazine, Special Topic on Recommender Systems 2022

    Journal ref: AI Magazine 43(2), 190-208, 2022

  24. arXiv:2108.11857  [pdf, other

    cs.CL

    Probing Pre-trained Auto-regressive Language Models for Named Entity Typing and Recognition

    Authors: Elena V. Epure, Romain Hennequin

    Abstract: Despite impressive results of language models for named entity recognition (NER), their generalization to varied textual genres, a growing entity type set, and new entities remains a challenge. Collecting thousands of annotations in each new case for training or fine-tuning is expensive and time-consuming. In contrast, humans can easily identify named entities given some simple instructions. Inspi… ▽ More

    Submitted 27 April, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

    Comments: Accepted for publication in LREC2022

  25. arXiv:2108.04655  [pdf, other

    cs.IR cs.LG

    Hierarchical Latent Relation Modeling for Collaborative Metric Learning

    Authors: Viet-Anh Tran, Guillaume Salha-Galvan, Romain Hennequin, Manuel Moussallam

    Abstract: Collaborative Metric Learning (CML) recently emerged as a powerful paradigm for recommendation based on implicit feedback collaborative filtering. However, standard CML methods learn fixed user and item representations, which fails to capture the complex interests of users. Existing extensions of CML also either ignore the heterogeneity of user-item relations, i.e. that a user can simultaneously l… ▽ More

    Submitted 26 July, 2021; originally announced August 2021.

    Comments: 15th ACM Conference on Recommender Systems (RecSys 2021)

  26. arXiv:2108.01053  [pdf, other

    cs.LG cs.IR cs.SI

    Cold Start Similar Artists Ranking with Gravity-Inspired Graph Autoencoders

    Authors: Guillaume Salha-Galvan, Romain Hennequin, Benjamin Chapus, Viet-Anh Tran, Michalis Vazirgiannis

    Abstract: On an artist's profile page, music streaming services frequently recommend a ranked list of "similar artists" that fans also liked. However, implementing such a feature is challenging for new artists, for which usage data on the service (e.g. streams or likes) is not yet available. In this paper, we model this cold start similar artists ranking problem as a link prediction task in a directed and a… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: 15th ACM Conference on Recommender Systems (RecSys 2021)

  27. Singing Language Identification using a Deep Phonotactic Approach

    Authors: Lenny Renault, Andrea Vaglio, Romain Hennequin

    Abstract: Extensive works have tackled Language Identification (LID) in the speech domain, however their application to the singing voice trails and performances on Singing Language Identification (SLID) can be improved leveraging recent progresses made in other singing related tasks. This work presents a modernized phonotactic system for SLID on polyphonic music: phoneme recognition is performed with a Con… ▽ More

    Submitted 31 May, 2021; originally announced May 2021.

    Comments: 5 pages, 1 figure, ICASSP 2021

    Journal ref: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 271-275

  28. arXiv:2104.12437  [pdf, other

    cs.LG stat.ML

    Towards Rigorous Interpretations: a Formalisation of Feature Attribution

    Authors: Darius Afchar, Romain Hennequin, Vincent Guigue

    Abstract: Feature attribution is often loosely presented as the process of selecting a subset of relevant features as a rationale of a prediction. Task-dependent by nature, precise definitions of "relevance" encountered in the literature are however not always consistent. This lack of clarity stems from the fact that we usually do not have access to any notion of ground-truth attribution and from a more gen… ▽ More

    Submitted 5 July, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

    Comments: 38th International Conference on Machine Learning (ICML 2021)

    Journal ref: PMLR 139:76-86, 2021

  29. arXiv:2010.06325  [pdf, other

    cs.CL cs.LG

    Modeling the Music Genre Perception across Language-Bound Cultures

    Authors: Elena V. Epure, Guillaume Salha, Manuel Moussallam, Romain Hennequin

    Abstract: The music genre perception expressed through human annotations of artists or albums varies significantly across language-bound cultures. These variations cannot be modeled as mere translations since we also need to account for cultural differences in the music genre perception. In this work, we study the feasibility of obtaining relevant cross-lingual, culture-specific music genre annotations base… ▽ More

    Submitted 16 November, 2020; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)

  30. arXiv:2009.07755  [pdf, other

    cs.CL cs.IR cs.LG

    Multilingual Music Genre Embeddings for Effective Cross-Lingual Music Item Annotation

    Authors: Elena V. Epure, Guillaume Salha, Romain Hennequin

    Abstract: Annotating music items with music genres is crucial for music recommendation and information retrieval, yet challenging given that music genres are subjective concepts. Recently, in order to explicitly consider this subjectivity, the annotation of music items was modeled as a translation task: predict for a music item its music genres within a target vocabulary or taxonomy (tag system) from a set… ▽ More

    Submitted 16 September, 2020; originally announced September 2020.

    Comments: 21st International Society for Music Information Retrieval Conference (ISMIR 2020)

  31. Making Neural Networks Interpretable with Attribution: Application to Implicit Signals Prediction

    Authors: Darius Afchar, Romain Hennequin

    Abstract: Explaining recommendations enables users to understand whether recommended items are relevant to their needs and has been shown to increase their trust in the system. More generally, if designing explainable machine learning models is key to check the sanity and robustness of a decision process and improve their efficiency, it however remains a challenge for complex architectures, especially deep… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Comments: 14th ACM Conference on Recommender Systems (RecSys '20)

  32. arXiv:2002.01910  [pdf, other

    cs.LG cs.SI stat.ML

    FastGAE: Scalable Graph Autoencoders with Stochastic Subgraph Decoding

    Authors: Guillaume Salha, Romain Hennequin, Jean-Baptiste Remy, Manuel Moussallam, Michalis Vazirgiannis

    Abstract: Graph autoencoders (AE) and variational autoencoders (VAE) are powerful node embedding methods, but suffer from scalability issues. In this paper, we introduce FastGAE, a general framework to scale graph AE and VAE to large graphs with millions of nodes and edges. Our strategy, based on an effective stochastic subgraph decoding scheme, significantly speeds up the training of graph AE and VAE while… ▽ More

    Submitted 13 April, 2021; v1 submitted 5 February, 2020; originally announced February 2020.

    Comments: Accepted for publication in Elsevier's Neural Networks journal

  33. arXiv:2001.07614  [pdf, other

    cs.LG cs.SI stat.ML

    Simple and Effective Graph Autoencoders with One-Hop Linear Models

    Authors: Guillaume Salha, Romain Hennequin, Michalis Vazirgiannis

    Abstract: Over the last few years, graph autoencoders (AE) and variational autoencoders (VAE) emerged as powerful node embedding methods, with promising performances on challenging tasks such as link prediction and node clustering. Graph AE, VAE and most of their extensions rely on multi-layer graph convolutional networks (GCN) encoders to learn vector space representations of nodes. In this paper, we show… ▽ More

    Submitted 17 June, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

    Comments: Accepted at ECML-PKDD 2020. A preliminary version of this work has previously been presented at the NeurIPS 2019 workshop on Graph Representation Learning: arXiv:1910.00942

  34. arXiv:1910.00942  [pdf, ps, other

    cs.LG cs.SI stat.ML

    Keep It Simple: Graph Autoencoders Without Graph Convolutional Networks

    Authors: Guillaume Salha, Romain Hennequin, Michalis Vazirgiannis

    Abstract: Graph autoencoders (AE) and variational autoencoders (VAE) recently emerged as powerful node embedding methods, with promising performances on challenging tasks such as link prediction and node clustering. Graph AE, VAE and most of their extensions rely on graph convolutional networks (GCN) to learn vector space representations of nodes. In this paper, we propose to replace the GCN encoder by a si… ▽ More

    Submitted 2 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019 Graph Representation Learning Workshop

  35. Improving Collaborative Metric Learning with Efficient Negative Sampling

    Authors: Viet-Anh Tran, Romain Hennequin, Jimena Royo-Letelier, Manuel Moussallam

    Abstract: Distance metric learning based on triplet loss has been applied with success in a wide range of applications such as face recognition, image retrieval, speaker change detection and recently recommendation with the CML model. However, as we show in this article, CML requires large batches to work reasonably well because of a too simplistic uniform negative sampling strategy for selecting triplets.… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

    Comments: SIGIR 2019

  36. arXiv:1907.08698  [pdf, other

    cs.SD cs.IR cs.LG eess.AS stat.ML

    Leveraging Knowledge Bases And Parallel Annotations For Music Genre Translation

    Authors: Elena V. Epure, Anis Khlif, Romain Hennequin

    Abstract: Prevalent efforts have been put in automatically inferring genres of musical items. Yet, the propose solutions often rely on simplifications and fail to address the diversity and subjectivity of music genres. Accounting for these has, though, many benefits for aligning knowledge sources, integrating data and enriching musical items with tags. Here, we choose a new angle for the genre study by seek… ▽ More

    Submitted 27 July, 2019; v1 submitted 18 July, 2019; originally announced July 2019.

    Comments: Published in ISMIR 2019

  37. Singing voice separation: a study on training data

    Authors: Laure Prétet, Romain Hennequin, Jimena Royo-Letelier, Andrea Vaglio

    Abstract: In the recent years, singing voice separation systems showed increased performance due to the use of supervised training. The design of training datasets is known as a crucial factor in the performance of such systems. We investigate on how the characteristics of the training dataset impacts the separation performances of state-of-the-art singing voice separation algorithms. We show that the separ… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Journal ref: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

  38. arXiv:1905.09570  [pdf, other

    cs.LG cs.SI stat.ML

    Gravity-Inspired Graph Autoencoders for Directed Link Prediction

    Authors: Guillaume Salha, Stratis Limnios, Romain Hennequin, Viet Anh Tran, Michalis Vazirgiannis

    Abstract: Graph autoencoders (AE) and variational autoencoders (VAE) recently emerged as powerful node embedding methods. In particular, graph AE and VAE were successfully leveraged to tackle the challenging link prediction problem, aiming at figuring out whether some pairs of nodes from a graph are connected by unobserved edges. However, these models focus on undirected graphs and therefore ignore the pote… ▽ More

    Submitted 6 June, 2022; v1 submitted 23 May, 2019; originally announced May 2019.

    Comments: ACM International Conference on Information and Knowledge Management (CIKM 2019)

  39. arXiv:1902.08813  [pdf, other

    cs.LG cs.SI stat.ML

    A Degeneracy Framework for Scalable Graph Autoencoders

    Authors: Guillaume Salha, Romain Hennequin, Viet Anh Tran, Michalis Vazirgiannis

    Abstract: In this paper, we present a general framework to scale graph autoencoders (AE) and graph variational autoencoders (VAE). This framework leverages graph degeneracy concepts to train models only from a dense subset of nodes instead of using the entire graph. Together with a simple yet effective propagation mechanism, our approach significantly improves scalability and training speed while preserving… ▽ More

    Submitted 21 June, 2022; v1 submitted 23 February, 2019; originally announced February 2019.

    Comments: International Joint Conference on Artificial Intelligence (IJCAI 2019)

  40. arXiv:1810.01807  [pdf, other

    cs.IR cs.AI cs.LG cs.SD stat.ML

    Disambiguating Music Artists at Scale with Audio Metric Learning

    Authors: Jimena Royo-Letelier, Romain Hennequin, Viet-Anh Tran, Manuel Moussallam

    Abstract: We address the problem of disambiguating large scale catalogs through the definition of an unknown artist clustering task. We explore the use of metric learning techniques to learn artist embeddings directly from audio, and using a dedicated homonym artists dataset, we compare our method with a recent approach that learn similar embeddings using artist classifiers. While both systems have the abil… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

    Comments: published in ISMIR 2018

  41. arXiv:1809.07276  [pdf, other

    cs.IR cs.LG cs.SD stat.ML

    Music Mood Detection Based On Audio And Lyrics With Deep Neural Net

    Authors: Rémi Delbouys, Romain Hennequin, Francesco Piccoli, Jimena Royo-Letelier, Manuel Moussallam

    Abstract: We consider the task of multimodal music mood prediction based on the audio signal and the lyrics of a track. We reproduce the implementation of traditional feature engineering based approaches and propose a new model based on deep learning. We compare the performance of both approaches on a database containing 18,000 tracks with associated valence and arousal values and show that our approach out… ▽ More

    Submitted 19 September, 2018; originally announced September 2018.

    Comments: Published in ISMIR 2018

  42. arXiv:1809.07256  [pdf, other

    cs.IR

    Audio Based Disambiguation Of Music Genre Tags

    Authors: Romain Hennequin, Jimena Royo-Letelier, Manuel Moussallam

    Abstract: In this paper, we propose to infer music genre embeddings from audio datasets carrying semantic information about genres. We show that such embeddings can be used for disambiguating genre tags (identification of different labels for the same genre, tag translation from a tag system to another, inference of hierarchical taxonomies on these genre tags). These embeddings are built by training a deep… ▽ More

    Submitted 19 September, 2018; originally announced September 2018.

    Comments: published in ISMIR 2018

  43. arXiv:1808.10351  [pdf, other

    cs.IR cs.MM

    Large-Scale Cover Song Detection in Digital Music Libraries Using Metadata, Lyrics and Audio Features

    Authors: Albin Andrew Correya, Romain Hennequin, Mickaël Arcos

    Abstract: Cover song detection is a very relevant task in Music Information Retrieval (MIR) studies and has been mainly addressed using audio-based systems. Despite its potential impact in industrial contexts, low performances and lack of scalability have prevented such systems from being adopted in practice for large applications. In this work, we investigate whether textual music information (such as meta… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

    Comments: Music Information Retrieval, Cover Song Identification, Million Song Dataset, Natural Language Processing