Skip to main content

Showing 1–10 of 10 results for author: Herremans, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2203.03022  [pdf, ps, other

    cs.SD cs.AI cs.LG eess.AS stat.ML

    HEAR: Holistic Evaluation of Audio Representations

    Authors: Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W. Schuller, Christian J. Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse Engel, Justin Salamon, Philippe Esling, Pranay Manocha, Shinji Watanabe, Zeyu Jin, Yonatan Bisk

    Abstract: What audio embedding approach generalizes best to a wide range of downstream tasks across a variety of everyday domains without fine-tuning? The aim of the HEAR benchmark is to develop a general-purpose audio representation that provides a strong basis for learning in a wide variety of tasks and scenarios. HEAR evaluates audio representations using a benchmark suite across a variety of domains, in… ▽ More

    Submitted 29 May, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

    Comments: to appear in Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

    Journal ref: Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track

  2. arXiv:2007.15474  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Music FaderNets: Controllable Music Generation Based On High-Level Features via Low-Level Feature Modelling

    Authors: Hao Hao Tan, Dorien Herremans

    Abstract: High-level musical qualities (such as emotion) are often abstract, subjective, and hard to quantify. Given these difficulties, it is not easy to learn good feature representations with supervised learning techniques, either because of the insufficiency of labels, or the subjectiveness (and hence large variance) in human-annotated labels. In this paper, we present a framework that can learn high-le… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Journal ref: Proc. of 21st International Society of Music Information Retrieval Conference, ISMIR 2020

  3. arXiv:2006.09016  [pdf, other

    physics.comp-ph cs.LG stat.AP

    Acoustic prediction of flowrate: varying liquid jet stream onto a free surface

    Authors: Balamurali B T, Edwin Jonathan Aslim, Yun Shu Lynn Ng, Tricia Li, Chuen Kuo, Jacob Shihang Chen, Dorien Herremans, Lay Guat Ng, Jer-Ming Chen

    Abstract: Information on liquid jet stream flow is crucial in many real world applications. In a large number of cases, these flows fall directly onto free surfaces (e.g. pools), creating a splash with accompanying splashing sounds. The sound produced is supplied by energy interactions between the liquid jet stream and the passive free surface. In this investigation, we collect the sound of a water jet of v… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

    MSC Class: 76-XX; 92C55; 92-XX ACM Class: J.2

    Journal ref: Proceedings of the IEEE International Conference on Signal Processing and Communications (SPCOM), 2020

  4. arXiv:1912.02613  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders

    Authors: Yin-Jyun Luo, Chin-Chen Hsu, Kat Agres, Dorien Herremans

    Abstract: We propose a flexible framework that deals with both singer conversion and singers vocal technique conversion. The proposed model is trained on non-parallel corpora, accommodates many-to-many conversion, and leverages recent advances of variational autoencoders. It employs separate encoders to learn disentangled latent representations of singer identity and vocal technique separately, with a joint… ▽ More

    Submitted 24 February, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: Accepted to ICASSP 2020

  5. arXiv:1909.02850  [pdf, other

    eess.SP cs.LG cs.SD stat.ML

    Doppler Invariant Demodulation for Shallow Water Acoustic Communications Using Deep Belief Networks

    Authors: Abigail Lee-Leon, Chau Yuen, Dorien Herremans

    Abstract: Shallow water environments create a challenging channel for communications. In this paper, we focus on the challenges posed by the frequency-selective signal distortion called the Doppler effect. We explore the design and performance of machine learning (ML) based demodulation methods --- (1) Deep Belief Network-feed forward Neural Network (DBN-NN) and (2) Deep Belief Network-Convolutional Neural… ▽ More

    Submitted 5 September, 2019; originally announced September 2019.

    Journal ref: Proceedings of 16th IEEE Asia Pacific Wireless Communications Symposium (APWCS). 2019. Singapore

  6. arXiv:1906.08152  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders

    Authors: Yin-Jyun Luo, Kat Agres, Dorien Herremans

    Abstract: In this paper, we learn disentangled representations of timbre and pitch for musical instrument sounds. We adapt a framework based on variational autoencoders with Gaussian mixture latent distributions. Specifically, we use two separate encoders to learn distinct latent spaces for timbre and pitch, which form Gaussian mixture components representing instrument identity and pitch, respectively. For… ▽ More

    Submitted 29 June, 2019; v1 submitted 19 June, 2019; originally announced June 2019.

    Comments: 20th Conference of the International Society for Music Information Retrieval

  7. arXiv:1905.12439  [pdf, other

    cs.SD cs.CR cs.LG cs.MM stat.ML

    Towards robust audio spoofing detection: a detailed comparison of traditional and learned features

    Authors: Balamurali BT, Kin Wah Edward Lin, Simon Lui, Jer-Ming Chen, Dorien Herremans

    Abstract: Automatic speaker verification, like every other biometric system, is vulnerable to spoofing attacks. Using only a few minutes of recorded voice of a genuine client of a speaker verification system, attackers can develop a variety of spoofing attacks that might trick such systems. Detecting these attacks using the audio cues present in the recordings is an important challenge. Most existing spoofi… ▽ More

    Submitted 18 June, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Journal ref: IEEE Access. 2019

  8. arXiv:1905.08076  [pdf, other

    cs.SD cs.IR cs.LG eess.AS stat.ML

    Dance Hit Song Prediction

    Authors: Dorien herremans, David Martens, Kenneth Sörensen

    Abstract: Record companies invest billions of dollars in new talent around the globe each year. Gaining insight into what actually makes a hit song would provide tremendous benefits for the music industry. In this research we tackle this question by focussing on the dance hit song classification problem. A database of dance hit songs from 1985 until 2013 is built, including basic musical features, as well a… ▽ More

    Submitted 17 May, 2019; originally announced May 2019.

    Journal ref: Journal of New music Research. 43:302 (2014)

  9. arXiv:1812.01278  [pdf, other

    cs.SD cs.AI cs.LG eess.AS stat.ML

    Singing Voice Separation Using a Deep Convolutional Neural Network Trained by Ideal Binary Mask and Cross Entropy

    Authors: Kin Wah Edward Lin, Balamurali B. T., Enyan Koh, Simon Lui, Dorien Herremans

    Abstract: Separating a singing voice from its music accompaniment remains an important challenge in the field of music information retrieval. We present a unique neural network approach inspired by a technique that has revolutionized the field of vision: pixel-wise image classification, which we combine with cross entropy loss and pretraining of the CNN as an autoencoder on singing voice spectrograms. The p… ▽ More

    Submitted 4 December, 2018; originally announced December 2018.

    Comments: In Press, Neural Computing and Applications, Springer. 2019

    MSC Class: 68-XX; 68Txx

  10. arXiv:1811.12408  [pdf, other

    cs.SD cs.IR cs.LG eess.AS stat.ML

    From Context to Concept: Exploring Semantic Relationships in Music with Word2Vec

    Authors: Ching-Hua Chuan, Kat Agres, Dorien Herremans

    Abstract: We explore the potential of a popular distributional semantics vector space model, word2vec, for capturing meaningful relationships in ecological (complex polyphonic) music. More precisely, the skip-gram version of word2vec is used to model slices of music from a large corpus spanning eight musical genres. In this newly learned vector space, a metric based on cosine distance is able to distinguish… ▽ More

    Submitted 29 November, 2018; originally announced November 2018.

    Comments: Accepted for publication in Neural Computing and Applications, Springer. In Press

    MSC Class: 68Txx; 68Wxx

    Journal ref: Neural Computing and Applications, Springer. 2019