-
A dataset and classification model for Malay, Hindi, Tamil and Chinese music
Authors:
Fajilatun Nahar,
Kat Agres,
Balamurali BT,
Dorien Herremans
Abstract:
In this paper we present a new dataset, with musical excepts from the three main ethnic groups in Singapore: Chinese, Malay and Indian (both Hindi and Tamil). We use this new dataset to train different classification models to distinguish the origin of the music in terms of these ethnic groups. The classification models were optimized by exploring the use of different musical features as the input…
▽ More
In this paper we present a new dataset, with musical excepts from the three main ethnic groups in Singapore: Chinese, Malay and Indian (both Hindi and Tamil). We use this new dataset to train different classification models to distinguish the origin of the music in terms of these ethnic groups. The classification models were optimized by exploring the use of different musical features as the input. Both high level features, i.e., musically meaningful features, as well as low level features, i.e., spectrogram based features, were extracted from the audio files so as to optimize the performance of the different classification models.
△ Less
Submitted 15 September, 2020; v1 submitted 9 September, 2020;
originally announced September 2020.
-
Towards robust audio spoofing detection: a detailed comparison of traditional and learned features
Authors:
Balamurali BT,
Kin Wah Edward Lin,
Simon Lui,
Jer-Ming Chen,
Dorien Herremans
Abstract:
Automatic speaker verification, like every other biometric system, is vulnerable to spoofing attacks. Using only a few minutes of recorded voice of a genuine client of a speaker verification system, attackers can develop a variety of spoofing attacks that might trick such systems. Detecting these attacks using the audio cues present in the recordings is an important challenge. Most existing spoofi…
▽ More
Automatic speaker verification, like every other biometric system, is vulnerable to spoofing attacks. Using only a few minutes of recorded voice of a genuine client of a speaker verification system, attackers can develop a variety of spoofing attacks that might trick such systems. Detecting these attacks using the audio cues present in the recordings is an important challenge. Most existing spoofing detection systems depend on knowing the used spoofing technique. With this research, we aim at overcoming this limitation, by examining robust audio features, both traditional and those learned through an autoencoder, that are generalizable over different types of replay spoofing. Furthermore, we provide a detailed account of all the steps necessary in setting up state-of-the-art audio feature detection, pre-, and postprocessing, such that the (non-audio expert) machine learning researcher can implement such systems. Finally, we evaluate the performance of our robust replay speaker detection system with a wide variety and different combinations of both extracted and machine learned audio features on the `out in the wild' ASVspoof 2017 dataset. This dataset contains a variety of new spoofing configurations. Since our focus is on examining which features will ensure robustness, we base our system on a traditional Gaussian Mixture Model-Universal Background Model. We then systematically investigate the relative contribution of each feature set. The fused models, based on both the known audio features and the machine learned features respectively, have a comparable performance with an Equal Error Rate (EER) of 12. The final best performing model, which obtains an EER of 10.8, is a hybrid model that contains both known and machine learned features, thus revealing the importance of incorporating both types of features when developing a robust spoofing prediction model.
△ Less
Submitted 18 June, 2019; v1 submitted 28 May, 2019;
originally announced May 2019.
-
D3.2: SPEED-5G enhanced functional and system architecture, scenarios and performance evaluation metrics
Authors:
Shahid Mumtaz,
Kazi Saidul,
Huq Jonathan Rodriguez,
Paulo Marques,
Ayman Radwan,
Keith Briggs Michael Fitch BT,
Andreas Georgakopoulos,
Ioannis-Prodromos Belikaidis,
Panagiotis Vlacheas,
Dimitrios Kelaidonis,
Evangelos Kosmatos,
Serafim Kotrotsos,
Stavroula Vassaki,
Yiouli Kritikou,
Panagiotis Demestichas,
Kostas Tsagkaris,
Evangelia Tzifa,
Aikaterini Demesticha,
Vera Stavroulaki,
Athina Ropodi,
Evangelos Argoudelis,
Marinos Galiatsatos,
Aristotelis Margaris,
George Paitaris,
Dimitrios Kardaris
, et al. (14 additional authors not shown)
Abstract:
This deliverable contains a detailed description of the use cases considered in SPEED-5G, which will be used as a basis for demonstration in project. These use cases are Dynamic Channel selection, Load balancing, carrier aggregation. This deliverable also explains the SPEED-5G architecture design principles, which is based on software-defined networking and network function virtualisation. The deg…
▽ More
This deliverable contains a detailed description of the use cases considered in SPEED-5G, which will be used as a basis for demonstration in project. These use cases are Dynamic Channel selection, Load balancing, carrier aggregation. This deliverable also explains the SPEED-5G architecture design principles, which is based on software-defined networking and network function virtualisation. The degree of virtualisation is further illustrated by a number of novel contributions from involved partners. In the end, KPIs for each use case are presented, along with the description of how these KPIs can support 5G-PPP KPIs.
△ Less
Submitted 14 November, 2017; v1 submitted 9 November, 2017;
originally announced November 2017.