-
An ambient denoising method based on multi-channel non-negative matrix factorization for wheezing detection
Authors:
Antonio J. Muñoz-Montoro,
Pablo Revuelta-Sanz,
Damian Martínez-Muñoz,
Juan Torre-Cruz,
José Ranilla
Abstract:
In this paper, a parallel computing method is proposed to perform the background denoising and wheezing detection from a multi-channel recording captured during the auscultation process. The proposed system is based on a non-negative matrix factorization (NMF) approach and a detection strategy. Moreover, the initialization of the proposed model is based on singular value decomposition to avoid dep…
▽ More
In this paper, a parallel computing method is proposed to perform the background denoising and wheezing detection from a multi-channel recording captured during the auscultation process. The proposed system is based on a non-negative matrix factorization (NMF) approach and a detection strategy. Moreover, the initialization of the proposed model is based on singular value decomposition to avoid dependence on the initial values of the NMF parameters. Additionally, novel update rules to simultaneously address the multichannel denoising while preserving an orthogonal constraint to maximize source separation have been designed. The proposed system has been evaluated for the task of wheezing detection showing a significant improvement over state-of-the-art algorithms when noisy sound sources are present. Moreover, parallel and high-performance techniques have been used to speedup the execution of the proposed system, showing that it is possible to achieve fast execution times, which enables its implementation in real-world scenarios.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Unsupervised detection and classification of heartbeats using the dissimilarity matrix in PCG signals
Authors:
J. Torre-Cruz,
D. Martinez-Munoz,
N. Ruiz-Reyes,
A. J. Munoz-Montoro,
M. Puentes-Chiachio,
F. J. Canadas-Quesada
Abstract:
The proposed system consists of a two-stage cascade. The first stage performs a rough heartbeat detection while the second stage refines the previous one, improving the temporal localization and also classifying the heartbeats into types S1 and S2. The first contribution is a novel approach that combines the dissimilarity matrix with the frame-level spectral divergence to locate heartbeats using t…
▽ More
The proposed system consists of a two-stage cascade. The first stage performs a rough heartbeat detection while the second stage refines the previous one, improving the temporal localization and also classifying the heartbeats into types S1 and S2. The first contribution is a novel approach that combines the dissimilarity matrix with the frame-level spectral divergence to locate heartbeats using the repetitiveness shown by the heart sounds and the temporal relationships between the intervals defined by the events S1/S2 and non-S1/S2 (systole and diastole). The second contribution is a verification-correction-classification process based on a sliding window that allows the preservation of the temporal structure of the cardiac cycle in order to be applied in the heart sound classification. The proposed method has been assessed using the open access databases PASCAL, CirCor DigiScope Phonocardiogram and an additional sound mixing procedure considering both Additive White Gaussian Noise (AWGN) and different kinds of clinical ambient noises from a commercial database. The proposed method provides the best detection/classification performance in realistic scenarios where the presence of cardiac anomalies as well as different types of clinical environmental noises are active in the PCG signal. Of note, the promising modelling of the temporal structures of the heart provided by the dissimilarity matrix together with the frame-level spectral divergence, as well as the removal of a significant number of spurious heart events and recovery of missing heart events, both corrected by the proposed verification-correction-classification algorithm, suggest that our proposal is a successful tool to be applied in heart segmentation.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
Pre-trained Spatial Priors on Multichannel NMF for Music Source Separation
Authors:
Pablo Cabanas-Molero,
Antonio J. Munoz-Montoro,
Julio Carabias-Orti,
Pedro Vera-Candeas
Abstract:
This paper presents a novel approach to sound source separation that leverages spatial information obtained during the recording setup. Our method trains a spatial mixing filter using solo passages to capture information about the room impulse response and transducer response at each sensor location. This pre-trained filter is then integrated into a multichannel non-negative matrix factorization (…
▽ More
This paper presents a novel approach to sound source separation that leverages spatial information obtained during the recording setup. Our method trains a spatial mixing filter using solo passages to capture information about the room impulse response and transducer response at each sensor location. This pre-trained filter is then integrated into a multichannel non-negative matrix factorization (MNMF) scheme to better capture the variances of different sound sources. The recording setup used in our experiments is the typical setup for orchestra recordings, with a main microphone and a close "cardioid" or "supercardioid" microphone for each section of the orchestra. This makes the proposed method applicable to many existing recordings. Experiments on polyphonic ensembles demonstrate the effectiveness of the proposed framework in separating individual sound sources, improving performance compared to conventional MNMF methods.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Multichannel Singing Voice Separation by Deep Neural Network Informed DOA Constrained CNMF
Authors:
Antonio J. Muñoz-Montoro,
Julio J. Carabias-Orti,
Archontis Politis,
Konstantinos Drossos
Abstract:
This work addresses the problem of multichannel source separation combining two powerful approaches, multichannel spectral factorization with recent monophonic deep-learning (DL) based spectrum inference. Individual source spectra at different channels are estimated with a Masker-Denoiser Twin Network (MaD TwinNet), able to model long-term temporal patterns of a musical piece. The monophonic sourc…
▽ More
This work addresses the problem of multichannel source separation combining two powerful approaches, multichannel spectral factorization with recent monophonic deep-learning (DL) based spectrum inference. Individual source spectra at different channels are estimated with a Masker-Denoiser Twin Network (MaD TwinNet), able to model long-term temporal patterns of a musical piece. The monophonic source spectrograms are used within a spatial covariance mixing model based on Complex Non-Negative Matrix Factorization (CNMF) that predicts the spatial characteristics of each source. The proposed framework is evaluated on the task of singing voice separation with a large multichannel dataset. Experimental results show that our joint DL+CNMF method outperforms both the individual monophonic DL-based separation and the multichannel CNMF baseline methods.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
A new definition of the distortion matrix for an audio-to-score alignment system
Authors:
A. J. Muñoz-Montoro,
P. Vera-Candeas,
D. Suarez-Dou,
R. Cortina
Abstract:
In this paper we present a new definition of the distortion matrix for a score following framework based on DTW. The proposal consists of arranging the score information in a sequence of note combinations and learning a spectral pattern for each combination using instrument models. Then, the distortion matrix is computed using these spectral patterns and a novel decomposition of the input signal.
In this paper we present a new definition of the distortion matrix for a score following framework based on DTW. The proposal consists of arranging the score information in a sequence of note combinations and learning a spectral pattern for each combination using instrument models. Then, the distortion matrix is computed using these spectral patterns and a novel decomposition of the input signal.
△ Less
Submitted 29 May, 2019;
originally announced May 2019.