Skip to main content

Showing 1–5 of 5 results for author: Elminshawi, M

.
  1. arXiv:2303.08702  [pdf, other

    eess.AS cs.SD

    Beamformer-Guided Target Speaker Extraction

    Authors: Mohamed Elminshawi, Srikanth Raj Chetupalli, Emanuël A. P. Habets

    Abstract: We propose a Beamformer-guided Target Speaker Extraction (BG-TSE) method to extract a target speaker's voice from a multi-channel recording informed by the direction of arrival of the target. The proposed method employs a front-end beamformer steered towards the target speaker to provide an auxiliary signal to a single-channel TSE system. By allowing for time-varying embeddings in the single-chann… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: Submitted to the 2023 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2023)

  2. arXiv:2206.13808  [pdf, other

    eess.AS cs.SD

    Speaker Verification in Multi-Speaker Environments Using Temporal Feature Fusion

    Authors: Ahmad Aloradi, Wolfgang Mack, Mohamed Elminshawi, Emanuël A. P. Habets

    Abstract: Verifying the identity of a speaker is crucial in modern human-machine interfaces, e.g., to ensure privacy protection or to enable biometric authentication. Classical speaker verification (SV) approaches estimate a fixed-dimensional embedding from a speech utterance that encodes the speaker's voice characteristics. A speaker is verified if his/her voice embedding is sufficiently similar to the emb… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: To be presented at EUSIPCO 2022

  3. arXiv:2202.00733  [pdf, other

    eess.AS cs.SD

    New Insights on Target Speaker Extraction

    Authors: Mohamed Elminshawi, Wolfgang Mack, Srikanth Raj Chetupalli, Soumitro Chakrabarty, Emanuël A. P. Habets

    Abstract: Speaker extraction (SE) aims to segregate the speech of a target speaker from a mixture of interfering speakers with the help of auxiliary information. Several forms of auxiliary information have been employed in single-channel SE, such as a speech snippet enrolled from the target speaker or visual information corresponding to the spoken utterance. The effectiveness of the auxiliary information in… ▽ More

    Submitted 15 September, 2023; v1 submitted 1 February, 2022; originally announced February 2022.

  4. arXiv:2011.04569  [pdf, other

    eess.AS cs.AI cs.SD

    Informed Source Extraction With Application to Acoustic Echo Reduction

    Authors: Mohamed Elminshawi, Wolfgang Mack, Emanuël A. P. Habets

    Abstract: Informed speaker extraction aims to extract a target speech signal from a mixture of sources given prior knowledge about the desired speaker. Recent deep learning-based methods leverage a speaker discriminative model that maps a reference snippet uttered by the target speaker into a single embedding vector that encapsulates the characteristics of the target speaker. However, such modeling delibera… ▽ More

    Submitted 26 October, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: Published at ITG 2021

    Report number: 978-3-8007-5627-8

  5. arXiv:2007.01579  [pdf, ps, other

    eess.AS cs.SD

    Noise-Robust Adaptation Control for Supervised Acoustic System Identification Exploiting A Noise Dictionary

    Authors: Thomas Haubner, Andreas Brendel, Mohamed Elminshawi, Walter Kellermann

    Abstract: We present a noise-robust adaptation control strategy for block-online supervised acoustic system identification by exploiting a noise dictionary. The proposed algorithm takes advantage of the pronounced spectral structure which characterizes many types of interfering noise signals. We model the noisy observations by a linear Gaussian Discrete Fourier Transform-domain state space model whose param… ▽ More

    Submitted 3 February, 2021; v1 submitted 3 July, 2020; originally announced July 2020.