Skip to main content

Showing 1–6 of 6 results for author: Aroudi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.11346  [pdf, other

    cs.HC

    Non-verbal Hands-free Control for Smart Glasses using Teeth Clicks

    Authors: Payal Mohapatra, Ali Aroudi, Anurag Kumar, Morteza Khaleghimeybodi

    Abstract: Smart glasses are emerging as a popular wearable computing platform potentially revolutionizing the next generation of human-computer interaction. The widespread adoption of smart glasses has created a pressing need for discreet and hands-free control methods. Traditional input techniques, such as voice commands or tactile gestures, can be intrusive and non-discreet. Additionally, voice-based cont… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  2. arXiv:2408.06468  [pdf, other

    cs.SD cs.MM eess.AS eess.SP

    FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses

    Authors: Zhongweiyang Xu, Ali Aroudi, Ke Tan, Ashutosh Pandey, Jung-Suk Lee, Buye Xu, Francesco Nesta

    Abstract: This paper presents a novel multi-channel speech enhancement approach, FoVNet, that enables highly efficient speech enhancement within a configurable field of view (FoV) of a smart-glasses user without needing specific target-talker(s) directions. It advances over prior works by enhancing all speakers within any given FoV, with a hybrid signal processing and deep learning approach designed with hi… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

    Comments: Accepted by INTERSPEECH2024

  3. arXiv:2110.04047  [pdf, other

    eess.AS cs.SD eess.SP

    TRUNet: Transformer-Recurrent-U Network for Multi-channel Reverberant Sound Source Separation

    Authors: Ali Aroudi, Stefan Uhlich, Marc Ferras Font

    Abstract: In recent years, many deep learning techniques for single-channel sound source separation have been proposed using recurrent, convolutional and transformer networks. When multiple microphones are available, spatial diversity between speakers and background noise in addition to spectro-temporal diversity can be exploited by using multi-channel filters for sound source separation. Aiming at end-to-e… ▽ More

    Submitted 22 August, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

  4. arXiv:2010.11566  [pdf, other

    eess.AS cs.AI eess.SP

    DBNET: DOA-driven beamforming network for end-to-end farfield sound source separation

    Authors: Ali Aroudi, Sebastian Braun

    Abstract: Many deep learning techniques are available to perform source separation and reduce background noise. However, designing an end-to-end multi-channel source separation method using deep learning and conventional acoustic signal processing techniques still remains challenging. In this paper we propose a direction-of-arrival-driven beamforming network (DBnet) consisting of direction-of-arrival (DOA)… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: 5 pages, 4 figures

  5. arXiv:2005.04669  [pdf, other

    cs.SD eess.AS eess.SP

    Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding

    Authors: Ali Aroudi, Marc Delcroix, Tomohiro Nakatani, Keisuke Kinoshita, Shoko Araki, Simon Doclo

    Abstract: The performance of speech enhancement algorithms in a multi-speaker scenario depends on correctly identifying the target speaker to be enhanced. Auditory attention decoding (AAD) methods allow to identify the target speaker which the listener is attending to from single-trial EEG recordings. Aiming at enhancing the target speaker and suppressing interfering speakers, reverberation and ambient nois… ▽ More

    Submitted 10 May, 2020; originally announced May 2020.

  6. arXiv:2004.00910  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Improving auditory attention decoding performance of linear and non-linear methods using state-space model

    Authors: Ali Aroudi, Tobias de Taillez, Simon Doclo

    Abstract: Identifying the target speaker in hearing aid applications is crucial to improve speech understanding. Recent advances in electroencephalography (EEG) have shown that it is possible to identify the target speaker from single-trial EEG recordings using auditory attention decoding (AAD) methods. AAD methods reconstruct the attended speech envelope from EEG recordings, based on a linear least-squares… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.