Skip to main content

Showing 1–3 of 3 results for author: Kienegger, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2507.02791  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Self-Steering Deep Non-Linear Spatially Selective Filters for Efficient Extraction of Moving Speakers under Weak Guidance

    Authors: Jakob Kienegger, Alina Mannanova, Huajian Fang, Timo Gerkmann

    Abstract: Recent works on deep non-linear spatially selective filters demonstrate exceptional enhancement performance with computationally lightweight architectures for stationary speakers of known directions. However, to maintain this performance in dynamic scenarios, resource-intensive data-driven tracking algorithms become necessary to provide precise spatial guidance conditioned on the initial direction… ▽ More

    Submitted 5 July, 2025; v1 submitted 3 July, 2025; originally announced July 2025.

    Comments: Accepted at IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2025. Video demonstration: https://youtu.be/aSKOSh5JZ3o

  2. arXiv:2505.14517  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Steering Deep Non-Linear Spatially Selective Filters for Weakly Guided Extraction of Moving Speakers in Dynamic Scenarios

    Authors: Jakob Kienegger, Timo Gerkmann

    Abstract: Recent speaker extraction methods using deep non-linear spatial filtering perform exceptionally well when the target direction is known and stationary. However, spatially dynamic scenarios are considerably more challenging due to time-varying spatial features and arising ambiguities, e.g. when moving speakers cross. While in a static scenario it may be easy for a user to point to the target's dire… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: Accepted at Interspeech 2025

  3. arXiv:2410.19595  [pdf, other

    eess.AS cs.LG cs.SD

    Mask-Weighted Spatial Likelihood Coding for Speaker-Independent Joint Localization and Mask Estimation

    Authors: Jakob Kienegger, Alina Mannanova, Timo Gerkmann

    Abstract: Due to their robustness and flexibility, neural-driven beamformers are a popular choice for speech separation in challenging environments with a varying amount of simultaneous speakers alongside noise and reverberation. Time-frequency masks and relative directions of the speakers regarding a fixed spatial grid can be used to estimate the beamformer's parameters. To some degree, speaker-independenc… ▽ More

    Submitted 8 January, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

    Comments: ©2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works