Skip to main content

Showing 1–30 of 30 results for author: Moeller, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2505.17584  [pdf, ps, other

    eess.AS cs.SD

    Private kNN-VC: Interpretable Anonymization of Converted Speech

    Authors: Carlos Franzreb, Arnab Das, Tim Polzehl, Sebastian Möller

    Abstract: Speaker anonymization seeks to conceal a speaker's identity while preserving the utility of their speech. The achieved privacy is commonly evaluated with a speaker recognition model trained on anonymized speech. Although this represents a strong attack, it is unclear which aspects of speech are exploited to identify the speakers. Our research sets out to unveil these aspects. It starts with kNN-VC… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: Accepted by Interspeech 2025

  2. arXiv:2505.13930  [pdf, ps, other

    cs.SD eess.AS

    BiCrossMamba-ST: Speech Deepfake Detection with Bidirectional Mamba Spectro-Temporal Cross-Attention

    Authors: Yassine El Kheir, Tim Polzehl, Sebastian Möller

    Abstract: We propose BiCrossMamba-ST, a robust framework for speech deepfake detection that leverages a dual-branch spectro-temporal architecture powered by bidirectional Mamba blocks and mutual cross-attention. By processing spectral sub-bands and temporal intervals separately and then integrating their representations, BiCrossMamba-ST effectively captures the subtle cues of synthetic speech. In addition,… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: Accepted Interspeech 2025

  3. arXiv:2502.03559  [pdf, other

    eess.AS cs.SD

    Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection

    Authors: Yassine El Kheir, Youness Samih, Suraj Maharjan, Tim Polzehl, Sebastian Möller

    Abstract: This paper conducts a comprehensive layer-wise analysis of self-supervised learning (SSL) models for audio deepfake detection across diverse contexts, including multilingual datasets (English, Chinese, Spanish), partial, song, and scene-based deepfake scenarios. By systematically evaluating the contributions of different transformer layers, we uncover critical insights into model behavior and perf… ▽ More

    Submitted 7 February, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

    Comments: Accepted to NAACL Findings 2025

  4. arXiv:2204.07923  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    Accelerated MRI With Deep Linear Convolutional Transform Learning

    Authors: Hongyi Gu, Burhaneddin Yaman, Steen Moeller, Il Yong Chun, Mehmet Akçakaya

    Abstract: Recent studies show that deep learning (DL) based MRI reconstruction outperforms conventional methods, such as parallel imaging and compressed sensing (CS), in multiple applications. Unlike CS that is typically implemented with pre-determined linear representations for regularization, DL inherently uses a non-linear representation learned from a large database. Another line of work uses transform… ▽ More

    Submitted 19 August, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

  5. arXiv:2204.01115  [pdf, other

    cs.SD cs.HC eess.AS

    On incorporating social speaker characteristics in synthetic speech

    Authors: Sai Sirisha Rallabandi, Sebastian Möller

    Abstract: In our previous work, we derived the acoustic features, that contribute to the perception of warmth and competence in synthetic speech. As an extension, in our current work, we investigate the impact of the derived vocal features in the generation of the desired characteristics. The acoustic features, spectral flux, F1 mean and F2 mean and their convex combinations were explored for the generation… ▽ More

    Submitted 3 April, 2022; originally announced April 2022.

    Comments: Submitted to Interspeech 2022

  6. arXiv:2203.16032  [pdf, other

    cs.SD eess.AS

    ConferencingSpeech 2022 Challenge: Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge for Online Conferencing Applications

    Authors: Gaoxiong Yi, Wei Xiao, Yiming Xiao, Babak Naderi, Sebastian Möller, Wafaa Wardah, Gabriel Mittag, Ross Cutler, Zhuohuang Zhang, Donald S. Williamson, Fei Chen, Fuzheng Yang, Shidong Shang

    Abstract: With the advances in speech communication systems such as online conferencing applications, we can seamlessly work with people regardless of where they are. However, during online meetings, speech quality can be significantly affected by background noise, reverberation, packet loss, network jitter, etc. Because of its nature, speech quality is traditionally assessed in subjective tests in laborato… ▽ More

    Submitted 31 March, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

  7. arXiv:2112.06219  [pdf, other

    cs.SD cs.LG eess.AS

    Visualising and Explaining Deep Learning Models for Speech Quality Prediction

    Authors: H. Tilkorn, G. Mittag, S. Möller

    Abstract: Estimating quality of transmitted speech is known to be a non-trivial task. While traditionally, test participants are asked to rate the quality of samples; nowadays, automated methods are available. These methods can be divided into: 1) intrusive models, which use both, the original and the degraded signals, and 2) non-intrusive models, which only require the degraded signal. Recently, non-intrus… ▽ More

    Submitted 12 December, 2021; originally announced December 2021.

    Comments: 4 pages, 6 figures, In Proceedings of the DAGA 2021 (the annual conference of the German Acoustical Society, DEGA)

    ACM Class: I.2.7

  8. arXiv:2105.05827  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    20-fold Accelerated 7T fMRI Using Referenceless Self-Supervised Deep Learning Reconstruction

    Authors: Omer Burak Demirel, Burhaneddin Yaman, Logan Dowdle, Steen Moeller, Luca Vizioli, Essa Yacoub, John Strupp, Cheryl A. Olman, Kâmil Uğurbil, Mehmet Akçakaya

    Abstract: High spatial and temporal resolution across the whole brain is essential to accurately resolve neural activities in fMRI. Therefore, accelerated imaging techniques target improved coverage with high spatio-temporal resolution. Simultaneous multi-slice (SMS) imaging combined with in-plane acceleration are used in large studies that involve ultrahigh field fMRI, such as the Human Connectome Project.… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

  9. arXiv:2105.04532  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    Improved Simultaneous Multi-Slice Functional MRI Using Self-supervised Deep Learning

    Authors: Omer Burak Demirel, Burhaneddin Yaman, Logan Dowdle, Steen Moeller, Luca Vizioli, Essa Yacoub, John Strupp, Cheryl A. Olman, Kâmil Uğurbil, Mehmet Akçakaya

    Abstract: Functional MRI (fMRI) is commonly used for interpreting neural activities across the brain. Numerous accelerated fMRI techniques aim to provide improved spatiotemporal resolutions. Among these, simultaneous multi-slice (SMS) imaging has emerged as a powerful strategy, becoming a part of large-scale studies, such as the Human Connectome Project. However, when SMS imaging is combined with in-plane a… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

  10. arXiv:2105.00783  [pdf, other

    eess.AS cs.AI cs.LG cs.SD

    Full-Reference Speech Quality Estimation with Attentional Siamese Neural Networks

    Authors: Gabriel Mittags, Sebastian Möller

    Abstract: In this paper, we present a full-reference speech quality prediction model with a deep learning approach. The model determines a feature representation of the reference and the degraded signal through a siamese recurrent convolutional network that shares the weights for both signals as input. The resulting features are then used to align the signals with an attention mechanism and are finally comb… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

    Comments: Late upload, presented at ICASSP 2020

  11. arXiv:2104.11673  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    Deep Learning Based Assessment of Synthetic Speech Naturalness

    Authors: Gabriel Mittag, Sebastian Möller

    Abstract: In this paper, we present a new objective prediction model for synthetic speech naturalness. It can be used to evaluate Text-To-Speech or Voice Conversion systems and works language independently. The model is trained end-to-end and based on a CNN-LSTM network that previously showed to give good results for speech quality estimation. We trained and tested the model on 16 different datasets, such a… ▽ More

    Submitted 23 April, 2021; originally announced April 2021.

    Comments: Late upload, presented at Interspeech 2020

  12. arXiv:2104.10217  [pdf, other

    eess.AS cs.LG cs.SD eess.IV

    Bias-Aware Loss for Training Image and Speech Quality Prediction Models from Multiple Datasets

    Authors: Gabriel Mittag, Saman Zadtootaghaj, Thilo Michael, Babak Naderi, Sebastian Möller

    Abstract: The ground truth used for training image, video, or speech quality prediction models is based on the Mean Opinion Scores (MOS) obtained from subjective experiments. Usually, it is necessary to conduct multiple experiments, mostly with different test participants, to obtain enough data to train quality models based on machine learning. Each of these experiments is subject to an experiment-specific… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: Accepted at QoMEX 2021

  13. arXiv:2104.09494  [pdf, other

    eess.AS cs.AI cs.LG cs.SD

    NISQA: A Deep CNN-Self-Attention Model for Multidimensional Speech Quality Prediction with Crowdsourced Datasets

    Authors: Gabriel Mittag, Babak Naderi, Assmaa Chehadi, Sebastian Möller

    Abstract: In this paper, we present an update to the NISQA speech quality prediction model that is focused on distortions that occur in communication networks. In contrast to the previous version, the model is trained end-to-end and the time-dependency modelling and time-pooling is achieved through a Self-Attention mechanism. Besides overall speech quality, the model also predicts the four speech quality di… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: Submitted to Interspeech 2021

  14. arXiv:2104.04371  [pdf, other

    cs.MM eess.AS

    Speech Quality Assessment in Crowdsourcing: Comparison Category Rating Method

    Authors: Babak Naderi, Sebastian Möller, Ross Cutler

    Abstract: Traditionally, Quality of Experience (QoE) for a communication system is evaluated through a subjective test. The most common test method for speech QoE is the Absolute Category Rating (ACR), in which participants listen to a set of stimuli, processed by the underlying test conditions, and rate their perceived quality for each stimulus on a specific scale. The Comparison Category Rating (CCR) is a… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: Accepted for QoMEX2021

  15. Incorporating Wireless Communication Parameters into the E-Model Algorithm

    Authors: Demóstenes Z. Rodríguez, Dick Carrillo Melgarejo, Miguel A. Ramírez, Pedro H. J. Nardelli, Sebastian Möller

    Abstract: Telecommunication service providers have to guarantee acceptable speech quality during a phone call to avoid a negative impact on the users' quality of experience. Currently, there are different speech quality assessment methods. ITU-T Recommendation G.107 describes the E-model algorithm, which is a computational model developed for network planning purposes focused on narrowband (NB) networks. La… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

    Comments: 18 pages

    Journal ref: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 29, 2021

  16. arXiv:2102.13066  [pdf

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    On Instabilities of Conventional Multi-Coil MRI Reconstruction to Small Adverserial Perturbations

    Authors: Chi Zhang, Jinghan Jia, Burhaneddin Yaman, Steen Moeller, Sijia Liu, Mingyi Hong, Mehmet Akçakaya

    Abstract: Although deep learning (DL) has received much attention in accelerated MRI, recent studies suggest small perturbations may lead to instabilities in DL-based reconstructions, leading to concern for their clinical application. However, these works focus on single-coil acquisitions, which is not practical. We investigate instabilities caused by small adversarial attacks for multi-coil acquisitions. O… ▽ More

    Submitted 25 February, 2021; originally announced February 2021.

    Comments: To appear in Proceedings of the 29th Annual Meeting of ISMRM, 2021

  17. arXiv:2011.09414  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    Self-Supervised Physics-Guided Deep Learning Reconstruction For High-Resolution 3D LGE CMR

    Authors: Burhaneddin Yaman, Chetan Shenoy, Zilin Deng, Steen Moeller, Hossam El-Rewaidy, Reza Nezafat, Mehmet Akçakaya

    Abstract: Late gadolinium enhancement (LGE) cardiac MRI (CMR) is the clinical standard for diagnosis of myocardial scar. 3D isotropic LGE CMR provides improved coverage and resolution compared to 2D imaging. However, image acceleration is required due to long scan times and contrast washout. Physics-guided deep learning (PG-DL) approaches have recently emerged as an improved accelerated MRI strategy. Traini… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

    Journal ref: Proceedings of IEEE ISBI, 2021

  18. arXiv:2010.13868  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    Improved Supervised Training of Physics-Guided Deep Learning Image Reconstruction with Multi-Masking

    Authors: Burhaneddin Yaman, Seyed Amir Hossein Hosseini, Steen Moeller, Mehmet Akçakaya

    Abstract: Physics-guided deep learning (PG-DL) via algorithm unrolling has received significant interest for improved image reconstruction, including MRI applications. These methods unroll an iterative optimization algorithm into a series of regularizer and data consistency units. The unrolled networks are typically trained end-to-end using a supervised approach. Current supervised PG-DL approaches use all… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Journal ref: Proceedings of IEEE ICASSP, 2021

  19. arXiv:2010.13260  [pdf, ps, other

    cs.MM cs.SD eess.AS

    Effect of Language Proficiency on Subjective Evaluation of Noise Suppression Algorithms

    Authors: Babak Naderi, Gabriel Mittag, Rafael Zequeira Jim\a'enez, Sebastian Möller

    Abstract: Speech communication systems based on Voice-over-IP technology are frequently used by native as well as non-native speakers of a target language, e.g. in international phone calls or telemeetings. Frequently, such calls also occur in a noisy environment, making noise suppression modules necessary to increase perceived quality of experience. Whereas standard tests for assessing perceived quality ma… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

  20. arXiv:2009.07163  [pdf

    physics.med-ph eess.SY physics.ins-det

    A Self-Decoupled 32 Channel Receive Array for Human Brain Magnetic Resonance Imaging at 10.5T

    Authors: Nader Tavaf, Russell L. Lagore, Steve Jungst, Shajan Gunamony, Jerahmie Radder, Andrea Grant, Steen Moeller, Edward Auerbach, Kamil Ugurbil, Gregor Adriany, Pierre-Francois Van de Moortele

    Abstract: Purpose: Receive array layout, noise mitigation and B0 field strength are crucial contributors to signal-to-noise ratio (SNR) and parallel imaging performance. Here, we investigate SNR and parallel imaging gains at 10.5 Tesla (T) compared to 7T using 32-channel receive arrays at both fields. Methods: A self-decoupled 32-channel receive array for human brain imaging at 10.5T (10.5T-32Rx), consistin… ▽ More

    Submitted 9 November, 2020; v1 submitted 15 September, 2020; originally announced September 2020.

    Comments: to be published in Magnetic Resonance in Medicine

    Journal ref: Magn Reson Med. 2021 pp 1-14

  21. arXiv:2008.06029  [pdf

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    Multi-Mask Self-Supervised Learning for Physics-Guided Neural Networks in Highly Accelerated MRI

    Authors: Burhaneddin Yaman, Hongyi Gu, Seyed Amir Hossein Hosseini, Omer Burak Demirel, Steen Moeller, Jutta Ellermann, Kâmil Uğurbil, Mehmet Akçakaya

    Abstract: Self-supervised learning has shown great promise due to its capability to train deep learning MRI reconstruction methods without fully-sampled data. Current self-supervised learning methods for physics-guided reconstruction networks split acquired undersampled data into two disjoint sets, where one is used for data consistency (DC) in the unrolled network and the other to define the training loss.… ▽ More

    Submitted 8 June, 2022; v1 submitted 13 August, 2020; originally announced August 2020.

    Journal ref: NMR in Biomedicine, 2022

  22. arXiv:2005.05550  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    High-Fidelity Accelerated MRI Reconstruction by Scan-Specific Fine-Tuning of Physics-Based Neural Networks

    Authors: Seyed Amir Hossein Hosseini, Burhaneddin Yaman, Steen Moeller, Mehmet Akçakaya

    Abstract: Long scan duration remains a challenge for high-resolution MRI. Deep learning has emerged as a powerful means for accelerated MRI reconstruction by providing data-driven regularizers that are directly learned from data. These data-driven priors typically remain unchanged for future data in the testing phase once they are learned during training. In this study, we propose to use a transfer learning… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

    Journal ref: Proceedings of IEEE EMBC, 2020

  23. arXiv:2005.00836  [pdf, ps, other

    cs.MM cs.CV eess.IV

    Towards Deep Learning Methods for Quality Assessment of Computer-Generated Imagery

    Authors: Markus Utke, Saman Zadtootaghaj, Steven Schmidt, Sebastian Möller

    Abstract: Video gaming streaming services are growing rapidly due to new services such as passive video streaming, e.g. Twitch.tv, and cloud gaming, e.g. Nvidia Geforce Now. In contrast to traditional video content, gaming content has special characteristics such as extremely high motion for some games, special motion patterns, synthetic content and repetitive content, which makes the state-of-the-art video… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Comments: 4 pages

  24. arXiv:2005.00400  [pdf, other

    cs.HC cs.NI cs.SD eess.AS

    Multi-episodic Perceived Quality of an Audio-on-Demand Service

    Authors: Dennis Guse, Oliver Hohlfeld, Anna Wunderlich, Benjamin Weiss, Sebastian Möller

    Abstract: QoE is traditionally evaluated by using short stimuli usually representing parts or single usage episodes. This opens the question on how the overall service perception involving multiple} usage episodes can be evaluated---a question of high practical relevance to service operators. Despite initial research on this challenging aspect of multi-episodic perceived quality, the question of the underly… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: To appear at IEEE QoMEX 2020

    ACM Class: H.5.1; H.5.5; C.2.m

  25. arXiv:1912.07669  [pdf

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    Self-Supervised Learning of Physics-Guided Reconstruction Neural Networks without Fully-Sampled Reference Data

    Authors: Burhaneddin Yaman, Seyed Amir Hossein Hosseini, Steen Moeller, Jutta Ellermann, Kâmil Uğurbil, Mehmet Akçakaya

    Abstract: Purpose: To develop a strategy for training a physics-guided MRI reconstruction neural network without a database of fully-sampled datasets. Theory and Methods: Self-supervised learning via data under-sampling (SSDU) for physics-guided deep learning (DL) reconstruction partitions available measurements into two disjoint sets, one of which is used in the data consistency units in the unrolled netwo… ▽ More

    Submitted 14 April, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

    Comments: This work is an extension of our previous work arXiv:1910.09116

    Journal ref: Magnetic Resonance in Medicine, 2020

  26. arXiv:1912.07197  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    Dense Recurrent Neural Networks for Accelerated MRI: History-Cognizant Unrolling of Optimization Algorithms

    Authors: Seyed Amir Hossein Hosseini, Burhaneddin Yaman, Steen Moeller, Mingyi Hong, Mehmet Akçakaya

    Abstract: Inverse problems for accelerated MRI typically incorporate domain-specific knowledge about the forward encoding operator in a regularized reconstruction framework. Recently physics-driven deep learning (DL) methods have been proposed to use neural networks for data-driven regularization. These methods unroll iterative optimization algorithms to solve the inverse problem objective function, by alte… ▽ More

    Submitted 8 July, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

    Journal ref: IEEE Journal of Selected Topics in Signal Processing, 2020

  27. arXiv:1910.09116  [pdf, other

    eess.IV cs.CV cs.LG eess.SP physics.med-ph

    Self-Supervised Physics-Based Deep Learning MRI Reconstruction Without Fully-Sampled Data

    Authors: Burhaneddin Yaman, Seyed Amir Hossein Hosseini, Steen Moeller, Jutta Ellermann, Kâmil Uǧurbil, Mehmet Akçakaya

    Abstract: Deep learning (DL) has emerged as a tool for improving accelerated MRI reconstruction. A common strategy among DL methods is the physics-based approach, where a regularized iterative algorithm alternating between data consistency and a regularizer is unrolled for a finite number of iterations. This unrolled network is then trained end-to-end in a supervised manner, using fully-sampled data as grou… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

    Comments: 5 Pages, 5 Figures

    Journal ref: Proceedings of IEEE ISBI, 2020

  28. Accelerated Coronary MRI with sRAKI: A Database-Free Self-Consistent Neural Network k-space Reconstruction for Arbitrary Undersampling

    Authors: Seyed Amir Hossein Hosseini, Chi Zhang, Sebastian Weingärtner, Steen Moeller, Matthias Stuber, Kâmil Uǧurbil, Mehmet Akçakaya

    Abstract: This study aims to accelerate coronary MRI using a novel reconstruction algorithm, called self-consistent robust artificial-neural-networks for k-space interpolation (sRAKI). sRAKI performs iterative parallel imaging reconstruction by enforcing coil self-consistency using subject-specific neural networks. This approach extends the linear convolutions in SPIRiT to nonlinear interpolation using conv… ▽ More

    Submitted 18 July, 2019; originally announced July 2019.

    Comments: This work has been partially presented at ISMRM Workshop on Machine Learning Part 2 (October 2018), SCMR/ISMRM Co-Provided Workshop (February 2019), IEEE International Symposium on Biomedical Imaging (April 2019) and ISMRM 27$^{th}$ Annual Meeting and Exhibition (May 2019)

  29. arXiv:1904.01112  [pdf, other

    eess.SP cs.CV cs.LG eess.IV

    Deep Learning Methods for Parallel Magnetic Resonance Image Reconstruction

    Authors: Florian Knoll, Kerstin Hammernik, Chi Zhang, Steen Moeller, Thomas Pock, Daniel K. Sodickson, Mehmet Akcakaya

    Abstract: Following the success of deep learning in a wide range of applications, neural network-based machine learning techniques have received interest as a means of accelerating magnetic resonance imaging (MRI). A number of ideas inspired by deep learning techniques from computer vision and image processing have been successfully applied to non-linear image reconstruction in the spirit of compressed sens… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

    Comments: 14 pages, 7 figures

  30. arXiv:1008.4895  [pdf, other

    math.OC eess.SY

    LIFO-Backpressure Achieves Near Optimal Utility-Delay Tradeoff

    Authors: Longbo Huang, Scott Moeller, Michael J. Neely, Bhaskar Krishnamachari

    Abstract: There has been considerable recent work developing a new stochastic network utility maximization framework using Backpressure algorithms, also known as MaxWeight. A key open problem has been the development of utility-optimal algorithms that are also delay efficient. In this paper, we show that the Backpressure algorithm, when combined with the LIFO queueing discipline (called LIFO-Backpressure),… ▽ More

    Submitted 3 April, 2011; v1 submitted 28 August, 2010; originally announced August 2010.