Skip to main content

Showing 1–12 of 12 results for author: Theodoridis, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2509.18691  [pdf, ps, other

    cs.SD cs.AI eess.AS

    An overview of neural architectures for self-supervised audio representation learning from masked spectrograms

    Authors: Sarthak Yadav, Sergios Theodoridis, Zheng-Hua Tan

    Abstract: In recent years, self-supervised learning has amassed significant interest for training deep neural representations without labeled data. One such self-supervised learning approach is masked spectrogram modeling, where the objective is to learn semantically rich contextual representations by predicting removed or hidden portions of the input audio spectrogram. With the Transformer neural architect… ▽ More

    Submitted 23 September, 2025; originally announced September 2025.

  2. arXiv:2507.10464  [pdf, ps, other

    cs.SD cs.AI eess.AS

    AudioMAE++: learning better masked audio representations with SwiGLU FFNs

    Authors: Sarthak Yadav, Sergios Theodoridis, Zheng-Hua Tan

    Abstract: Masked Autoencoders (MAEs) trained on audio spectrogram patches have emerged as a prominent approach for learning self-supervised audio representations. While several recent papers have evaluated key aspects of training MAEs on audio data, the majority of these approaches still leverage vanilla transformer building blocks, whereas the transformer community has seen steady integration of newer arch… ▽ More

    Submitted 14 July, 2025; originally announced July 2025.

    Comments: TO APPEAR AT IEEE MLSP 2025

  3. arXiv:2506.11629  [pdf, ps, other

    eess.SP

    FieldFormer: Self-supervised Reconstruction of Physical Fields via Tensor Attention Prior

    Authors: Panqi Chen, Siyuan Li, Lei Cheng, Xiao Fu, Yik-Chung Wu, Sergios Theodoridis

    Abstract: Reconstructing physical field tensors from \textit{in situ} observations, such as radio maps and ocean sound speed fields, is crucial for enabling environment-aware decision making in various applications, e.g., wireless communications and underwater acoustics. Field data reconstruction is often challenging, due to the limited and noisy nature of the observations, necessitating the incorporation o… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  4. AxLSTMs: learning self-supervised audio representations with xLSTMs

    Authors: Sarthak Yadav, Sergios Theodoridis, Zheng-Hua Tan

    Abstract: While the transformer has emerged as the eminent neural architecture, several independent lines of research have emerged to address its limitations. Recurrent neural approaches have observed a lot of renewed interest, including the extended long short-term memory (xLSTM) architecture, which reinvigorates the original LSTM. However, while xLSTMs have shown competitive performance compared to the tr… ▽ More

    Submitted 19 August, 2025; v1 submitted 29 August, 2024; originally announced August 2024.

    Comments: INTERSPEECH 2025

  5. arXiv:2309.08201  [pdf, other

    cs.LG eess.SP math.OC

    Sparsity-Aware Distributed Learning for Gaussian Processes with Linear Multiple Kernel

    Authors: Richard Cornelius Suwandi, Zhidi Lin, Feng Yin, Zhiguo Wang, Sergios Theodoridis

    Abstract: Gaussian processes (GPs) stand as crucial tools in machine learning and signal processing, with their effectiveness hinging on kernel design and hyper-parameter optimization. This paper presents a novel GP linear multiple kernel (LMK) and a generic sparsity-aware distributed learning framework to optimize the hyper-parameters. The newly proposed grid spectral mixture product (GSMP) kernel is tailo… ▽ More

    Submitted 16 January, 2025; v1 submitted 15 September, 2023; originally announced September 2023.

  6. arXiv:2309.01074  [pdf, other

    cs.LG eess.SP eess.SY

    Towards Efficient Modeling and Inference in Multi-Dimensional Gaussian Process State-Space Models

    Authors: Zhidi Lin, Juan Maroñas, Ying Li, Feng Yin, Sergios Theodoridis

    Abstract: The Gaussian process state-space model (GPSSM) has attracted extensive attention for modeling complex nonlinear dynamical systems. However, the existing GPSSM employs separate Gaussian processes (GPs) for each latent state dimension, leading to escalating computational complexity and parameter proliferation, thus posing challenges for modeling dynamical systems with high-dimensional latent states.… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

  7. arXiv:2306.00561  [pdf, other

    cs.SD cs.AI eess.AS

    Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners

    Authors: Sarthak Yadav, Sergios Theodoridis, Lars Kai Hansen, Zheng-Hua Tan

    Abstract: In this work, we propose a Multi-Window Masked Autoencoder (MW-MAE) fitted with a novel Multi-Window Multi-Head Attention (MW-MHA) module that facilitates the modelling of local-global interactions in every decoder transformer block through attention heads of several distinct local and global windows. Empirical results on ten downstream audio tasks show that MW-MAEs consistently outperform standar… ▽ More

    Submitted 1 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

  8. arXiv:2205.14283  [pdf, other

    stat.ML cs.LG eess.IV eess.SP

    Rethinking Bayesian Learning for Data Analysis: The Art of Prior and Inference in Sparsity-Aware Modeling

    Authors: Lei Cheng, Feng Yin, Sergios Theodoridis, Sotirios Chatzis, Tsung-Hui Chang

    Abstract: Sparse modeling for signal processing and machine learning has been at the focus of scientific research for over two decades. Among others, supervised sparsity-aware learning comprises two major paths paved by: a) discriminative methods and b) generative methods. The latter, more widely known as Bayesian methods, enable uncertainty evaluation w.r.t. the performed predictions. Furthermore, they can… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: 64 pages, 16 figures, 6 tables, 98 references, submitted to IEEE Signal Processing Magazine

  9. arXiv:2009.02472  [pdf, other

    cs.LG eess.SP stat.ML

    Towards Flexible Sparsity-Aware Modeling: Automatic Tensor Rank Learning Using The Generalized Hyperbolic Prior

    Authors: Lei Cheng, Zhongtao Chen, Qingjiang Shi, Yik-Chung Wu, Sergios Theodoridis

    Abstract: Tensor rank learning for canonical polyadic decomposition (CPD) has long been deemed as an essential yet challenging problem. In particular, since the tensor rank controls the complexity of the CPD model, its inaccurate learning would cause overfitting to noise or underfitting to the signal sources, and even destroy the interpretability of model parameters. However, the optimal determination of a… ▽ More

    Submitted 29 March, 2022; v1 submitted 5 September, 2020; originally announced September 2020.

  10. arXiv:2005.07134  [pdf, other

    eess.SP cs.LG q-bio.NC stat.ML

    Early soft and flexible fusion of EEG and fMRI via tensor decompositions

    Authors: Christos Chatzichristos, Eleftherios Kofidis, Lieven De Lathauwer, Sergios Theodoridis, Sabine Van Huffel

    Abstract: Data fusion refers to the joint analysis of multiple datasets which provide complementary views of the same task. In this preprint, the problem of jointly analyzing electroencephalography (EEG) and functional Magnetic Resonance Imaging (fMRI) data is considered. Jointly analyzing EEG and fMRI measurements is highly beneficial for studying brain function because these modalities have complementary… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

  11. arXiv:2003.03697  [pdf, other

    cs.DC cs.LG eess.SP eess.SY stat.AP

    FedLoc: Federated Learning Framework for Data-Driven Cooperative Localization and Location Data Processing

    Authors: Feng Yin, Zhidi Lin, Yue Xu, Qinglei Kong, Deshi Li, Sergios Theodoridis, Shuguang, Cui

    Abstract: In this overview paper, data-driven learning model-based cooperative localization and location data processing are considered, in line with the emerging machine learning and big data methods. We first review (1) state-of-the-art algorithms in the context of federated learning, (2) two widely used learning models, namely the deep neural network model and the Gaussian process model, and (3) various… ▽ More

    Submitted 25 May, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

  12. arXiv:1904.09559  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Linear Multiple Low-Rank Kernel Based Stationary Gaussian Processes Regression for Time Series

    Authors: Feng Yin, Lishuo Pan, Xinwei He, Tianshi Chen, Sergios Theodoridis, Zhi-Quan, Luo

    Abstract: Gaussian processes (GP) for machine learning have been studied systematically over the past two decades and they are by now widely used in a number of diverse applications. However, GP kernel design and the associated hyper-parameter optimization are still hard and to a large extend open problems. In this paper, we consider the task of GP regression for time series modeling and analysis. The under… ▽ More

    Submitted 21 April, 2019; originally announced April 2019.

    Comments: 15 pages, 5 figures, submitted