Search | arXiv e-print repository

Mixup Regularization: A Probabilistic Perspective

Authors: Yousef El-Laham, Niccolò Dalmasso, Svitlana Vyetrenko, Vamsi K. Potluru, Manuela Veloso

Abstract: In recent years, mixup regularization has gained popularity as an effective way to improve the generalization performance of deep learning models by training on convex combinations of training data. While many mixup variants have been explored, the proper adoption of the technique to conditional density estimation and probabilistic machine learning remains relatively unexplored. This work introduc… ▽ More In recent years, mixup regularization has gained popularity as an effective way to improve the generalization performance of deep learning models by training on convex combinations of training data. While many mixup variants have been explored, the proper adoption of the technique to conditional density estimation and probabilistic machine learning remains relatively unexplored. This work introduces a novel framework for mixup regularization based on probabilistic fusion that is better suited for conditional density estimation tasks. For data distributed according to a member of the exponential family, we show that likelihood functions can be analytically fused using log-linear pooling. We further propose an extension of probabilistic mixup, which allows for fusion of inputs at an arbitrary intermediate layer of the neural network. We provide a theoretical analysis comparing our approach to standard mixup variants. Empirical results on synthetic and real datasets demonstrate the benefits of our proposed framework compared to existing mixup variants. △ Less

Submitted 13 June, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

Comments: Accepted at UAI 2025, 28 figures, 9 tables

arXiv:2411.00635 [pdf, ps, other]

Variational Neural Stochastic Differential Equations with Change Points

Authors: Yousef El-Laham, Zhongchang Sun, Haibei Zhu, Tucker Balch, Svitlana Vyetrenko

Abstract: In this work, we explore modeling change points in time-series data using neural stochastic differential equations (neural SDEs). We propose a novel model formulation and training procedure based on the variational autoencoder (VAE) framework for modeling time-series as a neural SDE. Unlike existing algorithms training neural SDEs as VAEs, our proposed algorithm only necessitates a Gaussian prior… ▽ More In this work, we explore modeling change points in time-series data using neural stochastic differential equations (neural SDEs). We propose a novel model formulation and training procedure based on the variational autoencoder (VAE) framework for modeling time-series as a neural SDE. Unlike existing algorithms training neural SDEs as VAEs, our proposed algorithm only necessitates a Gaussian prior of the initial state of the latent stochastic process, rather than a Wiener process prior on the entire latent stochastic process. We develop two methodologies for modeling and estimating change points in time-series data with distribution shifts. Our iterative algorithm alternates between updating neural SDE parameters and updating the change points based on either a maximum likelihood-based approach or a change point detection algorithm using the sequential likelihood ratio test. We provide a theoretical analysis of this proposed change point detection scheme. Finally, we present an empirical evaluation that demonstrates the expressive power of our proposed model, showing that it can effectively model both classical parametric SDEs and some real datasets with distribution shifts. △ Less

Submitted 13 June, 2025; v1 submitted 1 November, 2024; originally announced November 2024.

arXiv:2312.13152 [pdf, other]

Neural Stochastic Differential Equations with Change Points: A Generative Adversarial Approach

Authors: Zhongchang Sun, Yousef El-Laham, Svitlana Vyetrenko

Abstract: Stochastic differential equations (SDEs) have been widely used to model real world random phenomena. Existing works mainly focus on the case where the time series is modeled by a single SDE, which might be restrictive for modeling time series with distributional shift. In this work, we propose a change point detection algorithm for time series modeled as neural SDEs. Given a time series dataset, t… ▽ More Stochastic differential equations (SDEs) have been widely used to model real world random phenomena. Existing works mainly focus on the case where the time series is modeled by a single SDE, which might be restrictive for modeling time series with distributional shift. In this work, we propose a change point detection algorithm for time series modeled as neural SDEs. Given a time series dataset, the proposed method jointly learns the unknown change points and the parameters of distinct neural SDE models corresponding to each change point. Specifically, the SDEs are learned under the framework of generative adversarial networks (GANs) and the change points are detected based on the output of the GAN discriminator in a forward pass. At each step of the proposed algorithm, the change points and the SDE model parameters are updated in an alternating fashion. Numerical results on both synthetic and real datasets are provided to validate the performance of our algorithm in comparison to classical change point detection benchmarks, standard GAN-based neural SDEs, and other state-of-the-art deep generative models for time series data. △ Less

Submitted 22 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: accepted paper to be published in the proceedings of ICASSP 2024

arXiv:2307.00868 [pdf, other]

MADS: Modulated Auto-Decoding SIREN for time series imputation

Authors: Tom Bamford, Elizabeth Fons, Yousef El-Laham, Svitlana Vyetrenko

Abstract: Time series imputation remains a significant challenge across many fields due to the potentially significant variability in the type of data being modelled. Whilst traditional imputation methods often impose strong assumptions on the underlying data generation process, limiting their applicability, researchers have recently begun to investigate the potential of deep learning for this task, inspire… ▽ More Time series imputation remains a significant challenge across many fields due to the potentially significant variability in the type of data being modelled. Whilst traditional imputation methods often impose strong assumptions on the underlying data generation process, limiting their applicability, researchers have recently begun to investigate the potential of deep learning for this task, inspired by the strong performance shown by these models in both classification and regression problems across a range of applications. In this work we propose MADS, a novel auto-decoding framework for time series imputation, built upon implicit neural representations. Our method leverages the capabilities of SIRENs for high fidelity reconstruction of signals and irregular data, and combines it with a hypernetwork architecture which allows us to generalise by learning a prior over the space of time series. We evaluate our model on two real-world datasets, and show that it outperforms state-of-the-art methods for time series imputation. On the human activity dataset, it improves imputation performance by at least 40%, while on the air quality dataset it is shown to be competitive across all metrics. When evaluated on synthetic data, our model results in the best average rank across different dataset configurations over all baselines. △ Less

Submitted 3 July, 2023; originally announced July 2023.

Comments: 8 pages (inc. refs), 1 figure

arXiv:2306.07235 [pdf, ps, other]

Deep Gaussian Mixture Ensembles

Authors: Yousef El-Laham, Niccolò Dalmasso, Elizabeth Fons, Svitlana Vyetrenko

Abstract: This work introduces a novel probabilistic deep learning technique called deep Gaussian mixture ensembles (DGMEs), which enables accurate quantification of both epistemic and aleatoric uncertainty. By assuming the data generating process follows that of a Gaussian mixture, DGMEs are capable of approximating complex probability distributions, such as heavy-tailed or multimodal distributions. Our co… ▽ More This work introduces a novel probabilistic deep learning technique called deep Gaussian mixture ensembles (DGMEs), which enables accurate quantification of both epistemic and aleatoric uncertainty. By assuming the data generating process follows that of a Gaussian mixture, DGMEs are capable of approximating complex probability distributions, such as heavy-tailed or multimodal distributions. Our contributions include the derivation of an expectation-maximization (EM) algorithm used for learning the model parameters, which results in an upper-bound on the log-likelihood of training data over that of standard deep ensembles. Additionally, the proposed EM training procedure allows for learning of mixture weights, which is not commonly done in ensembles. Our experimental results demonstrate that DGMEs outperform state-of-the-art uncertainty quantifying deep learning models in handling complex predictive densities. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: Accepted at Uncertainty in Artificial Intelligence (UAI) 2023 Conference, 7 figures, 11 tables

arXiv:2009.04551 [pdf, other]

Particle Filtering Under General Regime Switching

Authors: Yousef El-Laham, Liu Yang, Petar M. Djuric, Monica F. Bugallo

Abstract: In this paper, we consider a new framework for particle filtering under model uncertainty that operates beyond the scope of Markovian switching systems. Specifically, we develop a novel particle filtering algorithm that applies to general regime switching systems, where the model index is augmented as an unknown time-varying parameter in the system. The proposed approach does not require the use o… ▽ More In this paper, we consider a new framework for particle filtering under model uncertainty that operates beyond the scope of Markovian switching systems. Specifically, we develop a novel particle filtering algorithm that applies to general regime switching systems, where the model index is augmented as an unknown time-varying parameter in the system. The proposed approach does not require the use of multiple filters and can maintain a diverse set of particles for each considered model through appropriate choice of the particle filtering proposal distribution. The flexibility of the proposed approach allows for long-term dependencies between the models, which enables its use to a wider variety of real-world applications. We validate the method on a synthetic data experiment and show that it outperforms state-of-the-art multiple model particle filtering approaches that require the use of multiple filters. △ Less

Submitted 9 September, 2020; originally announced September 2020.

Comments: Accepted to EUSIPCO 2020

arXiv:1806.00093 [pdf, ps, other]

doi 10.1109/LSP.2018.2841641

Robust Covariance Adaptation in Adaptive Importance Sampling

Authors: Yousef El-Laham, Victor Elvira, Monica F. Bugallo

Abstract: Importance sampling (IS) is a Monte Carlo methodology that allows for approximation of a target distribution using weighted samples generated from another proposal distribution. Adaptive importance sampling (AIS) implements an iterative version of IS which adapts the parameters of the proposal distribution in order to improve estimation of the target. While the adaptation of the location (mean) of… ▽ More Importance sampling (IS) is a Monte Carlo methodology that allows for approximation of a target distribution using weighted samples generated from another proposal distribution. Adaptive importance sampling (AIS) implements an iterative version of IS which adapts the parameters of the proposal distribution in order to improve estimation of the target. While the adaptation of the location (mean) of the proposals has been largely studied, an important challenge of AIS relates to the difficulty of adapting the scale parameter (covariance matrix). In the case of weight degeneracy, adapting the covariance matrix using the empirical covariance results in a singular matrix, which leads to poor performance in subsequent iterations of the algorithm. In this paper, we propose a novel scheme which exploits recent advances in the IS literature to prevent the so-called weight degeneracy. The method efficiently adapts the covariance matrix of a population of proposal distributions and achieves a significant performance improvement in high-dimensional scenarios. We validate the new method through computer simulations. △ Less

Submitted 31 May, 2018; originally announced June 2018.

Showing 1–7 of 7 results for author: El-Laham, Y