-
Parsimonious Gaussian mixture models with piecewise-constant eigenvalue profiles
Authors:
Tom Szwagier,
Pierre-Alexandre Mattei,
Charles Bouveyron,
Xavier Pennec
Abstract:
Gaussian mixture models (GMMs) are ubiquitous in statistical learning, particularly for unsupervised problems. While full GMMs suffer from the overparameterization of their covariance matrices in high-dimensional spaces, spherical GMMs (with isotropic covariance matrices) certainly lack flexibility to fit certain anisotropic distributions. Connecting these two extremes, we introduce a new family o…
▽ More
Gaussian mixture models (GMMs) are ubiquitous in statistical learning, particularly for unsupervised problems. While full GMMs suffer from the overparameterization of their covariance matrices in high-dimensional spaces, spherical GMMs (with isotropic covariance matrices) certainly lack flexibility to fit certain anisotropic distributions. Connecting these two extremes, we introduce a new family of parsimonious GMMs with piecewise-constant covariance eigenvalue profiles. These extend several low-rank models like the celebrated mixtures of probabilistic principal component analyzers (MPPCA), by enabling any possible sequence of eigenvalue multiplicities. If the latter are prespecified, then we can naturally derive an expectation-maximization (EM) algorithm to learn the mixture parameters. Otherwise, to address the notoriously-challenging issue of jointly learning the mixture parameters and hyperparameters, we propose a componentwise penalized EM algorithm, whose monotonicity is proven. We show the superior likelihood-parsimony tradeoffs achieved by our models on a variety of unsupervised experiments: density fitting, clustering and single-image denoising.
△ Less
Submitted 2 July, 2025;
originally announced July 2025.
-
Eigengap Sparsity for Covariance Parsimony
Authors:
Tom Szwagier,
Guillaume Olikier,
Xavier Pennec
Abstract:
Covariance estimation is a central problem in statistics. An important issue is that there are rarely enough samples $n$ to accurately estimate the $p (p+1) / 2$ coefficients in dimension $p$. Parsimonious covariance models are therefore preferred, but the discrete nature of model selection makes inference computationally challenging. In this paper, we propose a relaxation of covariance parsimony…
▽ More
Covariance estimation is a central problem in statistics. An important issue is that there are rarely enough samples $n$ to accurately estimate the $p (p+1) / 2$ coefficients in dimension $p$. Parsimonious covariance models are therefore preferred, but the discrete nature of model selection makes inference computationally challenging. In this paper, we propose a relaxation of covariance parsimony termed "eigengap sparsity" and motivated by the good accuracy-parsimony tradeoffs of eigenvalue-equalization in covariance matrices. This penalty can be included in a penalized-likelihood framework that we propose to solve with a projected gradient descent on a monotone cone. The algorithm turns out to resemble an isotonic regression of mutually-attracted sample eigenvalues, drawing an interesting link between covariance parsimony and shrinkage.
△ Less
Submitted 11 July, 2025; v1 submitted 14 April, 2025;
originally announced April 2025.
-
Nested subspace learning with flags
Authors:
Tom Szwagier,
Xavier Pennec
Abstract:
Many machine learning methods look for low-dimensional representations of the data. The underlying subspace can be estimated by first choosing a dimension $q$ and then optimizing a certain objective function over the space of $q$-dimensional subspaces (the Grassmannian). Trying different $q$ yields in general non-nested subspaces, which raises an important issue of consistency between the data rep…
▽ More
Many machine learning methods look for low-dimensional representations of the data. The underlying subspace can be estimated by first choosing a dimension $q$ and then optimizing a certain objective function over the space of $q$-dimensional subspaces (the Grassmannian). Trying different $q$ yields in general non-nested subspaces, which raises an important issue of consistency between the data representations. In this paper, we propose a simple trick to enforce nestedness in subspace learning methods. It consists in lifting Grassmannian optimization problems to flag manifolds (the space of nested subspaces of increasing dimension) via nested projectors. We apply the flag trick to several classical machine learning methods and show that it successfully addresses the nestedness issue.
△ Less
Submitted 9 February, 2025;
originally announced February 2025.
-
The curse of isotropy: from principal components to principal subspaces
Authors:
Tom Szwagier,
Xavier Pennec
Abstract:
This paper raises an important issue about the interpretation of principal component analysis. The curse of isotropy states that a covariance matrix with repeated eigenvalues yields rotation-invariant eigenvectors. In other words, principal components associated with equal eigenvalues show large intersample variability and are arbitrary combinations of potentially more interpretable components. Ho…
▽ More
This paper raises an important issue about the interpretation of principal component analysis. The curse of isotropy states that a covariance matrix with repeated eigenvalues yields rotation-invariant eigenvectors. In other words, principal components associated with equal eigenvalues show large intersample variability and are arbitrary combinations of potentially more interpretable components. However, empirical eigenvalues are never exactly equal in practice due to sampling errors. Therefore, most users overlook the problem. In this paper, we propose to identify datasets that are likely to suffer from the curse of isotropy by introducing a generative Gaussian model with repeated eigenvalues and comparing it to traditional models via the principle of parsimony. This yields an explicit criterion to detect the curse of isotropy in practice. We notably argue that in a dataset with 1000 samples, all the eigenvalue pairs with a relative eigengap lower than 21% should be assumed equal. This demonstrates that the curse of isotropy cannot be overlooked. In this context, we propose to transition from fuzzy principal components to much-more-interpretable principal subspaces. The final methodology (principal subspace analysis) is extremely simple and shows promising results on a variety of datasets from different fields.
△ Less
Submitted 7 August, 2024; v1 submitted 28 July, 2023;
originally announced July 2023.
-
Generalizable and Robust Deep Learning Algorithm for Atrial Fibrillation Diagnosis Across Ethnicities, Ages and Sexes
Authors:
Shany Biton,
Mohsin Aldhafeeri,
Erez Marcusohn,
Kenta Tsutsui,
Tom Szwagier,
Adi Elias,
Julien Oster,
Jean Marc Sellal,
Mahmoud Suleiman,
Joachim A. Behar
Abstract:
To drive health innovation that meets the needs of all and democratize healthcare, there is a need to assess the generalization performance of deep learning (DL) algorithms across various distribution shifts to ensure that these algorithms are robust. This retrospective study is, to the best of our knowledge, the first to develop and assess the generalization performance of a deep learning (DL) mo…
▽ More
To drive health innovation that meets the needs of all and democratize healthcare, there is a need to assess the generalization performance of deep learning (DL) algorithms across various distribution shifts to ensure that these algorithms are robust. This retrospective study is, to the best of our knowledge, the first to develop and assess the generalization performance of a deep learning (DL) model for AF events detection from long term beat-to-beat intervals across ethnicities, ages and sexes. The new recurrent DL model, denoted ArNet2, was developed on a large retrospective dataset of 2,147 patients totaling 51,386 hours of continuous electrocardiogram (ECG). The models generalization was evaluated on manually annotated test sets from four centers (USA, Israel, Japan and China) totaling 402 patients. The model was further validated on a retrospective dataset of 1,730 consecutives Holter recordings from the Rambam Hospital Holter clinic, Haifa, Israel. The model outperformed benchmark state-of-the-art models and generalized well across ethnicities, ages and sexes. Performance was higher for female than male and young adults (less than 60 years old) and showed some differences across ethnicities. The main finding explaining these variations was an impairment in performance in groups with a higher prevalence of atrial flutter (AFL). Our findings on the relative performance of ArNet2 across groups may have clinical implications on the choice of the preferred AF examination method to use relative to the group of interest.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.