Search | arXiv e-print repository

E$^2$M: Double Bounded $α$-Divergence Optimization for Tensor-based Discrete Density Estimation

Authors: Kazu Ghalamkari, Jesper Løve Hinrich, Morten Mørup

Abstract: Tensor-based discrete density estimation requires flexible modeling and proper divergence criteria to enable effective learning; however, traditional approaches using $α$-divergence face analytical challenges due to the $α$-power terms in the objective function, which hinder the derivation of closed-form update rules. We present a generalization of the expectation-maximization (EM) algorithm, call… ▽ More Tensor-based discrete density estimation requires flexible modeling and proper divergence criteria to enable effective learning; however, traditional approaches using $α$-divergence face analytical challenges due to the $α$-power terms in the objective function, which hinder the derivation of closed-form update rules. We present a generalization of the expectation-maximization (EM) algorithm, called E$^2$M algorithm. It circumvents this issue by first relaxing the optimization into minimization of a surrogate objective based on the Kullback-Leibler (KL) divergence, which is tractable via the standard EM algorithm, and subsequently applying a tensor many-body approximation in the M-step to enable simultaneous closed-form updates of all parameters. Our approach offers flexible modeling for not only a variety of low-rank structures, including the CP, Tucker, and Tensor Train formats, but also their mixtures, thus allowing us to leverage the strengths of different low-rank structures. We demonstrate the effectiveness of our approach in classification and density estimation tasks. △ Less

Submitted 23 May, 2025; v1 submitted 28 May, 2024; originally announced May 2024.

Comments: 34 pages, 11 figures

MSC Class: 68T01 ACM Class: I.2.6

arXiv:2310.02694 [pdf]

Probabilistic Block Term Decomposition for the Modelling of Higher-Order Arrays

Authors: Jesper Løve Hinrich, Morten Mørup

Abstract: Tensors are ubiquitous in science and engineering and tensor factorization approaches have become important tools for the characterization of higher order structure. Factorizations includes the outer-product rank Canonical Polyadic Decomposition (CPD) as well as the multi-linear rank Tucker decomposition in which the Block-Term Decomposition (BTD) is a structured intermediate interpolating between… ▽ More Tensors are ubiquitous in science and engineering and tensor factorization approaches have become important tools for the characterization of higher order structure. Factorizations includes the outer-product rank Canonical Polyadic Decomposition (CPD) as well as the multi-linear rank Tucker decomposition in which the Block-Term Decomposition (BTD) is a structured intermediate interpolating between these two representations. Whereas CPD, Tucker, and BTD have traditionally relied on maximum-likelihood estimation, Bayesian inference has been use to form probabilistic CPD and Tucker. We propose, an efficient variational Bayesian probabilistic BTD, which uses the von-Mises Fisher matrix distribution to impose orthogonality in the multi-linear Tucker parts forming the BTD. On synthetic and two real datasets, we highlight the Bayesian inference procedure and demonstrate using the proposed pBTD on noisy data and for model order quantification. We find that the probabilistic BTD can quantify suitable multi-linear structures providing a means for robust inference of patterns in multi-linear data. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: 11 pages, preprint of submitted article

arXiv:2205.03501 [pdf, other]

PARAFAC2$\times$N: Coupled Decomposition of Multi-modal Data with Drift in N Modes

Authors: Michael D. Sorochan Armstrong, Jesper Løve Hinrich, A. Paulina de la Mata, James J. Harynuk

Abstract: Reliable analysis of comprehensive two-dimensional gas chromatography - time-of-flight mass spectrometry (GC$\times$GC-TOFMS) data is considered to be a major bottleneck for its widespread application. For multiple samples, GC$\times$GC-TOFMS data for specific chromatographic regions manifests as a 4th order tensor of I mass spectral acquisitions, J mass channels, K modulations, and L samples. Chr… ▽ More Reliable analysis of comprehensive two-dimensional gas chromatography - time-of-flight mass spectrometry (GC$\times$GC-TOFMS) data is considered to be a major bottleneck for its widespread application. For multiple samples, GC$\times$GC-TOFMS data for specific chromatographic regions manifests as a 4th order tensor of I mass spectral acquisitions, J mass channels, K modulations, and L samples. Chromatographic drift is common along both the first-dimension (modulations), and along the second-dimension (mass spectral acquisitions), while drift along the mass channel and sample dimensions is for all practical purposes nonexistent. A number of solutions to handling GC$\times$GC-TOFMS data have been proposed: these involve reshaping the data to make it amenable to either 2nd order decomposition techniques based on Multivariate Curve Resolution (MCR), or 3rd order decomposition techniques such as Parallel Factor Analysis 2 (PARAFAC2). PARAFAC2 has been utilised to model chromatographic drift along one mode, which has enabled its use for robust decomposition of multiple GC-MS experiments. Although extensible, it is not straightforward to implement a PARAFAC2 model that accounts for drift along multiple modes. In this submission, we demonstrate a new approach and a general theory for modelling data with drift along multiple modes, for applications in multidimensional chromatography with multivariate detection. △ Less

Submitted 6 May, 2022; originally announced May 2022.

arXiv:1806.08195 [pdf, other]

Probabilistic PARAFAC2

Authors: Philip J. H. Jørgensen, Søren F. V. Nielsen, Jesper L. Hinrich, Mikkel N. Schmidt, Kristoffer H. Madsen, Morten Mørup

Abstract: The PARAFAC2 is a multimodal factor analysis model suitable for analyzing multi-way data when one of the modes has incomparable observation units, for example because of differences in signal sampling or batch sizes. A fully probabilistic treatment of the PARAFAC2 is desirable in order to improve robustness to noise and provide a well founded principle for determining the number of factors, but ch… ▽ More The PARAFAC2 is a multimodal factor analysis model suitable for analyzing multi-way data when one of the modes has incomparable observation units, for example because of differences in signal sampling or batch sizes. A fully probabilistic treatment of the PARAFAC2 is desirable in order to improve robustness to noise and provide a well founded principle for determining the number of factors, but challenging because the factor loadings are constrained to be orthogonal. We develop two probabilistic formulations of the PARAFAC2 along with variational procedures for inference: In the one approach, the mean values of the factor loadings are orthogonal leading to closed form variational updates, and in the other, the factor loadings themselves are orthogonal using a matrix Von Mises-Fisher distribution. We contrast our probabilistic formulation to the conventional direct fitting algorithm based on maximum likelihood. On simulated data and real fluorescence spectroscopy and gas chromatography-mass spectrometry data, we compare our approach to the conventional PARAFAC2 model estimation and find that the probabilistic formulation is more robust to noise and model order misspecification. The probabilistic PARAFAC2 thus forms a promising framework for modeling multi-way data accounting for uncertainty. △ Less

Submitted 21 June, 2018; originally announced June 2018.

Comments: 16 pages (incl. 4 pages of supplemental material), 5 figures

arXiv:1612.04555 [pdf, ps, other]

Scalable Group Level Probabilistic Sparse Factor Analysis

Authors: Jesper L. Hinrich, Søren F. V. Nielsen, Nicolai A. B. Riis, Casper T. Eriksen, Jacob Frøsig, Marco D. F. Kristensen, Mikkel N. Schmidt, Kristoffer H. Madsen, Morten Mørup

Abstract: Many data-driven approaches exist to extract neural representations of functional magnetic resonance imaging (fMRI) data, but most of them lack a proper probabilistic formulation. We propose a group level scalable probabilistic sparse factor analysis (psFA) allowing spatially sparse maps, component pruning using automatic relevance determination (ARD) and subject specific heteroscedastic spatial n… ▽ More Many data-driven approaches exist to extract neural representations of functional magnetic resonance imaging (fMRI) data, but most of them lack a proper probabilistic formulation. We propose a group level scalable probabilistic sparse factor analysis (psFA) allowing spatially sparse maps, component pruning using automatic relevance determination (ARD) and subject specific heteroscedastic spatial noise modeling. For task-based and resting state fMRI, we show that the sparsity constraint gives rise to components similar to those obtained by group independent component analysis. The noise modeling shows that noise is reduced in areas typically associated with activation by the experimental design. The psFA model identifies sparse components and the probabilistic setting provides a natural way to handle parameter uncertainties. The variational Bayesian framework easily extends to more complex noise models than the presently considered. △ Less

Submitted 14 December, 2016; originally announced December 2016.

Comments: 10 pages plus 5 pages appendix, Submitted to ICASSP 17

Showing 1–5 of 5 results for author: Hinrich, J L