Skip to main content

Showing 1–14 of 14 results for author: Lücke, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2501.12299  [pdf, other

    stat.ML cs.CV cs.LG

    Sublinear Variational Optimization of Gaussian Mixture Models with Millions to Billions of Parameters

    Authors: Sebastian Salwig, Till Kahlke, Florian Hirschberger, Dennis Forster, Jörg Lücke

    Abstract: Gaussian Mixture Models (GMMs) range among the most frequently used machine learning models. However, training large, general GMMs becomes computationally prohibitive for datasets with many data points $N$ of high-dimensionality $D$. For GMMs with arbitrary covariances, we here derive a highly efficient variational approximation, which is integrated with mixtures of factor analyzers (MFAs). For GM… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: 22 pages, 6 figures (and 17 pages, 3 figures in Appendix)

  2. arXiv:2501.09022  [pdf, ps, other

    stat.ML cs.IT cs.LG math.PR math.ST

    Generative Models with ELBOs Converging to Entropy Sums

    Authors: Jan Warnken, Dmytro Velychko, Simon Damm, Asja Fischer, Jörg Lücke

    Abstract: The evidence lower bound (ELBO) is one of the most central objectives for probabilistic unsupervised learning. For the ELBOs of several generative models and model classes, we here prove convergence to entropy sums. As one result, we provide a list of generative models for which entropy convergence has been shown, so far, along with the corresponding expressions for entropy sums. Our consideration… ▽ More

    Submitted 25 December, 2024; originally announced January 2025.

    Comments: 16 Pages

    MSC Class: 65C20; 68T07; 60-08; 62-08; 62F99; 68T05 ACM Class: G.3

  3. arXiv:2311.01888  [pdf, other

    stat.ML cs.LG

    Learning Sparse Codes with Entropy-Based ELBOs

    Authors: Dmytro Velychko, Simon Damm, Asja Fischer, Jörg Lücke

    Abstract: Standard probabilistic sparse coding assumes a Laplace prior, a linear mapping from latents to observables, and Gaussian observable distributions. We here derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilis… ▽ More

    Submitted 9 April, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

  4. arXiv:2209.03077  [pdf, ps, other

    stat.ML cs.IT cs.LG math.PR math.ST

    On the Convergence of the ELBO to Entropy Sums

    Authors: Jörg Lücke, Jan Warnken

    Abstract: The variational lower bound (a.k.a. ELBO or free energy) is the central objective for many established as well as for many novel algorithms for unsupervised learning. Such algorithms usually increase the bound until parameters have converged to values close to a stationary point of the learning dynamics. Here we show that (for a very large class of generative models) the variational lower bound is… ▽ More

    Submitted 23 December, 2024; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: 38 Pages

    MSC Class: 65C20; 68T07; 60-08; 62-08; 62F99; 68T05 ACM Class: G.3

  5. arXiv:2012.12294  [pdf, other

    stat.ML cs.LG cs.NE

    Evolutionary Variational Optimization of Generative Models

    Authors: Jakob Drefs, Enrico Guiraud, Jörg Lücke

    Abstract: We combine two popular optimization approaches to derive learning algorithms for generative models: variational optimization and evolutionary algorithms. The combination is realized for generative models with discrete latents by using truncated posteriors as the family of variational distributions. The variational parameters of truncated posteriors are sets of latent states. By interpreting these… ▽ More

    Submitted 16 April, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

    MSC Class: 65C20; 68W50 ACM Class: G.3.0; I.2.6; I.4.0; I.5.1

    Journal ref: Journal of Machine Learning Research 23(21):1-51, 2022

  6. Direct Evolutionary Optimization of Variational Autoencoders With Binary Latents

    Authors: Enrico Guiraud, Jakob Drefs, Jörg Lücke

    Abstract: Discrete latent variables are considered important for real world data, which has motivated research on Variational Autoencoders (VAEs) with discrete latents. However, standard VAE training is not possible in this case, which has motivated different strategies to manipulate discrete distributions in order to train discrete VAEs similarly to conventional ones. Here we ask if it is also possible to… ▽ More

    Submitted 24 March, 2023; v1 submitted 27 November, 2020; originally announced November 2020.

    MSC Class: 65C20; 68T07 ACM Class: G.3.0; I.2.6; I.4.0; I.5.1

  7. arXiv:2010.14860  [pdf, other

    stat.ML cs.LG

    The ELBO of Variational Autoencoders Converges to a Sum of Three Entropies

    Authors: Simon Damm, Dennis Forster, Dmytro Velychko, Zhenwen Dai, Asja Fischer, Jörg Lücke

    Abstract: The central objective function of a variational autoencoder (VAE) is its variational lower bound (the ELBO). Here we show that for standard (i.e., Gaussian) VAEs the ELBO converges to a value given by the sum of three entropies: the (negative) entropy of the prior distribution, the expected (negative) entropy of the observable distribution, and the average entropy of the variational distributions… ▽ More

    Submitted 20 April, 2023; v1 submitted 28 October, 2020; originally announced October 2020.

    Journal ref: Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS), PMLR 206:3931-3960, 2023

  8. arXiv:2003.02214  [pdf, other

    cs.LG stat.ML

    Generic Unsupervised Optimization for a Latent Variable Model With Exponential Family Observables

    Authors: Hamid Mousavi, Jakob Drefs, Florian Hirschberger, Jörg Lücke

    Abstract: Latent variable models (LVMs) represent observed variables by parameterized functions of latent variables. Prominent examples of LVMs for unsupervised learning are probabilistic PCA or probabilistic SC which both assume a weighted linear summation of the latents to determine the mean of a Gaussian distribution for the observables. In many cases, however, observables do not follow a Gaussian distri… ▽ More

    Submitted 15 December, 2023; v1 submitted 4 March, 2020; originally announced March 2020.

    Journal ref: Journal of Machine Learning Research, 24(285), 1-59 (2023)

  9. arXiv:1908.06843  [pdf, ps, other

    eess.SP cs.LG stat.ML

    ProSper -- A Python Library for Probabilistic Sparse Coding with Non-Standard Priors and Superpositions

    Authors: Georgios Exarchakis, Jörg Bornschein, Abdul-Saboor Sheikh, Zhenwen Dai, Marc Henniges, Jakob Drefs, Jörg Lücke

    Abstract: ProSper is a python library containing probabilistic algorithms to learn dictionaries. Given a set of data points, the implemented algorithms seek to learn the elementary components that have generated the data. The library widens the scope of dictionary learning approaches beyond implementations of standard approaches such as ICA, NMF or standard L1 sparse coding. The implemented algorithms are e… ▽ More

    Submitted 1 August, 2019; originally announced August 2019.

  10. Large Scale Clustering with Variational EM for Gaussian Mixture Models

    Authors: Florian Hirschberger, Dennis Forster, Jörg Lücke

    Abstract: This paper represents a preliminary (pre-reviewing) version of a sublinear variational algorithm for isotropic Gaussian mixture models (GMMs). Further developments of the algorithm for GMMs with diagonal covariance matrices (instead of isotropic clusters) and their corresponding benchmarking results have been published by TPAMI (doi:10.1109/TPAMI.2021.3133763) in the paper "A Variational EM Accele… ▽ More

    Submitted 21 June, 2022; v1 submitted 1 October, 2018; originally announced October 2018.

  11. arXiv:1702.01997  [pdf, other

    stat.ML cs.LG

    Truncated Variational EM for Semi-Supervised Neural Simpletrons

    Authors: Dennis Forster, Jörg Lücke

    Abstract: Inference and learning for probabilistic generative networks is often very challenging and typically prevents scalability to as large networks as used for deep discriminative approaches. To obtain efficiently trainable, large-scale and well performing generative networks for semi-supervised learning, we here combine two recent developments: a neural network reformulation of hierarchical Poisson mi… ▽ More

    Submitted 7 February, 2017; originally announced February 2017.

  12. Neural Simpletrons - Minimalistic Directed Generative Networks for Learning with Few Labels

    Authors: Dennis Forster, Abdul-Saboor Sheikh, Jörg Lücke

    Abstract: Classifiers for the semi-supervised setting often combine strong supervised models with additional learning objectives to make use of unlabeled data. This results in powerful though very complex models that are hard to train and that demand additional labels for optimal parameter tuning, which are often not given when labeled data is very sparse. We here study a minimalistic multi-layer generative… ▽ More

    Submitted 18 November, 2016; v1 submitted 28 June, 2015; originally announced June 2015.

    Journal ref: Neural Computation Volume 30, Issue 8, August 2018, p.2113-2174

  13. GP-select: Accelerating EM using adaptive subspace preselection

    Authors: Jacquelyn A. Shelton, Jan Gasthaus, Zhenwen Dai, Joerg Luecke, Arthur Gretton

    Abstract: We propose a nonparametric procedure to achieve fast inference in generative graphical models when the number of latent states is very large. The approach is based on iterative latent variable preselection, where we alternate between learning a 'selection function' to reveal the relevant latent variables, and use this to obtain a compact approximation of the posterior distribution for EM; this can… ▽ More

    Submitted 17 July, 2016; v1 submitted 10 December, 2014; originally announced December 2014.

  14. Autonomous Cleaning of Corrupted Scanned Documents - A Generative Modeling Approach

    Authors: Zhenwen Dai, Jörg Lücke

    Abstract: We study the task of cleaning scanned text documents that are strongly corrupted by dirt such as manual line strokes, spilled ink etc. We aim at autonomously removing dirt from a single letter-size page based only on the information the page contains. Our approach, therefore, has to learn character representations without supervision and requires a mechanism to distinguish learned representations… ▽ More

    Submitted 2 July, 2012; v1 submitted 12 January, 2012; originally announced January 2012.

    Comments: oral presentation and Google Student Travel Award; IEEE conference on Computer Vision and Pattern Recognition 2012