Skip to main content

Showing 1–50 of 106 results for author: Hennig, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.00616  [pdf, ps, other

    math.DG cs.LG math.PR math.ST

    Geometric Gaussian Approximations of Probability Distributions

    Authors: Nathaël Da Costa, Bálint Mucsányi, Philipp Hennig

    Abstract: Approximating complex probability distributions, such as Bayesian posterior distributions, is of central interest in many applications. We study the expressivity of geometric Gaussian approximations. These consist of approximations by Gaussian pushforwards through diffeomorphisms or Riemannian exponential maps. We first review these two different kinds of geometric Gaussian approximations. Then we… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  2. arXiv:2506.17366  [pdf, ps, other

    stat.ML cs.LG math.NA math.PR math.ST

    Gaussian Processes and Reproducing Kernels: Connections and Equivalences

    Authors: Motonobu Kanagawa, Philipp Hennig, Dino Sejdinovic, Bharath K. Sriperumbudur

    Abstract: This monograph studies the relations between two approaches using positive definite kernels: probabilistic methods using Gaussian processes, and non-probabilistic methods using reproducing kernel Hilbert spaces (RKHS). They are widely studied and used in machine learning, statistics, and numerical analysis. Connections and equivalences between them are reviewed for fundamental topics such as regre… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 172 pages

  3. arXiv:2504.14701  [pdf, other

    cs.LG stat.ML

    Connecting Parameter Magnitudes and Hessian Eigenspaces at Scale using Sketched Methods

    Authors: Andres Fernandez, Frank Schneider, Maren Mahsereci, Philipp Hennig

    Abstract: Recently, it has been observed that when training a deep neural net with SGD, the majority of the loss landscape's curvature quickly concentrates in a tiny *top* eigenspace of the loss Hessian, which remains largely stable thereafter. Independently, it has been shown that successful magnitude pruning masks for deep neural nets emerge early in training and remain stable thereafter. In this work, we… ▽ More

    Submitted 20 April, 2025; originally announced April 2025.

    Comments: Accepted at TMLR 2025

  4. arXiv:2503.08343  [pdf, other

    cs.LG math.NA

    Flexible and Efficient Probabilistic PDE Solvers through Gaussian Markov Random Fields

    Authors: Tim Weiland, Marvin Pförtner, Philipp Hennig

    Abstract: Mechanistic knowledge about the physical world is virtually always expressed via partial differential equations (PDEs). Recently, there has been a surge of interest in probabilistic PDE solvers -- Bayesian statistical models mostly based on Gaussian process (GP) priors which seamlessly combine empirical measurements and mechanistic knowledge. As such, they quantify uncertainties arising from e.g.… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  5. arXiv:2502.15015  [pdf, other

    cs.LG stat.ML

    Accelerating Neural Network Training: An Analysis of the AlgoPerf Competition

    Authors: Priya Kasimbeg, Frank Schneider, Runa Eschenhagen, Juhan Bae, Chandramouli Shama Sastry, Mark Saroufim, Boyuan Feng, Less Wright, Edward Z. Yang, Zachary Nado, Sourabh Medapati, Philipp Hennig, Michael Rabbat, George E. Dahl

    Abstract: The goal of the AlgoPerf: Training Algorithms competition is to evaluate practical speed-ups in neural network training achieved solely by improving the underlying training algorithms. In the external tuning ruleset, submissions must provide workload-agnostic hyperparameter search spaces, while in the self-tuning ruleset they must be completely hyperparameter-free. In both rulesets, submissions ar… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: ICLR 2025; 23 pages, 5 figures, 8 tables

  6. arXiv:2502.03366  [pdf, other

    cs.LG stat.ML

    Rethinking Approximate Gaussian Inference in Classification

    Authors: Bálint Mucsányi, Nathaël Da Costa, Philipp Hennig

    Abstract: In classification tasks, softmax functions are ubiquitously used as output activations to produce predictive probabilities. Such outputs only capture aleatoric uncertainty. To capture epistemic uncertainty, approximate Gaussian inference methods have been proposed, which output Gaussian distributions over the logit space. Predictives are then obtained as the expectations of the Gaussian distributi… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

    Comments: 29 pages, 15 figures

  7. arXiv:2501.05279  [pdf, other

    cs.LG stat.ML

    Learning convolution operators on compact Abelian groups

    Authors: Emilia Magnani, Ernesto De Vito, Philipp Hennig, Lorenzo Rosasco

    Abstract: We consider the problem of learning convolution operators associated to compact Abelian groups. We study a regularization-based approach and provide corresponding learning guarantees under natural regularity conditions on the convolution kernel. More precisely, we assume the convolution kernel is a function in a translation invariant Hilbert space and analyze a natural ridge regression (RR) estima… ▽ More

    Submitted 10 April, 2025; v1 submitted 9 January, 2025; originally announced January 2025.

    MSC Class: 68T05; 47A52; 42B10; 62J07 ACM Class: I.2.6; F.2.1; G.3

  8. arXiv:2412.15361  [pdf, other

    cs.LG physics.ao-ph

    A Generative Framework for Probabilistic, Spatiotemporally Coherent Downscaling of Climate Simulation

    Authors: Jonathan Schmidt, Luca Schmidt, Felix Strnad, Nicole Ludwig, Philipp Hennig

    Abstract: Local climate information is crucial for impact assessment and decision-making, yet coarse global climate simulations cannot capture small-scale phenomena. Current statistical downscaling methods infer these phenomena as temporally decoupled spatial patches. However, to preserve physical properties, estimating spatio-temporally coherent high-resolution weather dynamics for multiple variables acros… ▽ More

    Submitted 28 January, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: 15 pages, 6 figures, additional supplementary text and figures

  9. arXiv:2411.01036  [pdf, ps, other

    cs.LG stat.ML

    Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference

    Authors: Jonathan Wenger, Kaiwen Wu, Philipp Hennig, Jacob R. Gardner, Geoff Pleiss, John P. Cunningham

    Abstract: Model selection in Gaussian processes scales prohibitively with the size of the training dataset, both in time and memory. While many approximations exist, all incur inevitable approximation error. Recent work accounts for this error in the form of computational uncertainty, which enables -- at the cost of quadratic complexity -- an explicit tradeoff between computation and precision. Here we exte… ▽ More

    Submitted 7 July, 2025; v1 submitted 1 November, 2024; originally announced November 2024.

    Comments: Advances in Neural Information Processing Systems (NeurIPS 2024)

  10. arXiv:2410.14325  [pdf, other

    cs.LG stat.ML

    Debiasing Mini-Batch Quadratics for Applications in Deep Learning

    Authors: Lukas Tatzel, Bálint Mucsányi, Osane Hackel, Philipp Hennig

    Abstract: Quadratic approximations form a fundamental building block of machine learning methods. E.g., second-order optimizers try to find the Newton step into the minimum of a local quadratic proxy to the objective function; and the second-order approximation of a network's loss function can be used to quantify the uncertainty of its outputs via the Laplace approximation. When computations on the entire t… ▽ More

    Submitted 28 January, 2025; v1 submitted 18 October, 2024; originally announced October 2024.

    Comments: Main text (including references): 13 pages, 6 figures; Supplements: 27 pages, 16 figures

  11. arXiv:2410.06800  [pdf, other

    cs.LG stat.ML

    Efficient Weight-Space Laplace-Gaussian Filtering and Smoothing for Sequential Deep Learning

    Authors: Joanna Sliwa, Frank Schneider, Nathanael Bosch, Agustinus Kristiadi, Philipp Hennig

    Abstract: Efficiently learning a sequence of related tasks, such as in continual learning, poses a significant challenge for neural nets due to the delicate trade-off between catastrophic forgetting and loss of plasticity. We address this challenge with a grounded framework for sequentially learning related tasks based on Bayesian inference. Specifically, we treat the model's parameters as a nonlinear Gauss… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    Comments: 20 pages, 8 figures

  12. arXiv:2407.13711  [pdf, other

    cs.LG cs.AI

    FSP-Laplace: Function-Space Priors for the Laplace Approximation in Bayesian Deep Learning

    Authors: Tristan Cinquin, Marvin Pförtner, Vincent Fortuin, Philipp Hennig, Robert Bamler

    Abstract: Laplace approximations are popular techniques for endowing deep networks with epistemic uncertainty estimates as they can be applied without altering the predictions of the trained network, and they scale to large models and datasets. While the choice of prior strongly affects the resulting posterior distribution, computational tractability and lack of interpretability of the weight space typicall… ▽ More

    Submitted 31 October, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  13. arXiv:2407.03951  [pdf, other

    cs.LG

    Uncertainty-Guided Optimization on Large Language Model Search Trees

    Authors: Julia Grosse, Ruotian Wu, Ahmad Rashid, Philipp Hennig, Pascal Poupart, Agustinus Kristiadi

    Abstract: Tree search algorithms such as greedy and beam search are the standard when it comes to finding sequences of maximum likelihood in the decoding processes of large language models (LLMs). However, they are myopic since they do not take the complete root-to-leaf path into account. Moreover, they are agnostic to prior knowledge available about the process: For example, it does not consider that the o… ▽ More

    Submitted 9 October, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: 10 pages

  14. arXiv:2406.05072  [pdf, other

    cs.LG stat.ML

    Linearization Turns Neural Operators into Function-Valued Gaussian Processes

    Authors: Emilia Magnani, Marvin Pförtner, Tobias Weber, Philipp Hennig

    Abstract: Neural operators generalize neural networks to learn mappings between function spaces from data. They are commonly used to learn solution operators of parametric partial differential equations (PDEs) or propagators of time-dependent PDEs. However, to make them useful in high-stakes simulation scenarios, their inherent predictive error must be quantified reliably. We introduce LUNO, a novel framewo… ▽ More

    Submitted 31 January, 2025; v1 submitted 7 June, 2024; originally announced June 2024.

    MSC Class: G.1.0; I.2.6; G.3; G.1.8

  15. arXiv:2406.05020  [pdf, other

    cs.LG math.NA

    Scaling up Probabilistic PDE Simulators with Structured Volumetric Information

    Authors: Tim Weiland, Marvin Pförtner, Philipp Hennig

    Abstract: Modeling real-world problems with partial differential equations (PDEs) is a prominent topic in scientific machine learning. Classic solvers for this task continue to play a central role, e.g. to generate training data for deep learning analogues. Any such numerical solution is subject to multiple sources of uncertainty, both from limited computational resources and limited data (including unknown… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  16. arXiv:2406.03334  [pdf, other

    cs.LG stat.ML

    Reparameterization invariance in approximate Bayesian inference

    Authors: Hrittik Roy, Marco Miani, Carl Henrik Ek, Philipp Hennig, Marvin Pförtner, Lukas Tatzel, Søren Hauberg

    Abstract: Current approximate posteriors in Bayesian neural networks (BNNs) exhibit a crucial limitation: they fail to maintain invariance under reparameterization, i.e. BNNs assign different posterior densities to different parametrizations of identical functions. This creates a fundamental flaw in the application of Bayesian principles as it breaks the correspondence between uncertainty over the parameter… ▽ More

    Submitted 10 February, 2025; v1 submitted 5 June, 2024; originally announced June 2024.

  17. arXiv:2405.20918  [pdf, other

    cs.SI physics.data-an stat.ML

    Flexible inference in heterogeneous and attributed multilayer networks

    Authors: Martina Contisciani, Marius Hobbhahn, Eleanor A. Power, Philipp Hennig, Caterina De Bacco

    Abstract: Networked datasets can be enriched by different types of information about individual nodes or edges. However, most existing methods for analyzing such datasets struggle to handle the complexity of heterogeneous data, often requiring substantial model-specific analysis. In this paper, we develop a probabilistic generative model to perform inference in multilayer networks with arbitrary types of in… ▽ More

    Submitted 10 January, 2025; v1 submitted 31 May, 2024; originally announced May 2024.

  18. arXiv:2405.08971  [pdf, other

    cs.LG math.NA stat.ML

    Computation-Aware Kalman Filtering and Smoothing

    Authors: Marvin Pförtner, Jonathan Wenger, Jon Cockayne, Philipp Hennig

    Abstract: Kalman filtering and smoothing are the foundational mechanisms for efficient inference in Gauss-Markov models. However, their time and memory complexities scale prohibitively with the size of the state space. This is particularly problematic in spatiotemporal regression problems, where the state dimension scales with the number of spatial observations. Existing approximate frameworks leverage low-… ▽ More

    Submitted 12 March, 2025; v1 submitted 14 May, 2024; originally announced May 2024.

  19. arXiv:2402.12231  [pdf, other

    cs.LG

    Diffusion Tempering Improves Parameter Estimation with Probabilistic Integrators for Ordinary Differential Equations

    Authors: Jonas Beck, Nathanael Bosch, Michael Deistler, Kyra L. Kadhim, Jakob H. Macke, Philipp Hennig, Philipp Berens

    Abstract: Ordinary differential equations (ODEs) are widely used to describe dynamical systems in science, but identifying parameters that explain experimental measurements is challenging. In particular, although ODEs are differentiable and would allow for gradient-based parameter optimization, the nonlinear dynamics of ODEs often lead to many local minima and extreme sensitivity to initial conditions. We t… ▽ More

    Submitted 19 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  20. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 6 August, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  21. arXiv:2312.14886  [pdf, other

    cs.LG math.PR math.ST stat.ML

    Sample Path Regularity of Gaussian Processes from the Covariance Kernel

    Authors: Nathaël Da Costa, Marvin Pförtner, Lancelot Da Costa, Philipp Hennig

    Abstract: Gaussian processes (GPs) are the most common formalism for defining probability distributions over spaces of functions. While applications of GPs are myriad, a comprehensive understanding of GP sample paths, i.e. the function spaces over which they define a probability measure, is lacking. In practice, GPs are not constructed through a probability measure, but instead through a mean function and a… ▽ More

    Submitted 16 February, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  22. arXiv:2311.00636  [pdf, other

    cs.LG stat.ML

    Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures

    Authors: Runa Eschenhagen, Alexander Immer, Richard E. Turner, Frank Schneider, Philipp Hennig

    Abstract: The core components of many modern neural network architectures, such as transformers, convolutional, or graph neural networks, can be expressed as linear layers with $\textit{weight-sharing}$. Kronecker-Factored Approximate Curvature (K-FAC), a second-order optimisation method, has shown promise to speed up neural network training and thereby reduce computational costs. However, there is currentl… ▽ More

    Submitted 11 January, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: NeurIPS 2023

  23. arXiv:2310.20285  [pdf, other

    cs.LG stat.ML

    Accelerating Non-Conjugate Gaussian Processes By Trading Off Computation For Uncertainty

    Authors: Lukas Tatzel, Jonathan Wenger, Frank Schneider, Philipp Hennig

    Abstract: Non-conjugate Gaussian processes (NCGPs) define a flexible probabilistic framework to model categorical, ordinal and continuous data, and are widely used in practice. However, exact inference in NCGPs is prohibitively expensive for large datasets, thus requiring approximations in practice. The approximation error adversely impacts the reliability of the model and is not accounted for in the uncert… ▽ More

    Submitted 17 April, 2025; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: Main text: 15 pages, 7 figures; Supplements: 15 pages, 3 figures

  24. arXiv:2310.01145  [pdf, other

    math.NA cs.LG stat.ML

    Parallel-in-Time Probabilistic Numerical ODE Solvers

    Authors: Nathanael Bosch, Adrien Corenflos, Fatemeh Yaghoobi, Filip Tronarp, Philipp Hennig, Simo Särkkä

    Abstract: Probabilistic numerical solvers for ordinary differential equations (ODEs) treat the numerical simulation of dynamical systems as problems of Bayesian state estimation. Aside from producing posterior distributions over ODE solutions and thereby quantifying the numerical approximation error of the method itself, one less-often noted advantage of this formalism is the algorithmic flexibility gained… ▽ More

    Submitted 11 September, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Journal ref: Journal of Machine Learning Research, 2024

  25. arXiv:2306.07774  [pdf, other

    stat.ML cs.LG

    The Rank-Reduced Kalman Filter: Approximate Dynamical-Low-Rank Filtering In High Dimensions

    Authors: Jonathan Schmidt, Philipp Hennig, Jörg Nick, Filip Tronarp

    Abstract: Inference and simulation in the context of high-dimensional dynamical systems remain computationally challenging problems. Some form of dimensionality reduction is required to make the problem tractable in general. In this paper, we propose a novel approximate Gaussian filtering and smoothing method which propagates low-rank approximations of the covariance matrices. This is accomplished by projec… ▽ More

    Submitted 3 January, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: 12 pages main text (including references) + 9 pages appendix, 6 figures

  26. arXiv:2306.07179  [pdf, ps, other

    cs.LG stat.ML

    Benchmarking Neural Network Training Algorithms

    Authors: George E. Dahl, Frank Schneider, Zachary Nado, Naman Agarwal, Chandramouli Shama Sastry, Philipp Hennig, Sourabh Medapati, Runa Eschenhagen, Priya Kasimbeg, Daniel Suo, Juhan Bae, Justin Gilmer, Abel L. Peirson, Bilal Khan, Rohan Anil, Mike Rabbat, Shankar Krishnan, Daniel Snider, Ehsan Amid, Kongtao Chen, Chris J. Maddison, Rakshith Vasudev, Michal Badura, Ankush Garg, Peter Mattson

    Abstract: Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning rate schedules, or data selection schemes) could save time, save computational resources, and lead to better, more accurate, models. Unfortunately, as a communi… ▽ More

    Submitted 18 June, 2025; v1 submitted 12 June, 2023; originally announced June 2023.

    Comments: 102 pages, 8 figures, 41 tables

  27. arXiv:2305.14978  [pdf, other

    math.NA cs.LG stat.ML

    Probabilistic Exponential Integrators

    Authors: Nathanael Bosch, Philipp Hennig, Filip Tronarp

    Abstract: Probabilistic solvers provide a flexible and efficient framework for simulation, uncertainty quantification, and inference in dynamical systems. However, like standard solvers, they suffer performance penalties for certain stiff systems, where small steps are required not for reasons of numerical accuracy but for the sake of stability. This issue is greatly alleviated in semi-linear problems by th… ▽ More

    Submitted 19 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  28. arXiv:2305.13290  [pdf, other

    cs.LG

    Uncertainty and Structure in Neural Ordinary Differential Equations

    Authors: Katharina Ott, Michael Tiemann, Philipp Hennig

    Abstract: Neural ordinary differential equations (ODEs) are an emerging class of deep learning models for dynamical systems. They are particularly useful for learning an ODE vector field from observed trajectories (i.e., inverse problems). We here consider aspects of these models relevant for their application in science and engineering. Scientific predictions generally require structured uncertainty estima… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  29. arXiv:2305.13248  [pdf, other

    stat.ML cs.LG

    Bayesian Numerical Integration with Neural Networks

    Authors: Katharina Ott, Michael Tiemann, Philipp Hennig, François-Xavier Briol

    Abstract: Bayesian probabilistic numerical methods for numerical integration offer significant advantages over their non-Bayesian counterparts: they can encode prior information about the integrand, and can quantify uncertainty over estimates of an integral. However, the most popular algorithm in this class, Bayesian quadrature, is based on Gaussian process models and is therefore associated with a high com… ▽ More

    Submitted 10 September, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Journal ref: PMLR 216:1606-1617, 2023

  30. arXiv:2302.07384  [pdf, other

    cs.LG stat.ML

    The Geometry of Neural Nets' Parameter Spaces Under Reparametrization

    Authors: Agustinus Kristiadi, Felix Dangel, Philipp Hennig

    Abstract: Model reparametrization, which follows the change-of-variable rule of calculus, is a popular way to improve the training of neural nets. But it can also be problematic since it can induce inconsistencies in, e.g., Hessian-based flatness measures, optimization trajectories, and modes of probability densities. This complicates downstream analyses: e.g. one cannot definitively relate flatness with ge… ▽ More

    Submitted 23 October, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: NeurIPS 2023

  31. arXiv:2212.12474  [pdf, other

    cs.LG math.NA stat.ML

    Physics-Informed Gaussian Process Regression Generalizes Linear PDE Solvers

    Authors: Marvin Pförtner, Ingo Steinwart, Philipp Hennig, Jonathan Wenger

    Abstract: Linear partial differential equations (PDEs) are an important, widely applied class of mechanistic models, describing physical processes such as heat transfer, electromagnetism, and wave propagation. In practice, specialized numerical methods based on discretization are used to solve PDEs. They generally use an estimate of the unknown model parameters and, if available, physical measurements for i… ▽ More

    Submitted 28 April, 2024; v1 submitted 23 December, 2022; originally announced December 2022.

  32. arXiv:2209.00895  [pdf, other

    cs.LG

    Optimistic Optimization of Gaussian Process Samples

    Authors: Julia Grosse, Cheng Zhang, Philipp Hennig

    Abstract: Bayesian optimization is a popular formalism for global optimization, but its computational costs limit it to expensive-to-evaluate functions. A competing, computationally more efficient, global optimization framework is optimistic optimization, which exploits prior knowledge about the geometry of the search space in form of a dissimilarity function. We investigate to which degree the conceptual a… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: 10 pages, 6 figures

  33. arXiv:2208.01565  [pdf, other

    cs.LG math.NA

    Approximate Bayesian Neural Operators: Uncertainty Quantification for Parametric PDEs

    Authors: Emilia Magnani, Nicholas Krämer, Runa Eschenhagen, Lorenzo Rosasco, Philipp Hennig

    Abstract: Neural operators are a type of deep architecture that learns to solve (i.e. learns the nonlinear solution operator of) partial differential equations (PDEs). The current state of the art for these models does not provide explicit uncertainty quantification. This is arguably even more of a problem for this kind of tasks than elsewhere in machine learning, because the dynamical systems typically des… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

  34. arXiv:2205.15449  [pdf, other

    cs.LG math.NA stat.ML

    Posterior and Computational Uncertainty in Gaussian Processes

    Authors: Jonathan Wenger, Geoff Pleiss, Marvin Pförtner, Philipp Hennig, John P. Cunningham

    Abstract: Gaussian processes scale prohibitively with the size of the dataset. In response, many approximation methods have been developed, which inevitably introduce approximation error. This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior. Therefore in practice, GP models are often as much about the approximation method as they are abo… ▽ More

    Submitted 9 October, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: Advances in Neural Information Processing Systems (NeurIPS 2022)

  35. arXiv:2205.10041  [pdf, other

    cs.LG stat.ML

    Posterior Refinement Improves Sample Efficiency in Bayesian Neural Networks

    Authors: Agustinus Kristiadi, Runa Eschenhagen, Philipp Hennig

    Abstract: Monte Carlo (MC) integration is the de facto method for approximating the predictive distribution of Bayesian neural networks (BNNs). But, even with many MC samples, Gaussian-based BNNs could still yield bad predictive performance due to the posterior approximation's error. Meanwhile, alternatives to MC integration tend to be more expensive or biased. In this work, we experimentally show that the… ▽ More

    Submitted 15 October, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022

  36. arXiv:2205.07531  [pdf, other

    cs.LG cs.HC stat.ML

    Wasserstein t-SNE

    Authors: Fynn Bachmann, Philipp Hennig, Dmitry Kobak

    Abstract: Scientific datasets often have hierarchical structure: for example, in surveys, individual participants (samples) might be grouped at a higher level (units) such as their geographical region. In these settings, the interest is often in exploring the structure on the unit level rather than on the sample level. Units can be compared based on the distance between their means, however this ignores the… ▽ More

    Submitted 23 June, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

    Comments: 16 pages, 10 figures, to be published at ECML/PKDD 2022

    Journal ref: ECML PKDD 2022

  37. arXiv:2203.03353  [pdf, other

    stat.ML cs.LG

    Discovering Inductive Bias with Gibbs Priors: A Diagnostic Tool for Approximate Bayesian Inference

    Authors: Luca Rendsburg, Agustinus Kristiadi, Philipp Hennig, Ulrike von Luxburg

    Abstract: Full Bayesian posteriors are rarely analytically tractable, which is why real-world Bayesian inference heavily relies on approximate techniques. Approximations generally differ from the true posterior and require diagnostic tools to assess whether the inference can still be trusted. We investigate a new approach to diagnosing approximate inference: the approximation mismatch is attributed to a cha… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: 24 pages, 9 figues, to be published in AISTATS22

  38. arXiv:2202.01287  [pdf, other

    cs.LG stat.ML

    Fenrir: Physics-Enhanced Regression for Initial Value Problems

    Authors: Filip Tronarp, Nathanael Bosch, Philipp Hennig

    Abstract: We show how probabilistic numerics can be used to convert an initial value problem into a Gauss--Markov process parametrised by the dynamics of the initial value problem. Consequently, the often difficult problem of parameter estimation in ordinary differential equations is reduced to hyperparameter estimation in Gauss--Markov regression, which tends to be considerably easier. The method's relatio… ▽ More

    Submitted 24 May, 2023; v1 submitted 2 February, 2022; originally announced February 2022.

  39. arXiv:2112.02100  [pdf, other

    cs.MS cs.LG math.NA

    ProbNum: Probabilistic Numerics in Python

    Authors: Jonathan Wenger, Nicholas Krämer, Marvin Pförtner, Jonathan Schmidt, Nathanael Bosch, Nina Effenberger, Johannes Zenn, Alexandra Gessner, Toni Karvonen, François-Xavier Briol, Maren Mahsereci, Philipp Hennig

    Abstract: Probabilistic numerical methods (PNMs) solve numerical problems via probabilistic inference. They have been developed for linear algebra, optimization, integration and differential equation simulation. PNMs naturally incorporate prior information about a problem and quantify uncertainty due to finite computational resources as well as stochastic input. In this paper, we present ProbNum: a Python l… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

  40. arXiv:2111.03577  [pdf, other

    cs.LG stat.ML

    Mixtures of Laplace Approximations for Improved Post-Hoc Uncertainty in Deep Learning

    Authors: Runa Eschenhagen, Erik Daxberger, Philipp Hennig, Agustinus Kristiadi

    Abstract: Deep neural networks are prone to overconfident predictions on outliers. Bayesian neural networks and deep ensembles have both been shown to mitigate this problem to some extent. In this work, we aim to combine the benefits of the two approaches by proposing to predict with a Gaussian mixture model posterior that consists of a weighted sum of Laplace approximations of independently trained deep ne… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

    Comments: Bayesian Deep Learning Workshop, NeurIPS 2021

  41. arXiv:2110.11812  [pdf, other

    stat.ML cs.LG math.NA

    Probabilistic ODE Solutions in Millions of Dimensions

    Authors: Nicholas Krämer, Nathanael Bosch, Jonathan Schmidt, Philipp Hennig

    Abstract: Probabilistic solvers for ordinary differential equations (ODEs) have emerged as an efficient framework for uncertainty quantification and inference on dynamical systems. In this work, we explain the mathematical assumptions and detailed implementation schemes behind solving {high-dimensional} ODEs with a probabilistic numerical algorithm. This has not been possible before due to matrix-matrix ope… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

  42. arXiv:2110.10770  [pdf, other

    stat.ML cs.LG math.NA

    Pick-and-Mix Information Operators for Probabilistic ODE Solvers

    Authors: Nathanael Bosch, Filip Tronarp, Philipp Hennig

    Abstract: Probabilistic numerical solvers for ordinary differential equations compute posterior distributions over the solution of an initial value problem via Bayesian inference. In this paper, we leverage their probabilistic formulation to seamlessly include additional information as general likelihood terms. We show that second-order differential equations should be directly provided to the solver, inste… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: 13 pages, 7 figures

  43. arXiv:2107.00243  [pdf, other

    cs.LG math.NA

    Preconditioning for Scalable Gaussian Process Hyperparameter Optimization

    Authors: Jonathan Wenger, Geoff Pleiss, Philipp Hennig, John P. Cunningham, Jacob R. Gardner

    Abstract: Gaussian process hyperparameter optimization requires linear solves with, and log-determinants of, large kernel matrices. Iterative numerical techniques are becoming popular to scale to larger datasets, relying on the conjugate gradient method (CG) for the linear solves and stochastic trace estimation for the log-determinant. This work introduces new algorithmic and theoretical insights for precon… ▽ More

    Submitted 18 June, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: International Conference on Machine Learning (ICML)

  44. arXiv:2106.14806  [pdf, other

    cs.LG stat.ML

    Laplace Redux -- Effortless Bayesian Deep Learning

    Authors: Erik Daxberger, Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Matthias Bauer, Philipp Hennig

    Abstract: Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection. The Laplace approximation (LA) is a classic, and arguably the simplest family of approximations for the intractable posteriors of deep neural networks. Yet, despite its simplicity, the L… ▽ More

    Submitted 14 March, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 camera-ready version; source code: https://github.com/AlexImmer/Laplace

  45. arXiv:2106.10065  [pdf, other

    cs.LG stat.ML

    Being a Bit Frequentist Improves Bayesian Neural Networks

    Authors: Agustinus Kristiadi, Matthias Hein, Philipp Hennig

    Abstract: Despite their compelling theoretical properties, Bayesian neural networks (BNNs) tend to perform worse than frequentist methods in classification-based uncertainty quantification (UQ) tasks such as out-of-distribution (OOD) detection. In this paper, based on empirical findings in prior works, we hypothesize that this issue is because even recent Bayesian methods have never considered OOD data in t… ▽ More

    Submitted 2 February, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: AISTATS 2022

  46. arXiv:2106.08717  [pdf, other

    cs.LG cs.AI

    Probabilistic DAG Search

    Authors: Julia Grosse, Cheng Zhang, Philipp Hennig

    Abstract: Exciting contemporary machine learning problems have recently been phrased in the classic formalism of tree search -- most famously, the game of Go. Interestingly, the state-space underlying these sequential decision-making problems often posses a more general latent structure than can be captured by a tree. In this work, we develop a probabilistic framework to exploit a search space's latent stru… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: 10 pages, 8 figures, to be published at the Conference on Uncertainty in Artificial Intelligence (UAI) 2021

  47. arXiv:2106.07761  [pdf, other

    stat.ML cs.LG math.NA

    Linear-Time Probabilistic Solutions of Boundary Value Problems

    Authors: Nicholas Krämer, Philipp Hennig

    Abstract: We propose a fast algorithm for the probabilistic solution of boundary value problems (BVPs), which are ordinary differential equations subject to boundary conditions. In contrast to previous work, we introduce a Gauss--Markov prior and tailor it specifically to BVPs, which allows computing a posterior distribution over the solution in linear time, at a quality and cost comparable to that of well-… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

  48. arXiv:2106.02624  [pdf, other

    cs.LG stat.ML

    ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

    Authors: Felix Dangel, Lukas Tatzel, Philipp Hennig

    Abstract: Curvature in form of the Hessian or its generalized Gauss-Newton (GGN) approximation is valuable for algorithms that rely on a local model for the loss to train, compress, or explain deep networks. Existing methods based on implicit multiplication via automatic differentiation or Kronecker-factored block diagonal approximations do not consider noise in the mini-batch. We present ViViT, a curvature… ▽ More

    Submitted 10 February, 2022; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: Main text: 10 pages, 6 figures; Supplements: 26 pages, 27 figures, 5 tables

  49. arXiv:2105.06331  [pdf, other

    cs.LG

    Informed Equation Learning

    Authors: Matthias Werner, Andrej Junginger, Philipp Hennig, Georg Martius

    Abstract: Distilling data into compact and interpretable analytic equations is one of the goals of science. Instead, contemporary supervised machine learning methods mostly produce unstructured and dense maps from input to output. Particularly in deep learning, this property is owed to the generic nature of simple standard link functions. To learn equations rather than maps, standard non-linearities can be… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

  50. arXiv:2105.03109  [pdf, other

    cs.LG stat.ML

    Laplace Matching for fast Approximate Inference in Latent Gaussian Models

    Authors: Marius Hobbhahn, Philipp Hennig

    Abstract: Bayesian inference on non-Gaussian data is often non-analytic and requires computationally expensive approximations such as sampling or variational inference. We propose an approximate inference framework primarily designed to be computationally cheap while still achieving high approximation quality. The concept, which we call Laplace Matching, involves closed-form, approximate, bi-directional tra… ▽ More

    Submitted 11 October, 2022; v1 submitted 7 May, 2021; originally announced May 2021.

    Comments: Added experiments and clarifications; Currently under review at JMLR

    MSC Class: 68T37 ACM Class: G.3; I.2.0