-
Predictive performance of power posteriors
Authors:
Yann McLatchie,
Edwin Fong,
David T. Frazier,
Jeremias Knoblauch
Abstract:
We analyse the impact of using tempered likelihoods in the production of posterior predictions. While the choice of temperature has an impact on predictive performance in small samples, we formally show that in moderate-to-large samples, tempering does not impact posterior predictions.
We analyse the impact of using tempered likelihoods in the production of posterior predictions. While the choice of temperature has an impact on predictive performance in small samples, we formally show that in moderate-to-large samples, tempering does not impact posterior predictions.
△ Less
Submitted 13 May, 2025; v1 submitted 16 August, 2024;
originally announced August 2024.
-
Exact Sampling of Gibbs Measures with Estimated Losses
Authors:
David T. Frazier,
Jeremias Knoblauch,
Jack Jewson,
Christopher Drovandi
Abstract:
In recent years, the shortcomings of Bayesian posteriors as inferential devices have received increased attention. A popular strategy for fixing them has been to instead target a Gibbs measure based on losses that connect a parameter of interest to observed data. However, existing theory for such inference procedures assumes these losses are analytically available, while in many situations these l…
▽ More
In recent years, the shortcomings of Bayesian posteriors as inferential devices have received increased attention. A popular strategy for fixing them has been to instead target a Gibbs measure based on losses that connect a parameter of interest to observed data. However, existing theory for such inference procedures assumes these losses are analytically available, while in many situations these losses must be stochastically estimated using pseudo-observations. In such cases, we show that when standard Markov Chain Monte Carlo algorithms are used to produce posterior samples, the resulting posterior exhibits strong dependence on the number of pseudo-observations: unless the number of pseudo-observations diverge sufficiently fast the resulting posterior will concentrate very slowly. However, we show that in many situations it is feasible to alleviate this dependence entirely using a modified piecewise deterministic Markov process (PDMP) sampler, and we formally and empirically show that these samplers produce posterior draws that have no dependence on the number of pseudo-observations used to estimate the loss within the Gibbs Measure. We apply our results to three examples that feature intractable likelihoods and model misspecification.
△ Less
Submitted 22 April, 2025; v1 submitted 24 April, 2024;
originally announced April 2024.
-
A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods
Authors:
Veit David Wild,
Sahra Ghalebikesabi,
Dino Sejdinovic,
Jeremias Knoblauch
Abstract:
We establish the first mathematically rigorous link between Bayesian, variational Bayesian, and ensemble methods. A key step towards this it to reformulate the non-convex optimisation problem typically encountered in deep learning as a convex optimisation in the space of probability measures. On a technical level, our contribution amounts to studying generalised variational inference through the l…
▽ More
We establish the first mathematically rigorous link between Bayesian, variational Bayesian, and ensemble methods. A key step towards this it to reformulate the non-convex optimisation problem typically encountered in deep learning as a convex optimisation in the space of probability measures. On a technical level, our contribution amounts to studying generalised variational inference through the lense of Wasserstein gradient flows. The result is a unified theory of various seemingly disconnected approaches that are commonly used for uncertainty quantification in deep learning -- including deep ensembles and (variational) Bayesian methods. This offers a fresh perspective on the reasons behind the success of deep ensembles over procedures based on parameterised variational inference, and allows the derivation of new ensembling schemes with convergence guarantees. We showcase this by proposing a family of interacting deep ensembles with direct parallels to the interactions of particle systems in thermodynamics, and use our theory to prove the convergence of these algorithms to a well-defined global minimiser on the space of probability measures.
△ Less
Submitted 22 October, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Robustifying likelihoods by optimistically re-weighting data
Authors:
Miheer Dewaskar,
Christopher Tosh,
Jeremias Knoblauch,
David B. Dunson
Abstract:
Likelihood-based inferences have been remarkably successful in wide-spanning application areas. However, even after due diligence in selecting a good model for the data at hand, there is inevitably some amount of model misspecification: outliers, data contamination or inappropriate parametric assumptions such as Gaussianity mean that most models are at best rough approximations of reality. A signi…
▽ More
Likelihood-based inferences have been remarkably successful in wide-spanning application areas. However, even after due diligence in selecting a good model for the data at hand, there is inevitably some amount of model misspecification: outliers, data contamination or inappropriate parametric assumptions such as Gaussianity mean that most models are at best rough approximations of reality. A significant practical concern is that for certain inferences, even small amounts of model misspecification may have a substantial impact; a problem we refer to as brittleness. This article attempts to address the brittleness problem in likelihood-based inferences by choosing the most model friendly data generating process in a distance-based neighborhood of the empirical measure. This leads to a new Optimistically Weighted Likelihood (OWL), which robustifies the original likelihood by formally accounting for a small amount of model misspecification. Focusing on total variation (TV) neighborhoods, we study theoretical properties, develop estimation algorithms and illustrate the methodology in applications to mixture models and regression.
△ Less
Submitted 10 September, 2024; v1 submitted 18 March, 2023;
originally announced March 2023.
-
Generalised Bayesian Inference for Discrete Intractable Likelihood
Authors:
Takuo Matsubara,
Jeremias Knoblauch,
François-Xavier Briol,
Chris. J. Oates
Abstract:
Discrete state spaces represent a major computational challenge to statistical inference, since the computation of normalisation constants requires summation over large or possibly infinite sets, which can be impractical. This paper addresses this computational challenge through the development of a novel generalised Bayesian inference procedure suitable for discrete intractable likelihood. Inspir…
▽ More
Discrete state spaces represent a major computational challenge to statistical inference, since the computation of normalisation constants requires summation over large or possibly infinite sets, which can be impractical. This paper addresses this computational challenge through the development of a novel generalised Bayesian inference procedure suitable for discrete intractable likelihood. Inspired by recent methodological advances for continuous data, the main idea is to update beliefs about model parameters using a discrete Fisher divergence, in lieu of the problematic intractable likelihood. The result is a generalised posterior that can be sampled from using standard computational tools, such as Markov chain Monte Carlo, circumventing the intractable normalising constant. The statistical properties of the generalised posterior are analysed, with sufficient conditions for posterior consistency and asymptotic normality established. In addition, a novel and general approach to calibration of generalised posteriors is proposed. Applications are presented on lattice models for discrete spatial data and on multivariate models for count data, where in each case the methodology facilitates generalised Bayesian inference at low computational cost.
△ Less
Submitted 1 September, 2023; v1 submitted 16 June, 2022;
originally announced June 2022.
-
Robust Generalised Bayesian Inference for Intractable Likelihoods
Authors:
Takuo Matsubara,
Jeremias Knoblauch,
François-Xavier Briol,
Chris. J. Oates
Abstract:
Generalised Bayesian inference updates prior beliefs using a loss function, rather than a likelihood, and can therefore be used to confer robustness against possible mis-specification of the likelihood. Here we consider generalised Bayesian inference with a Stein discrepancy as a loss function, motivated by applications in which the likelihood contains an intractable normalisation constant. In thi…
▽ More
Generalised Bayesian inference updates prior beliefs using a loss function, rather than a likelihood, and can therefore be used to confer robustness against possible mis-specification of the likelihood. Here we consider generalised Bayesian inference with a Stein discrepancy as a loss function, motivated by applications in which the likelihood contains an intractable normalisation constant. In this context, the Stein discrepancy circumvents evaluation of the normalisation constant and produces generalised posteriors that are either closed form or accessible using standard Markov chain Monte Carlo. On a theoretical level, we show consistency, asymptotic normality, and bias-robustness of the generalised posterior, highlighting how these properties are impacted by the choice of Stein discrepancy. Then, we provide numerical experiments on a range of intractable distributions, including applications to kernel-based exponential family models and non-Gaussian graphical models.
△ Less
Submitted 11 January, 2022; v1 submitted 15 April, 2021;
originally announced April 2021.
-
Frequentist Consistency of Generalized Variational Inference
Authors:
Jeremias Knoblauch
Abstract:
This paper investigates Frequentist consistency properties of the posterior distributions constructed via Generalized Variational Inference (GVI). A number of generic and novel strategies are given for proving consistency, relying on the theory of $Γ$-convergence. Specifically, this paper shows that under minimal regularity conditions, the sequence of GVI posteriors is consistent and collapses to…
▽ More
This paper investigates Frequentist consistency properties of the posterior distributions constructed via Generalized Variational Inference (GVI). A number of generic and novel strategies are given for proving consistency, relying on the theory of $Γ$-convergence. Specifically, this paper shows that under minimal regularity conditions, the sequence of GVI posteriors is consistent and collapses to a point mass at the population-optimal parameter value as the number of observations goes to infinity. The results extend to the latent variable case without additional assumptions and hold under misspecification. Lastly, the paper explains how to apply the results to a selection of GVI posteriors with especially popular variational families. For example, consistency is established for GVI methods using the mean field normal variational family, normal mixtures, Gaussian process variational families as well as neural networks indexing a normal (mixture) distribution.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.