-
Decentralized Reinforcement Learning for Multi-Agent Multi-Resource Allocation via Dynamic Cluster Agreements
Authors:
Antonio Marino,
Esteban Restrepo,
Claudio Pacchierotti,
Paolo Robuffo Giordano
Abstract:
This paper addresses the challenge of allocating heterogeneous resources among multiple agents in a decentralized manner. Our proposed method, LGTC-IPPO, builds upon Independent Proximal Policy Optimization (IPPO) by integrating dynamic cluster consensus, a mechanism that allows agents to form and adapt local sub-teams based on resource demands. This decentralized coordination strategy reduces rel…
▽ More
This paper addresses the challenge of allocating heterogeneous resources among multiple agents in a decentralized manner. Our proposed method, LGTC-IPPO, builds upon Independent Proximal Policy Optimization (IPPO) by integrating dynamic cluster consensus, a mechanism that allows agents to form and adapt local sub-teams based on resource demands. This decentralized coordination strategy reduces reliance on global information and enhances scalability. We evaluate LGTC-IPPO against standard multi-agent reinforcement learning baselines and a centralized expert solution across a range of team sizes and resource distributions. Experimental results demonstrate that LGTC-IPPO achieves more stable rewards, better coordination, and robust performance even as the number of agents or resource types increases. Additionally, we illustrate how dynamic clustering enables agents to reallocate resources efficiently also for scenarios with discharging resources.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Sensitivity of MCMC-based analyses to small-data removal
Authors:
Tin D. Nguyen,
Ryan Giordano,
Rachael Meager,
Tamara Broderick
Abstract:
If the conclusion of a data analysis is sensitive to dropping very few data points, that conclusion might hinge on the particular data at hand rather than representing a more broadly applicable truth. How could we check whether this sensitivity holds? One idea is to consider every small subset of data, drop it from the dataset, and re-run our analysis. But running MCMC to approximate a Bayesian po…
▽ More
If the conclusion of a data analysis is sensitive to dropping very few data points, that conclusion might hinge on the particular data at hand rather than representing a more broadly applicable truth. How could we check whether this sensitivity holds? One idea is to consider every small subset of data, drop it from the dataset, and re-run our analysis. But running MCMC to approximate a Bayesian posterior is already very expensive; running multiple times is prohibitive, and the number of re-runs needed here is combinatorially large. Recent work proposes a fast and accurate approximation to find the worst-case dropped data subset, but that work was developed for problems based on estimating equations -- and does not directly handle Bayesian posterior approximations using MCMC. We make two principal contributions in the present work. We adapt the existing data-dropping approximation to estimators computed via MCMC. Observing that Monte Carlo errors induce variability in the approximation, we use a variant of the bootstrap to quantify this uncertainty. We demonstrate how to use our approximation in practice to determine whether there is non-robustness in a problem. Empirically, our method is accurate in simple models, such as linear regression. In models with complicated structure, such as hierarchical models, the performance of our method is mixed.
△ Less
Submitted 10 November, 2024; v1 submitted 13 August, 2024;
originally announced August 2024.
-
Could dropping a few cells change the takeaways from differential expression?
Authors:
Miriam Shiffman,
Ryan Giordano,
Tamara Broderick
Abstract:
Differential expression (DE) plays a fundamental role toward illuminating the molecular mechanisms driving a difference between groups (e.g., due to treatment or disease). While any analysis is run on particular cells/samples, the intent is to generalize to future occurrences of the treatment or disease. Implicitly, this step is justified by assuming that present and future samples are independent…
▽ More
Differential expression (DE) plays a fundamental role toward illuminating the molecular mechanisms driving a difference between groups (e.g., due to treatment or disease). While any analysis is run on particular cells/samples, the intent is to generalize to future occurrences of the treatment or disease. Implicitly, this step is justified by assuming that present and future samples are independent and identically distributed from the same population. Though this assumption is always false, we hope that any deviation from the assumption is small enough that A) conclusions of the analysis still hold and B) standard tools like standard error, significance, and power still reflect generalizability. Conversely, we might worry about these deviations, and reliance on standard tools, if conclusions could be substantively changed by dropping a very small fraction of data. While checking every small fraction is computationally intractable, recent work develops an approximation to identify when such an influential subset exists. Building on this work, we develop a metric for dropping-data robustness of DE; namely, we cast the analysis in a form suitable to the approximation, extend the approximation to models with data-dependent hyperparameters, and extend the notion of a data point from a single cell to a pseudobulk observation. We then overcome the inherent non-differentiability of gene set enrichment analysis to develop an additional approximation for the robustness of top gene sets. We assess robustness of DE for published single-cell RNA-seq data and discover that 1000s of genes can have their results flipped by dropping <1% of the data, including 100s that are sensitive to dropping a single cell (0.07%). Surprisingly, this non-robustness extends to high-level takeaways; half of the top 10 gene sets can be changed by dropping 1-2% of cells, and 2/10 can be changed by dropping a single cell.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
The Bayesian Infinitesimal Jackknife for Variance
Authors:
Ryan Giordano,
Tamara Broderick
Abstract:
The frequentist variability of Bayesian posterior expectations can provide meaningful measures of uncertainty even when models are misspecified. Classical methods to asymptotically approximate the frequentist covariance of Bayesian estimators such as the Laplace approximation and the nonparametric bootstrap can be practically inconvenient, since the Laplace approximation may require an intractable…
▽ More
The frequentist variability of Bayesian posterior expectations can provide meaningful measures of uncertainty even when models are misspecified. Classical methods to asymptotically approximate the frequentist covariance of Bayesian estimators such as the Laplace approximation and the nonparametric bootstrap can be practically inconvenient, since the Laplace approximation may require an intractable integral to compute the marginal log posterior, and the bootstrap requires computing the posterior for many different bootstrap datasets. We develop and explore the infinitesimal jackknife (IJ), an alternative method for computing asymptotic frequentist covariance of smooth functionals of exchangeable data, which is based on the "influence function" of robust statistics. We show that the influence function for posterior expectations has the form of a simple posterior covariance, and that the IJ covariance estimate is, in turn, easily computed from a single set of posterior samples. Under conditions similar to those required for a Bayesian central limit theorem to apply, we prove that the corresponding IJ covariance estimate is asymptotically equivalent to the Laplace approximation and the bootstrap. In the presence of nuisance parameters that may not obey a central limit theorem, we argue using a von Mises expansion that the IJ covariance is inconsistent, but can remain a good approximation to the limiting frequentist variance. We demonstrate the accuracy and computational benefits of the IJ covariance estimates with simulated and real-world experiments.
△ Less
Submitted 26 June, 2024; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Black Box Variational Inference with a Deterministic Objective: Faster, More Accurate, and Even More Black Box
Authors:
Ryan Giordano,
Martin Ingram,
Tamara Broderick
Abstract:
Automatic differentiation variational inference (ADVI) offers fast and easy-to-use posterior approximation in multiple modern probabilistic programming languages. However, its stochastic optimizer lacks clear convergence criteria and requires tuning parameters. Moreover, ADVI inherits the poor posterior uncertainty estimates of mean-field variational Bayes (MFVB). We introduce "deterministic ADVI"…
▽ More
Automatic differentiation variational inference (ADVI) offers fast and easy-to-use posterior approximation in multiple modern probabilistic programming languages. However, its stochastic optimizer lacks clear convergence criteria and requires tuning parameters. Moreover, ADVI inherits the poor posterior uncertainty estimates of mean-field variational Bayes (MFVB). We introduce "deterministic ADVI" (DADVI) to address these issues. DADVI replaces the intractable MFVB objective with a fixed Monte Carlo approximation, a technique known in the stochastic optimization literature as the "sample average approximation" (SAA). By optimizing an approximate but deterministic objective, DADVI can use off-the-shelf second-order optimization, and, unlike standard mean-field ADVI, is amenable to more accurate posterior covariances via linear response (LR). In contrast to existing worst-case theory, we show that, on certain classes of common statistical problems, DADVI and the SAA can perform well with relatively few samples even in very high dimensions, though we also show that such favorable results cannot extend to variational approximations that are too expressive relative to mean-field ADVI. We show on a variety of real-world problems that DADVI reliably finds good solutions with default settings (unlike ADVI) and, together with LR covariances, is typically faster and more accurate than standard ADVI.
△ Less
Submitted 17 January, 2024; v1 submitted 11 April, 2023;
originally announced April 2023.
-
Evaluating Sensitivity to the Stick-Breaking Prior in Bayesian Nonparametrics (Rejoinder)
Authors:
Ryan Giordano,
Runjing Liu,
Michael I. Jordan,
Tamara Broderick
Abstract:
One can typically form a local robustness metric for a particular problem quite directly, for Markov chain Monte Carlo applications as well as optimization problems such as variational Bayes. However, we argue that simply forming a local robustness metric is not enough: the hard work is showing that it is useful. Computability, interpretability, and the ability of a local robustness metric to extr…
▽ More
One can typically form a local robustness metric for a particular problem quite directly, for Markov chain Monte Carlo applications as well as optimization problems such as variational Bayes. However, we argue that simply forming a local robustness metric is not enough: the hard work is showing that it is useful. Computability, interpretability, and the ability of a local robustness metric to extrapolate well, are more important -- and often more difficult to establish -- than mere computation of derivatives.
△ Less
Submitted 11 March, 2023;
originally announced March 2023.
-
Gaussian processes at the Helm(holtz): A more fluid model for ocean currents
Authors:
Renato Berlinghieri,
Brian L. Trippe,
David R. Burt,
Ryan Giordano,
Kaushik Srinivasan,
Tamay Özgökmen,
Junfei Xia,
Tamara Broderick
Abstract:
Given sparse observations of buoy velocities, oceanographers are interested in reconstructing ocean currents away from the buoys and identifying divergences in a current vector field. As a first and modular step, we focus on the time-stationary case - for instance, by restricting to short time periods. Since we expect current velocity to be a continuous but highly non-linear function of spatial lo…
▽ More
Given sparse observations of buoy velocities, oceanographers are interested in reconstructing ocean currents away from the buoys and identifying divergences in a current vector field. As a first and modular step, we focus on the time-stationary case - for instance, by restricting to short time periods. Since we expect current velocity to be a continuous but highly non-linear function of spatial location, Gaussian processes (GPs) offer an attractive model. But we show that applying a GP with a standard stationary kernel directly to buoy data can struggle at both current reconstruction and divergence identification, due to some physically unrealistic prior assumptions. To better reflect known physical properties of currents, we propose to instead put a standard stationary kernel on the divergence and curl-free components of a vector field obtained through a Helmholtz decomposition. We show that, because this decomposition relates to the original vector field just via mixed partial derivatives, we can still perform inference given the original data with only a small constant multiple of additional computational expense. We illustrate the benefits of our method with theory and experiments on synthetic and real ocean data.
△ Less
Submitted 20 June, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Evaluating Sensitivity to the Stick-Breaking Prior in Bayesian Nonparametrics
Authors:
Ryan Giordano,
Runjing Liu,
Michael I. Jordan,
Tamara Broderick
Abstract:
Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks. Prior specification is, however, relatively difficult for such models, given that their flexibility implies that the consequences of prior choices are often relatively opaque. Moreover, these choices can have a…
▽ More
Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks. Prior specification is, however, relatively difficult for such models, given that their flexibility implies that the consequences of prior choices are often relatively opaque. Moreover, these choices can have a substantial effect on posterior inferences. Thus, considerations of robustness need to go hand in hand with nonparametric modeling. In the current paper, we tackle this challenge by exploiting the fact that variational Bayesian methods, in addition to having computational advantages in fitting complex nonparametric models, also yield sensitivities with respect to parametric and nonparametric aspects of Bayesian models. In particular, we demonstrate how to assess the sensitivity of conclusions to the choice of concentration parameter and stick-breaking distribution for inferences under Dirichlet process mixtures and related mixture models. We provide both theoretical and empirical support for our variational approach to Bayesian sensitivity analysis.
△ Less
Submitted 25 October, 2021; v1 submitted 7 July, 2021;
originally announced July 2021.
-
An Automatic Finite-Sample Robustness Metric: When Can Dropping a Little Data Make a Big Difference?
Authors:
Tamara Broderick,
Ryan Giordano,
Rachael Meager
Abstract:
Study samples often differ from the target populations of inference and policy decisions in non-random ways. Researchers typically believe that such departures from random sampling -- due to changes in the population over time and space, or difficulties in sampling truly randomly -- are small, and their corresponding impact on the inference should be small as well. We might therefore be concerned…
▽ More
Study samples often differ from the target populations of inference and policy decisions in non-random ways. Researchers typically believe that such departures from random sampling -- due to changes in the population over time and space, or difficulties in sampling truly randomly -- are small, and their corresponding impact on the inference should be small as well. We might therefore be concerned if the conclusions of our studies are excessively sensitive to a very small proportion of our sample data. We propose a method to assess the sensitivity of applied econometric conclusions to the removal of a small fraction of the sample. Manually checking the influence of all possible small subsets is computationally infeasible, so we use an approximation to find the most influential subset. Our metric, the "Approximate Maximum Influence Perturbation," is based on the classical influence function, and is automatically computable for common methods including (but not limited to) OLS, IV, MLE, GMM, and variational Bayes. We provide finite-sample error bounds on approximation performance. At minimal extra cost, we provide an exact finite-sample lower bound on sensitivity. We find that sensitivity is driven by a signal-to-noise ratio in the inference problem, is not reflected in standard errors, does not disappear asymptotically, and is not due to misspecification. While some empirical applications are robust, results of several influential economics papers can be overturned by removing less than 1% of the sample.
△ Less
Submitted 19 July, 2023; v1 submitted 30 November, 2020;
originally announced November 2020.
-
A Higher-Order Swiss Army Infinitesimal Jackknife
Authors:
Ryan Giordano,
Michael I. Jordan,
Tamara Broderick
Abstract:
Cross validation (CV) and the bootstrap are ubiquitous model-agnostic tools for assessing the error or variability of machine learning and statistical estimators. However, these methods require repeatedly re-fitting the model with different weighted versions of the original dataset, which can be prohibitively time-consuming. For sufficiently regular optimization problems the optimum depends smooth…
▽ More
Cross validation (CV) and the bootstrap are ubiquitous model-agnostic tools for assessing the error or variability of machine learning and statistical estimators. However, these methods require repeatedly re-fitting the model with different weighted versions of the original dataset, which can be prohibitively time-consuming. For sufficiently regular optimization problems the optimum depends smoothly on the data weights, and so the process of repeatedly re-fitting can be approximated with a Taylor series that can be often evaluated relatively quickly. The first-order approximation is known as the "infinitesimal jackknife" in the statistics literature and has been the subject of recent interest in machine learning for approximate CV. In this work, we consider high-order approximations, which we call the "higher-order infinitesimal jackknife" (HOIJ). Under mild regularity conditions, we provide a simple recursive procedure to compute approximations of all orders with finite-sample accuracy bounds. Additionally, we show that the HOIJ can be efficiently computed even in high dimensions using forward-mode automatic differentiation. We show that a linear approximation with bootstrap weights approximation is equivalent to those provided by asymptotic normal approximations. Consequently, the HOIJ opens up the possibility of enjoying higher-order accuracy properties of the bootstrap using local approximations. Consistency of the HOIJ for leave-one-out CV under different asymptotic regimes follows as corollaries from our finite-sample bounds under additional regularity assumptions. The generality of the computation and bounds motivate the name "higher-order Swiss Army infinitesimal jackknife."
△ Less
Submitted 28 July, 2019;
originally announced July 2019.
-
Evaluating Sensitivity to the Stick-Breaking Prior in Bayesian Nonparametrics
Authors:
Ryan Giordano,
Runjing Liu,
Michael I. Jordan,
Tamara Broderick
Abstract:
Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks. However, due to the flexibility of these models, the consequences of prior choices can be opaque. And so prior specification can be relatively difficult. At the same time, prior choice can have a substantial eff…
▽ More
Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks. However, due to the flexibility of these models, the consequences of prior choices can be opaque. And so prior specification can be relatively difficult. At the same time, prior choice can have a substantial effect on posterior inferences. Thus, considerations of robustness need to go hand in hand with nonparametric modeling. In the current paper, we tackle this challenge by exploiting the fact that variational Bayesian methods, in addition to having computational advantages in fitting complex nonparametric models, also yield sensitivities with respect to parametric and nonparametric aspects of Bayesian models. In particular, we demonstrate how to assess the sensitivity of conclusions to the choice of concentration parameter and stick-breaking distribution for inferences under Dirichlet process mixtures and related mixture models. We provide both theoretical and empirical support for our variational approach to Bayesian sensitivity analysis.
△ Less
Submitted 25 January, 2022; v1 submitted 15 October, 2018;
originally announced October 2018.
-
A Swiss Army Infinitesimal Jackknife
Authors:
Ryan Giordano,
Will Stephenson,
Runjing Liu,
Michael I. Jordan,
Tamara Broderick
Abstract:
The error or variability of machine learning algorithms is often assessed by repeatedly re-fitting a model with different weighted versions of the observed data. The ubiquitous tools of cross-validation (CV) and the bootstrap are examples of this technique. These methods are powerful in large part due to their model agnosticism but can be slow to run on modern, large data sets due to the need to r…
▽ More
The error or variability of machine learning algorithms is often assessed by repeatedly re-fitting a model with different weighted versions of the observed data. The ubiquitous tools of cross-validation (CV) and the bootstrap are examples of this technique. These methods are powerful in large part due to their model agnosticism but can be slow to run on modern, large data sets due to the need to repeatedly re-fit the model. In this work, we use a linear approximation to the dependence of the fitting procedure on the weights, producing results that can be faster than repeated re-fitting by an order of magnitude. This linear approximation is sometimes known as the "infinitesimal jackknife" in the statistics literature, where it is mostly used as a theoretical tool to prove asymptotic results. We provide explicit finite-sample error bounds for the infinitesimal jackknife in terms of a small number of simple, verifiable assumptions. Our results apply whether the weights and data are stochastic or deterministic, and so can be used as a tool for proving the accuracy of the infinitesimal jackknife on a wide variety of problems. As a corollary, we state mild regularity conditions under which our approximation consistently estimates true leave-$k$-out cross-validation for any fixed $k$. These theoretical results, together with modern automatic differentiation software, support the application of the infinitesimal jackknife to a wide variety of practical problems in machine learning, providing a "Swiss Army infinitesimal jackknife". We demonstrate the accuracy of our methods on a range of simulated and real datasets.
△ Less
Submitted 7 February, 2020; v1 submitted 1 June, 2018;
originally announced June 2018.
-
Measuring Cluster Stability for Bayesian Nonparametrics Using the Linear Bootstrap
Authors:
Ryan Giordano,
Runjing Liu,
Nelle Varoquaux,
Michael I. Jordan,
Tamara Broderick
Abstract:
Clustering procedures typically estimate which data points are clustered together, a quantity of primary importance in many analyses. Often used as a preliminary step for dimensionality reduction or to facilitate interpretation, finding robust and stable clusters is often crucial for appropriate for downstream analysis. In the present work, we consider Bayesian nonparametric (BNP) models, a partic…
▽ More
Clustering procedures typically estimate which data points are clustered together, a quantity of primary importance in many analyses. Often used as a preliminary step for dimensionality reduction or to facilitate interpretation, finding robust and stable clusters is often crucial for appropriate for downstream analysis. In the present work, we consider Bayesian nonparametric (BNP) models, a particularly popular set of Bayesian models for clustering due to their flexibility. Because of its complexity, the Bayesian posterior often cannot be computed exactly, and approximations must be employed. Mean-field variational Bayes forms a posterior approximation by solving an optimization problem and is widely used due to its speed. An exact BNP posterior might vary dramatically when presented with different data. As such, stability and robustness of the clustering should be assessed.
A popular mean to assess stability is to apply the bootstrap by resampling the data, and rerun the clustering for each simulated data set. The time cost is thus often very expensive, especially for the sort of exploratory analysis where clustering is typically used. We propose to use a fast and automatic approximation to the full bootstrap called the "linear bootstrap", which can be seen by local data perturbation. In this work, we demonstrate how to apply this idea to a data analysis pipeline, consisting of an MFVB approximation to a BNP clustering posterior of time course gene expression data. We show that using auto-differentiation tools, the necessary calculations can be done automatically, and that the linear bootstrap is a fast but approximate alternative to the bootstrap.
△ Less
Submitted 4 December, 2017;
originally announced December 2017.
-
Covariances, Robustness, and Variational Bayes
Authors:
Ryan Giordano,
Tamara Broderick,
Michael I. Jordan
Abstract:
Mean-field Variational Bayes (MFVB) is an approximate Bayesian posterior inference technique that is increasingly popular due to its fast runtimes on large-scale datasets. However, even when MFVB provides accurate posterior means for certain parameters, it often mis-estimates variances and covariances. Furthermore, prior robustness measures have remained undeveloped for MFVB. By deriving a simple…
▽ More
Mean-field Variational Bayes (MFVB) is an approximate Bayesian posterior inference technique that is increasingly popular due to its fast runtimes on large-scale datasets. However, even when MFVB provides accurate posterior means for certain parameters, it often mis-estimates variances and covariances. Furthermore, prior robustness measures have remained undeveloped for MFVB. By deriving a simple formula for the effect of infinitesimal model perturbations on MFVB posterior means, we provide both improved covariance estimates and local robustness measures for MFVB, thus greatly expanding the practical usefulness of MFVB posterior approximations. The estimates for MFVB posterior covariances rely on a result from the classical Bayesian robustness literature relating derivatives of posterior expectations to posterior covariances and include the Laplace approximation as a special case. Our key condition is that the MFVB approximation provides good estimates of a select subset of posterior means---an assumption that has been shown to hold in many practical settings. In our experiments, we demonstrate that our methods are simple, general, and fast, providing accurate posterior uncertainty estimates and robustness measures with runtimes that can be an order of magnitude faster than MCMC.
△ Less
Submitted 17 October, 2018; v1 submitted 8 September, 2017;
originally announced September 2017.
-
Fast Measurements of Robustness to Changing Priors in Variational Bayes
Authors:
Ryan Giordano,
Tamara Broderick,
Michael Jordan
Abstract:
In Bayesian analysis, the posterior follows from the data and a choice of a prior and a likelihood. One hopes that the posterior is robust to reasonable variation in the choice of prior, since this choice is made by the modeler and is often somewhat subjective. A different, equally subjectively plausible choice of prior may result in a substantially different posterior, and so different conclusion…
▽ More
In Bayesian analysis, the posterior follows from the data and a choice of a prior and a likelihood. One hopes that the posterior is robust to reasonable variation in the choice of prior, since this choice is made by the modeler and is often somewhat subjective. A different, equally subjectively plausible choice of prior may result in a substantially different posterior, and so different conclusions drawn from the data. Were this to be the case, our conclusions would not be robust to the choice of prior. To determine whether our model is robust, we must quantify how sensitive our posterior is to perturbations of our prior. Despite the importance of the problem and a considerable body of literature, generic, easy-to-use methods to quantify Bayesian robustness are still lacking.
Abstract In this paper, we demonstrate that powerful measures of robustness can be easily calculated from Variational Bayes (VB) approximate posteriors. We begin with local robustness, which measures the effect of infinitesimal changes to the prior on a posterior mean of interest. In particular, we show that the influence function of Gustafson (2012) has a simple, easy-to-calculate closed form expression for VB approximations. We then demonstrate how local robustness measures can be inadequate for non-local prior changes, such as replacing one prior entirely with another. We propose a simple approximate non-local robustness measure and demonstrate its effectiveness on a simulated data set.
△ Less
Submitted 6 December, 2016; v1 submitted 22 November, 2016;
originally announced November 2016.
-
Learning an Astronomical Catalog of the Visible Universe through Scalable Bayesian Inference
Authors:
Jeffrey Regier,
Kiran Pamnany,
Ryan Giordano,
Rollin Thomas,
David Schlegel,
Jon McAuliffe,
Prabhat
Abstract:
Celeste is a procedure for inferring astronomical catalogs that attains state-of-the-art scientific results. To date, Celeste has been scaled to at most hundreds of megabytes of astronomical images: Bayesian posterior inference is notoriously demanding computationally. In this paper, we report on a scalable, parallel version of Celeste, suitable for learning catalogs from modern large-scale astron…
▽ More
Celeste is a procedure for inferring astronomical catalogs that attains state-of-the-art scientific results. To date, Celeste has been scaled to at most hundreds of megabytes of astronomical images: Bayesian posterior inference is notoriously demanding computationally. In this paper, we report on a scalable, parallel version of Celeste, suitable for learning catalogs from modern large-scale astronomical datasets. Our algorithmic innovations include a fast numerical optimization routine for Bayesian posterior inference and a statistically efficient scheme for decomposing astronomical optimization problems into subproblems.
Our scalable implementation is written entirely in Julia, a new high-level dynamic programming language designed for scientific and numerical computing. We use Julia's high-level constructs for shared and distributed memory parallelism, and demonstrate effective load balancing and efficient scaling on up to 8192 Xeon cores on the NERSC Cori supercomputer.
△ Less
Submitted 10 November, 2016;
originally announced November 2016.
-
Fast robustness quantification with variational Bayes
Authors:
Ryan Giordano,
Tamara Broderick,
Rachael Meager,
Jonathan Huggins,
Michael Jordan
Abstract:
Bayesian hierarchical models are increasing popular in economics. When using hierarchical models, it is useful not only to calculate posterior expectations, but also to measure the robustness of these expectations to reasonable alternative prior choices. We use variational Bayes and linear response methods to provide fast, accurate posterior means and robustness measures with an application to mea…
▽ More
Bayesian hierarchical models are increasing popular in economics. When using hierarchical models, it is useful not only to calculate posterior expectations, but also to measure the robustness of these expectations to reasonable alternative prior choices. We use variational Bayes and linear response methods to provide fast, accurate posterior means and robustness measures with an application to measuring the effectiveness of microcredit in the developing world.
△ Less
Submitted 22 June, 2016;
originally announced June 2016.
-
Robust Inference with Variational Bayes
Authors:
Ryan Giordano,
Tamara Broderick,
Michael Jordan
Abstract:
In Bayesian analysis, the posterior follows from the data and a choice of a prior and a likelihood. One hopes that the posterior is robust to reasonable variation in the choice of prior and likelihood, since this choice is made by the modeler and is necessarily somewhat subjective. Despite the fundamental importance of the problem and a considerable body of literature, the tools of robust Bayes ar…
▽ More
In Bayesian analysis, the posterior follows from the data and a choice of a prior and a likelihood. One hopes that the posterior is robust to reasonable variation in the choice of prior and likelihood, since this choice is made by the modeler and is necessarily somewhat subjective. Despite the fundamental importance of the problem and a considerable body of literature, the tools of robust Bayes are not commonly used in practice. This is in large part due to the difficulty of calculating robustness measures from MCMC draws. Although methods for computing robustness measures from MCMC draws exist, they lack generality and often require additional coding or computation.
In contrast to MCMC, variational Bayes (VB) techniques are readily amenable to robustness analysis. The derivative of a posterior expectation with respect to a prior or data perturbation is a measure of local robustness to the prior or likelihood. Because VB casts posterior inference as an optimization problem, its methodology is built on the ability to calculate derivatives of posterior quantities with respect to model parameters, even in very complex models. In the present work, we develop local prior robustness measures for mean-field variational Bayes(MFVB), a VB technique which imposes a particular factorization assumption on the variational posterior approximation. We start by outlining existing local prior measures of robustness. Next, we use these results to derive closed-form measures of the sensitivity of mean-field variational posterior approximation to prior specification. We demonstrate our method on a meta-analysis of randomized controlled interventions in access to microcredit in developing countries.
△ Less
Submitted 8 December, 2015;
originally announced December 2015.
-
Linear Response Methods for Accurate Covariance Estimates from Mean Field Variational Bayes
Authors:
Ryan Giordano,
Tamara Broderick,
Michael Jordan
Abstract:
Mean field variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is that it underestimates the uncertainty of model variables (sometimes severely) and provides no information about model variable covariance.
We generalize linear response methods from statistical physics to deli…
▽ More
Mean field variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is that it underestimates the uncertainty of model variables (sometimes severely) and provides no information about model variable covariance.
We generalize linear response methods from statistical physics to deliver accurate uncertainty estimates for model variables---both for individual variables and coherently across variables. We call our method linear response variational Bayes (LRVB). When the MFVB posterior approximation is in the exponential family, LRVB has a simple, analytic form, even for non-conjugate models. Indeed, we make no assumptions about the form of the true posterior. We demonstrate the accuracy and scalability of our method on a range of models for both simulated and real data.
△ Less
Submitted 23 December, 2015; v1 submitted 12 June, 2015;
originally announced June 2015.
-
Covariance Matrices and Influence Scores for Mean Field Variational Bayes
Authors:
Ryan Giordano,
Tamara Broderick
Abstract:
Mean field variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is that it underestimates the uncertainty of model variables (sometimes severely) and provides no information about model variable covariance. We develop a fast, general methodology for exponential families that aug…
▽ More
Mean field variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is that it underestimates the uncertainty of model variables (sometimes severely) and provides no information about model variable covariance. We develop a fast, general methodology for exponential families that augments MFVB to deliver accurate uncertainty estimates for model variables -- both for individual variables and coherently across variables. MFVB for exponential families defines a fixed-point equation in the means of the approximating posterior, and our approach yields a covariance estimate by perturbing this fixed point. Inspired by linear response theory, we call our method linear response variational Bayes (LRVB). We also show how LRVB can be used to quickly calculate a measure of the influence of individual data points on parameter point estimates. We demonstrate the accuracy and scalability of our method by learning Gaussian mixture models for both simulated and real data.
△ Less
Submitted 26 February, 2015;
originally announced February 2015.
-
Covariance Matrices for Mean Field Variational Bayes
Authors:
Ryan Giordano,
Tamara Broderick
Abstract:
Mean Field Variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is its (sometimes severe) underestimates of the uncertainty of model variables and lack of information about model variable covariance. We develop a fast, general methodology for exponential families that augments M…
▽ More
Mean Field Variational Bayes (MFVB) is a popular posterior approximation method due to its fast runtime on large-scale data sets. However, it is well known that a major failing of MFVB is its (sometimes severe) underestimates of the uncertainty of model variables and lack of information about model variable covariance. We develop a fast, general methodology for exponential families that augments MFVB to deliver accurate uncertainty estimates for model variables -- both for individual variables and coherently across variables. MFVB for exponential families defines a fixed-point equation in the means of the approximating posterior, and our approach yields a covariance estimate by perturbing this fixed point. Inspired by linear response theory, we call our method linear response variational Bayes (LRVB). We demonstrate the accuracy of our method on simulated data sets.
△ Less
Submitted 8 December, 2014; v1 submitted 24 October, 2014;
originally announced October 2014.