-
Sacred and Profane: from the Involutive Theory of MCMC to Helpful Hamiltonian Hacks
Authors:
Nathan E. Glatt-Holtz,
Andrew J. Holbrook,
Justin A. Krometis,
Cecilia F. Mondaini,
Ami Sheth
Abstract:
In the first edition of this Handbook, two remarkable chapters consider seemingly distinct yet deeply connected subjects ...
In the first edition of this Handbook, two remarkable chapters consider seemingly distinct yet deeply connected subjects ...
△ Less
Submitted 29 October, 2024; v1 submitted 22 October, 2024;
originally announced October 2024.
-
A tale of two faults: Statistical reconstruction of the 1820 Flores Sea earthquake using tsunami observations alone
Authors:
T. Paskett,
J. P. Whitehead,
R. A. Harris,
C. Ashcroft,
J. A. Krometis,
I. Sorensen,
R. Wonnacott
Abstract:
Using a Bayesian approach we compare anecdotal tsunami runup observations from the 29 December 1820 Flores Sea earthquake with close to 200,000 tsunami simulations to determine the most probable earthquake parameters causing the tsunami. Using a dual hypothesis of the source earthquake either originating from the Flores Thrust or the Walanae/Selayar Fault, we found that neither source perfectly ma…
▽ More
Using a Bayesian approach we compare anecdotal tsunami runup observations from the 29 December 1820 Flores Sea earthquake with close to 200,000 tsunami simulations to determine the most probable earthquake parameters causing the tsunami. Using a dual hypothesis of the source earthquake either originating from the Flores Thrust or the Walanae/Selayar Fault, we found that neither source perfectly matches the observational data, particularly while satisfying seismic constraints of the region. However, there is clear quantitative evidence that a major earthquake on the Walanae/Selayar Fault more closely aligns with historical records of the tsunami, and earthquake shaking. The simulated data available from this study alludes to the potential for a different source in the region or the occurrence of an earthquake near where both faults potentially merge and simultaneously rupture similar to the 2016 Kaikoura, New Zealand event.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
A Statistical Framework for Domain Shape Estimation in Stokes Flows
Authors:
Jeff Borggaard,
Nathan E. Glatt-Holtz,
Justin A. Krometis
Abstract:
We develop and implement a Bayesian approach for the estimation of the shape of a two dimensional annular domain enclosing a Stokes flow from sparse and noisy observations of the enclosed fluid. Our setup includes the case of direct observations of the flow field as well as the measurement of concentrations of a solute passively advected by and diffusing within the flow. Adopting a statistical app…
▽ More
We develop and implement a Bayesian approach for the estimation of the shape of a two dimensional annular domain enclosing a Stokes flow from sparse and noisy observations of the enclosed fluid. Our setup includes the case of direct observations of the flow field as well as the measurement of concentrations of a solute passively advected by and diffusing within the flow. Adopting a statistical approach provides estimates of uncertainty in the shape due both to the non-invertibility of the forward map and to error in the measurements. When the shape represents a design problem of attempting to match desired target outcomes, this "uncertainty" can be interpreted as identifying remaining degrees of freedom available to the designer. We demonstrate the viability of our framework on three concrete test problems. These problems illustrate the promise of our framework for applications while providing a collection of test cases for recently developed Markov Chain Monte Carlo (MCMC) algorithms designed to resolve infinite dimensional statistical quantities.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Parallel MCMC Algorithms: Theoretical Foundations, Algorithm Design, Case Studies
Authors:
Nathan E. Glatt-Holtz,
Andrew J. Holbrook,
Justin A. Krometis,
Cecilia F. Mondaini
Abstract:
Parallel Markov Chain Monte Carlo (pMCMC) algorithms generate clouds of proposals at each step to efficiently resolve a target probability distribution. We build a rigorous foundational framework for pMCMC algorithms that situates these methods within a unified 'extended phase space' measure-theoretic formalism. Drawing on our recent work that provides a comprehensive theory for reversible single…
▽ More
Parallel Markov Chain Monte Carlo (pMCMC) algorithms generate clouds of proposals at each step to efficiently resolve a target probability distribution. We build a rigorous foundational framework for pMCMC algorithms that situates these methods within a unified 'extended phase space' measure-theoretic formalism. Drawing on our recent work that provides a comprehensive theory for reversible single proposal methods, we herein derive general criteria for multiproposal acceptance mechanisms which yield ergodic chains on general state spaces. Our formulation encompasses a variety of methodologies, including proposal cloud resampling and Hamiltonian methods, while providing a basis for the derivation of novel algorithms. In particular, we obtain a top-down picture for a class of methods arising from 'conditionally independent' proposal structures. As an immediate application, we identify several new algorithms including a multiproposal version of the popular preconditioned Crank-Nicolson (pCN) sampler suitable for high- and infinite-dimensional target measures which are absolutely continuous with respect to a Gaussian base measure. To supplement our theoretical results, we carry out a selection of numerical case studies that evaluate the efficacy of these novel algorithms. First, noting that the true potential of pMCMC algorithms arises from their natural parallelizability, we provide a limited parallelization study using TensorFlow and a graphics processing unit to scale pMCMC algorithms that leverage as many as 100k proposals at each step. Second, we use our multiproposal pCN algorithm (mpCN) to resolve a selection of problems in Bayesian statistical inversion for partial differential equations motivated by fluid measurement. These examples provide preliminary evidence of the efficacy of mpCN for high-dimensional target distributions featuring complex geometries and multimodal structures.
△ Less
Submitted 17 July, 2024; v1 submitted 10 September, 2022;
originally announced September 2022.
-
Embracing Uncertainty in "Small Data" Problems: Estimating Earthquakes from Historical Anecdotes
Authors:
Nathan E. Glatt-Holtz,
Ronald A. Harris,
Andrew J. Holbrook,
Justin A. Krometis,
Yonatan Kurniawan,
Hayden Ringer,
Jared P. Whitehead
Abstract:
Seismic risk estimates will be vastly improved with an increased understanding of historical (and pre-historical) seismic events. However the only existing data for these events is anecdotal and sparse. To address this we developed a framework based on Bayesian inference to estimate the location and magnitude of pre-instrumental earthquakes. We present a careful analysis of results obtained from t…
▽ More
Seismic risk estimates will be vastly improved with an increased understanding of historical (and pre-historical) seismic events. However the only existing data for these events is anecdotal and sparse. To address this we developed a framework based on Bayesian inference to estimate the location and magnitude of pre-instrumental earthquakes. We present a careful analysis of results obtained from this procedure which justifies the sampling algorithm, its convergence to the resultant posterior distribution, and yields estimates on uncertainties in the relevant quantities. Using a priori estimates on the posterior and numerical approximations of the Hessian, we demonstrate that the 1852 Banda Sea earthquake and tsunami is indeed well-understood given certain explicit hypotheses. Using the same techniques we also find that the 1820 south Sulawesi event may best be explained by a dual fault rupture, best attributed to the Kalatoa fault potentially conjoining the Flores thrust and Walanae/Selayar fault.
△ Less
Submitted 20 January, 2025; v1 submitted 14 June, 2021;
originally announced June 2021.
-
On the accept-reject mechanism for Metropolis-Hastings algorithms
Authors:
Nathan E. Glatt-Holtz,
Justin A. Krometis,
Cecilia F. Mondaini
Abstract:
This work develops a powerful and versatile framework for determining acceptance ratios in Metropolis-Hastings type Markov kernels widely used in statistical sampling problems. Our approach allows us to derive new classes of kernels which unify random walk or diffusion-type sampling methods with more complicated "extended phase space" algorithms based around ideas from Hamiltonian dynamics. Our st…
▽ More
This work develops a powerful and versatile framework for determining acceptance ratios in Metropolis-Hastings type Markov kernels widely used in statistical sampling problems. Our approach allows us to derive new classes of kernels which unify random walk or diffusion-type sampling methods with more complicated "extended phase space" algorithms based around ideas from Hamiltonian dynamics. Our starting point is an abstract result developed in the generality of measurable state spaces that addresses proposal kernels that possess a certain involution structure. Note that, while this underlying proposal structure suggests a scope which includes Hamiltonian-type kernels, we demonstrate that our abstract result is, in an appropriate sense, equivalent to an earlier general state space setting developed in [Tierney, Annals of Applied Probability, 1998] where the connection to Hamiltonian methods was more obscure. Altogether, the theoretical unity and reach of our main result provides a basis for deriving novel sampling algorithms while laying bare important relationships between existing methods.
△ Less
Submitted 19 July, 2021; v1 submitted 9 November, 2020;
originally announced November 2020.
-
On Bayesian Consistency for Flows Observed Through a Passive Scalar
Authors:
Jeff Borggaard,
Nathan E. Glatt-Holtz,
Justin A. Krometis
Abstract:
We consider the statistical inverse problem of estimating a background fluid flow field $\mathbf{v}$ from the partial, noisy observations of the concentration $θ$ of a substance passively advected by the fluid, so that $θ$ is governed by the partial differential equation \[ \frac{\partial}{\partial t}θ(t,\mathbf{x}) = -\mathbf{v}(\mathbf{x}) \cdot \nabla θ(t,\mathbf{x}) + κΔθ(t,\mathbf{x}) \quad \…
▽ More
We consider the statistical inverse problem of estimating a background fluid flow field $\mathbf{v}$ from the partial, noisy observations of the concentration $θ$ of a substance passively advected by the fluid, so that $θ$ is governed by the partial differential equation \[ \frac{\partial}{\partial t}θ(t,\mathbf{x}) = -\mathbf{v}(\mathbf{x}) \cdot \nabla θ(t,\mathbf{x}) + κΔθ(t,\mathbf{x}) \quad \text{ , } \quad θ(0,\mathbf{x}) = θ_0(\mathbf{x}) \] for $t \in [0,T], T>0$ and $\mathbf{x} \in \mathbb{T}=[0,1]^2$. The initial condition $θ_0$ and diffusion coefficient $κ$ are assumed to be known and the data consist of point observations of the scalar field $θ$ corrupted by additive, i.i.d. Gaussian noise. We adopt a Bayesian approach to this estimation problem and establish that the inference is consistent, i.e., that the posterior measure identifies the true background flow as the number of scalar observations grows large. Since the inverse map is ill-defined for some classes of problems even for perfect, infinite measurements of $θ$, multiple experiments (initial conditions) are required to resolve the true fluid flow. Under this assumption, suitable conditions on the observation points, and given support and tail conditions on the prior measure, we show that the posterior measure converges to a Dirac measure centered on the true flow as the number of observations goes to infinity.
△ Less
Submitted 12 September, 2019; v1 submitted 13 September, 2018;
originally announced September 2018.
-
GPU-Accelerated Particle Methods for Evaluation of Sparse Observations for PDE-Constrained Inverse Problems
Authors:
Jeff Borggaard,
Nathan E. Glatt-Holtz,
Justin A. Krometis
Abstract:
We consider the inverse problem of estimating parameters of a driven diffusion (e.g., the underlying fluid flow, diffusion coefficient, or source terms) from point measurements of a passive scalar (e.g., the concentration of a pollutant). We present two particle methods that leverage the structure of the inverse problem to enable efficient computation of the forward map, one for time evolution pro…
▽ More
We consider the inverse problem of estimating parameters of a driven diffusion (e.g., the underlying fluid flow, diffusion coefficient, or source terms) from point measurements of a passive scalar (e.g., the concentration of a pollutant). We present two particle methods that leverage the structure of the inverse problem to enable efficient computation of the forward map, one for time evolution problems and one for a Dirichlet boundary-value problem. The methods scale in a natural fashion to modern computational architectures, enabling substantial speedup for applications involving sparse observations and high-dimensional unknowns. Numerical examples of applications to Bayesian inference and numerical optimization are provided.
△ Less
Submitted 30 August, 2018;
originally announced August 2018.
-
A Bayesian Approach to Estimating Background Flows from a Passive Scalar
Authors:
Jeff Borggaard,
Nathan E. Glatt-Holtz,
Justin A. Krometis
Abstract:
We consider the statistical inverse problem of estimating a background flow field (e.g., of air or water) from the partial and noisy observation of a passive scalar (e.g., the concentration of a solute), a common experimental approach to visualizing complex fluid flows. Here the unknown is a vector field that is specified by a large or infinite number of degrees of freedom. Since the inverse probl…
▽ More
We consider the statistical inverse problem of estimating a background flow field (e.g., of air or water) from the partial and noisy observation of a passive scalar (e.g., the concentration of a solute), a common experimental approach to visualizing complex fluid flows. Here the unknown is a vector field that is specified by a large or infinite number of degrees of freedom. Since the inverse problem is ill-posed, i.e., there may be many or no background flows that match a given set of observations, we adopt a Bayesian approach to regularize it. In doing so, we leverage frameworks developed in recent years for infinite-dimensional Bayesian inference. The contributions in this work are threefold. First, we lay out a functional analytic and Bayesian framework for approaching this problem. Second, we define an adjoint method for efficient computation of the gradient of the log likelihood, a key ingredient in many numerical methods. Finally, we identify interesting example problems that exhibit posterior measures with simple and complex structure. We use these examples to conduct a large-scale benchmark of Markov Chain Monte Carlo methods developed in recent years for infinite-dimensional settings. Our results indicate that these methods are capable of resolving complex multimodal posteriors in high dimensions.
△ Less
Submitted 10 June, 2019; v1 submitted 3 August, 2018;
originally announced August 2018.