-
Localized covariance estimation: A Bayesian perspective
Authors:
Robert J. Webber,
Matthias Morzfeld
Abstract:
A major problem in numerical weather prediction (NWP) is the estimation of high-dimensional covariance matrices from a small number of samples. Maximum likelihood estimators cannot provide reliable estimates when the overall dimension is much larger than the number of samples. Fortunately, NWP practitioners have found ingenious ways to boost the accuracy of their covariance estimators by leveragin…
▽ More
A major problem in numerical weather prediction (NWP) is the estimation of high-dimensional covariance matrices from a small number of samples. Maximum likelihood estimators cannot provide reliable estimates when the overall dimension is much larger than the number of samples. Fortunately, NWP practitioners have found ingenious ways to boost the accuracy of their covariance estimators by leveraging the assumption that the correlations decay with spatial distance. In this work, Bayesian statistics is used to provide a new justification and analysis of the practical NWP covariance estimators. The Bayesian framework involves manipulating distributions over symmetric positive definite matrices, and it leads to two main findings: (i) the commonly used "hybrid estimator" for the covariance matrix has a naturally Bayesian interpretation; (ii) the very commonly used "Schur product estimator" is not Bayesian, but it can be studied and understood within the Bayesian framework. As practical implications, the Bayesian framework shows how to reduce the amount of tuning required for covariance estimation, and it suggests that efficient covariance estimation should be rooted in understanding and penalizing conditional correlations, rather than correlations.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
Localization in Ensemble Kalman inversion
Authors:
Xin T. Tong,
Matthias Morzfeld
Abstract:
Ensemble Kalman inversion (EKI) is a technique for the numerical solution of inverse problems. A great advantage of the EKI's ensemble approach is that derivatives are not required in its implementation. But theoretically speaking, EKI's ensemble size needs to surpass the dimension of the problem. This is because of EKI's "subspace property", i.e., that the EKI solution is a linear combination of…
▽ More
Ensemble Kalman inversion (EKI) is a technique for the numerical solution of inverse problems. A great advantage of the EKI's ensemble approach is that derivatives are not required in its implementation. But theoretically speaking, EKI's ensemble size needs to surpass the dimension of the problem. This is because of EKI's "subspace property", i.e., that the EKI solution is a linear combination of the initial ensemble it starts off with. We show that the ensemble can break out of this initial subspace when "localization" is applied. In essence, localization enforces an assumed correlation structure onto the problem, and is heavily used in ensemble Kalman filtering and data assimilation. We describe and analyze how to apply localization to the EKI, and how localization helps the EKI ensemble break out of the initial subspace. Specifically, we show that the localized EKI (LEKI) ensemble will collapse to a single point (as intended) and that the LEKI ensemble mean will converge to the global optimum at a sublinear rate. Under strict assumptions on the localization procedure and observation process, we further show that the data misfit decays uniformly. We illustrate our ideas and theoretical developments with numerical examples with simplified toy problems, a Lorenz model, and an inversion of electromagnetic data, where some of our mathematical assumptions may only be approximately valid.
△ Less
Submitted 31 January, 2022; v1 submitted 26 January, 2022;
originally announced January 2022.
-
MALA-within-Gibbs samplers for high-dimensional distributions with sparse conditional structure
Authors:
X. T. Tong,
M. Morzfeld,
Y. M. Marzouk
Abstract:
Markov chain Monte Carlo (MCMC) samplers are numerical methods for drawing samples from a given target probability distribution. We discuss one particular MCMC sampler, the MALA-within-Gibbs sampler, from the theoretical and practical perspectives. We first show that the acceptance ratio and step size of this sampler are independent of the overall problem dimension when (i) the target distribution…
▽ More
Markov chain Monte Carlo (MCMC) samplers are numerical methods for drawing samples from a given target probability distribution. We discuss one particular MCMC sampler, the MALA-within-Gibbs sampler, from the theoretical and practical perspectives. We first show that the acceptance ratio and step size of this sampler are independent of the overall problem dimension when (i) the target distribution has sparse conditional structure, and (ii) this structure is reflected in the partial updating strategy of MALA-within-Gibbs. If, in addition, the target density is block-wise log-concave, then the sampler's convergence rate is independent of dimension. From a practical perspective, we expect that MALA-within-Gibbs is useful for solving high-dimensional Bayesian inference problems where the posterior exhibits sparse conditional structure at least approximately. In this context, a partitioning of the state that correctly reflects the sparse conditional structure must be found, and we illustrate this process in two numerical examples. We also discuss trade-offs between the block size used for partial updating and computational requirements that may increase with the number of blocks.
△ Less
Submitted 18 March, 2020; v1 submitted 25 August, 2019;
originally announced August 2019.
-
Localization for MCMC: sampling high-dimensional posterior distributions with local structure
Authors:
Matthias Morzfeld,
Xin T. Tong,
Youssef M. Marzouk
Abstract:
We investigate how ideas from covariance localization in numerical weather prediction can be used in Markov chain Monte Carlo (MCMC) sampling of high-dimensional posterior distributions arising in Bayesian inverse problems. To localize an inverse problem is to enforce an anticipated "local" structure by (i) neglecting small off-diagonal elements of the prior precision and covariance matrices; and…
▽ More
We investigate how ideas from covariance localization in numerical weather prediction can be used in Markov chain Monte Carlo (MCMC) sampling of high-dimensional posterior distributions arising in Bayesian inverse problems. To localize an inverse problem is to enforce an anticipated "local" structure by (i) neglecting small off-diagonal elements of the prior precision and covariance matrices; and (ii) restricting the influence of observations to their neighborhood. For linear problems we can specify the conditions under which posterior moments of the localized problem are close to those of the original problem. We explain physical interpretations of our assumptions about local structure and discuss the notion of high dimensionality in local problems, which is different from the usual notion of high dimensionality in function space MCMC. The Gibbs sampler is a natural choice of MCMC algorithm for localized inverse problems and we demonstrate that its convergence rate is independent of dimension for localized linear problems. Nonlinear problems can also be tackled efficiently by localization and, as a simple illustration of these ideas, we present a localized Metropolis-within-Gibbs sampler. Several linear and nonlinear numerical examples illustrate localization in the context of MCMC samplers for inverse problems.
△ Less
Submitted 8 January, 2019; v1 submitted 20 October, 2017;
originally announced October 2017.
-
Symmetrized importance samplers for stochastic differential equations
Authors:
Andrew Leach,
Kevin K. Lin,
Matthias Morzfeld
Abstract:
We study a class of importance sampling methods for stochastic differential equations (SDEs). A small-noise analysis is performed, and the results suggest that a simple symmetrization procedure can significantly improve the performance of our importance sampling schemes when the noise is not too large. We demonstrate that this is indeed the case for a number of linear and nonlinear examples. Poten…
▽ More
We study a class of importance sampling methods for stochastic differential equations (SDEs). A small-noise analysis is performed, and the results suggest that a simple symmetrization procedure can significantly improve the performance of our importance sampling schemes when the noise is not too large. We demonstrate that this is indeed the case for a number of linear and nonlinear examples. Potential applications, e.g., data assimilation, are discussed.
△ Less
Submitted 28 March, 2018; v1 submitted 10 July, 2017;
originally announced July 2017.
-
Iterative importance sampling algorithms for parameter estimation
Authors:
Matthias Morzfeld,
Marcus S. Day,
Ray W. Grout,
George Shu Heng Pau,
Stefan A. Finsterle,
John B. Bell
Abstract:
In parameter estimation problems one computes a posterior distribution over uncertain parameters defined jointly by a prior distribution, a model, and noisy data. Markov Chain Monte Carlo (MCMC) is often used for the numerical solution of such problems. An alternative to MCMC is importance sampling, which can exhibit near perfect scaling with the number of cores on high performance computing syste…
▽ More
In parameter estimation problems one computes a posterior distribution over uncertain parameters defined jointly by a prior distribution, a model, and noisy data. Markov Chain Monte Carlo (MCMC) is often used for the numerical solution of such problems. An alternative to MCMC is importance sampling, which can exhibit near perfect scaling with the number of cores on high performance computing systems because samples are drawn independently. However, finding a suitable proposal distribution is a challenging task. Several sampling algorithms have been proposed over the past years that take an iterative approach to constructing a proposal distribution. We investigate the applicability of such algorithms by applying them to two realistic and challenging test problems, one in subsurface flow, and one in combustion modeling. More specifically, we implement importance sampling algorithms that iterate over the mean and covariance matrix of Gaussian or multivariate t-proposal distributions. Our implementation leverages massively parallel computers, and we present strategies to initialize the iterations using "coarse" MCMC runs or Gaussian mixture models.
△ Less
Submitted 14 November, 2017; v1 submitted 5 August, 2016;
originally announced August 2016.
-
What the collapse of the ensemble Kalman filter tells us about particle filters
Authors:
Matthias Morzfeld,
Daniel Hodyss,
Chris Snyder
Abstract:
The ensemble Kalman filter (EnKF) is a reliable data assimilation tool for high-dimensional meteorological problems. On the other hand, the EnKF can be interpreted as a particle filter, and particle filters collapse in high-dimensional problems. We explain that these seemingly contradictory statements offer insights about how particle filters function in certain high-dimensional problems, and in p…
▽ More
The ensemble Kalman filter (EnKF) is a reliable data assimilation tool for high-dimensional meteorological problems. On the other hand, the EnKF can be interpreted as a particle filter, and particle filters collapse in high-dimensional problems. We explain that these seemingly contradictory statements offer insights about how particle filters function in certain high-dimensional problems, and in particular support recent efforts in meteorology to "localize" particle filters, i.e., to restrict the influence of an observation to its neighborhood.
△ Less
Submitted 31 May, 2016; v1 submitted 11 December, 2015;
originally announced December 2015.
-
Sampling, feasibility, and priors in Bayesian estimation
Authors:
Alexandre J. Chorin,
Fei Lu,
Robert N. Miller,
Matthias Morzfeld,
Xuemin Tu
Abstract:
Importance sampling algorithms are discussed in detail, with an emphasis on implicit sampling, and applied to data assimilation via particle filters. Implicit sampling makes it possible to use the data to find high-probability samples at relatively low cost, making the assimilation more efficient. A new analysis of the feasibility of data assimilation is presented, showing in detail why feasibilit…
▽ More
Importance sampling algorithms are discussed in detail, with an emphasis on implicit sampling, and applied to data assimilation via particle filters. Implicit sampling makes it possible to use the data to find high-probability samples at relatively low cost, making the assimilation more efficient. A new analysis of the feasibility of data assimilation is presented, showing in detail why feasibility depends on the Frobenius norm of the covariance matrix of the noise and not on the number of variables. A discussion of the convergence of particular particle filters follows. A major open problem in numerical data assimilation is the determination of appropriate priors, a progress report on recent work on this problem is given. The analysis highlights the need for a careful attention both to the data and to the physics in data assimilation problems.
△ Less
Submitted 29 May, 2015;
originally announced June 2015.
-
Analysis of the ensemble Kalman filter for marginal and joint posteriors
Authors:
Matthias Morzfeld,
Daniel Hodyss
Abstract:
The ensemble Kalman filter (EnKF) is widely used to sample a probability density function (pdf) generated by a stochastic model conditioned by noisy data. This pdf can be either a joint posterior that describes the evolution of the state of the system in time, conditioned on all the data up to the present, or a particular marginal of this posterior. We show that the EnKF collapses in the same way…
▽ More
The ensemble Kalman filter (EnKF) is widely used to sample a probability density function (pdf) generated by a stochastic model conditioned by noisy data. This pdf can be either a joint posterior that describes the evolution of the state of the system in time, conditioned on all the data up to the present, or a particular marginal of this posterior. We show that the EnKF collapses in the same way and under even broader conditions as a particle filter when it samples the joint posterior. However, this does not imply that EnKF collapses when it samples the marginal posterior. We we show that a localized and inflated EnKF can efficiently sample this marginal, and argue that the marginal posterior is often the more useful pdf in geophysics. This explains the wide applicability of EnKF in this field. We further investigate the typical tuning of EnKF, in which one attempts to match the mean square error (MSE) to the marginal posterior variance, and show that sampling error may be huge, even if the MSE is moderate.
△ Less
Submitted 5 August, 2016; v1 submitted 23 March, 2015;
originally announced March 2015.
-
Small-noise analysis and symmetrization of implicit Monte Carlo samplers
Authors:
Jonathan Goodman,
Kevin K. Lin,
Matthias Morzfeld
Abstract:
Implicit samplers are algorithms for producing independent, weighted samples from multi-variate probability distributions. These are often applied in Bayesian data assimilation algorithms. We use Laplace asymptotic expansions to analyze two implicit samplers in the small noise regime. Our analysis suggests a symmetrization of the algo- rithms that leads to improved (implicit) sampling schemes at a…
▽ More
Implicit samplers are algorithms for producing independent, weighted samples from multi-variate probability distributions. These are often applied in Bayesian data assimilation algorithms. We use Laplace asymptotic expansions to analyze two implicit samplers in the small noise regime. Our analysis suggests a symmetrization of the algo- rithms that leads to improved (implicit) sampling schemes at a rel- atively small additional cost. Computational experiments confirm the theory and show that symmetrization is effective for small noise sampling problems.
△ Less
Submitted 22 October, 2014;
originally announced October 2014.
-
Limitations of polynomial chaos expansions in the Bayesian solution of inverse problems
Authors:
Fei Lu,
Matthias Morzfeld,
Xuemin Tu,
Alexandre J. Chorin
Abstract:
Polynomial chaos expansions are used to reduce the computational cost in the Bayesian solutions of inverse problems by creating a surrogate posterior that can be evaluated inexpensively. We show, by analysis and example, that when the data contain significant information beyond what is assumed in the prior, the surrogate posterior can be very different from the posterior, and the resulting estimat…
▽ More
Polynomial chaos expansions are used to reduce the computational cost in the Bayesian solutions of inverse problems by creating a surrogate posterior that can be evaluated inexpensively. We show, by analysis and example, that when the data contain significant information beyond what is assumed in the prior, the surrogate posterior can be very different from the posterior, and the resulting estimates become inaccurate. One can improve the accuracy by adaptively increasing the order of the polynomial chaos, but the cost may increase too fast for this to be cost effective compared to Monte Carlo sampling without a surrogate posterior.
△ Less
Submitted 24 October, 2014; v1 submitted 28 April, 2014;
originally announced April 2014.
-
Conditions for successful data assimilation
Authors:
Alexandre J. Chorin,
Matthias Morzfeld
Abstract:
We show, using idealized models, that numerical data assimilation can be successful only if an effective dimension of the problem is not excessive. This effective dimension depends on the noise in the model and the data, and in physically reasonable problems it can be moderate even when the number of variables is huge. We then analyze several data assimilation algorithms, including particle filter…
▽ More
We show, using idealized models, that numerical data assimilation can be successful only if an effective dimension of the problem is not excessive. This effective dimension depends on the noise in the model and the data, and in physically reasonable problems it can be moderate even when the number of variables is huge. We then analyze several data assimilation algorithms, including particle filters and variational methods. We show that well-designed particle filters can solve most of those data assimilation problems that can be solved in principle, and compare the conditions under which variational methods can succeed to the conditions required of particle filters. We also discuss the limitations of our analysis.
△ Less
Submitted 22 October, 2014; v1 submitted 11 March, 2013;
originally announced March 2013.