-
What can be estimated? Identifiability, estimability, causal inference and ill-posed inverse problems
Authors:
Oliver J. Maclaren,
Ruanui Nicholson
Abstract:
We consider basic conceptual questions concerning the relationship between statistical estimation and causal inference. Firstly, we show how to translate causal inference problems into an abstract statistical formalism without requiring any structure beyond an arbitrarily-indexed family of probability models. The formalism is simple but can incorporate a variety of causal modelling frameworks, inc…
▽ More
We consider basic conceptual questions concerning the relationship between statistical estimation and causal inference. Firstly, we show how to translate causal inference problems into an abstract statistical formalism without requiring any structure beyond an arbitrarily-indexed family of probability models. The formalism is simple but can incorporate a variety of causal modelling frameworks, including 'structural causal models', but also models expressed in terms of, e.g., differential equations. We focus primarily on the structural/graphical causal modelling literature, however. Secondly, we consider the extent to which causal and statistical concerns can be cleanly separated, examining the fundamental question: 'What can be estimated from data?'. We call this the problem of estimability. We approach this by analysing a standard formal definition of 'can be estimated' commonly adopted in the causal inference literature -- identifiability -- in our abstract statistical formalism. We use elementary category theory to show that identifiability implies the existence of a Fisher-consistent estimator, but also show that this estimator may be discontinuous, and thus unstable, in general. This difficulty arises because the causal inference problem is, in general, an ill-posed inverse problem. Inverse problems have three conditions which must be satisfied to be considered well-posed: existence, uniqueness, and stability of solutions. Here identifiability corresponds to the question of uniqueness; in contrast, we take estimability to mean satisfaction of all three conditions, i.e. well-posedness. Lack of stability implies that naive translation of a causally identifiable quantity into an achievable statistical estimation target may prove impossible. Our article is primarily expository and aimed at unifying ideas from multiple fields, though we provide new constructions and proofs.
△ Less
Submitted 20 July, 2020; v1 submitted 4 April, 2019;
originally announced April 2019.
-
Incorporating Posterior-Informed Approximation Errors into a Hierarchical Framework to Facilitate Out-of-the-Box MCMC Sampling for Geothermal Inverse Problems and Uncertainty Quantification
Authors:
Oliver J. Maclaren,
Ruanui Nicholson,
Elvar K. Bjarkason,
John P. O'Sullivan,
Michael J. O'Sullivan
Abstract:
We consider geothermal inverse problems and uncertainty quantification from a Bayesian perspective. Our main goal is to make standard, `out-of-the-box' Markov chain Monte Carlo (MCMC) sampling more feasible for complex simulation models by using suitable approximations. To do this, we first show how to pose both the inverse and prediction problems in a hierarchical Bayesian framework. We then show…
▽ More
We consider geothermal inverse problems and uncertainty quantification from a Bayesian perspective. Our main goal is to make standard, `out-of-the-box' Markov chain Monte Carlo (MCMC) sampling more feasible for complex simulation models by using suitable approximations. To do this, we first show how to pose both the inverse and prediction problems in a hierarchical Bayesian framework. We then show how to incorporate so-called posterior-informed model approximation error into this hierarchical framework, using a modified form of the Bayesian approximation error (BAE) approach. This enables the use of a `coarse', approximate model in place of a finer, more expensive model, while accounting for the additional uncertainty and potential bias that this can introduce. Our method requires only simple probability modelling, a relatively small number of fine model simulations, and only modifies the target posterior -- any standard MCMC sampling algorithm can be used to sample the new posterior. These corrections can also be used in methods that are not based on MCMC sampling. We show that our approach can achieve significant computational speed-ups on two geothermal test problems. We also demonstrate the dangers of naively using coarse, approximate models in place of finer models, without accounting for the induced approximation errors. The naive approach tends to give overly confident and biased posteriors while incorporating BAE into our hierarchical framework corrects for this while maintaining computational efficiency and ease-of-use.
△ Less
Submitted 19 December, 2019; v1 submitted 9 October, 2018;
originally announced October 2018.
-
Is profile likelihood a true likelihood? An argument in favor
Authors:
Oliver J. Maclaren
Abstract:
Profile likelihood is the key tool for dealing with nuisance parameters in likelihood theory. It is often asserted, however, that profile likelihood is not a 'true' likelihood. One implication is that likelihood theory lacks the generality of e.g. Bayesian inference, wherein marginalization is the universal tool for dealing with nuisance parameters. Here we argue that profile likelihood has as muc…
▽ More
Profile likelihood is the key tool for dealing with nuisance parameters in likelihood theory. It is often asserted, however, that profile likelihood is not a 'true' likelihood. One implication is that likelihood theory lacks the generality of e.g. Bayesian inference, wherein marginalization is the universal tool for dealing with nuisance parameters. Here we argue that profile likelihood has as much claim to being a true likelihood as a marginal probability has to being a true probability distribution. The crucial point we argue is that a likelihood function is naturally interpreted as a maxitive possibility measure: given this, the associated theory of integration with respect to maxitive measures delivers profile likelihood as the direct analogue of marginal probability in additive measure theory. Thus, given a background likelihood function, we argue that profiling over the likelihood function is as natural (or as unnatural, as the case may be) as marginalizing over a background probability measure. The connections to Bayesian inference can also be further clarified with the introduction of a suitable logarithmic distance function, in which case the present theory can be naturally described as 'Tropical Bayes' in the sense of tropical algebra.
△ Less
Submitted 5 July, 2018; v1 submitted 12 January, 2018;
originally announced January 2018.
-
Randomized Truncated SVD Levenberg-Marquardt Approach to Geothermal Natural State and History Matching
Authors:
Elvar K. Bjarkason,
Oliver J. Maclaren,
John P. O'Sullivan,
Michael J. O'Sullivan
Abstract:
The Levenberg-Marquardt (LM) method is commonly used for inverting models used to describe geothermal, groundwater, or oil and gas reservoirs. In previous studies LM parameter updates have been made tractable for highly parameterized inverse problems with large data sets by applying matrix factorization methods or iterative linear solvers to approximately solve the update equations.
Some studies…
▽ More
The Levenberg-Marquardt (LM) method is commonly used for inverting models used to describe geothermal, groundwater, or oil and gas reservoirs. In previous studies LM parameter updates have been made tractable for highly parameterized inverse problems with large data sets by applying matrix factorization methods or iterative linear solvers to approximately solve the update equations.
Some studies have shown that basing model updates on the truncated singular value decomposition (TSVD) of a dimensionless sensitivity matrix achieved using Lanczos iteration can speed up the inversion of reservoir models. Lanczos iterations only require the sensitivity matrix times a vector and its transpose times a vector, which are found efficiently using adjoint and direct simulations without the expense of forming a large sensitivity matrix.
Nevertheless, Lanczos iteration has the drawback of being a serial process, requiring a separate adjoint solve and direct solve every Lanczos iteration. Randomized methods, developed for low-rank matrix approximation of large matrices, are more efficient alternatives to the standard Lanczos method. Here we develop LM variants which use randomized methods to find a TSVD of a dimensionless sensitivity matrix when updating parameters. The randomized approach offers improved efficiency by enabling simultaneous solution of all adjoint and direct problems for a parameter update.
△ Less
Submitted 3 October, 2017;
originally announced October 2017.