-
Robust Bayesian causal estimation for causal inference in medical diagnosis
Authors:
Tathagata Basu,
Matthias C. M. Troffaes
Abstract:
Causal effect estimation is a critical task in statistical learning that aims to find the causal effect on subjects by identifying causal links between a number of predictor (or, explanatory) variables and the outcome of a treatment. In a regressional framework, we assign a treatment and outcome model to estimate the average causal effect. Additionally, for high dimensional regression problems, va…
▽ More
Causal effect estimation is a critical task in statistical learning that aims to find the causal effect on subjects by identifying causal links between a number of predictor (or, explanatory) variables and the outcome of a treatment. In a regressional framework, we assign a treatment and outcome model to estimate the average causal effect. Additionally, for high dimensional regression problems, variable selection methods are also used to find a subset of predictor variables that maximises the predictive performance of the underlying model for better estimation of the causal effect. In this paper, we propose a different approach. We focus on the variable selection aspects of high dimensional causal estimation problem. We suggest a cautious Bayesian group LASSO (least absolute shrinkage and selection operator) framework for variable selection using prior sensitivity analysis. We argue that in some cases, abstaining from selecting (or, rejecting) a predictor is beneficial and we should gather more information to obtain a more decisive result. We also show that for problems with very limited information, expert elicited variable selection can give us a more stable causal effect estimation as it avoids overfitting. Lastly, we carry a comparative study with synthetic dataset and show the applicability of our method in real-life situations.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
Iterative importance sampling with Markov chain Monte Carlo sampling in robust Bayesian analysis
Authors:
Ivette Raices Cruz,
Johan Lindström,
Matthias C. M. Troffaes,
Ullrika Sahlin
Abstract:
Bayesian inference under a set of priors, called robust Bayesian analysis, allows for estimation of parameters within a model and quantification of epistemic uncertainty in quantities of interest by bounded (or imprecise) probability. Iterative importance sampling can be used to estimate bounds on the quantity of interest by optimizing over the set of priors. A method for iterative importance samp…
▽ More
Bayesian inference under a set of priors, called robust Bayesian analysis, allows for estimation of parameters within a model and quantification of epistemic uncertainty in quantities of interest by bounded (or imprecise) probability. Iterative importance sampling can be used to estimate bounds on the quantity of interest by optimizing over the set of priors. A method for iterative importance sampling when the robust Bayesian inference rely on Markov chain Monte Carlo (MCMC) sampling is proposed. To accommodate the MCMC sampling in iterative importance sampling, a new expression for the effective sample size of the importance sampling is derived, which accounts for the correlation in the MCMC samples. To illustrate the proposed method for robust Bayesian analysis, iterative importance sampling with MCMC sampling is applied to estimate the lower bound of the overall effect in a previously published meta-analysis with a random effects model. The performance of the method compared to a grid search method and under different degrees of prior-data conflict is also explored.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
A robust Bayesian analysis of variable selection under prior ignorance
Authors:
Tathagata Basu,
Matthias C. M. Troffaes,
Jochen Einbeck
Abstract:
We propose a cautious Bayesian variable selection routine by investigating the sensitivity of a hierarchical model, where the regression coefficients are specified by spike and slab priors. We exploit the use of latent variables to understand the importance of the co-variates. These latent variables also allow us to obtain the size of the model space which is an important aspect of high dimensiona…
▽ More
We propose a cautious Bayesian variable selection routine by investigating the sensitivity of a hierarchical model, where the regression coefficients are specified by spike and slab priors. We exploit the use of latent variables to understand the importance of the co-variates. These latent variables also allow us to obtain the size of the model space which is an important aspect of high dimensional problems. In our approach, instead of fixing a single prior, we adopt a specific type of robust Bayesian analysis, where we consider a set of priors within the same parametric family to specify the selection probabilities of these latent variables. We achieve that by considering a set of expected prior selection probabilities, which allows us to perform a sensitivity analysis to understand the effect of prior elicitation on the variable selection. The sensitivity analysis provides us sets of posteriors for the regression coefficients as well as the selection indicators and we show that the posterior odds of the model selection probabilities are monotone with respect to the prior expectations of the selection probabilities. We also analyse synthetic and real life datasets to illustrate our cautious variable selection method and compare it with other well known methods.
△ Less
Submitted 3 May, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
A robust Bayesian bias-adjusted random effects model for consideration of uncertainty about bias terms in evidence synthesis
Authors:
Ivette Raices Cruz,
Matthias C. M. Troffaes,
Johan Lindström,
Ullrika Sahlin
Abstract:
Meta-analysis is a statistical method used in evidence synthesis for combining, analyzing and summarizing studies that have the same target endpoint and aims to derive a pooled quantitative estimate using fixed and random effects models or network models. Differences among included studies depend on variations in target populations (i.e. heterogeneity) and variations in study quality due to study…
▽ More
Meta-analysis is a statistical method used in evidence synthesis for combining, analyzing and summarizing studies that have the same target endpoint and aims to derive a pooled quantitative estimate using fixed and random effects models or network models. Differences among included studies depend on variations in target populations (i.e. heterogeneity) and variations in study quality due to study design and execution (i.e. bias). The risk of bias is usually assessed qualitatively using critical appraisal, and quantitative bias analysis can be used to evaluate the influence of bias on the quantity of interest. We propose a way to consider ignorance or ambiguity in how to quantify bias terms in a bias analysis by characterizing bias with imprecision (as bounds on probability) and use robust Bayesian analysis to estimate the overall effect. Robust Bayesian analysis is here seen as Bayesian updating performed over a set of coherent probability distributions, where the set emerges from a set of bias terms. We show how the set of bias terms can be specified based on judgments on the relative magnitude of biases (i.e., low, unclear and high risk of bias) in one or several domains of the Cochrane's risk of bias table. For illustration, we apply a robust Bayesian bias-adjusted random effects model to an already published meta-analysis on the effect of Rituximab for rheumatoid arthritis from the Cochrane Database of Systematic Reviews.
△ Less
Submitted 22 April, 2022;
originally announced April 2022.
-
Improving and benchmarking of algorithms for $Γ$-maximin, $Γ$-maximax and interval dominance
Authors:
Nawapon Nakharutai,
Matthias C. M. Troffaes,
Camila C. S. Caiado
Abstract:
$Γ$-maximin, $Γ…
▽ More
$Γ$-maximin, $Γ$-maximax and inteval dominance are familiar decision criteria for making decisions under severe uncertainty, when probability distributions can only be partially identified. One can apply these three criteria by solving sequences of linear programs. In this study, we present new algorithms for these criteria and compare their performance to existing standard algorithms. Specifically, we use efficient ways, based on previous work, to find common initial feasible points for these algorithms. Exploiting these initial feasible points, we develop early stopping criteria to determine whether gambles are either $Γ$-maximin, $Γ$-maximax or interval dominant. We observe that the primal-dual interior point method benefits considerably from these improvements. In our simulation, we find that our proposed algorithms outperform the standard algorithms when the size of the domain of lower previsions is less or equal to the sizes of decisions and outcomes. However, our proposed algorithms do not outperform the standard algorithms in the case that the size of the domain of lower previsions is much larger than the sizes of decisions and outcomes.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
Robust decision analysis under severe uncertainty and ambiguous tradeoffs: an invasive species case study
Authors:
Ullrika Sahlin,
Matthias C. M. Troffaes,
Lennart Edsman
Abstract:
Bayesian decision analysis is a useful method for risk management decisions, but is limited in its ability to consider severe uncertainty in knowledge, and value ambiguity in management objectives. We study the use of robust Bayesian decision analysis to handle problems where one or both of these issues arise. The robust Bayesian approach models severe uncertainty through bounds on probability dis…
▽ More
Bayesian decision analysis is a useful method for risk management decisions, but is limited in its ability to consider severe uncertainty in knowledge, and value ambiguity in management objectives. We study the use of robust Bayesian decision analysis to handle problems where one or both of these issues arise. The robust Bayesian approach models severe uncertainty through bounds on probability distributions, and value ambiguity through bounds on utility functions. To incorporate data, standard Bayesian updating is applied on the entire set of distributions. To elicit our expert's utility representing the value of different management objectives, we use a modified version of the swing weighting procedure that can cope with severe value ambiguity. We demonstrate these methods on an environmental management problem to eradicate an alien invasive marmorkrebs recently discovered in Sweden, which needed a rapid response despite substantial knowledge gaps if the species was still present (i.e. severe uncertainty) and the need for difficult tradeoffs and competing interests (i.e. value ambiguity). We identify that the decision alternatives to drain the system and remove individuals in combination with dredging and sieving with or without a degradable biocide, or increasing pH, are consistently bad under the entire range of probability and utility bounds. This case study shows how robust Bayesian decision analysis provides a transparent methodology for integrating information in risk management problems where little data are available and/or where the tradeoffs ambiguous.
△ Less
Submitted 8 March, 2021;
originally announced March 2021.
-
Improving and benchmarking of algorithms for decision making with lower previsions
Authors:
Nawapon Nakharutai,
Matthias C. M. Troffaes,
Camila C. S. Caiado
Abstract:
Maximality, interval dominance, and E-admissibility are three well-known criteria for decision making under severe uncertainty using lower previsions. We present a new fast algorithm for finding maximal gambles. We compare its performance to existing algorithms, one proposed by Troffaes and Hable (2014), and one by Jansen, Augustin, and Schollmeyer (2017). To do so, we develop a new method for gen…
▽ More
Maximality, interval dominance, and E-admissibility are three well-known criteria for decision making under severe uncertainty using lower previsions. We present a new fast algorithm for finding maximal gambles. We compare its performance to existing algorithms, one proposed by Troffaes and Hable (2014), and one by Jansen, Augustin, and Schollmeyer (2017). To do so, we develop a new method for generating random decision problems with pre-specified ratios of maximal and interval dominant gambles. Based on earlier work, we present efficient ways to find common feasible starting points in these algorithms. We then exploit these feasible starting points to develop early stopping criteria for the primal-dual interior point method, further improving efficiency. We find that the primal-dual interior point method works best. We also investigate the use of interval dominance to eliminate non-maximal gambles. This can make the problem smaller, and we observe that this benefits Jansen et al.'s algorithm, but perhaps surprisingly, not the other two algorithms. We find that our algorithm, without using interval dominance, outperforms all other algorithms in all scenarios in our benchmarking.
△ Less
Submitted 28 June, 2019;
originally announced June 2019.
-
Improved linear programming methods for checking avoiding sure loss
Authors:
Nawapon Nakharutai,
Matthias C. M. Troffaes,
Camila C. S. Caiado
Abstract:
We review the simplex method and two interior-point methods (the affine scaling and the primal-dual) for solving linear programming problems for checking avoiding sure loss, and propose novel improvements. We exploit the structure of these problems to reduce their size. We also present an extra stopping criterion, and direct ways to calculate feasible starting points in almost all cases. For bench…
▽ More
We review the simplex method and two interior-point methods (the affine scaling and the primal-dual) for solving linear programming problems for checking avoiding sure loss, and propose novel improvements. We exploit the structure of these problems to reduce their size. We also present an extra stopping criterion, and direct ways to calculate feasible starting points in almost all cases. For benchmarking, we present algorithms for generating random sets of desirable gambles that either avoid or do not avoid sure loss. We test our improvements on these linear programming methods by measuring the computational time on these generated sets. We assess the relative performance of the three methods as a function of the number of desirable gambles and the number of outcomes. Overall, the affine scaling and primal-dual methods benefit from the improvements, and they both outperform the simplex method in most scenarios. We conclude that the simplex method is not a good choice for checking avoiding sure loss. If problems are small, then there is no tangible difference in performance between all methods. For large problems, our improved primal-dual method performs at least three times faster than any of the other methods.
△ Less
Submitted 9 August, 2018;
originally announced August 2018.
-
Decision making under uncertainty using imprecise probabilities
Authors:
Matthias C. M. Troffaes
Abstract:
Various ways for decision making with imprecise probabilities (admissibility, maximal expected utility, maximality, E-admissibility, $Γ$-maximax, $Γ$-maximin, all of which are well-known from the literature) are discussed and compared. We generalize a well-known sufficient condition for existence of optimal decisions. A simple numerical example shows how these criteria can work in practice, and de…
▽ More
Various ways for decision making with imprecise probabilities (admissibility, maximal expected utility, maximality, E-admissibility, $Γ$-maximax, $Γ$-maximin, all of which are well-known from the literature) are discussed and compared. We generalize a well-known sufficient condition for existence of optimal decisions. A simple numerical example shows how these criteria can work in practice, and demonstrates their differences. Finally, we suggest an efficient approach to calculate optimal decisions under these decision criteria.
△ Less
Submitted 9 July, 2018;
originally announced July 2018.
-
Imprecise Monte Carlo simulation and iterative importance sampling for the estimation of lower previsions
Authors:
Matthias C. M. Troffaes
Abstract:
We develop a theoretical framework for studying numerical estimation of lower previsions, generally applicable to two-level Monte Carlo methods, importance sampling methods, and a wide range of other sampling methods one might devise. We link consistency of these estimators to Glivenko-Cantelli classes, and for the sub-Gaussian case we show how the correlation structure of this process can be used…
▽ More
We develop a theoretical framework for studying numerical estimation of lower previsions, generally applicable to two-level Monte Carlo methods, importance sampling methods, and a wide range of other sampling methods one might devise. We link consistency of these estimators to Glivenko-Cantelli classes, and for the sub-Gaussian case we show how the correlation structure of this process can be used to bound the bias and prove consistency. We also propose a new upper estimator, which can be used along with the standard lower estimator, in order to provide a simple confidence interval. As a case study of this framework, we then discuss how importance sampling can be exploited to provide accurate numerical estimates of lower previsions. We propose an iterative importance sampling method to drastically improve the performance of imprecise importance sampling. We demonstrate our results on the imprecise Dirichlet model.
△ Less
Submitted 27 June, 2018;
originally announced June 2018.
-
A robust Bayesian approach to modelling epistemic uncertainty in common-cause failure models
Authors:
Matthias C. M. Troffaes,
Gero Walter,
Dana Kelly
Abstract:
In a standard Bayesian approach to the alpha-factor model for common-cause failure, a precise Dirichlet prior distribution models epistemic uncertainty in the alpha-factors. This Dirichlet prior is then updated with observed data to obtain a posterior distribution, which forms the basis for further inferences.
In this paper, we adapt the imprecise Dirichlet model of Walley to represent epistemic…
▽ More
In a standard Bayesian approach to the alpha-factor model for common-cause failure, a precise Dirichlet prior distribution models epistemic uncertainty in the alpha-factors. This Dirichlet prior is then updated with observed data to obtain a posterior distribution, which forms the basis for further inferences.
In this paper, we adapt the imprecise Dirichlet model of Walley to represent epistemic uncertainty in the alpha-factors. In this approach, epistemic uncertainty is expressed more cautiously via lower and upper expectations for each alpha-factor, along with a learning parameter which determines how quickly the model learns from observed data. For this application, we focus on elicitation of the learning parameter, and find that values in the range of 1 to 10 seem reasonable. The approach is compared with Kelly and Atwood's minimally informative Dirichlet prior for the alpha-factor model, which incorporated precise mean values for the alpha-factors, but which was otherwise quite diffuse.
Next, we explore the use of a set of Gamma priors to model epistemic uncertainty in the marginal failure rate, expressed via a lower and upper expectation for this rate, again along with a learning parameter. As zero counts are generally less of an issue here, we find that the choice of this learning parameter is less crucial.
Finally, we demonstrate how both epistemic uncertainty models can be combined to arrive at lower and upper expectations for all common-cause failure rates. Thereby, we effectively provide a full sensitivity analysis of common-cause failure rates, properly reflecting epistemic uncertainty of the analyst on all levels of the common-cause failure model.
△ Less
Submitted 3 January, 2013;
originally announced January 2013.
-
Finite approximations to coherent choice
Authors:
Matthias C. M. Troffaes
Abstract:
This paper studies and bounds the effects of approximating loss functions and credal sets on choice functions, under very weak assumptions. In particular, the credal set is assumed to be neither convex nor closed. The main result is that the effects of approximation can be bounded, although in general, approximation of the credal set may not always be practically possible. In case of pairwise choi…
▽ More
This paper studies and bounds the effects of approximating loss functions and credal sets on choice functions, under very weak assumptions. In particular, the credal set is assumed to be neither convex nor closed. The main result is that the effects of approximation can be bounded, although in general, approximation of the credal set may not always be practically possible. In case of pairwise choice, I demonstrate how the situation can be improved by showing that only approximations of the extreme points of the closure of the convex hull of the credal set need to be taken into account, as expected.
△ Less
Submitted 5 March, 2012;
originally announced March 2012.
-
Robust detection of exotic infectious diseases in animal herds: A comparative study of three decision methodologies under severe uncertainty
Authors:
Matthias C. M. Troffaes,
John Paul Gosling
Abstract:
When animals are transported and pass through customs, some of them may have dangerous infectious diseases. Typically, due to the cost of testing, not all animals are tested: a reasonable selection must be made. How to test effectively whilst avoiding costly disease outbreaks? First, we extend a model proposed in the literature for the detection of invasive species to suit our purpose. Secondly, w…
▽ More
When animals are transported and pass through customs, some of them may have dangerous infectious diseases. Typically, due to the cost of testing, not all animals are tested: a reasonable selection must be made. How to test effectively whilst avoiding costly disease outbreaks? First, we extend a model proposed in the literature for the detection of invasive species to suit our purpose. Secondly, we explore and compare three decision methodologies on the problem at hand, namely, Bayesian statistics, info-gap theory and imprecise probability theory, all of which are designed to handle severe uncertainty. We show that, under rather general conditions, every info-gap solution is maximal with respect to a suitably chosen imprecise probability model, and that therefore, perhaps surprisingly, the set of maximal options can be inferred at least partly---and sometimes entirely---from an info-gap analysis.
△ Less
Submitted 5 March, 2012; v1 submitted 8 December, 2011;
originally announced December 2011.
-
Normal form backward induction for decision trees with coherent lower previsions
Authors:
Nathan Huntley,
Matthias C. M. Troffaes
Abstract:
We examine normal form solutions of decision trees under typical choice functions induced by lower previsions. For large trees, finding such solutions is hard as very many strategies must be considered. In an earlier paper, we extended backward induction to arbitrary choice functions, yielding far more efficient solutions, and we identified simple necessary and sufficient conditions for this to wo…
▽ More
We examine normal form solutions of decision trees under typical choice functions induced by lower previsions. For large trees, finding such solutions is hard as very many strategies must be considered. In an earlier paper, we extended backward induction to arbitrary choice functions, yielding far more efficient solutions, and we identified simple necessary and sufficient conditions for this to work. In this paper, we show that backward induction works for maximality and E-admissibility, but not for interval dominance and Gamma-maximin. We also show that, in some situations, a computationally cheap approximation of a choice function can be used, even if the approximation violates the conditions for backward induction; for instance, interval dominance with backward induction will yield at least all maximal normal form solutions.
△ Less
Submitted 23 March, 2012; v1 submitted 1 April, 2011;
originally announced April 2011.
-
Probability boxes on totally preordered spaces for multivariate modelling
Authors:
Matthias C. M. Troffaes,
Sebastien Destercke
Abstract:
A pair of lower and upper cumulative distribution functions, also called probability box or p-box, is among the most popular models used in imprecise probability theory. They arise naturally in expert elicitation, for instance in cases where bounds are specified on the quantiles of a random variable, or when quantiles are specified only at a finite number of points. Many practical and formal resul…
▽ More
A pair of lower and upper cumulative distribution functions, also called probability box or p-box, is among the most popular models used in imprecise probability theory. They arise naturally in expert elicitation, for instance in cases where bounds are specified on the quantiles of a random variable, or when quantiles are specified only at a finite number of points. Many practical and formal results concerning p-boxes already exist in the literature. In this paper, we provide new efficient tools to construct multivariate p-boxes and develop algorithms to draw inferences from them. For this purpose, we formalise and extend the theory of p-boxes using Walley's behavioural theory of imprecise probabilities, and heavily rely on its notion of natural extension and existing results about independence modeling. In particular, we allow p-boxes to be defined on arbitrary totally preordered spaces, hence thereby also admitting multivariate p-boxes via probability bounds over any collection of nested sets. We focus on the cases of independence (using the factorization property), and of unknown dependence (using the Fréchet bounds), and we show that our approach extends the probabilistic arithmetic of Williamson and Downs. Two design problems---a damped oscillator, and a river dike---demonstrate the practical feasibility of our results.
△ Less
Submitted 29 March, 2011; v1 submitted 9 March, 2011;
originally announced March 2011.