-
On a fundamental difference between Bayesian and frequentist approaches to robustness
Authors:
Philippe Gagnon,
Alain Desgagné
Abstract:
Heavy-tailed models are often used as a way to gain robustness against outliers in Bayesian analyses. On the other side, in frequentist analyses, M-estimators are often employed. In this paper, the two approaches are reconciled by considering M-estimators as maximum likelihood estimators of heavy-tailed models. We realize that, even from this perspective, there is a fundamental difference in that…
▽ More
Heavy-tailed models are often used as a way to gain robustness against outliers in Bayesian analyses. On the other side, in frequentist analyses, M-estimators are often employed. In this paper, the two approaches are reconciled by considering M-estimators as maximum likelihood estimators of heavy-tailed models. We realize that, even from this perspective, there is a fundamental difference in that frequentists do not require these heavy-tailed models to be proper. It is shown what the difference between improper and proper heavy-tailed models can be in terms of estimation results through two real-data analyses based on linear regression. The findings of this paper make us ponder on the use of improper heavy-tailed data models in Bayesian analyses, an approach which is seen to fit within the generalized Bayesian framework of Bissiri et al. (2016) when combined with proper prior distributions yielding proper (generalized) posterior distributions.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Asymptotics for non-degenerate multivariate $U$-statistics with estimated nuisance parameters under the null and local alternative hypotheses
Authors:
Alain Desgagné,
Christian Genest,
Frédéric Ouimet
Abstract:
The large-sample behavior of non-degenerate multivariate $U$-statistics of arbitrary degree is investigated under the assumption that their kernel depends on parameters that can be estimated consistently. Mild regularity conditions are provided which guarantee that once properly normalized, such statistics are asymptotically multivariate Gaussian both under the null hypothesis and sequences of loc…
▽ More
The large-sample behavior of non-degenerate multivariate $U$-statistics of arbitrary degree is investigated under the assumption that their kernel depends on parameters that can be estimated consistently. Mild regularity conditions are provided which guarantee that once properly normalized, such statistics are asymptotically multivariate Gaussian both under the null hypothesis and sequences of local alternatives. The work of Randles (1982, Ann. Statist.) is extended in three ways: the data and the kernel values can be multivariate rather than univariate, the limiting behavior under local alternatives is studied for the first time, and the effect of knowing some of the nuisance parameters is quantified. These results can be applied to a broad range of goodness-of-fit testing contexts, as shown in two specific examples.
△ Less
Submitted 28 November, 2024; v1 submitted 20 January, 2024;
originally announced January 2024.
-
A comprehensive empirical power comparison of univariate goodness-of-fit tests for the Laplace distribution
Authors:
Alain Desgagné,
Pierre Lafaye de Micheaux,
Frédéric Ouimet
Abstract:
In this paper we present the results from an empirical power comparison of 40 goodness-of-fit tests for the univariate Laplace distribution, carried out using Monte Carlo simulations with sample sizes $n = 20, 50, 100, 200$, significance levels $α= 0.01, 0.05, 0.10$, and 400 alternatives consisting of asymmetric and symmetric light/heavy-tailed distributions taken as special cases from 11 models.…
▽ More
In this paper we present the results from an empirical power comparison of 40 goodness-of-fit tests for the univariate Laplace distribution, carried out using Monte Carlo simulations with sample sizes $n = 20, 50, 100, 200$, significance levels $α= 0.01, 0.05, 0.10$, and 400 alternatives consisting of asymmetric and symmetric light/heavy-tailed distributions taken as special cases from 11 models. In addition to the unmatched scope of our study, an interesting contribution is the proposal of an innovative design for the selection of alternatives. The 400 alternatives consist of 20 specific cases of 20 submodels drawn from the main 11 models. For each submodel, the 20 specific cases corresponded to parameter values chosen to cover the full power range. An analysis of the results leads to a recommendation of the best tests for five different groupings of the alternative distributions. A real-data example is also presented, where an appropriate test for the goodness-of-fit of the univariate Laplace distribution is applied to weekly log-returns of Amazon stock over a recent four-year period.
△ Less
Submitted 25 July, 2022; v1 submitted 12 July, 2020;
originally announced July 2020.
-
Efficient and Robust Estimation of Linear Regression with Normal Errors
Authors:
Alain Desgagné
Abstract:
Linear regression with normally distributed errors - including particular cases such as ANOVA, Student's t-test or location-scale inference - is a widely used statistical procedure. In this case the ordinary least squares estimator possesses remarkable properties but is very sensitive to outliers. Several robust alternatives have been proposed, but there is still significant room for improvement.…
▽ More
Linear regression with normally distributed errors - including particular cases such as ANOVA, Student's t-test or location-scale inference - is a widely used statistical procedure. In this case the ordinary least squares estimator possesses remarkable properties but is very sensitive to outliers. Several robust alternatives have been proposed, but there is still significant room for improvement. This paper thus proposes an original method of estimation that offers the best efficiency simultaneously in the absence and the presence of outliers, both for the estimation of the regression coefficients and the scale parameter. The approach first consists in broadening the normal assumption of the errors to a mixture of the normal and the filtered-log-Pareto (FLP), an original distribution designed to represent the outliers. The expectation-maximization (EM) algorithm is then adapted and we obtain the N-FLP estimators of the regression coefficients, the scale parameter and the proportion of outliers, along with probabilities of each observation being an outlier. The performance of the N-FLP estimators is compared with the best alternatives in an extensive Monte Carlo simulation. The paper demonstrates that this method of estimation can also be used for a complete robust inference, including confidence intervals, hypothesis testing and model selection.
△ Less
Submitted 17 September, 2019;
originally announced September 2019.
-
An automatic robust Bayesian approach to principal component regression
Authors:
Philippe Gagnon,
Mylène Bédard,
Alain Desgagné
Abstract:
Principal component regression uses principal components as regressors. It is particularly useful in prediction settings with high-dimensional covariates. The existing literature treating of Bayesian approaches is relatively sparse. We introduce a Bayesian approach that is robust to outliers in both the dependent variable and the covariates. Outliers can be thought of as observations that are not…
▽ More
Principal component regression uses principal components as regressors. It is particularly useful in prediction settings with high-dimensional covariates. The existing literature treating of Bayesian approaches is relatively sparse. We introduce a Bayesian approach that is robust to outliers in both the dependent variable and the covariates. Outliers can be thought of as observations that are not in line with the general trend. The proposed approach automatically penalises these observations so that their impact on the posterior gradually vanishes as they move further and further away from the general trend, corresponding to a concept in Bayesian statistics called whole robustness. The predictions produced are thus consistent with the bulk of the data. The approach also exploits the geometry of principal components to efficiently identify those that are significant. Individual predictions obtained from the resulting models are consolidated according to model-averaging mechanisms to account for model uncertainty. The approach is evaluated on real data and compared to its nonrobust Bayesian counterpart, the traditional frequentist approach, and a commonly employed robust frequentist method. Detailed guidelines to automate the entire statistical procedure are provided. All required code is made available, see ArXiv:1711.06341.
△ Less
Submitted 24 January, 2020; v1 submitted 16 November, 2017;
originally announced November 2017.
-
A New Bayesian Approach to Robustness Against Outliers in Linear Regression
Authors:
Philippe Gagnon,
Alain Desgagné,
Mylène Bédard
Abstract:
Linear regression is ubiquitous in statistical analysis. It is well understood that conflicting sources of information may contaminate the inference when the classical normality of errors is assumed. The contamination caused by the light normal tails follows from an undesirable effect: the posterior concentrates in an area in between the different sources with a large enough scaling to incorporate…
▽ More
Linear regression is ubiquitous in statistical analysis. It is well understood that conflicting sources of information may contaminate the inference when the classical normality of errors is assumed. The contamination caused by the light normal tails follows from an undesirable effect: the posterior concentrates in an area in between the different sources with a large enough scaling to incorporate them all. The theory of conflict resolution in Bayesian statistics (O'Hagan and Pericchi (2012)) recommends to address this problem by limiting the impact of outliers to obtain conclusions consistent with the bulk of the data. In this paper, we propose a model with super heavy-tailed errors to achieve this. We prove that it is wholly robust, meaning that the impact of outliers gradually vanishes as they move further and further away form the general trend. The super heavy-tailed density is similar to the normal outside of the tails, which gives rise to an efficient estimation procedure. In addition, estimates are easily computed. This is highlighted via a detailed user guide, where all steps are explained through a simulated case study. The performance is shown using simulation. All required code is given.
△ Less
Submitted 11 June, 2019; v1 submitted 19 December, 2016;
originally announced December 2016.
-
Weak Convergence and Optimal Tuning of the Reversible Jump Algorithm
Authors:
Philippe Gagnon,
Mylène Bédard,
Alain Desgagné
Abstract:
The reversible jump algorithm is a useful Markov chain Monte Carlo method introduced by Green (1995) that allows switches between subspaces of differing dimensionality, and therefore, model selection. Although this method is now increasingly used in key areas (e.g. biology and finance), it remains a challenge to implement it. In this paper, we focus on a simple sampling context in order to obtain…
▽ More
The reversible jump algorithm is a useful Markov chain Monte Carlo method introduced by Green (1995) that allows switches between subspaces of differing dimensionality, and therefore, model selection. Although this method is now increasingly used in key areas (e.g. biology and finance), it remains a challenge to implement it. In this paper, we focus on a simple sampling context in order to obtain theoretical results that lead to an optimal tuning procedure for the considered reversible jump algorithm, and consequently, to easy implementation. The key result is the weak convergence of the sequence of stochastic processes engendered by the algorithm. It represents the main contribution of this paper as it is, to our knowledge, the first weak convergence result for the reversible jump algorithm. The sampler updating the parameters according to a random walk, this result allows to retrieve the well-known 0.234 rule for finding the optimal scaling. It also leads to an answer to the question: "with what probability should a parameter update be proposed comparatively to a model switch at each iteration?"
△ Less
Submitted 17 April, 2019; v1 submitted 16 December, 2016;
originally announced December 2016.
-
Bayesian Robustness to Outliers in Linear Regression and Ratio Estimation
Authors:
Alain Desgagné,
Philippe Gagnon
Abstract:
Whole robustness is a nice property to have for statistical models. It implies that the impact of outliers gradually vanishes as they approach plus or minus infinity. So far, the Bayesian literature provides results that ensure whole robustness for the location-scale model. In this paper, we make two contributions. First, we generalise the results to attain whole robustness in simple linear regres…
▽ More
Whole robustness is a nice property to have for statistical models. It implies that the impact of outliers gradually vanishes as they approach plus or minus infinity. So far, the Bayesian literature provides results that ensure whole robustness for the location-scale model. In this paper, we make two contributions. First, we generalise the results to attain whole robustness in simple linear regression through the origin, which is a necessary step towards results for general linear regression models. We allow the variance of the error term to depend on the explanatory variable. This flexibility leads to the second contribution: we provide a simple Bayesian approach to robustly estimate finite population means and ratios. The strategy to attain whole robustness is simple since it lies in replacing the traditional normal assumption on the error term by a super heavy-tailed distribution assumption. As a result, users can estimate the parameters as usual, using the posterior distribution.
△ Less
Submitted 12 August, 2018; v1 submitted 15 December, 2016;
originally announced December 2016.