-
Identifying Total Causal Effects in Linear Models under Partial Homoscedasticity
Authors:
David Strieder,
Mathias Drton
Abstract:
A fundamental challenge of scientific research is inferring causal relations based on observed data. One commonly used approach involves utilizing structural causal models that postulate noisy functional relations among interacting variables. A directed graph naturally represents these models and reflects the underlying causal structure. However, classical identifiability results suggest that, wit…
▽ More
A fundamental challenge of scientific research is inferring causal relations based on observed data. One commonly used approach involves utilizing structural causal models that postulate noisy functional relations among interacting variables. A directed graph naturally represents these models and reflects the underlying causal structure. However, classical identifiability results suggest that, without conducting additional experiments, this causal graph can only be identified up to a Markov equivalence class of indistinguishable models. Recent research has shown that focusing on linear relations with equal error variances can enable the identification of the causal structure from mere observational data. Nonetheless, practitioners are often primarily interested in the effects of specific interventions, rendering the complete identification of the causal structure unnecessary. In this work, we investigate the extent to which less restrictive assumptions of partial homoscedasticity are sufficient for identifying the causal effects of interest. Furthermore, we construct mathematically rigorous confidence regions for total causal effects under structure uncertainty and explore the performance gain of relying on stricter error assumptions in a simulation study.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
Dual Likelihood for Causal Inference under Structure Uncertainty
Authors:
David Strieder,
Mathias Drton
Abstract:
Knowledge of the underlying causal relations is essential for inferring the effect of interventions in complex systems. In a widely studied approach, structural causal models postulate noisy functional relations among interacting variables, where the underlying causal structure is then naturally represented by a directed graph whose edges indicate direct causal dependencies. In the typical applica…
▽ More
Knowledge of the underlying causal relations is essential for inferring the effect of interventions in complex systems. In a widely studied approach, structural causal models postulate noisy functional relations among interacting variables, where the underlying causal structure is then naturally represented by a directed graph whose edges indicate direct causal dependencies. In the typical application, this underlying causal structure must be learned from data, and thus, the remaining structure uncertainty needs to be incorporated into causal inference in order to draw reliable conclusions. In recent work, test inversions provide an ansatz to account for this data-driven model choice and, therefore, combine structure learning with causal inference. In this article, we propose the use of dual likelihood to greatly simplify the treatment of the involved testing problem. Indeed, dual likelihood leads to a closed-form solution for constructing confidence regions for total causal effects that rigorously capture both sources of uncertainty: causal structure and numerical size of nonzero effects. The proposed confidence regions can be computed with a bottom-up procedure starting from sink nodes. To render the causal structure identifiable, we develop our ideas in the context of linear causal relations with equal error variances.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Confidence in Causal Inference under Structure Uncertainty in Linear Causal Models with Equal Variances
Authors:
David Strieder,
Mathias Drton
Abstract:
Inferring the effect of interventions within complex systems is a fundamental problem of statistics. A widely studied approach employs structural causal models that postulate noisy functional relations among a set of interacting variables. The underlying causal structure is then naturally represented by a directed graph whose edges indicate direct causal dependencies. In a recent line of work, add…
▽ More
Inferring the effect of interventions within complex systems is a fundamental problem of statistics. A widely studied approach employs structural causal models that postulate noisy functional relations among a set of interacting variables. The underlying causal structure is then naturally represented by a directed graph whose edges indicate direct causal dependencies. In a recent line of work, additional assumptions on the causal models have been shown to render this causal graph identifiable from observational data alone. One example is the assumption of linear causal relations with equal error variances that we will take up in this work. When the graph structure is known, classical methods may be used for calculating estimates and confidence intervals for causal effects. However, in many applications, expert knowledge that provides an a priori valid causal structure is not available. Lacking alternatives, a commonly used two-step approach first learns a graph and then treats the graph as known in inference. This, however, yields confidence intervals that are overly optimistic and fail to account for the data-driven model choice. We argue that to draw reliable conclusions, it is necessary to incorporate the remaining uncertainty about the underlying causal structure in confidence statements about causal effects. To address this issue, we present a framework based on test inversion that allows us to give confidence regions for total causal effects that capture both sources of uncertainty: causal structure and numerical size of nonzero effects.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Discussion of "A note on universal inference" by Timmy Tse and Anthony Davison
Authors:
Mathias Drton,
Hongjian Shi,
David Strieder
Abstract:
Invited discussion for Stat of "A note on universal inference" by Timmy Tse and Anthony Davison (2022)
Invited discussion for Stat of "A note on universal inference" by Timmy Tse and Anthony Davison (2022)
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
On the choice of the splitting ratio for the split likelihood ratio test
Authors:
David Strieder,
Mathias Drton
Abstract:
The recently introduced framework of universal inference provides a new approach to constructing hypothesis tests and confidence regions that are valid in finite samples and do not rely on any specific regularity assumptions on the underlying statistical model. At the core of the methodology is a split likelihood ratio statistic, which is formed under data splitting and compared to a cleverly sele…
▽ More
The recently introduced framework of universal inference provides a new approach to constructing hypothesis tests and confidence regions that are valid in finite samples and do not rely on any specific regularity assumptions on the underlying statistical model. At the core of the methodology is a split likelihood ratio statistic, which is formed under data splitting and compared to a cleverly selected universal critical value. As this critical value can be very conservative, it is interesting to mitigate the potential loss of power by careful choice of the ratio according to which data are split. Motivated by this problem, we study the split likelihood ratio test under local alternatives and introduce the resulting class of noncentral split chi-square distributions. We investigate the properties of this new class of distributions and use it to numerically examine and propose an optimal choice of the data splitting ratio for tests of composite hypotheses of different dimensions.
△ Less
Submitted 24 November, 2022; v1 submitted 13 March, 2022;
originally announced March 2022.
-
Confidence in Causal Discovery with Linear Causal Models
Authors:
David Strieder,
Tobias Freidling,
Stefan Haffner,
Mathias Drton
Abstract:
Structural causal models postulate noisy functional relations among a set of interacting variables. The causal structure underlying each such model is naturally represented by a directed graph whose edges indicate for each variable which other variables it causally depends upon. Under a number of different model assumptions, it has been shown that this causal graph and, thus also, causal effects a…
▽ More
Structural causal models postulate noisy functional relations among a set of interacting variables. The causal structure underlying each such model is naturally represented by a directed graph whose edges indicate for each variable which other variables it causally depends upon. Under a number of different model assumptions, it has been shown that this causal graph and, thus also, causal effects are identifiable from mere observational data. For these models, practical algorithms have been devised to learn the graph. Moreover, when the graph is known, standard techniques may be used to give estimates and confidence intervals for causal effects. We argue, however, that a two-step method that first learns a graph and then treats the graph as known yields confidence intervals that are overly optimistic and can drastically fail to account for the uncertain causal structure. To address this issue we lay out a framework based on test inversion that allows us to give confidence regions for total causal effects that capture both sources of uncertainty: causal structure and numerical size of nonzero effects. Our ideas are developed in the context of bivariate linear causal models with homoscedastic errors, but as we exemplify they are generalizable to larger systems as well as other settings such as, in particular, linear non-Gaussian models.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
Testing normality in any dimension by Fourier methods in a multivariate Stein equation
Authors:
Bruno Ebner,
Norbert Henze,
David Strieder
Abstract:
We study a novel class of affine invariant and consistent tests for multivariate normality. The tests are based on a characterization of the standard $d$-variate normal distribution by means of the unique solution of an initial value problem connected to a partial differential equation, which is motivated by a multivariate Stein equation. The test criterion is a suitably weighted $L^2$-statistic.…
▽ More
We study a novel class of affine invariant and consistent tests for multivariate normality. The tests are based on a characterization of the standard $d$-variate normal distribution by means of the unique solution of an initial value problem connected to a partial differential equation, which is motivated by a multivariate Stein equation. The test criterion is a suitably weighted $L^2$-statistic. We derive the limit distribution of the test statistic under the null hypothesis as well as under contiguous and fixed alternatives to normality. A consistent estimator of the limiting variance under fixed alternatives as well as an asymptotic confidence interval of the distance of an underlying alternative with respect to the multivariate normal law is derived. In simulation studies, we show that the tests are strong in comparison with prominent competitors, and that the empirical coverage rate of the asymptotic confidence interval converges to the nominal level. We present a real data example, and we outline topics for further research.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.