-
Debiased Ill-Posed Regression
Authors:
AmirEmad Ghassami,
James M. Robins,
Andrea Rotnitzky
Abstract:
In various statistical settings, the goal is to estimate a function which is restricted by the statistical model only through a conditional moment restriction. Prominent examples include the nonparametric instrumental variable framework for estimating the structural function of the outcome variable, and the proximal causal inference framework for estimating the bridge functions. A common strategy…
▽ More
In various statistical settings, the goal is to estimate a function which is restricted by the statistical model only through a conditional moment restriction. Prominent examples include the nonparametric instrumental variable framework for estimating the structural function of the outcome variable, and the proximal causal inference framework for estimating the bridge functions. A common strategy in the literature is to find the minimizer of the projected mean squared error. However, this approach can be sensitive to misspecification or slow convergence rate of the estimators of the involved nuisance components. In this work, we propose a debiased estimation strategy based on the influence function of a modification of the projected error and demonstrate its finite-sample convergence rate. Our proposed estimator possesses a second-order bias with respect to the involved nuisance functions and a desirable robustness property with respect to the misspecification of one of the nuisance functions. The proposed estimator involves a hyper-parameter, for which the optimal value depends on potentially unknown features of the underlying data-generating process. Hence, we further propose a hyper-parameter selection approach based on cross-validation and derive an error bound for the resulting estimator. This analysis highlights the potential rate loss due to hyper-parameter selection and underscore the importance and advantages of incorporating debiasing in this setting. We also study the application of our approach to the estimation of regular parameters in a specific parameter class, which are linear functionals of the solutions to the conditional moment restrictions and provide sufficient conditions for achieving root-n consistency using our debiased estimator.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
On the asymptotic validity of confidence sets for linear functionals of solutions to integral equations
Authors:
Ezequiel Smucler,
James M. Robins,
Andrea Rotnitzky
Abstract:
This paper examines the construction of confidence sets for parameters defined as linear functionals of a function of W and X whose conditional mean given Z and X equals the conditional mean of another variable Y given Z and X. Many estimands of interest in causal inference can be expressed in this form, including the average treatment effect in proximal causal inference and treatment effect contr…
▽ More
This paper examines the construction of confidence sets for parameters defined as linear functionals of a function of W and X whose conditional mean given Z and X equals the conditional mean of another variable Y given Z and X. Many estimands of interest in causal inference can be expressed in this form, including the average treatment effect in proximal causal inference and treatment effect contrasts in instrumental variable models. We derive a necessary condition for a confidence set to be uniformly valid over a model that allows for the dependence between W and Z given X to be arbitrarily weak. Specifically, we show that for any such confidence set, there must exist some laws in the model under which, with high probability, the confidence set has a diameter greater than or equal to the diameter of the parameter's range. In particular, consistent with the weak instruments literature, Wald confidence intervals are not uniformly valid over the aforementioned model. Furthermore, we argue that inverting the score test, a successful approach in that literature, generally fails for the broader class of parameters considered here. We present a method for constructing uniformly valid confidence sets in the special case where all variables, but possibly Y, are binary and discuss its limitations. Finally, we emphasize that developing uniformly valid confidence sets for the class of parameters considered in this paper remains an open problem.
△ Less
Submitted 1 June, 2025; v1 submitted 23 February, 2025;
originally announced February 2025.
-
Towards a Unified Theory for Semiparametric Data Fusion with Individual-Level Data
Authors:
Ellen Graham,
Marco Carone,
Andrea Rotnitzky
Abstract:
We address the goal of conducting inference about a smooth finite-dimensional parameter by utilizing individual-level data from various independent sources. Recent advancements have led to the development of a comprehensive theory capable of handling scenarios where different data sources align with, possibly distinct subsets of, conditional distributions of a single factorization of the joint tar…
▽ More
We address the goal of conducting inference about a smooth finite-dimensional parameter by utilizing individual-level data from various independent sources. Recent advancements have led to the development of a comprehensive theory capable of handling scenarios where different data sources align with, possibly distinct subsets of, conditional distributions of a single factorization of the joint target distribution. While this theory proves effective in many significant contexts, it falls short in certain common data fusion problems, such as two-sample instrumental variable analysis, settings that integrate data from epidemiological studies with diverse designs (e.g., prospective cohorts and retrospective case-control studies), and studies with variables prone to measurement error that are supplemented by validation studies. In this paper, we extend the aforementioned comprehensive theory to allow for the fusion of individual-level data from sources aligned with conditional distributions that do not correspond to a single factorization of the target distribution. Assuming conditional and marginal distribution alignments, we provide universal results that characterize the class of all influence functions of regular asymptotically linear estimators and the efficient influence function of any pathwise differentiable parameter, irrespective of the number of data sources, the specific parameter of interest, or the statistical model for the target distribution. This theory paves the way for machine-learning debiased, semiparametric efficient estimation.
△ Less
Submitted 24 February, 2025; v1 submitted 16 September, 2024;
originally announced September 2024.
-
Investigating symptom duration using current status data: a case study of post-acute COVID-19 syndrome
Authors:
Charles J. Wolock,
Susan Jacob,
Julia C. Bennett,
Anna Elias-Warren,
Jessica O'Hanlon,
Avi Kenny,
Nicholas P. Jewell,
Andrea Rotnitzky,
Stephen R. Cole,
Ana A. Weil,
Helen Y. Chu,
Marco Carone
Abstract:
For infectious diseases, characterizing symptom duration is of clinical and public health importance. Symptom duration may be assessed by surveying infected individuals and querying symptom status at the time of survey response. For example, in a SARS-CoV-2 testing program at the University of Washington, participants were surveyed at least $28$ days after testing positive and asked to report curr…
▽ More
For infectious diseases, characterizing symptom duration is of clinical and public health importance. Symptom duration may be assessed by surveying infected individuals and querying symptom status at the time of survey response. For example, in a SARS-CoV-2 testing program at the University of Washington, participants were surveyed at least $28$ days after testing positive and asked to report current symptom status. This study design yielded current status data: outcome measurements for each respondent consisted only of the time of survey response and a binary indicator of whether symptoms had resolved by that time. Such study design benefits from limited risk of recall bias, but analyzing the resulting data necessitates tailored statistical tools. Here, we review methods for current status data and describe a novel application of modern nonparametric techniques to this setting. The proposed approach is valid under weaker assumptions compared to existing methods, allows use of flexible machine learning tools, and handles potential survey nonresponse. From the university study, under an assumption that the survey response time is conditionally independent of symptom resolution time within strata of measured covariates, we estimate that 19% of participants experienced ongoing symptoms 30 days after testing positive, decreasing to 7% at 90 days. We assess the sensitivity of these results to deviations from conditional independence, finding the estimates to be more sensitive to assumption violations at 30 days compared to 90 days. Female sex, fatigue during acute infection, and higher viral load were associated with slower symptom resolution.
△ Less
Submitted 17 March, 2025; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Variable elimination, graph reduction and efficient g-formula
Authors:
F. Richard Guo,
Emilija Perković,
Andrea Rotnitzky
Abstract:
We study efficient estimation of an interventional mean associated with a point exposure treatment under a causal graphical model represented by a directed acyclic graph without hidden variables. Under such a model, it may happen that a subset of the variables are uninformative in that failure to measure them neither precludes identification of the interventional mean nor changes the semiparametri…
▽ More
We study efficient estimation of an interventional mean associated with a point exposure treatment under a causal graphical model represented by a directed acyclic graph without hidden variables. Under such a model, it may happen that a subset of the variables are uninformative in that failure to measure them neither precludes identification of the interventional mean nor changes the semiparametric variance bound for regular estimators of it. We develop a set of graphical criteria that are sound and complete for eliminating all the uninformative variables so that the cost of measuring them can be saved without sacrificing estimation efficiency, which could be useful when designing a planned observational or randomized study. Further, we construct a reduced directed acyclic graph on the set of informative variables only. We show that the interventional mean is identified from the marginal law by the g-formula (Robins, 1986) associated with the reduced graph, and the semiparametric variance bounds for estimating the interventional mean under the original and the reduced graphical model agree. This g-formula is an irreducible, efficient identifying formula in the sense that the nonparametric estimator of the formula, under regularity conditions, is asymptotically efficient under the original causal graphical model, and no formula with such property exists that only depends on a strict subset of the variables.
△ Less
Submitted 2 December, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
A note on efficient minimum cost adjustment sets in causal graphical models
Authors:
Ezequiel Smucler,
Andrea Rotnitzky
Abstract:
We study the selection of adjustment sets for estimating the interventional mean under an individualized treatment rule. We assume a non-parametric causal graphical model with, possibly, hidden variables and at least one adjustment set comprised of observable variables. Moreover, we assume that observable variables have positive costs associated with them. We define the cost of an observable adjus…
▽ More
We study the selection of adjustment sets for estimating the interventional mean under an individualized treatment rule. We assume a non-parametric causal graphical model with, possibly, hidden variables and at least one adjustment set comprised of observable variables. Moreover, we assume that observable variables have positive costs associated with them. We define the cost of an observable adjustment set as the sum of the costs of the variables that comprise it. We show that in this setting there exist adjustment sets that are minimum cost optimal, in the sense that they yield non-parametric estimators of the interventional mean with the smallest asymptotic variance among those that control for observable adjustment sets that have minimum cost. Our results are based on the construction of a special flow network associated with the original causal graph. We show that a minimum cost optimal adjustment set can be found by computing a maximum flow on the network, and then finding the set of vertices that are reachable from the source by augmenting paths. The optimaladj Python package implements the algorithms introduced in this paper.
△ Less
Submitted 6 January, 2022;
originally announced January 2022.
-
Double-robust and efficient methods for estimating the causal effects of a binary treatment
Authors:
James Robins,
Mariela Sued,
Quanhong Lei-Gomez,
Andrea Rotnitzky
Abstract:
We consider the problem of estimating the effects of a binary treatment on a continuous outcome of interest from observational data in the absence of confounding by unmeasured factors. We provide a new estimator of the population average treatment effect (ATE) based on the difference of novel double-robust (DR) estimators of the treatment-specific outcome means. We compare our new estimator with p…
▽ More
We consider the problem of estimating the effects of a binary treatment on a continuous outcome of interest from observational data in the absence of confounding by unmeasured factors. We provide a new estimator of the population average treatment effect (ATE) based on the difference of novel double-robust (DR) estimators of the treatment-specific outcome means. We compare our new estimator with previously estimators both theoretically and via simulation. DR-difference estimators may have poor finite sample behavior when the estimated propensity scores in the treated and untreated do not overlap. We therefore propose an alternative approach, which can be used even in this unfavorable setting, based on locally efficient double-robust estimation of a semiparametric regression model for the modification on an additive scale of the magnitude of the treatment effect by the baseline covariates $X$. In contrast with existing methods, our approach simultaneously provides estimates of: i) the average treatment effect in the total study population, ii) the average treatment effect in the random subset of the population with overlapping estimated propensity scores, and iii) the treatment effect at each level of the baseline covariates $X$.
When the covariate vector $X$ is high dimensional, one cannot be certain, owing to lack of power, that given models for the propensity score and for the regression of the outcome on treatment and $X$ used in constructing our DR estimators are nearly correct, even if they pass standard goodness of fit tests. Therefore to select among candidate models, we propose a novel approach to model selection that leverages the DR-nature of our treatment effect estimator and that outperforms cross-validation in a small simulation study.
△ Less
Submitted 2 August, 2020;
originally announced August 2020.
-
Efficient adjustment sets in causal graphical models with hidden variables
Authors:
Ezequiel Smucler,
Facundo Sapienza,
Andrea Rotnitzky
Abstract:
We study the selection of covariate adjustment sets for estimating the value of point exposure dynamic policies, also known as dynamic treatment regimes, assuming a non-parametric causal graphical model with hidden variables, in which at least one adjustment set is fully observable. We show that recently developed criteria, for graphs without hidden variables, to compare the asymptotic variance of…
▽ More
We study the selection of covariate adjustment sets for estimating the value of point exposure dynamic policies, also known as dynamic treatment regimes, assuming a non-parametric causal graphical model with hidden variables, in which at least one adjustment set is fully observable. We show that recently developed criteria, for graphs without hidden variables, to compare the asymptotic variance of non-parametric estimators of static policy values that control for certain adjustment sets, are also valid under dynamic policies and graphs with hidden variables. We show that there exist adjustment sets that are optimal minimal (minimum), in the sense of yielding estimators with the smallest variance among those that control for adjustment sets that are minimal (of minimum cardinality). Moreover, we show that if either no variables are hidden or if all the observable variables are ancestors of either treatment, outcome, or the variables that are used to decide treatment, a globally optimal adjustment set exists. We provide polynomial time algorithms to compute the globally optimal (when it exists), optimal minimal, and optimal minimum adjustment sets. Our results are based on the construction of an undirected graph in which vertex cuts between the treatment and outcome variables correspond to adjustment sets. In this undirected graph, a partial order between minimal vertex cuts can be defined that makes the set of minimal cuts a lattice. This partial order corresponds directly to the ordering of the asymptotic variances of the corresponding non-parametrically adjusted estimators.
△ Less
Submitted 26 May, 2020; v1 submitted 22 April, 2020;
originally announced April 2020.
-
Efficient adjustment sets for population average treatment effect estimation in non-parametric causal graphical models
Authors:
Andrea Rotnitzky,
Ezequiel Smucler
Abstract:
The method of covariate adjustment is often used for estimation of population average treatment effects in observational studies. Graphical rules for determining all valid covariate adjustment sets from an assumed causal graphical model are well known. Restricting attention to causal linear models, a recent article derived two novel graphical criteria: one to compare the asymptotic variance of lin…
▽ More
The method of covariate adjustment is often used for estimation of population average treatment effects in observational studies. Graphical rules for determining all valid covariate adjustment sets from an assumed causal graphical model are well known. Restricting attention to causal linear models, a recent article derived two novel graphical criteria: one to compare the asymptotic variance of linear regression treatment effect estimators that control for certain distinct adjustment sets and another to identify the optimal adjustment set that yields the least squares treatment effect estimator with the smallest asymptotic variance among consistent adjusted least squares estimators. In this paper we show that the same graphical criteria can be used in non-parametric causal graphical models when treatment effects are estimated by contrasts involving non-parametrically adjusted estimators of the interventional means. We also provide a graphical criterion for determining the optimal adjustment set among the minimal adjustment sets, which is valid for both linear and non-parametric estimators. We provide a new graphical criterion for comparing time dependent adjustment sets, that is, sets comprised by covariates that adjust for future treatments and that are themselves affected by earlier treatments. We show by example that uniformly optimal time dependent adjustment sets do not always exist. In addition, for point interventions, we provide a sound and complete graphical criterion for determining when a non-parametric optimally adjusted estimator of an interventional mean, or of a contrast of interventional means, is as efficient as an efficient estimator of the same parameter that exploits the information in the conditional independencies encoded in the non-parametric causal graphical model.
△ Less
Submitted 16 December, 2019; v1 submitted 30 November, 2019;
originally announced December 2019.
-
Efficient estimation of optimal regimes under a no direct effect assumption
Authors:
Lin Liu,
Zach Shahn,
James M. Robins,
Andrea Rotnitzky
Abstract:
We derive new estimators of an optimal joint testing and treatment regime under the no direct effect (NDE) assumption that a given laboratory, diagnostic, or screening test has no effect on a patient's clinical outcomes except through the effect of the test results on the choice of treatment. We model the optimal joint strategy using an optimal regime structural nested mean model (opt-SNMM). The p…
▽ More
We derive new estimators of an optimal joint testing and treatment regime under the no direct effect (NDE) assumption that a given laboratory, diagnostic, or screening test has no effect on a patient's clinical outcomes except through the effect of the test results on the choice of treatment. We model the optimal joint strategy using an optimal regime structural nested mean model (opt-SNMM). The proposed estimators are more efficient than previous estimators of the parameters of an opt-SNMM because they efficiently leverage the `no direct effect (NDE) of testing' assumption. Our methods will be of importance to decision scientists who either perform cost-benefit analyses or are tasked with the estimation of the `value of information' supplied by an expensive diagnostic test (such as an MRI to screen for lung cancer).
△ Less
Submitted 18 January, 2021; v1 submitted 27 August, 2019;
originally announced August 2019.
-
A unifying approach for doubly-robust $\ell_1$ regularized estimation of causal contrasts
Authors:
Ezequiel Smucler,
Andrea Rotnitzky,
James M. Robins
Abstract:
We consider inference about a scalar parameter under a non-parametric model based on a one-step estimator computed as a plug in estimator plus the empirical mean of an estimator of the parameter's influence function. We focus on a class of parameters that have influence function which depends on two infinite dimensional nuisance functions and such that the bias of the one-step estimator of the par…
▽ More
We consider inference about a scalar parameter under a non-parametric model based on a one-step estimator computed as a plug in estimator plus the empirical mean of an estimator of the parameter's influence function. We focus on a class of parameters that have influence function which depends on two infinite dimensional nuisance functions and such that the bias of the one-step estimator of the parameter of interest is the expectation of the product of the estimation errors of the two nuisance functions. Our class includes many important treatment effect contrasts of interest in causal inference and econometrics, such as ATE, ATT, an integrated causal contrast with a continuous treatment, and the mean of an outcome missing not at random. We propose estimators of the target parameter that entertain approximately sparse regression models for the nuisance functions allowing for the number of potential confounders to be even larger than the sample size. By employing sample splitting, cross-fitting and $\ell_1$-regularized regression estimators of the nuisance functions based on objective functions whose directional derivatives agree with those of the parameter's influence function, we obtain estimators of the target parameter with two desirable robustness properties: (1) they are rate doubly-robust in that they are root-n consistent and asymptotically normal when both nuisance functions follow approximately sparse models, even if one function has a very non-sparse regression coefficient, so long as the other has a sufficiently sparse regression coefficient, and (2) they are model doubly-robust in that they are root-n consistent and asymptotically normal even if one of the nuisance functions does not follow an approximately sparse model so long as the other nuisance function follows an approximately sparse model with a sufficiently sparse regression coefficient.
△ Less
Submitted 5 June, 2019; v1 submitted 7 April, 2019;
originally announced April 2019.
-
Characterization of parameters with a mixed bias property
Authors:
Andrea Rotnitzky,
Ezequiel Smucler,
James M. Robins
Abstract:
In this article we study a class of parameters with the so-called `mixed bias property'. For parameters with this property, the bias of the semiparametric efficient one step estimator is equal to the mean of the product of the estimation errors of two nuisance functions. In non-parametric models, parameters with the mixed bias property admit so-called rate doubly robust estimators, i.e. estimators…
▽ More
In this article we study a class of parameters with the so-called `mixed bias property'. For parameters with this property, the bias of the semiparametric efficient one step estimator is equal to the mean of the product of the estimation errors of two nuisance functions. In non-parametric models, parameters with the mixed bias property admit so-called rate doubly robust estimators, i.e. estimators that are consistent and asymptotically normal when one succeeds in estimating both nuisance functions at sufficiently fast rates, with the possibility of trading off slower rates of convergence for the estimator of one of the nuisance functions with faster rates for the estimator of the other nuisance. We show that the class of parameters with the mixed bias property strictly includes two recently studied classes of parameters which, in turn, include many parameters of interest in causal inference. We characterize the form of parameters with the mixed bias property and of their influence functions. Furthermore, we derive two functional moment equations, each being solved at one of the two nuisance functions, as well as, two functional loss functions, each being minimized at one of the two nuisance functions. These loss functions can be used to derive loss based penalized estimators of the nuisance functions.
△ Less
Submitted 4 May, 2019; v1 submitted 7 April, 2019;
originally announced April 2019.
-
On the multiply robust estimation of the mean of the g-functional
Authors:
Andrea Rotnitzky,
James Robins,
Lucia Babino
Abstract:
We study multiply robust (MR) estimators of the longitudinal g-computation formula of Robins (1986). In the first part of this paper we review and extend the recently proposed parametric multiply robust estimators of Tchetgen-Tchetgen (2009) and Molina, Rotnitzky, Sued and Robins (2017). In the second part of the paper we derive multiply and doubly robust estimators that use non-parametric machine…
▽ More
We study multiply robust (MR) estimators of the longitudinal g-computation formula of Robins (1986). In the first part of this paper we review and extend the recently proposed parametric multiply robust estimators of Tchetgen-Tchetgen (2009) and Molina, Rotnitzky, Sued and Robins (2017). In the second part of the paper we derive multiply and doubly robust estimators that use non-parametric machine-learning (ML) estimators of nuisance functions in lieu of parametric models. We use sample splitting to avoid the need for Donsker conditions, thereby allowing an analyst to select the ML algorithms of their choosing. We contrast the asymptotic behavior of our non-parametric doubly robust and multiply robust estimators. In particular, we derive formulas for their asymptotic bias. Examining these formulas we conclude that although, under certain data generating laws, the rate at which the bias of the MR estimator converges to zero can exceed that of the DR estimator, nonetheless, under most laws, the bias of the DR and MR estimators converge to zero at the same rate.
△ Less
Submitted 23 May, 2017;
originally announced May 2017.
-
On the analysis of tuberculosis studies with intermittent missing sputum data
Authors:
Daniel Scharfstein,
Andrea Rotnitzky,
Maria Abraham,
Aidan McDermott,
Richard Chaisson,
Lawrence Geiter
Abstract:
In randomized studies evaluating treatments for tuberculosis (TB), individuals are scheduled to be routinely evaluated for the presence of TB using sputum cultures. One important endpoint in such studies is the time of culture conversion, the first visit at which a patient's sputum culture is negative and remains negative. This article addresses how to draw inference about treatment effects when s…
▽ More
In randomized studies evaluating treatments for tuberculosis (TB), individuals are scheduled to be routinely evaluated for the presence of TB using sputum cultures. One important endpoint in such studies is the time of culture conversion, the first visit at which a patient's sputum culture is negative and remains negative. This article addresses how to draw inference about treatment effects when sputum cultures are intermittently missing on some patients. We discuss inference under a novel benchmark assumption and under a class of assumptions indexed by a treatment-specific sensitivity parameter that quantify departures from the benchmark assumption. We motivate and illustrate our approach using data from a randomized trial comparing the effectiveness of two treatments for adult TB patients in Brazil.
△ Less
Submitted 9 February, 2016;
originally announced February 2016.
-
Causal Etiology of the Research of James M. Robins
Authors:
Thomas S. Richardson,
Andrea Rotnitzky
Abstract:
This issue of Statistical Science draws its inspiration from the work of James M. Robins. Jon Wellner, the Editor at the time, asked the two of us to edit a special issue that would highlight the research topics studied by Robins and the breadth and depth of Robins' contributions. Between the two of us, we have collaborated closely with Jamie for nearly 40 years. We agreed to edit this issue becau…
▽ More
This issue of Statistical Science draws its inspiration from the work of James M. Robins. Jon Wellner, the Editor at the time, asked the two of us to edit a special issue that would highlight the research topics studied by Robins and the breadth and depth of Robins' contributions. Between the two of us, we have collaborated closely with Jamie for nearly 40 years. We agreed to edit this issue because we recognized that we were among the few in a position to relate the trajectory of his research career to date.
△ Less
Submitted 10 March, 2015;
originally announced March 2015.
-
Comment: Performance of Double-Robust Estimators When ``Inverse Probability'' Weights Are Highly Variable
Authors:
James Robins,
Mariela Sued,
Quanhong Lei-Gomez,
Andrea Rotnitzky
Abstract:
Comment on ``Performance of Double-Robust Estimators When ``Inverse Probability'' Weights Are Highly Variable'' [arXiv:0804.2958]
Comment on ``Performance of Double-Robust Estimators When ``Inverse Probability'' Weights Are Highly Variable'' [arXiv:0804.2958]
△ Less
Submitted 18 April, 2008;
originally announced April 2008.