-
Augmented balancing weights as linear regression
Authors:
David Bruns-Smith,
Oliver Dukes,
Avi Feller,
Elizabeth L. Ogburn
Abstract:
We provide a novel characterization of augmented balancing weights, also known as automatic debiased machine learning (AutoDML). These popular doubly robust or de-biased machine learning estimators combine outcome modeling with balancing weights - weights that achieve covariate balance directly in lieu of estimating and inverting the propensity score. When the outcome and weighting models are both…
▽ More
We provide a novel characterization of augmented balancing weights, also known as automatic debiased machine learning (AutoDML). These popular doubly robust or de-biased machine learning estimators combine outcome modeling with balancing weights - weights that achieve covariate balance directly in lieu of estimating and inverting the propensity score. When the outcome and weighting models are both linear in some (possibly infinite) basis, we show that the augmented estimator is equivalent to a single linear model with coefficients that combine the coefficients from the original outcome model and coefficients from an unpenalized ordinary least squares (OLS) fit on the same data. We see that, under certain choices of regularization parameters, the augmented estimator often collapses to the OLS estimator alone; this occurs for example in a re-analysis of the Lalonde 1986 dataset. We then extend these results to specific choices of outcome and weighting models. We first show that the augmented estimator that uses (kernel) ridge regression for both outcome and weighting models is equivalent to a single, undersmoothed (kernel) ridge regression. This holds numerically in finite samples and lays the groundwork for a novel analysis of undersmoothing and asymptotic rates of convergence. When the weighting model is instead lasso-penalized regression, we give closed-form expressions for special cases and demonstrate a ``double selection'' property. Our framework opens the black box on this increasingly popular class of estimators, bridges the gap between existing results on the semiparametric efficiency of undersmoothed and doubly robust estimators, and provides new insights into the performance of augmented balancing weights.
△ Less
Submitted 5 June, 2024; v1 submitted 27 April, 2023;
originally announced April 2023.
-
Prospective Learning: Principled Extrapolation to the Future
Authors:
Ashwin De Silva,
Rahul Ramesh,
Lyle Ungar,
Marshall Hussain Shuler,
Noah J. Cowan,
Michael Platt,
Chen Li,
Leyla Isik,
Seung-Eon Roh,
Adam Charles,
Archana Venkataraman,
Brian Caffo,
Javier J. How,
Justus M Kebschull,
John W. Krakauer,
Maxim Bichuch,
Kaleab Alemayehu Kinfu,
Eva Yezerets,
Dinesh Jayaraman,
Jong M. Shin,
Soledad Villar,
Ian Phillips,
Carey E. Priebe,
Thomas Hartung,
Michael I. Miller
, et al. (18 additional authors not shown)
Abstract:
Learning is a process which can update decision rules, based on past experience, such that future performance improves. Traditionally, machine learning is often evaluated under the assumption that the future will be identical to the past in distribution or change adversarially. But these assumptions can be either too optimistic or pessimistic for many problems in the real world. Real world scenari…
▽ More
Learning is a process which can update decision rules, based on past experience, such that future performance improves. Traditionally, machine learning is often evaluated under the assumption that the future will be identical to the past in distribution or change adversarially. But these assumptions can be either too optimistic or pessimistic for many problems in the real world. Real world scenarios evolve over multiple spatiotemporal scales with partially predictable dynamics. Here we reformulate the learning problem to one that centers around this idea of dynamic futures that are partially learnable. We conjecture that certain sequences of tasks are not retrospectively learnable (in which the data distribution is fixed), but are prospectively learnable (in which distributions may be dynamic), suggesting that prospective learning is more difficult in kind than retrospective learning. We argue that prospective learning more accurately characterizes many real world problems that (1) currently stymie existing artificial intelligence solutions and/or (2) lack adequate explanations for how natural intelligences solve them. Thus, studying prospective learning will lead to deeper insights and solutions to currently vexing challenges in both natural and artificial intelligences.
△ Less
Submitted 13 July, 2023; v1 submitted 18 January, 2022;
originally announced January 2022.
-
Incorporating Contact Network Structure in Cluster Randomized Trials
Authors:
Patrick C. Staples,
Elizabeth L. Ogburn,
Jukka-Pekka Onnela
Abstract:
Whenever possible, the efficacy of a new treatment, such as a drug or behavioral intervention, is investigated by randomly assigning some individuals to a treatment condition and others to a control condition, and comparing the outcomes between the two groups. Often, when the treatment aims to slow an infectious disease, groups or clusters of individuals are assigned en masse to each treatment arm…
▽ More
Whenever possible, the efficacy of a new treatment, such as a drug or behavioral intervention, is investigated by randomly assigning some individuals to a treatment condition and others to a control condition, and comparing the outcomes between the two groups. Often, when the treatment aims to slow an infectious disease, groups or clusters of individuals are assigned en masse to each treatment arm. The structure of interactions within and between clusters can reduce the power of the trial, i.e. the probability of correctly detecting a real treatment effect. We investigate the relationships among power, within-cluster structure, between-cluster mixing, and infectivity by simulating an infectious process on a collection of clusters. We demonstrate that current power calculations may be conservative for low levels of between-cluster mixing, but failing to account for moderate or high amounts can result in severely underpowered studies. Power also depends on within-cluster network structure for certain kinds of infectious spreading. Infections that spread opportunistically through very highly connected individuals have unpredictable infectious breakouts, which makes it harder to distinguish between random variation and real treatment effects. Our approach can be used before conducting a trial to assess power using network information if it is available, and we demonstrate how empirical data can inform the extent of between-cluster mixing.
△ Less
Submitted 30 April, 2015;
originally announced May 2015.
-
Vaccines, Contagion, and Social Networks
Authors:
Elizabeth L. Ogburn,
Tyler J. VanderWeele
Abstract:
Consider the causal effect that one individual's treatment may have on another individual's outcome when the outcome is contagious, with specific application to the effect of vaccination on an infectious disease outcome. The effect of one individual's vaccination on another's outcome can be decomposed into two different causal effects, called the "infectiousness" and "contagion" effects. We presen…
▽ More
Consider the causal effect that one individual's treatment may have on another individual's outcome when the outcome is contagious, with specific application to the effect of vaccination on an infectious disease outcome. The effect of one individual's vaccination on another's outcome can be decomposed into two different causal effects, called the "infectiousness" and "contagion" effects. We present identifying assumptions and estimation or testing procedures for infectiousness and contagion effects in two different settings: (1) using data sampled from independent groups of observations, and (2) using data collected from a single interdependent social network. The methods that we propose for social network data require fitting generalized linear models (GLMs). GLMs and other statistical models that require independence across subjects have been used widely to estimate causal effects in social network data, but, because the subjects in networks are presumably not independent, the use of such models is generally invalid, resulting in inference that is expected to be anticonservative. We introduce a way to ensure that GLM residuals are uncorrelated across subjects despite the fact that outcomes are non-independent. This simultaneously demonstrates the possibility of using GLMs and related statistical models for network data and highlights their limitations.
△ Less
Submitted 5 March, 2014;
originally announced March 2014.