-
Design-Based Estimation and Central Limit Theorems for Local Average Treatment Effects for RCTs
Authors:
Peter Z Schochet
Abstract:
There is a growing literature on design-based methods to estimate average treatment effects for randomized controlled trials (RCTs) using the underpinnings of experiments. In this article, we build on these methods to consider design-based regression estimators for the local average treatment effect (LATE) estimand for RCTs with treatment noncompliance. We prove new finite-population central limit…
▽ More
There is a growing literature on design-based methods to estimate average treatment effects for randomized controlled trials (RCTs) using the underpinnings of experiments. In this article, we build on these methods to consider design-based regression estimators for the local average treatment effect (LATE) estimand for RCTs with treatment noncompliance. We prove new finite-population central limit theorems for a range of designs, including blocked and clustered RCTs, allowing for baseline covariates to improve precision. We discuss consistent variance estimators based on model residuals and conduct simulations that show the estimators yield confidence interval coverage near nominal levels. We demonstrate the methods using data from a private school voucher RCT in New York City USA.
△ Less
Submitted 14 January, 2024;
originally announced January 2024.
-
Design-Based RCT Estimators and Central Limit Theorems for Baseline Subgroup and Related Analyses
Authors:
Peter Z. Schochet
Abstract:
There is a growing literature on design-based methods to estimate average treatment effects (ATEs) for randomized controlled trials (RCTs) for full sample analyses. This article extends these methods to estimate ATEs for discrete subgroups defined by pre-treatment variables, with an application to an RCT testing subgroup effects for a school voucher experiment in New York City. We consider ratio e…
▽ More
There is a growing literature on design-based methods to estimate average treatment effects (ATEs) for randomized controlled trials (RCTs) for full sample analyses. This article extends these methods to estimate ATEs for discrete subgroups defined by pre-treatment variables, with an application to an RCT testing subgroup effects for a school voucher experiment in New York City. We consider ratio estimators for subgroup effects using regression methods, allowing for model covariates to improve precision, and prove a finite population central limit theorem. We discuss extensions to blocked and clustered RCT designs, and to other common estimators with random treatment-control sample sizes (or weights): post-stratification estimators, weighted estimators that adjust for data nonresponse, and estimators for Bernoulli trials. We also develop simple variance estimators that share features with robust estimators. Simulations show that the design-based subgroup estimators yield confidence interval coverage near nominal levels, even for small subgroups.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Estimating Complier Average Causal Effects for Clustered RCTs When the Treatment Affects the Service Population
Authors:
Peter Z. Schochet
Abstract:
RCTs sometimes test interventions that aim to improve existing services targeted to a subset of individuals identified after randomization. Accordingly, the treatment could affect the composition of service recipients and the offered services. With such bias, intention-to-treat estimates using data on service recipients and nonrecipients may be difficult to interpret. This article develops causal…
▽ More
RCTs sometimes test interventions that aim to improve existing services targeted to a subset of individuals identified after randomization. Accordingly, the treatment could affect the composition of service recipients and the offered services. With such bias, intention-to-treat estimates using data on service recipients and nonrecipients may be difficult to interpret. This article develops causal estimands and inverse probability weighting (IPW) estimators for complier populations in these settings, using a generalized estimating equation approach that adjusts the standard errors for estimation error in the IPW weights. While our focus is on more general clustered RCTs, the methods also apply (reduce) to non-clustered RCTs. Simulations show that the estimators achieve nominal confidence interval coverage under the assumed identification conditions. An empirical application demonstrates the methods using data from a large-scale RCT testing the effects of early childhood services on children's cognitive development scores.
△ Less
Submitted 17 May, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
Statistical Power for Estimating Treatment Effects Using Difference-in-Differences and Comparative Interrupted Time Series Designs with Variation in Treatment Timing
Authors:
Peter Z. Schochet
Abstract:
This article develops new closed-form variance expressions for power analyses for commonly used difference-in-differences (DID) and comparative interrupted time series (CITS) panel data estimators. The main contribution is to incorporate variation in treatment timing into the analysis. The power formulas also account for other key design features that arise in practice: autocorrelated errors, uneq…
▽ More
This article develops new closed-form variance expressions for power analyses for commonly used difference-in-differences (DID) and comparative interrupted time series (CITS) panel data estimators. The main contribution is to incorporate variation in treatment timing into the analysis. The power formulas also account for other key design features that arise in practice: autocorrelated errors, unequal measurement intervals, and clustering due to the unit of treatment assignment. We consider power formulas for both cross-sectional and longitudinal models and allow for covariates. An illustrative power analysis provides guidance on appropriate sample sizes. The key finding is that accounting for treatment timing increases required sample sizes. Further, DID estimators have considerably more power than standard CITS and ITS estimators. An available Shiny R dashboard performs the sample size calculations for the considered estimators.
△ Less
Submitted 14 October, 2021; v1 submitted 12 February, 2021;
originally announced February 2021.
-
A Lasso-OLS Hybrid Approach to Covariate Selection and Average Treatment Effect Estimation for Clustered RCTs Using Design-Based Methods
Authors:
Peter Z. Schochet
Abstract:
Statistical power is often a concern for clustered RCTs due to variance inflation from design effects and the high cost of adding study clusters (such as hospitals, schools, or communities). While covariate pre-specification is the preferred approach for improving power to estimate regression-adjusted average treatment effects (ATEs), further precision gains can be achieved through covariate selec…
▽ More
Statistical power is often a concern for clustered RCTs due to variance inflation from design effects and the high cost of adding study clusters (such as hospitals, schools, or communities). While covariate pre-specification is the preferred approach for improving power to estimate regression-adjusted average treatment effects (ATEs), further precision gains can be achieved through covariate selection once primary outcomes have been collected. This article uses design-based methods underlying clustered RCTs to develop a Lasso-OLS hybrid procedure for the post-hoc selection of covariates and ATE estimation that avoids model overfitting and lack of transparency. In the first stage, lasso estimation is conducted using cluster-level averages, where asymptotic normality is proved using a new central limit theorem for finite population regression estimators. In the second stage, ATEs and design-based standard errors are estimated using weighted least squares with the first stage lasso covariates. This nonparametric approach applies to continuous, binary, and discrete outcomes. Simulation results indicate that Type 1 errors of the second stage ATE estimates are near nominal values and standard errors are near true ones, although somewhat conservative with small samples. The method is demonstrated using data from a large, federally funded clustered RCT testing the effects of school-based programs promoting behavioral health.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
Design-Based Ratio Estimators and Central Limit Theorems for Clustered, Blocked RCTs
Authors:
Peter Z. Schochet,
Nicole E. Pashley,
Luke W. Miratrix,
Tim Kautz
Abstract:
This article develops design-based ratio estimators for clustered, blocked randomized controlled trials (RCTs), with an application to a federally funded, school-based RCT testing the effects of behavioral health interventions. We consider finite population weighted least squares estimators for average treatment effects (ATEs), allowing for general weighting schemes and covariates. We consider mod…
▽ More
This article develops design-based ratio estimators for clustered, blocked randomized controlled trials (RCTs), with an application to a federally funded, school-based RCT testing the effects of behavioral health interventions. We consider finite population weighted least squares estimators for average treatment effects (ATEs), allowing for general weighting schemes and covariates. We consider models with block-by-treatment status interactions as well as restricted models with block indicators only. We prove new finite population central limit theorems for each block specification. We also discuss simple variance estimators that share features with commonly used cluster-robust standard error estimators. Simulations show that the design-based ATE estimator yields nominal rejection rates with standard errors near true ones, even with few clusters.
△ Less
Submitted 25 February, 2021; v1 submitted 4 February, 2020;
originally announced February 2020.