-
A Double Machine Learning Approach to Combining Experimental and Observational Data
Authors:
Harsh Parikh,
Marco Morucci,
Vittorio Orlandi,
Sudeepa Roy,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only on…
▽ More
Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only one of these assumptions is violated, we provide semiparametrically efficient treatment effect estimators. However, our no-free-lunch theorem highlights the necessity of accurately identifying the violated assumption for consistent treatment effect estimation. Through comparative analyses, we show our framework's superiority over existing data fusion methods. The practical utility of our approach is further exemplified by three real-world case studies, underscoring its potential for widespread application in empirical research.
△ Less
Submitted 2 April, 2024; v1 submitted 3 July, 2023;
originally announced July 2023.
-
Matched Machine Learning: A Generalized Framework for Treatment Effect Inference With Learned Metrics
Authors:
Marco Morucci,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
We introduce Matched Machine Learning, a framework that combines the flexibility of machine learning black boxes with the interpretability of matching, a longstanding tool in observational causal inference. Interpretability is paramount in many high-stakes application of causal inference. Current tools for nonparametric estimation of both average and individualized treatment effects are black-boxe…
▽ More
We introduce Matched Machine Learning, a framework that combines the flexibility of machine learning black boxes with the interpretability of matching, a longstanding tool in observational causal inference. Interpretability is paramount in many high-stakes application of causal inference. Current tools for nonparametric estimation of both average and individualized treatment effects are black-boxes that do not allow for human auditing of estimates. Our framework uses machine learning to learn an optimal metric for matching units and estimating outcomes, thus achieving the performance of machine learning black-boxes, while being interpretable. Our general framework encompasses several published works as special cases. We provide asymptotic inference theory for our proposed framework, enabling users to construct approximate confidence intervals around estimates of both individualized and average treatment effects. We show empirically that instances of Matched Machine Learning perform on par with black-box machine learning methods and better than existing matching methods for similar problems. Finally, in our application we show how Matched Machine Learning can be used to perform causal inference even when covariate data are highly complex: we study an image dataset, and produce high quality matches and estimates of treatment effects.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Measurement That Matches Theory: Theory-Driven Identification in IRT Models
Authors:
Marco Morucci,
Margaret Foster,
Kaitlyn Webster,
So Jin Lee,
David Siegel
Abstract:
Measurement bridges theory and empirics. Without measures that appropriately capture theoretical concepts, description will fail to represent reality and true causal inference will be impossible. Yet, the social sciences traffic in complex concepts and their measurement is difficult. Item Response Theory (IRT) models reduce variation in multiple variables to continuous variation along one or more…
▽ More
Measurement bridges theory and empirics. Without measures that appropriately capture theoretical concepts, description will fail to represent reality and true causal inference will be impossible. Yet, the social sciences traffic in complex concepts and their measurement is difficult. Item Response Theory (IRT) models reduce variation in multiple variables to continuous variation along one or more latent dimensions intended to capture key theoretical concepts. Unfortunately, those latent dimensions have no intrinsic conceptual meaning. Partial solutions to that problem include limiting the number of dimensions to one or assigning meaning post-analysis, but either can lead to potential bias and a lack of reliability across data sources. We propose, detail, and validate a semi-supervised approach employing Bayesian Item Response Theory on multiple latent dimensions and binary data. Our approach, which we validate on simulated and real data, yields conceptually meaningful latent dimensions that are reliable across different data sources without additional exogenous assumptions.
△ Less
Submitted 28 May, 2024; v1 submitted 23 November, 2021;
originally announced November 2021.
-
Matching Bounds: How Choice of Matching Algorithm Impacts Treatment Effects Estimates and What to Do about It
Authors:
Marco Morucci,
Cynthia Rudin
Abstract:
Many major works in social science employ matching to make causal conclusions, but different matches on the same data may produce different treatment effect estimates, even when they achieve similar balance or minimize the same loss function. We discuss reasons and consequences of this problem. We present evidence of this problem by replicating ten papers that use matching and we find that differe…
▽ More
Many major works in social science employ matching to make causal conclusions, but different matches on the same data may produce different treatment effect estimates, even when they achieve similar balance or minimize the same loss function. We discuss reasons and consequences of this problem. We present evidence of this problem by replicating ten papers that use matching and we find that different popular matching algorithms produce inconsistent results. We introduce Matching Bounds: a finite-sample, nonstochastic method that allows analysts to know whether a matched sample that produces different results with the same levels of balance and overall match quality could be obtained from their data. We apply Matching Bounds to a replication of two studies and show that in one case results are robust to this issue and in another they are not.
△ Less
Submitted 22 March, 2023; v1 submitted 6 September, 2020;
originally announced September 2020.
-
Adaptive Hyper-box Matching for Interpretable Individualized Treatment Effect Estimation
Authors:
Marco Morucci,
Vittorio Orlandi,
Sudeepa Roy,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
We propose a matching method for observational data that matches units with others in unit-specific, hyper-box-shaped regions of the covariate space. These regions are large enough that many matches are created for each unit and small enough that the treatment effect is roughly constant throughout. The regions are found as either the solution to a mixed integer program, or using a (fast) approxima…
▽ More
We propose a matching method for observational data that matches units with others in unit-specific, hyper-box-shaped regions of the covariate space. These regions are large enough that many matches are created for each unit and small enough that the treatment effect is roughly constant throughout. The regions are found as either the solution to a mixed integer program, or using a (fast) approximation algorithm. The result is an interpretable and tailored estimate of a causal effect for each unit.
△ Less
Submitted 8 August, 2020; v1 submitted 3 March, 2020;
originally announced March 2020.
-
Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference
Authors:
M. Usaid Awan,
Marco Morucci,
Vittorio Orlandi,
Sudeepa Roy,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
We propose a matching method that recovers direct treatment effects from randomized experiments where units are connected in an observed network, and units that share edges can potentially influence each others' outcomes. Traditional treatment effect estimators for randomized experiments are biased and error prone in this setting. Our method matches units almost exactly on counts of unique subgrap…
▽ More
We propose a matching method that recovers direct treatment effects from randomized experiments where units are connected in an observed network, and units that share edges can potentially influence each others' outcomes. Traditional treatment effect estimators for randomized experiments are biased and error prone in this setting. Our method matches units almost exactly on counts of unique subgraphs within their neighborhood graphs. The matches that we construct are interpretable and high-quality. Our method can be extended easily to accommodate additional unit-level covariate information. We show empirically that our method performs better than other existing methodologies for this problem, while producing meaningful, interpretable results.
△ Less
Submitted 2 March, 2020;
originally announced March 2020.
-
Interpretable Almost-Matching-Exactly With Instrumental Variables
Authors:
M. Usaid Awan,
Yameng Liu,
Marco Morucci,
Sudeepa Roy,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the effects of unmeasured confounding. Existing methods for IV estimation either require strong parametric assumptions, use arbitrary distance metrics,…
▽ More
Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the effects of unmeasured confounding. Existing methods for IV estimation either require strong parametric assumptions, use arbitrary distance metrics, or do not scale well to large datasets. We propose a matching framework for IV in the presence of observed categorical confounders that addresses these weaknesses. Our method first matches units exactly, and then consecutively drops variables to approximately match the remaining units on as many variables as possible. We show that our algorithm constructs better matches than other existing methods on simulated datasets, and we produce interesting results in an application to political canvassing.
△ Less
Submitted 28 July, 2019; v1 submitted 27 June, 2019;
originally announced June 2019.
-
A robust approach to quantifying uncertainty in matching problems of causal inference
Authors:
Marco Morucci,
Md. Noor-E-Alam,
Cynthia Rudin
Abstract:
Unquantified sources of uncertainty in observational causal analyses can break the integrity of the results. One would never want another analyst to repeat a calculation with the same dataset, using a seemingly identical procedure, only to find a different conclusion. However, as we show in this work, there is a typical source of uncertainty that is essentially never considered in observational ca…
▽ More
Unquantified sources of uncertainty in observational causal analyses can break the integrity of the results. One would never want another analyst to repeat a calculation with the same dataset, using a seemingly identical procedure, only to find a different conclusion. However, as we show in this work, there is a typical source of uncertainty that is essentially never considered in observational causal studies: the choice of match assignment for matched groups, that is, which unit is matched to which other unit before a hypothesis test is conducted. The choice of match assignment is anything but innocuous, and can have a surprisingly large influence on the causal conclusions. Given that a vast number of causal inference studies test hypotheses on treatment effects after treatment cases are matched with similar control cases, we should find a way to quantify how much this extra source of uncertainty impacts results. What we would really like to be able to report is that \emph{no matter} which match assignment is made, as long as the match is sufficiently good, then the hypothesis test result still holds. In this paper, we provide methodology based on discrete optimization to create robust tests that explicitly account for this possibility. We formulate robust tests for binary and continuous data based on common test statistics as integer linear programs solvable with common methodologies. We study the finite-sample behavior of our test statistic in the discrete-data case. We apply our methods to simulated and real-world datasets and show that they can produce useful results in practical applied settings.
△ Less
Submitted 11 August, 2022; v1 submitted 5 December, 2018;
originally announced December 2018.
-
FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference
Authors:
Tianyu Wang,
Marco Morucci,
M. Usaid Awan,
Yameng Liu,
Sudeepa Roy,
Cynthia Rudin,
Alexander Volfovsky
Abstract:
A classical problem in causal inference is that of matching, where treatment units need to be matched to control units based on covariate information. In this work, we propose a method that computes high quality almost-exact matches for high-dimensional categorical datasets. This method, called FLAME (Fast Large-scale Almost Matching Exactly), learns a distance metric for matching using a hold-out…
▽ More
A classical problem in causal inference is that of matching, where treatment units need to be matched to control units based on covariate information. In this work, we propose a method that computes high quality almost-exact matches for high-dimensional categorical datasets. This method, called FLAME (Fast Large-scale Almost Matching Exactly), learns a distance metric for matching using a hold-out training data set. In order to perform matching efficiently for large datasets, FLAME leverages techniques that are natural for query processing in the area of database management, and two implementations of FLAME are provided: the first uses SQL queries and the second uses bit-vector techniques. The algorithm starts by constructing matches of the highest quality (exact matches on all covariates), and successively eliminates variables in order to match exactly on as many variables as possible, while still maintaining interpretable high-quality matches and balance between treatment and control groups. We leverage these high quality matches to estimate conditional average treatment effects (CATEs). Our experiments show that FLAME scales to huge datasets with millions of observations where existing state-of-the-art methods fail, and that it achieves significantly better performance than other matching methods.
△ Less
Submitted 4 February, 2021; v1 submitted 19 July, 2017;
originally announced July 2017.