-
Exact statistical tests using integer programming: Leveraging an overlooked approach for maximizing power for differences between binomial proportions
Authors:
Stef Baas,
Yaron Racah,
Elad Berkman,
SofĂa S. Villar
Abstract:
Traditional hypothesis testing methods for differences in binomial proportions can either be too liberal (Wald test) or overly conservative (Fisher's exact test), especially in small samples. Regulators favour conservative approaches for robust type I error control, though excessive conservatism may significantly reduce statistical power. We offer fundamental theoretical contributions that extend…
▽ More
Traditional hypothesis testing methods for differences in binomial proportions can either be too liberal (Wald test) or overly conservative (Fisher's exact test), especially in small samples. Regulators favour conservative approaches for robust type I error control, though excessive conservatism may significantly reduce statistical power. We offer fundamental theoretical contributions that extend an approach proposed in 1969, resulting in the derivation of a family of exact tests designed to maximize a specific type of power. We establish theoretical guarantees for controlling type I error despite the discretization of the null parameter space. This theoretical advancement is supported by a comprehensive series of experiments to empirically quantify the power advantages compared to traditional hypothesis tests. The approach determines the rejection region through a binary decision for each outcome dataset and uses integer programming to find an optimal decision boundary that maximizes power subject to type I error constraints. Our analysis provides new theoretical properties and insights into this approach's comparative advantages. When optimized for average power over all possible parameter configurations under the alternative, the method exhibits remarkable robustness, performing optimally or near-optimally across specific alternatives while maintaining exact type I error control. The method can be further customized for particular prior beliefs by using a weighted average. The findings highlight both the method's practical utility and how techniques from combinatorial optimization can enhance statistical methodology.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Robust CATE Estimation Using Novel Ensemble Methods
Authors:
Oshri Machluf,
Tzviel Frostig,
Gal Shoham,
Tomer Milo,
Elad Berkman,
Raviv Pryluk
Abstract:
The estimation of Conditional Average Treatment Effects (CATE) is crucial for understanding the heterogeneity of treatment effects in clinical trials. We evaluate the performance of common methods, including causal forests and various meta-learners, across a diverse set of scenarios, revealing that each of the methods struggles in one or more of the tested scenarios. Given the inherent uncertainty…
▽ More
The estimation of Conditional Average Treatment Effects (CATE) is crucial for understanding the heterogeneity of treatment effects in clinical trials. We evaluate the performance of common methods, including causal forests and various meta-learners, across a diverse set of scenarios, revealing that each of the methods struggles in one or more of the tested scenarios. Given the inherent uncertainty of the data-generating process in real-life scenarios, the robustness of a CATE estimator to various scenarios is critical for its reliability. To address this limitation of existing methods, we propose two new ensemble methods that integrate multiple estimators to enhance prediction stability and performance - Stacked X-Learner which uses the X-Learner with model stacking for estimating the nuisance functions, and Consensus Based Averaging (CBA), which averages only the models with highest internal agreement. We show that these models achieve good performance across a wide range of scenarios varying in complexity, sample size and structure of the underlying-mechanism, including a biologically driven model for PD-L1 inhibition pathway for cancer treatment. Furthermore, we demonstrate improved performance by the Stacked X-Learner also when comparing to other ensemble methods, including R-Stacking, Causal-Stacking and others.
△ Less
Submitted 11 July, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Causal Responder Detection
Authors:
Tzviel Frostig,
Oshri Machluf,
Amitay Kamber,
Elad Berkman,
Raviv Pryluk
Abstract:
We introduce the causal responders detection (CARD), a novel method for responder analysis that identifies treated subjects who significantly respond to a treatment. Leveraging recent advances in conformal prediction, CARD employs machine learning techniques to accurately identify responders while controlling the false discovery rate in finite sample sizes. Additionally, we incorporate a propensit…
▽ More
We introduce the causal responders detection (CARD), a novel method for responder analysis that identifies treated subjects who significantly respond to a treatment. Leveraging recent advances in conformal prediction, CARD employs machine learning techniques to accurately identify responders while controlling the false discovery rate in finite sample sizes. Additionally, we incorporate a propensity score adjustment to mitigate bias arising from non-random treatment allocation, enhancing the robustness of our method in observational settings. Simulation studies demonstrate that CARD effectively detects responders with high power in diverse scenarios.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.