-
Statistics at a Crossroads; Who is for the Challenge?
Authors:
Xuming He,
David Madigan,
Bin Yu,
Jon Wellner
Abstract:
This project was sponsored by the National Science Foundation and organized by a steering committee and a group of theme leaders. The six-member steering committee, consisting of James Berger, Xuming He, David Madigan, Susan Murphy, Bin Yu, and Jon Wellner, was responsible for the overall planning of the project.
This report is designed to be accessible to the wider audience of key stakeholders…
▽ More
This project was sponsored by the National Science Foundation and organized by a steering committee and a group of theme leaders. The six-member steering committee, consisting of James Berger, Xuming He, David Madigan, Susan Murphy, Bin Yu, and Jon Wellner, was responsible for the overall planning of the project.
This report is designed to be accessible to the wider audience of key stakeholders in statistics and data science, including academic departments, university administration, and funding agencies. After the role and the value of Statistics and Data Science are discussed in Section 1, the report focuses on the two goals related to emerging research and data-driven challenges in applications. Section 2 identifies emerging research topics from the data challenges arising from scientific and social applications, and Section 3 discusses a number of emerging areas in foundational research. How to engage with those data-driven challenges and foster interdisciplinary collaborations is also summarized in the Executive Summary. The third goal of creating a vibrant research community and maintaining an appropriate balance is addressed in Sections 4 (Professional Culture and Community Responsibilities) and 5 (Doctoral Education).
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
Combining Cox Regressions Across a Heterogeneous Distributed Research Network Facing Small and Zero Counts
Authors:
Martijn J. Schuemie,
Yong Chen,
David Madigan,
Marc A. Suchard
Abstract:
Studies of the effects of medical interventions increasingly take place in distributed research settings using data from multiple clinical data sources including electronic health records and administrative claims. In such settings, privacy concerns typically prohibit sharing of individual patient data, and instead, analyses can only utilize summary statistics from the individual databases. In the…
▽ More
Studies of the effects of medical interventions increasingly take place in distributed research settings using data from multiple clinical data sources including electronic health records and administrative claims. In such settings, privacy concerns typically prohibit sharing of individual patient data, and instead, analyses can only utilize summary statistics from the individual databases. In the specific but very common context of the Cox proportional hazards model, we show that standard meta analysis methods then lead to substantial bias when outcome counts are small. This bias derives primarily from the normal approximations that the methods utilize. Here we propose and evaluate methods that eschew normal approximations in favor of three more flexible approximations: a skew-normal, a one-dimensional grid, and a custom parametric function that mimics the behavior of the Cox likelihood function. In extensive simulation studies we demonstrate how these approximations impact bias in the context of both fixed-effects and (Bayesian) random-effects models. We then apply these approaches to three real-world studies of the comparative safety of antidepressants, each using data from four observational healthcare databases.
△ Less
Submitted 5 January, 2021;
originally announced January 2021.
-
Stable discovery of interpretable subgroups via calibration in causal studies
Authors:
Raaz Dwivedi,
Yan Shuo Tan,
Briton Park,
Mian Wei,
Kevin Horgan,
David Madigan,
Bin Yu
Abstract:
Building on Yu and Kumbier's PCS framework and for randomized experiments, we introduce a novel methodology for Stable Discovery of Interpretable Subgroups via Calibration (StaDISC), with large heterogeneous treatment effects. StaDISC was developed during our re-analysis of the 1999-2000 VIGOR study, an 8076 patient randomized controlled trial (RCT), that compared the risk of adverse events from a…
▽ More
Building on Yu and Kumbier's PCS framework and for randomized experiments, we introduce a novel methodology for Stable Discovery of Interpretable Subgroups via Calibration (StaDISC), with large heterogeneous treatment effects. StaDISC was developed during our re-analysis of the 1999-2000 VIGOR study, an 8076 patient randomized controlled trial (RCT), that compared the risk of adverse events from a then newly approved drug, Rofecoxib (Vioxx), to that from an older drug Naproxen. Vioxx was found to, on average and in comparison to Naproxen, reduce the risk of gastrointestinal (GI) events but increase the risk of thrombotic cardiovascular (CVT) events. Applying StaDISC, we fit 18 popular conditional average treatment effect (CATE) estimators for both outcomes and use calibration to demonstrate their poor global performance. However, they are locally well-calibrated and stable, enabling the identification of patient groups with larger than (estimated) average treatment effects. In fact, StaDISC discovers three clinically interpretable subgroups each for the GI outcome (totaling 29.4% of the study size) and the CVT outcome (totaling 11.0%). Complementary analyses of the found subgroups using the 2001-2004 APPROVe study, a separate independently conducted RCT with 2587 patients, provides further supporting evidence for the promise of StaDISC.
△ Less
Submitted 28 September, 2020; v1 submitted 23 August, 2020;
originally announced August 2020.
-
Bayesian Posterior Interval Calibration to Improve the Interpretability of Observational Studies
Authors:
Jami J. Mulgrave,
David Madigan,
George Hripcsak
Abstract:
Observational healthcare data offer the potential to estimate causal effects of medical products on a large scale. However, the confidence intervals and p-values produced by observational studies only account for random error and fail to account for systematic error. As a consequence, operating characteristics such as confidence interval coverage and Type I error rates often deviate sharply from t…
▽ More
Observational healthcare data offer the potential to estimate causal effects of medical products on a large scale. However, the confidence intervals and p-values produced by observational studies only account for random error and fail to account for systematic error. As a consequence, operating characteristics such as confidence interval coverage and Type I error rates often deviate sharply from their nominal values and render interpretation impossible. While there is longstanding awareness of systematic error in observational studies, analytic approaches to empirically account for systematic error are relatively new. Several authors have proposed approaches using negative controls (also known as "falsification hypotheses") and positive controls. The basic idea is to adjust confidence intervals and p-values in light of the bias (if any) detected in the analyses of the negative and positive control. In this work, we propose a Bayesian statistical procedure for posterior interval calibration that uses negative and positive controls. We show that the posterior interval calibration procedure restores nominal characteristics, such as 95% coverage of the true effect size by the 95% posterior interval.
△ Less
Submitted 1 May, 2024; v1 submitted 12 March, 2020;
originally announced March 2020.
-
A systematic approach to improving the reliability and scale of evidence from health care data
Authors:
Martijn J. Schuemie,
Patrick B. Ryan,
George Hripcsak,
David Madigan,
Marc A. Suchard
Abstract:
Concerns over reproducibility in science extend to research using existing healthcare data; many observational studies investigating the same topic produce conflicting results, even when using the same data. To address this problem, we propose a paradigm shift. The current paradigm centers on generating one estimate at a time using a unique study design with unknown reliability and publishing (or…
▽ More
Concerns over reproducibility in science extend to research using existing healthcare data; many observational studies investigating the same topic produce conflicting results, even when using the same data. To address this problem, we propose a paradigm shift. The current paradigm centers on generating one estimate at a time using a unique study design with unknown reliability and publishing (or not) one estimate at a time. The new paradigm advocates for high-throughput observational studies using consistent and standardized methods, allowing evaluation, calibration, and unbiased dissemination to generate a more reliable and complete evidence base. We demonstrate this new paradigm by comparing all depression treatments for a set of outcomes, producing 17,718 hazard ratios, each using methodology on par with state-of-the-art studies. We furthermore include control hypotheses to evaluate and calibrate our evidence generation process. Results show good transitivity and consistency between databases, and agree with four out of the five findings from clinical trials. The distribution of effect size estimates reported in literature reveals an absence of small or null effects, with a sharp cutoff at p = 0.05. No such phenomena were observed in our results, suggesting more complete and more reliable evidence.
△ Less
Submitted 28 March, 2018;
originally announced March 2018.
-
Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model
Authors:
Benjamin Letham,
Cynthia Rudin,
Tyler H. McCormick,
David Madigan
Abstract:
We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if...then... statements (e.g., if high blood pressure, then stroke) that discretize a high-dimensional, multivariate feature space into a series of simple, readily interpretable decision statements. We introduce a generative model…
▽ More
We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if...then... statements (e.g., if high blood pressure, then stroke) that discretize a high-dimensional, multivariate feature space into a series of simple, readily interpretable decision statements. We introduce a generative model called Bayesian Rule Lists that yields a posterior distribution over possible decision lists. It employs a novel prior structure to encourage sparsity. Our experiments show that Bayesian Rule Lists has predictive accuracy on par with the current top algorithms for prediction in machine learning. Our method is motivated by recent developments in personalized medicine, and can be used to produce highly accurate and interpretable medical scoring systems. We demonstrate this by producing an alternative to the CHADS$_2$ score, actively used in clinical practice for estimating the risk of stroke in patients that have atrial fibrillation. Our model is as interpretable as CHADS$_2$, but more accurate.
△ Less
Submitted 5 November, 2015;
originally announced November 2015.
-
Massive parallelization of serial inference algorithms for a complex generalized linear model
Authors:
Marc A. Suchard,
Shawn E. Simpson,
Ivan Zorych,
Patrick Ryan,
David Madigan
Abstract:
Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper w…
▽ More
Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety.
△ Less
Submitted 4 August, 2012;
originally announced August 2012.
-
Bayesian hierarchical rule modeling for predicting medical conditions
Authors:
Tyler H. McCormick,
Cynthia Rudin,
David Madigan
Abstract:
We propose a statistical modeling technique, called the Hierarchical Association Rule Model (HARM), that predicts a patient's possible future medical conditions given the patient's current and past history of reported conditions. The core of our technique is a Bayesian hierarchical model for selecting predictive association rules (such as "condition 1 and condition 2 $\rightarrow$ condition 3") fr…
▽ More
We propose a statistical modeling technique, called the Hierarchical Association Rule Model (HARM), that predicts a patient's possible future medical conditions given the patient's current and past history of reported conditions. The core of our technique is a Bayesian hierarchical model for selecting predictive association rules (such as "condition 1 and condition 2 $\rightarrow$ condition 3") from a large set of candidate rules. Because this method "borrows strength" using the conditions of many similar patients, it is able to provide predictions specialized to any given patient, even when little information about the patient's history of conditions is available.
△ Less
Submitted 28 June, 2012;
originally announced June 2012.
-
A flexible Bayesian generalized linear model for dichotomous response data with an application to text categorization
Authors:
Susana Eyheramendy,
David Madigan
Abstract:
We present a class of sparse generalized linear models that include probit and logistic regression as special cases and offer some extra flexibility. We provide an EM algorithm for learning the parameters of these models from data. We apply our method in text classification and in simulated data and show that our method outperforms the logistic and probit models and also the elastic net, in gene…
▽ More
We present a class of sparse generalized linear models that include probit and logistic regression as special cases and offer some extra flexibility. We provide an EM algorithm for learning the parameters of these models from data. We apply our method in text classification and in simulated data and show that our method outperforms the logistic and probit models and also the elastic net, in general by a substantial margin.
△ Less
Submitted 7 August, 2007;
originally announced August 2007.