Search | arXiv e-print repository

Use of Real-World Data and Real-World Evidence in Rare Disease Drug Development: A Statistical Perspective

Authors: Jie Chen, Susan Gruber, Hana Lee, Haitao Chu, Shiowjen Lee, Haijun Tian, Yan Wang, Weili He, Thomas Jemielita, Yang Song, Roy Tamura, Lu Tian, Yihua Zhao, Yong Chen, Mark van der Laan, Lei Nie

Abstract: Real-world data (RWD) and real-world evidence (RWE) have been increasingly used in medical product development and regulatory decision-making, especially for rare diseases. After outlining the challenges and possible strategies to address the challenges in rare disease drug development (see the accompanying paper), the Real-World Evidence (RWE) Scientific Working Group of the American Statistical… ▽ More Real-world data (RWD) and real-world evidence (RWE) have been increasingly used in medical product development and regulatory decision-making, especially for rare diseases. After outlining the challenges and possible strategies to address the challenges in rare disease drug development (see the accompanying paper), the Real-World Evidence (RWE) Scientific Working Group of the American Statistical Association Biopharmaceutical Section reviews the roles of RWD and RWE in clinical trials for drugs treating rare diseases. This paper summarizes relevant guidance documents and frameworks by selected regulatory agencies and the current practice on the use of RWD and RWE in natural history studies and the design, conduct, and analysis of rare disease clinical trials. A targeted learning roadmap for rare disease trials is described, followed by case studies on the use of RWD and RWE to support a natural history study and marketing applications in various settings. △ Less

Submitted 9 October, 2024; originally announced October 2024.

arXiv:2410.06585 [pdf]

Challenges and Possible Strategies to Address Them in Rare Disease Drug Development: A Statistical Perspective

Authors: Jie Chen, Lei Nie, Shiowjen Lee, Haitao Chu, Haijun Tian, Yan Wang, Weili He, Thomas Jemielita, Susan Gruber, Yang Song, Roy Tamura, Lu Tian, Yihua Zhao, Yong Chen, Mark van der Laan, Hana Lee

Abstract: Developing drugs for rare diseases presents unique challenges from a statistical perspective. These challenges may include slowly progressive diseases with unmet medical needs, poorly understood natural history, small population size, diversified phenotypes and geneotypes within a disorder, and lack of appropriate surrogate endpoints to measure clinical benefits. The Real-World Evidence (RWE) Scie… ▽ More Developing drugs for rare diseases presents unique challenges from a statistical perspective. These challenges may include slowly progressive diseases with unmet medical needs, poorly understood natural history, small population size, diversified phenotypes and geneotypes within a disorder, and lack of appropriate surrogate endpoints to measure clinical benefits. The Real-World Evidence (RWE) Scientific Working Group of the American Statistical Association Biopharmaceutical Section has assembled a research team to assess the landscape including challenges and possible strategies to address these challenges and the role of real-world data (RWD) and RWE in rare disease drug development. This paper first reviews the current regulations by regulatory agencies worldwide and then discusses in more details the challenges from a statistical perspective in the design, conduct, and analysis of rare disease clinical trials. After outlining an overall development pathway for rare disease drugs, corresponding strategies to address the aforementioned challenges are presented. Other considerations are also discussed for generating relevant evidence for regulatory decision-making on drugs for rare diseases. The accompanying paper discusses how RWD and RWE can be used to improve the efficiency of rare disease drug development. △ Less

Submitted 9 October, 2024; originally announced October 2024.

arXiv:2402.03447 [pdf, other]

Challenges in Variable Importance Ranking Under Correlation

Authors: Annie Liang, Thomas Jemielita, Andy Liaw, Vladimir Svetnik, Lingkang Huang, Richard Baumgartner, Jason M. Klusowski

Abstract: Variable importance plays a pivotal role in interpretable machine learning as it helps measure the impact of factors on the output of the prediction model. Model agnostic methods based on the generation of "null" features via permutation (or related approaches) can be applied. Such analysis is often utilized in pharmaceutical applications due to its ability to interpret black-box models, including… ▽ More Variable importance plays a pivotal role in interpretable machine learning as it helps measure the impact of factors on the output of the prediction model. Model agnostic methods based on the generation of "null" features via permutation (or related approaches) can be applied. Such analysis is often utilized in pharmaceutical applications due to its ability to interpret black-box models, including tree-based ensembles. A major challenge and significant confounder in variable importance estimation however is the presence of between-feature correlation. Recently, several adjustments to marginal permutation utilizing feature knockoffs were proposed to address this issue, such as the variable importance measure known as conditional predictive impact (CPI). Assessment and evaluation of such approaches is the focus of our work. We first present a comprehensive simulation study investigating the impact of feature correlation on the assessment of variable importance. We then theoretically prove the limitation that highly correlated features pose for the CPI through the knockoff construction. While we expect that there is always no correlation between knockoff variables and its corresponding predictor variables, we prove that the correlation increases linearly beyond a certain correlation threshold between the predictor variables. Our findings emphasize the absence of free lunch when dealing with high feature correlation, as well as the necessity of understanding the utility and limitations behind methods in variable importance estimation. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2211.16609 [pdf]

Harnessing electronic health records for real-world evidence

Authors: Jue Hou, Rachel Zhao, Jessica Gronsbell, Brett K. Beaulieu-Jones, Griffin Webber, Thomas Jemielita, Shuyan Wan, Chuan Hong, Yucong Lin, Tianrun Cai, Jun Wen, Vidul A. Panickan, Clara-Lea Bonzel, Kai-Li Liaw, Katherine P. Liao, Tianxi Cai

Abstract: While randomized controlled trials (RCTs) are the gold-standard for establishing the efficacy and safety of a medical treatment, real-world evidence (RWE) generated from real-world data (RWD) has been vital in post-approval monitoring and is being promoted for the regulatory process of experimental therapies. An emerging source of RWD is electronic health records (EHRs), which contain detailed inf… ▽ More While randomized controlled trials (RCTs) are the gold-standard for establishing the efficacy and safety of a medical treatment, real-world evidence (RWE) generated from real-world data (RWD) has been vital in post-approval monitoring and is being promoted for the regulatory process of experimental therapies. An emerging source of RWD is electronic health records (EHRs), which contain detailed information on patient care in both structured (e. g., diagnosis codes) and unstructured (e. g., clinical notes, images) form. Despite the granularity of the data available in EHRs, critical variables required to reliably assess the relationship between a treatment and clinical outcome can be challenging to extract. We provide an integrated data curation and modeling pipeline leveraging recent advances in natural language processing, computational phenotyping, modeling techniques with noisy data to address this fundamental challenge and accelerate the reliable use of EHRs for RWE, as well as the creation of digital twins. The proposed pipeline is highly automated for the task and includes guidance for deployment. Examples are also drawn from existing literature on EHR emulation of RCT and accompanied by our own studies with Mass General Brigham (MGB) EHR. △ Less

Submitted 29 November, 2022; originally announced November 2022.

Comments: 39 pages, 1 figure, 1 table

arXiv:1912.03337 [pdf]

PRISM: Patient Response Identifiers for Stratified Medicine

Authors: Thomas O. Jemielita, Devan V. Mehrotra

Abstract: Pharmaceutical companies continue to seek innovative ways to explore whether a drug under development is likely to be suitable for all or only an identifiable stratum of patients in the target population. The sooner this can be done during the clinical development process, the better it is for the company, and downstream for prescribers, payers, and most importantly, for patients. To help enable t… ▽ More Pharmaceutical companies continue to seek innovative ways to explore whether a drug under development is likely to be suitable for all or only an identifiable stratum of patients in the target population. The sooner this can be done during the clinical development process, the better it is for the company, and downstream for prescribers, payers, and most importantly, for patients. To help enable this vision of stratified medicine, we describe a powerful statistical framework, Patient Response Identifiers for Stratified Medicine (PRISM), for the discovery of potential predictors of drug response and associated subgroups using machine learning tools. PRISM is highly flexible and can have many "configurations", allowing the incorporation of complementary models or tools for a variety of outcomes and settings. One promising PRISM configuration is to use the observed outcomes for subgroup identification, while using counterfactual within-patient predicted treatment differences for subgroup-specific treatment estimates and associated interpretation. This separates the "subgroup-identification" from the "decision-making" and, to facilitate clinical design planning, is a simple way to obtain unbiased treatment effect sizes in the discovered subgroups. Simulation results, along with data from a real clinical trial are used to illustrate the utility of the proposed PRISM framework. △ Less

Submitted 6 December, 2019; originally announced December 2019.

Comments: 24 pages; 6 figures, 3 tables

Showing 1–5 of 5 results for author: Jemielita, T