Search | arXiv e-print repository

Safe and Interpretable Estimation of Optimal Treatment Regimes

Authors: Harsh Parikh, Quinn Lanners, Zade Akras, Sahar F. Zafar, M. Brandon Westover, Cynthia Rudin, Alexander Volfovsky

Abstract: Recent statistical and reinforcement learning methods have significantly advanced patient care strategies. However, these approaches face substantial challenges in high-stakes contexts, including missing data, inherent stochasticity, and the critical requirements for interpretability and patient safety. Our work operationalizes a safe and interpretable framework to identify optimal treatment regim… ▽ More Recent statistical and reinforcement learning methods have significantly advanced patient care strategies. However, these approaches face substantial challenges in high-stakes contexts, including missing data, inherent stochasticity, and the critical requirements for interpretability and patient safety. Our work operationalizes a safe and interpretable framework to identify optimal treatment regimes. This approach involves matching patients with similar medical and pharmacological characteristics, allowing us to construct an optimal policy via interpolation. We perform a comprehensive simulation study to demonstrate the framework's ability to identify optimal policies even in complex settings. Ultimately, we operationalize our approach to study regimes for treating seizures in critically ill patients. Our findings strongly support personalized treatment strategies based on a patient's medical history and pharmacological features. Notably, we identify that reducing medication doses for patients with mild and brief seizure episodes while adopting aggressive treatment for patients in intensive care unit experiencing intense seizures leads to more favorable outcomes. △ Less

Submitted 1 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted for publication in the proceedings of AISTATS 2025

arXiv:2203.04920 [pdf]

doi 10.1016/S2589-7500(23)00088-2

Effects of Epileptiform Activity on Discharge Outcome in Critically Ill Patients

Authors: Harsh Parikh, Kentaro Hoffman, Haoqi Sun, Wendong Ge, Jin Jing, Rajesh Amerineni, Lin Liu, Jimeng Sun, Sahar Zafar, Aaron Struck, Alexander Volfovsky, Cynthia Rudin, M. Brandon Westover

Abstract: Epileptiform activity (EA) is associated with worse outcomes including increased risk of disability and death. However, the effect of EA on the neurologic outcome is confounded by the feedback between treatment with anti-seizure medications (ASM) and EA burden. A randomized clinical trial is challenging due to the sequential nature of EA-ASM feedback, as well as ethical reasons. However, some mech… ▽ More Epileptiform activity (EA) is associated with worse outcomes including increased risk of disability and death. However, the effect of EA on the neurologic outcome is confounded by the feedback between treatment with anti-seizure medications (ASM) and EA burden. A randomized clinical trial is challenging due to the sequential nature of EA-ASM feedback, as well as ethical reasons. However, some mechanistic knowledge is available, e.g., how drugs are absorbed. This knowledge together with observational data could provide a more accurate effect estimate using causal inference. We performed a retrospective cross-sectional study with 995 patients with the modified Rankin Scale (mRS) at discharge as the outcome and the EA burden defined as the mean or maximum proportion of time spent with EA in six-hour windows in the first 24 hours of electroencephalography as the exposure. We estimated the change in discharge mRS if everyone in the dataset had experienced a certain EA burden and were untreated. We combined pharmacological modeling with an interpretable matching method to account for confounding and EA-ASM feedback. Our matched groups' quality was validated by the neurologists. Having a maximum EA burden greater than 75% when untreated had a 22% increased chance of a poor outcome (severe disability or death), and mild but long-lasting EA increased the risk of a poor outcome by 14%. The effect sizes were heterogeneous depending on pre-admission profile, e.g., patients with hypoxic-ischemic encephalopathy (HIE) or acquired brain injury (ABI) were more affected. Interventions should put a higher priority on patients with an average EA burden higher than 10%, while treatment should be more conservative when the maximum EA burden is low. △ Less

Submitted 11 March, 2023; v1 submitted 9 March, 2022; originally announced March 2022.

Comments: 4 Figures

arXiv:2103.03945 [pdf, other]

SCRIB: Set-classifier with Class-specific Risk Bounds for Blackbox Models

Authors: Zhen Lin, Cao Xiao, Lucas Glass, M. Brandon Westover, Jimeng Sun

Abstract: Despite deep learning (DL) success in classification problems, DL classifiers do not provide a sound mechanism to decide when to refrain from predicting. Recent works tried to control the overall prediction risk with classification with rejection options. However, existing works overlook the different significance of different classes. We introduce Set-classifier with Class-specific RIsk Bounds (S… ▽ More Despite deep learning (DL) success in classification problems, DL classifiers do not provide a sound mechanism to decide when to refrain from predicting. Recent works tried to control the overall prediction risk with classification with rejection options. However, existing works overlook the different significance of different classes. We introduce Set-classifier with Class-specific RIsk Bounds (SCRIB) to tackle this problem, assigning multiple labels to each example. Given the output of a black-box model on the validation set, SCRIB constructs a set-classifier that controls the class-specific prediction risks with a theoretical guarantee. The key idea is to reject when the set classifier returns more than one label. We validated SCRIB on several medical applications, including sleep staging on electroencephalogram (EEG) data, X-ray COVID image classification, and atrial fibrillation detection based on electrocardiogram (ECG) data. SCRIB obtained desirable class-specific risks, which are 35\%-88\% closer to the target risks than baseline methods. △ Less

Submitted 5 March, 2021; originally announced March 2021.

arXiv:2006.11689 [pdf, other]

Clinically Relevant Mediation Analysis using Controlled Indirect Effect

Authors: Haoqi Sun, Michael J. Leone, Lin Liu, Shabani S. Mukerji, Gregory K. Robbins, M. Brandon Westover

Abstract: Mediation analysis allows one to use observational data to estimate the importance of each potential mediating pathway involved in the causal effect of an exposure on an outcome. However, current approaches to mediation analysis with multiple mediators either involve assumptions not verifiable by experiments, or estimate the effect when mediators are manipulated jointly which precludes the practic… ▽ More Mediation analysis allows one to use observational data to estimate the importance of each potential mediating pathway involved in the causal effect of an exposure on an outcome. However, current approaches to mediation analysis with multiple mediators either involve assumptions not verifiable by experiments, or estimate the effect when mediators are manipulated jointly which precludes the practical design of experiments due to curse of dimensionality, or are difficult to interpret when arbitrary causal dependencies are present. We propose a method for mediation analysis for multiple manipulable mediators with arbitrary causal dependencies. The proposed method is clinically relevant because the decomposition of the total effect does not involve effects under cross-world assumptions and focuses on the effects after manipulating (i.e. treating) one single mediator, which is more relevant in a clinical scenario. We illustrate the approach using simulated data, the "framing" dataset from political science, and the HIV-Brain Age dataset from a clinical retrospective cohort study. Our results provide potential guidance for clinical practitioners to make justified choices to manipulate one of the mediators to optimize the outcome. △ Less

Submitted 20 June, 2020; originally announced June 2020.

Comments: 15 pages, 2 figures in main text, 1 figure in supplemental, 4 tables in main text

arXiv:2002.11701 [pdf, other]

CLARA: Clinical Report Auto-completion

Authors: Siddharth Biswal, Cao Xiao, Lucas M. Glass, M. Brandon Westover, Jimeng Sun

Abstract: Generating clinical reports from raw recordings such as X-rays and electroencephalogram (EEG) is an essential and routine task for doctors. However, it is often time-consuming to write accurate and detailed reports. Most existing methods try to generate the whole reports from the raw input with limited success because 1) generated reports often contain errors that need manual review and correction… ▽ More Generating clinical reports from raw recordings such as X-rays and electroencephalogram (EEG) is an essential and routine task for doctors. However, it is often time-consuming to write accurate and detailed reports. Most existing methods try to generate the whole reports from the raw input with limited success because 1) generated reports often contain errors that need manual review and correction, 2) it does not save time when doctors want to write additional information into the report, and 3) the generated reports are not customized based on individual doctors' preference. We propose {\it CL}inic{\it A}l {\it R}eport {\it A}uto-completion (CLARA), an interactive method that generates reports in a sentence by sentence fashion based on doctors' anchor words and partially completed sentences. CLARA searches for most relevant sentences from existing reports as the template for the current report. The retrieved sentences are sequentially modified by combining with the input feature representations to create the final report. In our experimental evaluation, CLARA achieved 0.393 CIDEr and 0.248 BLEU-4 on X-ray reports and 0.482 CIDEr and 0.491 BLEU-4 for EEG reports for sentence-level generation, which is up to 35% improvement over the best baseline. Also via our qualitative evaluation, CLARA is shown to produce reports which have a significantly higher level of approval by doctors in a user study (3.74 out of 5 for CLARA vs 2.52 out of 5 for the baseline). △ Less

Submitted 4 March, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

arXiv:1910.06100 [pdf, other]

SLEEPER: interpretable Sleep staging via Prototypes from Expert Rules

Authors: Irfan Al-Hussaini, Cao Xiao, M. Brandon Westover, Jimeng Sun

Abstract: Sleep staging is a crucial task for diagnosing sleep disorders. It is tedious and complex as it can take a trained expert several hours to annotate just one patient's polysomnogram (PSG) from a single night. Although deep learning models have demonstrated state-of-the-art performance in automating sleep staging, interpretability which defines other desiderata, has largely remained unexplored. In t… ▽ More Sleep staging is a crucial task for diagnosing sleep disorders. It is tedious and complex as it can take a trained expert several hours to annotate just one patient's polysomnogram (PSG) from a single night. Although deep learning models have demonstrated state-of-the-art performance in automating sleep staging, interpretability which defines other desiderata, has largely remained unexplored. In this study, we propose Sleep staging via Prototypes from Expert Rules (SLEEPER), which combines deep learning models with expert defined rules using a prototype learning framework to generate simple interpretable models. In particular, SLEEPER utilizes sleep scoring rules and expert defined features to derive prototypes which are embeddings of PSG data fragments via convolutional neural networks. The final models are simple interpretable models like a shallow decision tree defined over those phenotypes. We evaluated SLEEPER using two PSG datasets collected from sleep studies and demonstrated that SLEEPER could provide accurate sleep stage classification comparable to human experts and deep neural networks with about 85% ROC-AUC and .7 kappa. △ Less

Submitted 14 October, 2019; originally announced October 2019.

Comments: Machine Learning for Healthcare Conference (MLHC) 2019. Proceedings of Machine Learning Research 106

Journal ref: PMLR 106:721-739, 2019

arXiv:1803.09702 [pdf, other]

HAMLET: Interpretable Human And Machine co-LEarning Technique

Authors: Olivier Deiss, Siddharth Biswal, Jing Jin, Haoqi Sun, M. Brandon Westover, Jimeng Sun

Abstract: Efficient label acquisition processes are key to obtaining robust classifiers. However, data labeling is often challenging and subject to high levels of label noise. This can arise even when classification targets are well defined, if instances to be labeled are more difficult than the prototypes used to define the class, leading to disagreements among the expert community. Here, we enable efficie… ▽ More Efficient label acquisition processes are key to obtaining robust classifiers. However, data labeling is often challenging and subject to high levels of label noise. This can arise even when classification targets are well defined, if instances to be labeled are more difficult than the prototypes used to define the class, leading to disagreements among the expert community. Here, we enable efficient training of deep neural networks. From low-confidence labels, we iteratively improve their quality by simultaneous learning of machines and experts. We call it Human And Machine co-LEarning Technique (HAMLET). Throughout the process, experts become more consistent, while the algorithm provides them with explainable feedback for confirmation. HAMLET uses a neural embedding function and a memory module filled with diverse reference embeddings from different classes. Its output includes classification labels and highly relevant reference embeddings as explanation. We took the study of brain monitoring at intensive care unit (ICU) as an application of HAMLET on continuous electroencephalography (cEEG) data. Although cEEG monitoring yields large volumes of data, labeling costs and difficulty make it hard to build a classifier. Additionally, while experts agree on the labels of clear-cut examples of cEEG patterns, labeling many real-world cEEG data can be extremely challenging. Thus, a large minority of sequences might be mislabeled. HAMLET has shown significant performance gain against deep learning and other baselines, increasing accuracy from 7.03% to 68.75% on challenging inputs. Besides improved performance, clinical experts confirmed the interpretability of those reference embeddings in helping explaining the classification results by HAMLET. △ Less

Submitted 21 August, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

Comments: Removed KDD template

Showing 1–7 of 7 results for author: Westover, M B