Skip to main content

Showing 1–40 of 40 results for author: Shah, N H

.
  1. arXiv:2506.06574  [pdf, ps, other

    cs.AI cs.MA

    The Optimization Paradox in Clinical AI Multi-Agent Systems

    Authors: Suhana Bedi, Iddah Mlauzi, Daniel Shin, Sanmi Koyejo, Nigam H. Shah

    Abstract: Multi-agent artificial intelligence systems are increasingly deployed in clinical settings, yet the relationship between component-level optimization and system-wide performance remains poorly understood. We evaluated this relationship using 2,400 real patient cases from the MIMIC-CDM dataset across four abdominal pathologies (appendicitis, pancreatitis, cholecystitis, diverticulitis), decomposing… ▽ More

    Submitted 11 June, 2025; v1 submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2505.23802  [pdf, ps, other

    cs.CL cs.AI

    MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks

    Authors: Suhana Bedi, Hejie Cui, Miguel Fuentes, Alyssa Unell, Michael Wornow, Juan M. Banda, Nikesh Kotecha, Timothy Keyes, Yifan Mai, Mert Oez, Hao Qiu, Shrey Jain, Leonardo Schettini, Mehr Kashyap, Jason Alan Fries, Akshay Swaminathan, Philip Chung, Fateme Nateghi, Asad Aali, Ashwin Nayak, Shivam Vedak, Sneha S. Jain, Birju Patel, Oluseyi Fayanju, Shreya Shah , et al. (56 additional authors not shown)

    Abstract: While large language models (LLMs) achieve near-perfect scores on medical licensing exams, these evaluations inadequately reflect the complexity and diversity of real-world clinical practice. We introduce MedHELM, an extensible evaluation framework for assessing LLM performance for medical tasks with three key contributions. First, a clinician-validated taxonomy spanning 5 categories, 22 subcatego… ▽ More

    Submitted 2 June, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

  3. arXiv:2501.00031  [pdf, other

    cs.CL

    Distilling Large Language Models for Efficient Clinical Information Extraction

    Authors: Karthik S. Vedula, Annika Gupta, Akshay Swaminathan, Ivan Lopez, Suhana Bedi, Nigam H. Shah

    Abstract: Large language models (LLMs) excel at clinical information extraction but their computational demands limit practical deployment. Knowledge distillation--the process of transferring knowledge from larger to smaller models--offers a potential solution. We evaluate the performance of distilled BERT models, which are approximately 1,000 times smaller than modern LLMs, for clinical named entity recogn… ▽ More

    Submitted 20 December, 2024; originally announced January 2025.

    Comments: 19 pages, 1 figure, 10 tables

    MSC Class: 68T50 ACM Class: I.2.7

  4. arXiv:2412.16178  [pdf, other

    cs.LG cs.AI cs.CE

    Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs

    Authors: Michael Wornow, Suhana Bedi, Miguel Angel Fuentes Hernandez, Ethan Steinberg, Jason Alan Fries, Christopher Re, Sanmi Koyejo, Nigam H. Shah

    Abstract: Foundation Models (FMs) trained on Electronic Health Records (EHRs) have achieved state-of-the-art results on numerous clinical prediction tasks. However, most existing EHR FMs have context windows of <1k tokens. This prevents them from modeling full patient EHRs which can exceed 10k's of events. Recent advancements in subquadratic long-context architectures (e.g., Mamba) offer a promising solutio… ▽ More

    Submitted 18 March, 2025; v1 submitted 9 December, 2024; originally announced December 2024.

  5. arXiv:2411.09361  [pdf, other

    cs.CV cs.LG

    Time-to-Event Pretraining for 3D Medical Imaging

    Authors: Zepeng Huo, Jason Alan Fries, Alejandro Lozano, Jeya Maria Jose Valanarasu, Ethan Steinberg, Louis Blankemeier, Akshay S. Chaudhari, Curtis Langlotz, Nigam H. Shah

    Abstract: With the rise of medical foundation models and the growing availability of imaging data, scalable pretraining techniques offer a promising way to identify imaging biomarkers predictive of future disease risk. While current self-supervised methods for 3D medical imaging models capture local structural features like organ morphology, they fail to link pixel biomarkers with long-term health outcomes… ▽ More

    Submitted 19 March, 2025; v1 submitted 14 November, 2024; originally announced November 2024.

    Comments: 34 pages, 19 figures

  6. arXiv:2411.03191  [pdf, other

    eess.SP

    Newtonized Orthogonal Matching Pursuit for High-Resolution Target Detection in Sparse OFDM ISAC Systems

    Authors: Syed Najaf Haider Shah, Sebastian Semper, Aamir Ullah Khan, Christian Schneider, Joerg Robert

    Abstract: Integrated Sensing and Communication (ISAC) is a technology paradigm that combines sensing capabilities with communication functionalities in a single device or system. In vehicle-to-everything (V2X) sidelink, ISAC can provide enhanced safety by allowing vehicles to not only communicate with one another but also sense the surrounding environment by using sidelink signals. In ISAC-capable V2X sidel… ▽ More

    Submitted 5 November, 2024; originally announced November 2024.

  7. arXiv:2409.09095  [pdf, other

    cs.LG cs.DB

    meds_reader: A fast and efficient EHR processing library

    Authors: Ethan Steinberg, Michael Wornow, Suhana Bedi, Jason Alan Fries, Matthew B. A. McDermott, Nigam H. Shah

    Abstract: The growing demand for machine learning in healthcare requires processing increasingly large electronic health record (EHR) datasets, but existing pipelines are not computationally efficient or scalable. In this paper, we introduce meds_reader, an optimized Python package for efficient EHR data processing that is designed to take advantage of many intrinsic properties of EHR data for improved spee… ▽ More

    Submitted 14 November, 2024; v1 submitted 12 September, 2024; originally announced September 2024.

    Comments: Findings paper presented at Machine Learning for Health (ML4H) symposium 2024, December 15-16, 2024, Vancouver, Canada, 8 pages

  8. arXiv:2407.00541  [pdf

    cs.CL cs.AI cs.IR

    Answering real-world clinical questions using large language model based systems

    Authors: Yen Sia Low, Michael L. Jackson, Rebecca J. Hyde, Robert E. Brown, Neil M. Sanghavi, Julian D. Baldwin, C. William Pike, Jananee Muralidharan, Gavin Hui, Natasha Alexander, Hadeel Hassan, Rahul V. Nene, Morgan Pike, Courtney J. Pokrzywa, Shivam Vedak, Adam Paul Yan, Dong-han Yao, Amy R. Zipursky, Christina Dinh, Philip Ballentine, Dan C. Derieg, Vladimir Polony, Rehan N. Chawdry, Jordan Davies, Brigham B. Hyde , et al. (2 additional authors not shown)

    Abstract: Evidence to guide healthcare decisions is often limited by a lack of relevant and trustworthy literature as well as difficulty in contextualizing existing research for a specific patient. Large language models (LLMs) could potentially address both challenges by either summarizing published literature or generating new studies based on real-world data (RWD). We evaluated the ability of five LLM-bas… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 28 pages (2 figures, 3 tables) inclusive of 8 pages of supplemental materials (4 supplemental figures and 4 supplemental tables)

  9. arXiv:2406.13264  [pdf, other

    cs.AI cs.LG cs.SE

    WONDERBREAD: A Benchmark for Evaluating Multimodal Foundation Models on Business Process Management Tasks

    Authors: Michael Wornow, Avanika Narayan, Ben Viggiano, Ishan S. Khare, Tathagat Verma, Tibor Thompson, Miguel Angel Fuentes Hernandez, Sudharsan Sundar, Chloe Trujillo, Krrish Chawla, Rongfei Lu, Justin Shen, Divya Nagaraj, Joshua Martinez, Vardhan Agrawal, Althea Hudson, Nigam H. Shah, Christopher Re

    Abstract: Existing ML benchmarks lack the depth and diversity of annotations needed for evaluating models on business process management (BPM) tasks. BPM is the practice of documenting, measuring, improving, and automating enterprise workflows. However, research has focused almost exclusively on one task - full end-to-end automation using agents based on multimodal foundation models (FMs) like GPT-4. This f… ▽ More

    Submitted 10 October, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  10. arXiv:2406.06512  [pdf, other

    cs.CV cs.AI

    Merlin: A Vision Language Foundation Model for 3D Computed Tomography

    Authors: Louis Blankemeier, Joseph Paul Cohen, Ashwin Kumar, Dave Van Veen, Syed Jamal Safdar Gardezi, Magdalini Paschali, Zhihong Chen, Jean-Benoit Delbrouck, Eduardo Reis, Cesar Truyts, Christian Bluethgen, Malte Engmann Kjeldskov Jensen, Sophie Ostmeier, Maya Varma, Jeya Maria Jose Valanarasu, Zhongnan Fang, Zepeng Huo, Zaid Nabulsi, Diego Ardila, Wei-Hung Weng, Edson Amaro Junior, Neera Ahuja, Jason Fries, Nigam H. Shah, Andrew Johnston , et al. (6 additional authors not shown)

    Abstract: Over 85 million computed tomography (CT) scans are performed annually in the US, of which approximately one quarter focus on the abdomen. Given the current radiologist shortage, there is a large impetus to use artificial intelligence to alleviate the burden of interpreting these complex imaging studies. Prior state-of-the-art approaches for automated medical image interpretation leverage vision la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 18 pages, 7 figures

  11. arXiv:2405.03710  [pdf, other

    cs.SE cs.AI cs.LG

    Automating the Enterprise with Foundation Models

    Authors: Michael Wornow, Avanika Narayan, Krista Opsahl-Ong, Quinn McIntyre, Nigam H. Shah, Christopher Re

    Abstract: Automating enterprise workflows could unlock $4 trillion/year in productivity gains. Despite being of interest to the data management community for decades, the ultimate vision of end-to-end workflow automation has remained elusive. Current solutions rely on process mining and robotic process automation (RPA), in which a bot is hard-coded to follow a set of predefined rules for completing a workfl… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  12. arXiv:2403.07911  [pdf

    cs.CY cs.AI

    Standing on FURM ground -- A framework for evaluating Fair, Useful, and Reliable AI Models in healthcare systems

    Authors: Alison Callahan, Duncan McElfresh, Juan M. Banda, Gabrielle Bunney, Danton Char, Jonathan Chen, Conor K. Corbin, Debadutta Dash, Norman L. Downing, Sneha S. Jain, Nikesh Kotecha, Jonathan Masterson, Michelle M. Mello, Keith Morse, Srikar Nallan, Abby Pandya, Anurang Revri, Aditya Sharma, Christopher Sharp, Rahul Thapa, Michael Wornow, Alaa Youssef, Michael A. Pfeffer, Nigam H. Shah

    Abstract: The impact of using artificial intelligence (AI) to guide patient care or operational processes is an interplay of the AI model's output, the decision-making protocol based on that output, and the capacity of the stakeholders involved to take the necessary subsequent action. Estimating the effects of this interplay before deployment, and studying it in real time afterwards, are essential to bridge… ▽ More

    Submitted 14 March, 2024; v1 submitted 26 February, 2024; originally announced March 2024.

  13. arXiv:2402.05125  [pdf, other

    cs.CL cs.AI

    Zero-Shot Clinical Trial Patient Matching with LLMs

    Authors: Michael Wornow, Alejandro Lozano, Dev Dash, Jenelle Jindal, Kenneth W. Mahaffey, Nigam H. Shah

    Abstract: Matching patients to clinical trials is a key unsolved challenge in bringing new drugs to market. Today, identifying patients who meet a trial's eligibility criteria is highly manual, taking up to 1 hour per patient. Automated screening is challenging, however, as it requires understanding unstructured clinical text. Large language models (LLMs) offer a promising solution. In this work, we explore… ▽ More

    Submitted 10 April, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  14. arXiv:2311.10798  [pdf, other

    cs.LG cs.AI cs.CV eess.IV

    INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis

    Authors: Shih-Cheng Huang, Zepeng Huo, Ethan Steinberg, Chia-Chun Chiang, Matthew P. Lungren, Curtis P. Langlotz, Serena Yeung, Nigam H. Shah, Jason A. Fries

    Abstract: Synthesizing information from multiple data sources plays a crucial role in the practice of modern medicine. Current applications of artificial intelligence in medicine often focus on single-modality data due to a lack of publicly available, multimodal medical datasets. To address this limitation, we introduce INSPECT, which contains de-identified longitudinal records from a large cohort of patien… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  15. arXiv:2308.14089  [pdf, other

    cs.CL cs.AI cs.LG

    MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records

    Authors: Scott L. Fleming, Alejandro Lozano, William J. Haberkorn, Jenelle A. Jindal, Eduardo P. Reis, Rahul Thapa, Louis Blankemeier, Julian Z. Genkins, Ethan Steinberg, Ashwin Nayak, Birju S. Patel, Chia-Chun Chiang, Alison Callahan, Zepeng Huo, Sergios Gatidis, Scott J. Adams, Oluseyi Fayanju, Shreya J. Shah, Thomas Savage, Ethan Goh, Akshay S. Chaudhari, Nima Aghaeepour, Christopher Sharp, Michael A. Pfeffer, Percy Liang , et al. (5 additional authors not shown)

    Abstract: The ability of large language models (LLMs) to follow natural language instructions with human-level fluency suggests many opportunities in healthcare to reduce administrative burden and improve quality of care. However, evaluating LLMs on realistic text generation tasks for healthcare remains challenging. Existing question answering datasets for electronic health record (EHR) data fail to capture… ▽ More

    Submitted 24 December, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

  16. arXiv:2307.02028  [pdf, other

    cs.LG cs.AI cs.CL

    EHRSHOT: An EHR Benchmark for Few-Shot Evaluation of Foundation Models

    Authors: Michael Wornow, Rahul Thapa, Ethan Steinberg, Jason A. Fries, Nigam H. Shah

    Abstract: While the general machine learning (ML) community has benefited from public datasets, tasks, and models, the progress of ML in healthcare has been hampered by a lack of such shared assets. The success of foundation models creates new challenges for healthcare ML by requiring access to shared pretrained models to validate performance benefits. We help address these challenges through three contribu… ▽ More

    Submitted 11 December, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

  17. arXiv:2305.03219  [pdf

    cs.LG stat.ME

    All models are local: time to replace external validation with recurrent local validation

    Authors: Alex Youssef, Michael Pencina, Anshul Thakur, Tingting Zhu, David Clifton, Nigam H. Shah

    Abstract: External validation is often recommended to ensure the generalizability of ML models. However, it neither guarantees generalizability nor equates to a model's clinical usefulness (the ultimate goal of any clinical decision-support tool). External validation is misaligned with current healthcare ML needs. First, patient data changes across time, geography, and facilities. These changes create signi… ▽ More

    Submitted 13 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

  18. arXiv:2304.13714  [pdf

    cs.AI cs.CL cs.IR

    Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery

    Authors: Debadutta Dash, Rahul Thapa, Juan M. Banda, Akshay Swaminathan, Morgan Cheatham, Mehr Kashyap, Nikesh Kotecha, Jonathan H. Chen, Saurabh Gombar, Lance Downing, Rachel Pedreira, Ethan Goh, Angel Arnaout, Garret Kenn Morris, Honor Magon, Matthew P Lungren, Eric Horvitz, Nigam H. Shah

    Abstract: Despite growing interest in using large language models (LLMs) in healthcare, current explorations do not assess the real-world utility and safety of LLMs in clinical settings. Our objective was to determine whether two LLMs can serve information needs submitted by physicians as questions to an informatics consultation service in a safe and concordant manner. Sixty six questions from an informatic… ▽ More

    Submitted 30 April, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: 27 pages including supplemental information

  19. arXiv:2303.12961  [pdf

    cs.LG cs.AI

    The Shaky Foundations of Clinical Foundation Models: A Survey of Large Language Models and Foundation Models for EMRs

    Authors: Michael Wornow, Yizhe Xu, Rahul Thapa, Birju Patel, Ethan Steinberg, Scott Fleming, Michael A. Pfeffer, Jason Fries, Nigam H. Shah

    Abstract: The successes of foundation models such as ChatGPT and AlphaFold have spurred significant interest in building similar models for electronic medical records (EMRs) to improve patient care and hospital operations. However, recent hype has obscured critical gaps in our understanding of these models' capabilities. We review over 80 foundation models trained on non-imaging EMR data (i.e. clinical text… ▽ More

    Submitted 24 March, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

    Comments: Reformatted figures, updated contributions

  20. arXiv:2303.06269  [pdf, other

    cs.LG

    DEPLOYR: A technical framework for deploying custom real-time machine learning models into the electronic medical record

    Authors: Conor K. Corbin, Rob Maclay, Aakash Acharya, Sreedevi Mony, Soumya Punnathanam, Rahul Thapa, Nikesh Kotecha, Nigam H. Shah, Jonathan H. Chen

    Abstract: Machine learning (ML) applications in healthcare are extensively researched, but successful translations to the bedside are scant. Healthcare institutions are establishing frameworks to govern and promote the implementation of accurate, actionable and reliable models that integrate with clinical workflow. Such governance frameworks require an accompanying technical framework to deploy models in a… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

  21. arXiv:2211.10828  [pdf, other

    cs.LG cs.AI

    Instability in clinical risk stratification models using deep learning

    Authors: Daniel Lopez-Martinez, Alex Yakubovich, Martin Seneviratne, Adam D. Lelkes, Akshit Tyagi, Jonas Kemp, Ethan Steinberg, N. Lance Downing, Ron C. Li, Keith E. Morse, Nigam H. Shah, Ming-Jun Chen

    Abstract: While it has been well known in the ML community that deep learning models suffer from instability, the consequences for healthcare deployments are under characterised. We study the stability of different model architectures trained on electronic health records, using a set of outpatient prediction tasks as a case study. We show that repeated training runs of the same deep learning model on the sa… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

    Comments: Accepted for publication in Machine Learning for Health (ML4H) 2022

  22. arXiv:2209.06985  [pdf, other

    stat.AP

    Clinical Utility Gains from Incorporating Comorbidity and Geographic Location Information into Risk Estimation Equations for Atherosclerotic Cardiovascular Disease

    Authors: Yizhe Xu, Agata Foryciarz, Ethan Steinberg, Nigam H. Shah

    Abstract: Objective: There are several efforts to re-learn the 2013 ACC/AHA pooled cohort equations (PCE) for patients with specific comorbidities and geographic locations. With over 363 customized risk models in the literature, we aim to evaluate such revised models to determine if the performance improvements translate to gains in clinical utility. Methods: We re-train a baseline PCE using the ACC/AHA P… ▽ More

    Submitted 17 September, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

  23. arXiv:2202.01906  [pdf, other

    stat.ML cs.CY cs.LG

    Net benefit, calibration, threshold selection, and training objectives for algorithmic fairness in healthcare

    Authors: Stephen R. Pfohl, Yizhe Xu, Agata Foryciarz, Nikolaos Ignatiadis, Julian Genkins, Nigam H. Shah

    Abstract: A growing body of work uses the paradigm of algorithmic fairness to frame the development of techniques to anticipate and proactively mitigate the introduction or exacerbation of health inequities that may follow from the use of model-guided decision-making. We evaluate the interplay between measures of model performance, fairness, and the expected utility of decision-making to offer practical rec… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

  24. arXiv:2108.12250  [pdf, other

    stat.ML cs.CY cs.LG

    A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

    Authors: Stephen R. Pfohl, Haoran Zhang, Yizhe Xu, Agata Foryciarz, Marzyeh Ghassemi, Nigam H. Shah

    Abstract: Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality. Model training approaches that aim to maximize worst-case model performance across subpopulations, such as distributionally robust optimization (DRO), attempt to address this… ▽ More

    Submitted 1 February, 2022; v1 submitted 27 August, 2021; originally announced August 2021.

  25. Ontology-driven weak supervision for clinical entity classification in electronic health records

    Authors: Jason A. Fries, Ethan Steinberg, Saelig Khattar, Scott L. Fleming, Jose Posada, Alison Callahan, Nigam H. Shah

    Abstract: In the electronic health record, using clinical notes to identify entities such as disorders and their temporality (e.g. the order of an event relative to a time index) can inform many important analyses. However, creating training data for clinical entity tasks is time consuming and sharing labeled data is challenging due to privacy concerns. The information needs of the COVID-19 pandemic highlig… ▽ More

    Submitted 6 April, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

    Journal ref: Nature Communications 12.1 (2021): 1-11

  26. arXiv:2007.10306  [pdf, other

    stat.ML cs.CY cs.LG stat.AP

    An Empirical Characterization of Fair Machine Learning For Clinical Risk Prediction

    Authors: Stephen R. Pfohl, Agata Foryciarz, Nigam H. Shah

    Abstract: The use of machine learning to guide clinical decision making has the potential to worsen existing health disparities. Several recent works frame the problem as that of algorithmic fairness, a framework that has attracted considerable attention and criticism. However, the appropriateness of this framework is unclear due to both ethical as well as technical considerations, the latter of which inclu… ▽ More

    Submitted 15 June, 2021; v1 submitted 20 July, 2020; originally announced July 2020.

    Comments: Published in the Journal of Biomedical Informatics (https://doi.org/10.1016/j.jbi.2020.103621). Version 3 updates acknowledgements and fixes typos

    Journal ref: Journal of Biomedical Informatics, Volume 113, January 2021, 103621

  27. arXiv:2006.14102  [pdf, other

    stat.AP

    Using public clinical trial reports to evaluate observational study methods

    Authors: Ethan Steinberg, Nikolaos Ignatiadis, Steve Yadlowsky, Yizhe Xu, Nigam H. Shah

    Abstract: Observational studies are valuable for estimating the effects of various medical interventions, but are notoriously difficult to evaluate because the methods used in observational studies require many untestable assumptions. This lack of verifiability makes it difficult both to compare different observational study methods and to trust the results of any particular observational study. In this wor… ▽ More

    Submitted 13 September, 2022; v1 submitted 24 June, 2020; originally announced June 2020.

  28. arXiv:2001.05295  [pdf, other

    cs.CL cs.LG stat.ML

    Language Models Are An Effective Patient Representation Learning Technique For Electronic Health Record Data

    Authors: Ethan Steinberg, Ken Jung, Jason A. Fries, Conor K. Corbin, Stephen R. Pfohl, Nigam H. Shah

    Abstract: Widespread adoption of electronic health records (EHRs) has fueled the development of using machine learning to build prediction models for various clinical outcomes. This process is often constrained by having a relatively small number of patient records for training the model. We demonstrate that using patient representation schemes inspired from techniques in natural language processing can inc… ▽ More

    Submitted 12 May, 2020; v1 submitted 6 January, 2020; originally announced January 2020.

  29. arXiv:1907.06260  [pdf, other

    cs.LG cs.CY stat.ML

    Counterfactual Reasoning for Fair Clinical Risk Prediction

    Authors: Stephen Pfohl, Tony Duan, Daisy Yi Ding, Nigam H. Shah

    Abstract: The use of machine learning systems to support decision making in healthcare raises questions as to what extent these systems may introduce or exacerbate disparities in care for historically underrepresented and mistreated groups, due to biases implicitly embedded in observational data in electronic health records. To address this problem in the context of clinical risk prediction models, we devel… ▽ More

    Submitted 14 July, 2019; originally announced July 2019.

    Comments: Machine Learning for Healthcare 2019

  30. arXiv:1904.07640  [pdf

    cs.CY cs.LG

    Medical device surveillance with electronic health records

    Authors: Alison Callahan, Jason A Fries, Christopher Ré, James I Huddleston III, Nicholas J Giori, Scott Delp, Nigam H Shah

    Abstract: Post-market medical device surveillance is a challenge facing manufacturers, regulatory agencies, and health care providers. Electronic health records are valuable sources of real world evidence to assess device safety and track device-related patient outcomes over time. However, distilling this evidence remains challenging, as information is fractured across clinical notes and structured records.… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

  31. arXiv:1901.05958  [pdf

    q-bio.QM cs.LG stat.ML

    A Semi-Supervised Machine Learning Approach to Detecting Recurrent Metastatic Breast Cancer Cases Using Linked Cancer Registry and Electronic Medical Record Data

    Authors: Albee Y. Ling, Allison W. Kurian, Jennifer L. Caswell-Jin, George W. Sledge Jr., Nigam H. Shah, Suzanne R. Tamang

    Abstract: Objectives: Most cancer data sources lack information on metastatic recurrence. Electronic medical records (EMRs) and population-based cancer registries contain complementary information on cancer treatment and outcomes, yet are rarely used synergistically. To enable detection of metastatic breast cancer (MBC), we applied a semi-supervised machine learning framework to linked EMR-California Cancer… ▽ More

    Submitted 16 January, 2019; originally announced January 2019.

    Journal ref: JAMIA open 2.4 (2019): 528-537

  32. arXiv:1812.00371  [pdf, other

    cs.LG stat.ML

    Predicting Inpatient Discharge Prioritization With Electronic Health Records

    Authors: Anand Avati, Stephen Pfohl, Chris Lin, Thao Nguyen, Meng Zhang, Philip Hwang, Jessica Wetstone, Kenneth Jung, Andrew Ng, Nigam H. Shah

    Abstract: Identifying patients who will be discharged within 24 hours can improve hospital resource management and quality of care. We studied this problem using eight years of Electronic Health Records (EHR) data from Stanford Hospital. We fit models to predict 24 hour discharge across the entire inpatient population. The best performing models achieved an area under the receiver-operator characteristic cu… ▽ More

    Submitted 2 December, 2018; originally announced December 2018.

  33. arXiv:1809.04663  [pdf, other

    cs.LG stat.ML

    Creating Fair Models of Atherosclerotic Cardiovascular Disease Risk

    Authors: Stephen Pfohl, Ben Marafino, Adrien Coulet, Fatima Rodriguez, Latha Palaniappan, Nigam H. Shah

    Abstract: Guidelines for the management of atherosclerotic cardiovascular disease (ASCVD) recommend the use of risk stratification models to identify patients most likely to benefit from cholesterol-lowering and other therapies. These models have differential performance across race and gender groups with inconsistent behavior across studies, potentially resulting in an inequitable distribution of beneficia… ▽ More

    Submitted 14 June, 2019; v1 submitted 12 September, 2018; originally announced September 2018.

  34. arXiv:1808.03331  [pdf, other

    stat.ML cs.LG

    The Effectiveness of Multitask Learning for Phenotyping with Electronic Health Records Data

    Authors: Daisy Yi Ding, Chloé Simpson, Stephen Pfohl, Dave C. Kale, Kenneth Jung, Nigam H. Shah

    Abstract: Electronic phenotyping is the task of ascertaining whether an individual has a medical condition of interest by analyzing their medical record and is foundational in clinical informatics. Increasingly, electronic phenotyping is performed via supervised learning. We investigate the effectiveness of multitask learning for phenotyping using electronic health records (EHR) data. Multitask learning aim… ▽ More

    Submitted 5 January, 2019; v1 submitted 9 August, 2018; originally announced August 2018.

    Comments: Pacific Symposium on Biocomputing (PSB) 2019, Hawaii, https://psb.stanford.edu/psb-online/; 13 pages, 7 figures

  35. arXiv:1806.08324  [pdf, other

    cs.LG stat.AP stat.ML

    Countdown Regression: Sharp and Calibrated Survival Predictions

    Authors: Anand Avati, Tony Duan, Sharon Zhou, Kenneth Jung, Nigam H. Shah, Andrew Ng

    Abstract: Probabilistic survival predictions from models trained with Maximum Likelihood Estimation (MLE) can have high, and sometimes unacceptably high variance. The field of meteorology, where the paradigm of maximizing sharpness subject to calibration is popular, has addressed this problem by using scoring rules beyond MLE, such as the Continuous Ranked Probability Score (CRPS). In this paper we present… ▽ More

    Submitted 18 June, 2019; v1 submitted 21 June, 2018; originally announced June 2018.

    Comments: UAI 2019

  36. arXiv:1801.08668  [pdf

    stat.AP q-bio.QM

    Monitoring physical function in patients with knee osteoarthritis using data from wearable activity monitors

    Authors: Vibhu Agarwal, Matthew Smuck, Nigam H Shah

    Abstract: Currently used clinical assessments for physical function do not objectively quantify daily activities in routine living. Wearable activity monitors enable objective measurement of routine daily activities, but do not map to clinically measured physical performance measures. We represent physical function as a daily activity profile derived from minute-level activity data obtained via a wearable a… ▽ More

    Submitted 25 January, 2018; originally announced January 2018.

  37. Scalable and accurate deep learning for electronic health records

    Authors: Alvin Rajkomar, Eyal Oren, Kai Chen, Andrew M. Dai, Nissan Hajaj, Peter J. Liu, Xiaobing Liu, Mimi Sun, Patrik Sundberg, Hector Yee, Kun Zhang, Gavin E. Duggan, Gerardo Flores, Michaela Hardt, Jamie Irvine, Quoc Le, Kurt Litsch, Jake Marcus, Alexander Mossin, Justin Tansuwan, De Wang, James Wexler, Jimbo Wilson, Dana Ludwig, Samuel L. Volchenboum , et al. (9 additional authors not shown)

    Abstract: Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient's record. We propose a representation of p… ▽ More

    Submitted 11 May, 2018; v1 submitted 24 January, 2018; originally announced January 2018.

    Comments: Published version from https://www.nature.com/articles/s41746-018-0029-1

    Journal ref: npj Digital Medicine 1:18 (2018)

  38. arXiv:1711.06402  [pdf, other

    cs.CY cs.LG stat.ML

    Improving Palliative Care with Deep Learning

    Authors: Anand Avati, Kenneth Jung, Stephanie Harman, Lance Downing, Andrew Ng, Nigam H. Shah

    Abstract: Improving the quality of end-of-life care for hospitalized patients is a priority for healthcare organizations. Studies have shown that physicians tend to over-estimate prognoses, which in combination with treatment inertia results in a mismatch between patients wishes and actual care at the end of life. We describe a method to address this problem using Deep Learning and Electronic Health Record… ▽ More

    Submitted 16 November, 2017; originally announced November 2017.

    Comments: IEEE International Conference on Bioinformatics and Biomedicine 2017

  39. arXiv:1707.00102  [pdf, other

    stat.ML

    Some methods for heterogeneous treatment effect estimation in high-dimensions

    Authors: Scott Powers, Junyang Qian, Kenneth Jung, Alejandro Schuler, Nigam H. Shah, Trevor Hastie, Robert Tibshirani

    Abstract: When devising a course of treatment for a patient, doctors often have little quantitative evidence on which to base their decisions, beyond their medical education and published clinical trials. Stanford Health Care alone has millions of electronic medical records (EMRs) that are only just recently being leveraged to inform better treatment recommendations. These data present a unique challenge be… ▽ More

    Submitted 1 July, 2017; originally announced July 2017.

  40. arXiv:1507.05408  [pdf, other

    cs.CY

    Provenance-Centered Dataset of Drug-Drug Interactions

    Authors: Juan M. Banda, Tobias Kuhn, Nigam H. Shah, Michel Dumontier

    Abstract: Over the years several studies have demonstrated the ability to identify potential drug-drug interactions via data mining from the literature (MEDLINE), electronic health records, public databases (Drugbank), etc. While each one of these approaches is properly statistically validated, they do not take into consideration the overlap between them as one of their decision making variables. In this pa… ▽ More

    Submitted 20 July, 2015; originally announced July 2015.

    Comments: In Proceedings of the 14th International Semantic Web Conference (ISWC) 2015