Skip to main content

Showing 1–50 of 71 results for author: Rajpurkar, P

.
  1. arXiv:2506.04353  [pdf, ps, other

    cs.CV cs.AI cs.CE cs.CL cs.LG

    ReXVQA: A Large-scale Visual Question Answering Benchmark for Generalist Chest X-ray Understanding

    Authors: Ankit Pal, Jung-Oh Lee, Xiaoman Zhang, Malaikannan Sankarasubbu, Seunghyeon Roh, Won Jung Kim, Meesun Lee, Pranav Rajpurkar

    Abstract: We present ReXVQA, the largest and most comprehensive benchmark for visual question answering (VQA) in chest radiology, comprising approximately 696,000 questions paired with 160,000 chest X-rays studies across training, validation, and test sets. Unlike prior efforts that rely heavily on template based queries, ReXVQA introduces a diverse and clinically authentic task suite reflecting five core r… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  2. arXiv:2505.00228  [pdf, ps, other

    eess.IV cs.CV

    ReXGradient-160K: A Large-Scale Publicly Available Dataset of Chest Radiographs with Free-text Reports

    Authors: Xiaoman Zhang, Julián N. Acosta, Josh Miller, Ouwen Huang, Pranav Rajpurkar

    Abstract: We present ReXGradient-160K, representing the largest publicly available chest X-ray dataset to date in terms of the number of patients. This dataset contains 160,000 chest X-ray studies with paired radiological reports from 109,487 unique patients across 3 U.S. health systems (79 medical sites). This comprehensive dataset includes multiple images per study and detailed radiology reports, making i… ▽ More

    Submitted 10 May, 2025; v1 submitted 30 April, 2025; originally announced May 2025.

  3. arXiv:2504.21336  [pdf, ps, other

    cs.CV

    UniBiomed: A Universal Foundation Model for Grounded Biomedical Image Interpretation

    Authors: Linshan Wu, Yuxiang Nie, Sunan He, Jiaxin Zhuang, Luyang Luo, Neeraj Mahboobani, Varut Vardhanabhuti, Ronald Cheong Kin Chan, Yifan Peng, Pranav Rajpurkar, Hao Chen

    Abstract: The integration of AI-assisted biomedical image analysis into clinical practice demands AI-generated findings that are not only accurate but also interpretable to clinicians. However, existing biomedical AI models generally lack the ability to simultaneously generate diagnostic findings and localize corresponding biomedical objects. This limitation makes it challenging for clinicians to correlate… ▽ More

    Submitted 29 May, 2025; v1 submitted 30 April, 2025; originally announced April 2025.

    Comments: The first universal foundation model for grounded biomedical image interpretation

  4. arXiv:2502.18519  [pdf, other

    eess.IV cs.AI cs.CV

    FreeTumor: Large-Scale Generative Tumor Synthesis in Computed Tomography Images for Improving Tumor Recognition

    Authors: Linshan Wu, Jiaxin Zhuang, Yanning Zhou, Sunan He, Jiabo Ma, Luyang Luo, Xi Wang, Xuefeng Ni, Xiaoling Zhong, Mingxiang Wu, Yinghua Zhao, Xiaohui Duan, Varut Vardhanabhuti, Pranav Rajpurkar, Hao Chen

    Abstract: Tumor is a leading cause of death worldwide, with an estimated 10 million deaths attributed to tumor-related diseases every year. AI-driven tumor recognition unlocks new possibilities for more precise and intelligent tumor screening and diagnosis. However, the progress is heavily hampered by the scarcity of annotated datasets, which demands extensive annotation efforts by radiologists. To tackle t… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

  5. arXiv:2502.06171  [pdf

    eess.IV cs.CV

    A Data-Efficient Pan-Tumor Foundation Model for Oncology CT Interpretation

    Authors: Wenhui Lei, Hanyu Chen, Zitian Zhang, Luyang Luo, Qiong Xiao, Yannian Gu, Peng Gao, Yankai Jiang, Ci Wang, Guangtao Wu, Tongjia Xu, Yingjie Zhang, Xiaofan Zhang, Pranav Rajpurkar, Shaoting Zhang, Zhenning Wang

    Abstract: Artificial intelligence-assisted imaging analysis has made substantial strides in tumor diagnosis and management. Here we present PASTA, a pan-tumor CT foundation model that achieves state-of-the-art performance on 45 of 46 representative oncology tasks -- including lesion segmentation, tumor detection in plain CT, tumor staging, survival prediction, structured report generation, and cross-modalit… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: 57 pages, 7 figures

  6. arXiv:2501.15579  [pdf, other

    cs.CV cs.CL

    An Explainable Biomedical Foundation Model via Large-Scale Concept-Enhanced Vision-Language Pre-training

    Authors: Yuxiang Nie, Sunan He, Yequan Bie, Yihui Wang, Zhixuan Chen, Shu Yang, Zhiyuan Cai, Hongmei Wang, Xi Wang, Luyang Luo, Mingxiang Wu, Xian Wu, Ronald Cheong Kin Chan, Yuk Ming Lau, Yefeng Zheng, Pranav Rajpurkar, Hao Chen

    Abstract: The clinical adoption of artificial intelligence (AI) in medical imaging requires models that are both diagnostically accurate and interpretable to clinicians. While current multimodal biomedical foundation models prioritize performance, their black-box nature hinders explaining the decision-making process in clinically meaningful concepts. Here, we present ConceptCLIP, the first explainable biome… ▽ More

    Submitted 26 April, 2025; v1 submitted 26 January, 2025; originally announced January 2025.

  7. arXiv:2412.15264  [pdf, other

    cs.CL cs.AI

    ReXTrust: A Model for Fine-Grained Hallucination Detection in AI-Generated Radiology Reports

    Authors: Romain Hardy, Sung Eun Kim, Du Hyun Ro, Pranav Rajpurkar

    Abstract: The increasing adoption of AI-generated radiology reports necessitates robust methods for detecting hallucinations--false or unfounded statements that could impact patient care. We present ReXTrust, a novel framework for fine-grained hallucination detection in AI-generated radiology reports. Our approach leverages sequences of hidden states from large vision-language models to produce finding-leve… ▽ More

    Submitted 30 January, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

    Comments: Accepted to AIMedHealth 10 pages, 5 figures

  8. arXiv:2412.12629  [pdf, other

    eess.IV cs.AI cs.CV

    a2z-1 for Multi-Disease Detection in Abdomen-Pelvis CT: External Validation and Performance Analysis Across 21 Conditions

    Authors: Pranav Rajpurkar, Julian N. Acosta, Siddhant Dogra, Jaehwan Jeong, Deepanshu Jindal, Michael Moritz, Samir Rajpurkar

    Abstract: We present a comprehensive evaluation of a2z-1, an artificial intelligence (AI) model designed to analyze abdomen-pelvis CT scans for 21 time-sensitive and actionable findings. Our study focuses on rigorous assessment of the model's performance and generalizability. Large-scale retrospective analysis demonstrates an average AUC of 0.931 across 21 conditions. External validation across two distinct… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  9. arXiv:2412.12042  [pdf, other

    cs.HC cs.AI

    The Impact of AI Assistance on Radiology Reporting: A Pilot Study Using Simulated AI Draft Reports

    Authors: Julián N. Acosta, Siddhant Dogra, Subathra Adithan, Kay Wu, Michael Moritz, Stephen Kwak, Pranav Rajpurkar

    Abstract: Radiologists face increasing workload pressures amid growing imaging volumes, creating risks of burnout and delayed reporting times. While artificial intelligence (AI) based automated radiology report generation shows promise for reporting workflow optimization, evidence of its real-world impact on clinical accuracy and efficiency remains limited. This study evaluated the effect of draft reports o… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

  10. arXiv:2412.02971  [pdf, other

    cs.CV

    MedAutoCorrect: Image-Conditioned Autocorrection in Medical Reporting

    Authors: Arnold Caleb Asiimwe, Dídac Surís, Pranav Rajpurkar, Carl Vondrick

    Abstract: In medical reporting, the accuracy of radiological reports, whether generated by humans or machine learning algorithms, is critical. We tackle a new task in this paper: image-conditioned autocorrection of inaccuracies within these reports. Using the MIMIC-CXR dataset, we first intentionally introduce a diverse range of errors into reports. Subsequently, we propose a two-stage framework capable of… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  11. arXiv:2411.18672  [pdf, ps, other

    cs.CV

    FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

    Authors: Alice Heiman, Xiaoman Zhang, Emma Chen, Sung Eun Kim, Pranav Rajpurkar

    Abstract: Medical vision-language models often struggle with generating accurate quantitative measurements in radiology reports, leading to hallucinations that undermine clinical reliability. We introduce FactCheXcker, a modular framework that de-hallucinates radiology report measurements by leveraging an improved query-code-update paradigm. Specifically, FactCheXcker employs specialized modules and the cod… ▽ More

    Submitted 2 June, 2025; v1 submitted 27 November, 2024; originally announced November 2024.

    Comments: Accepted to CVPR 2025

  12. arXiv:2411.15122  [pdf, other

    cs.CV cs.AI cs.CL

    ReXrank: A Public Leaderboard for AI-Powered Radiology Report Generation

    Authors: Xiaoman Zhang, Hong-Yu Zhou, Xiaoli Yang, Oishi Banerjee, Julián N. Acosta, Josh Miller, Ouwen Huang, Pranav Rajpurkar

    Abstract: AI-driven models have demonstrated significant potential in automating radiology report generation for chest X-rays. However, there is no standardized benchmark for objectively evaluating their performance. To address this, we present ReXrank, https://rexrank.ai, a public leaderboard and challenge for assessing AI-powered radiology report generation. Our framework incorporates ReXGradient, the lar… ▽ More

    Submitted 22 November, 2024; originally announced November 2024.

  13. arXiv:2411.00299  [pdf, other

    cs.CV cs.LG

    RadFlag: A Black-Box Hallucination Detection Method for Medical Vision Language Models

    Authors: Serena Zhang, Sraavya Sambara, Oishi Banerjee, Julian Acosta, L. John Fahrner, Pranav Rajpurkar

    Abstract: Generating accurate radiology reports from medical images is a clinically important but challenging task. While current Vision Language Models (VLMs) show promise, they are prone to generating hallucinations, potentially compromising patient care. We introduce RadFlag, a black-box method to enhance the accuracy of radiology report generation. Our method uses a sampling-based flagging technique to… ▽ More

    Submitted 15 November, 2024; v1 submitted 31 October, 2024; originally announced November 2024.

    Comments: 17 pages, 6 figures

  14. arXiv:2411.00024  [pdf, other

    cs.CL cs.AI

    A Perspective for Adapting Generalist AI to Specialized Medical AI Applications and Their Challenges

    Authors: Zifeng Wang, Hanyin Wang, Benjamin Danek, Ying Li, Christina Mack, Hoifung Poon, Yajuan Wang, Pranav Rajpurkar, Jimeng Sun

    Abstract: The integration of Large Language Models (LLMs) into medical applications has sparked widespread interest across the healthcare industry, from drug discovery and development to clinical decision support, assisting telemedicine, medical devices, and healthcare insurance applications. This perspective paper aims to discuss the inner workings of building LLM-powered medical AI applications and introd… ▽ More

    Submitted 29 November, 2024; v1 submitted 28 October, 2024; originally announced November 2024.

  15. arXiv:2410.00441  [pdf, other

    cs.AI eess.IV

    ReXplain: Translating Radiology into Patient-Friendly Video Reports

    Authors: Luyang Luo, Jenanan Vairavamurthy, Xiaoman Zhang, Abhinav Kumar, Ramon R. Ter-Oganesyan, Stuart T. Schroff, Dan Shilo, Rydhwana Hossain, Mike Moritz, Pranav Rajpurkar

    Abstract: Radiology reports, designed for efficient communication between medical experts, often remain incomprehensible to patients. This inaccessibility could potentially lead to anxiety, decreased engagement in treatment decisions, and poorer health outcomes, undermining patient-centered care. We present ReXplain (Radiology eXplanation), an innovative AI-driven system that translates radiology findings i… ▽ More

    Submitted 17 December, 2024; v1 submitted 1 October, 2024; originally announced October 2024.

    Comments: 12 pages. The project page is https://www.rajpurkarlab.hms.harvard.edu/rexplain

  16. arXiv:2409.13038  [pdf, other

    cs.AI

    HeadCT-ONE: Enabling Granular and Controllable Automated Evaluation of Head CT Radiology Report Generation

    Authors: Julián N. Acosta, Xiaoman Zhang, Siddhant Dogra, Hong-Yu Zhou, Seyedmehdi Payabvash, Guido J. Falcone, Eric K. Oermann, Pranav Rajpurkar

    Abstract: We present Head CT Ontology Normalized Evaluation (HeadCT-ONE), a metric for evaluating head CT report generation through ontology-normalized entity and relation extraction. HeadCT-ONE enhances current information extraction derived metrics (such as RadGraph F1) by implementing entity normalization through domain-specific ontologies, addressing radiological language variability. HeadCT-ONE compare… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

  17. arXiv:2409.10829  [pdf, other

    cs.CL

    ReXErr: Synthesizing Clinically Meaningful Errors in Diagnostic Radiology Reports

    Authors: Vishwanatha M. Rao, Serena Zhang, Julian N. Acosta, Subathra Adithan, Pranav Rajpurkar

    Abstract: Accurately interpreting medical images and writing radiology reports is a critical but challenging task in healthcare. Both human-written and AI-generated reports can contain errors, ranging from clinical inaccuracies to linguistic mistakes. To address this, we introduce ReXErr, a methodology that leverages Large Language Models to generate representative errors within chest X-ray reports. Working… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

  18. arXiv:2408.16208  [pdf, other

    cs.LG cs.CL

    ReXamine-Global: A Framework for Uncovering Inconsistencies in Radiology Report Generation Metrics

    Authors: Oishi Banerjee, Agustina Saenz, Kay Wu, Warren Clements, Adil Zia, Dominic Buensalido, Helen Kavnoudias, Alain S. Abi-Ghanem, Nour El Ghawi, Cibele Luna, Patricia Castillo, Khaled Al-Surimi, Rayyan A. Daghistani, Yuh-Min Chen, Heng-sheng Chao, Lars Heiliger, Moon Kim, Johannes Haubold, Frederic Jonske, Pranav Rajpurkar

    Abstract: Given the rapidly expanding capabilities of generative AI models for radiology, there is a need for robust metrics that can accurately measure the quality of AI-generated radiology reports across diverse hospitals. We develop ReXamine-Global, a LLM-powered, multi-site framework that tests metrics across different writing styles and patient populations, exposing gaps in their generalization. First,… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  19. arXiv:2408.14397  [pdf, other

    cs.AI cs.CL cs.CV

    Uncovering Knowledge Gaps in Radiology Report Generation Models through Knowledge Graphs

    Authors: Xiaoman Zhang, Julián N. Acosta, Hong-Yu Zhou, Pranav Rajpurkar

    Abstract: Recent advancements in artificial intelligence have significantly improved the automatic generation of radiology reports. However, existing evaluation methods fail to reveal the models' understanding of radiological images and their capacity to achieve human-level granularity in descriptions. To bridge this gap, we introduce a system, named ReXKG, which extracts structured information from process… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: Code is available at: https://github.com/rajpurkarlab/ReXKG

  20. arXiv:2408.12606  [pdf, other

    cs.CV cs.AI

    A Large Model for Non-invasive and Personalized Management of Breast Cancer from Multiparametric MRI

    Authors: Luyang Luo, Mingxiang Wu, Mei Li, Yi Xin, Qiong Wang, Varut Vardhanabhuti, Winnie CW Chu, Zhenhui Li, Juan Zhou, Pranav Rajpurkar, Hao Chen

    Abstract: Breast Magnetic Resonance Imaging (MRI) demonstrates the highest sensitivity for breast cancer detection among imaging modalities and is standard practice for high-risk women. Interpreting the multi-sequence MRI is time-consuming and prone to subjective variation. We develop a large mixture-of-modality-experts model (MOME) that integrates multiparametric MRI information within a unified structure,… ▽ More

    Submitted 4 April, 2025; v1 submitted 8 August, 2024; originally announced August 2024.

    Comments: Nature Communications 2025

  21. arXiv:2406.06496  [pdf, other

    cs.LG cs.CL cs.CV

    Direct Preference Optimization for Suppressing Hallucinated Prior Exams in Radiology Report Generation

    Authors: Oishi Banerjee, Hong-Yu Zhou, Subathra Adithan, Stephen Kwak, Kay Wu, Pranav Rajpurkar

    Abstract: Recent advances in generative vision-language models (VLMs) have exciting potential implications for AI in radiology, yet VLMs are also known to produce hallucinations, nonsensical text, and other unwanted behaviors that can waste clinicians' time and cause patient harm. Drawing on recent work on direct preference optimization (DPO), we propose a simple method for modifying the behavior of pretrai… ▽ More

    Submitted 14 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Added acknowledgemnts

  22. arXiv:2405.20613  [pdf, other

    cs.CL

    FineRadScore: A Radiology Report Line-by-Line Evaluation Technique Generating Corrections with Severity Scores

    Authors: Alyssa Huang, Oishi Banerjee, Kay Wu, Eduardo Pontes Reis, Pranav Rajpurkar

    Abstract: The current gold standard for evaluating generated chest x-ray (CXR) reports is through radiologist annotations. However, this process can be extremely time-consuming and costly, especially when evaluating large numbers of reports. In this work, we present FineRadScore, a Large Language Model (LLM)-based automated evaluation metric for generated CXR reports. Given a candidate report and a ground-t… ▽ More

    Submitted 12 August, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  23. arXiv:2405.09594  [pdf, other

    eess.IV cs.CV cs.LG

    Learning Generalized Medical Image Representations through Image-Graph Contrastive Pretraining

    Authors: Sameer Khanna, Daniel Michael, Marinka Zitnik, Pranav Rajpurkar

    Abstract: Medical image interpretation using deep learning has shown promise but often requires extensive expert-annotated datasets. To reduce this annotation burden, we develop an Image-Graph Contrastive Learning framework that pairs chest X-rays with structured report knowledge graphs automatically extracted from radiology notes. Our approach uniquely encodes the disconnected graph components via a relati… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted into Machine Learning for Health (ML4H) 2023

  24. arXiv:2405.07988  [pdf, ps, other

    cs.CV

    MedVersa: A Generalist Foundation Model for Medical Image Interpretation

    Authors: Hong-Yu Zhou, Julián Nicolás Acosta, Subathra Adithan, Suvrankar Datta, Eric J. Topol, Pranav Rajpurkar

    Abstract: Current medical AI systems are often limited to narrow applications, hindering widespread adoption. We present MedVersa, a generalist foundation model trained on tens of millions of compiled medical instances. MedVersa unlocks generalist learning from multimodal inputs and outputs, representing the first example of a generalist model reaching competitive performance with leading specialized soluti… ▽ More

    Submitted 9 June, 2025; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: Technical study

  25. arXiv:2404.15127  [pdf, other

    cs.CV cs.CL

    GSCo: Towards Generalizable AI in Medicine via Generalist-Specialist Collaboration

    Authors: Sunan He, Yuxiang Nie, Hongmei Wang, Shu Yang, Yihui Wang, Zhiyuan Cai, Zhixuan Chen, Yingxue Xu, Luyang Luo, Huiling Xiang, Xi Lin, Mingxiang Wu, Yifan Peng, George Shih, Ziyang Xu, Xian Wu, Qiong Wang, Ronald Cheong Kin Chan, Varut Vardhanabhuti, Winnie Chiu Wing Chu, Yefeng Zheng, Pranav Rajpurkar, Kang Zhang, Hao Chen

    Abstract: Generalist foundation models (GFMs) are renowned for their exceptional capability and flexibility in effectively generalizing across diverse tasks and modalities. In the field of medicine, while GFMs exhibit superior generalizability based on their extensive intrinsic knowledge as well as proficiency in instruction following and in-context learning, specialist models excel in precision due to thei… ▽ More

    Submitted 4 November, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  26. arXiv:2311.09574  [pdf, other

    cs.LG cs.AI cs.CV

    LymphoML: An interpretable artificial intelligence-based method identifies morphologic features that correlate with lymphoma subtype

    Authors: Vivek Shankar, Xiaoli Yang, Vrishab Krishna, Brent Tan, Oscar Silva, Rebecca Rojansky, Andrew Ng, Fabiola Valvert, Edward Briercheck, David Weinstock, Yasodha Natkunam, Sebastian Fernandez-Pol, Pranav Rajpurkar

    Abstract: The accurate classification of lymphoma subtypes using hematoxylin and eosin (H&E)-stained tissue is complicated by the wide range of morphological features these cancers can exhibit. We present LymphoML - an interpretable machine learning method that identifies morphologic features that correlate with lymphoma subtypes. Our method applies steps to process H&E-stained tissue microarray cores, segm… ▽ More

    Submitted 19 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: To be published in Proceedings of the 3rd Machine Learning for Health symposium, Proceedings of Machine Learning Research (PMLR)

    ACM Class: I.5.1; I.5.2; I.5.4; J.3

  27. arXiv:2311.05591  [pdf

    cs.CV cs.AI cs.CL

    Multimodal Foundation Models Exploit Text to Make Medical Image Predictions

    Authors: Thomas Buckley, James A. Diao, Pranav Rajpurkar, Adam Rodman, Arjun K. Manrai

    Abstract: Multimodal foundation models have shown compelling but conflicting performance in medical image interpretation. However, the mechanisms by which these models integrate and prioritize different data modalities, including images and text, remain poorly understood. Here, using a diverse collection of 1014 multimodal medical cases, we evaluate the unimodal and multimodal image interpretation abilities… ▽ More

    Submitted 25 November, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

  28. arXiv:2311.04937  [pdf, other

    cs.LG cs.AI

    Multimodal Clinical Benchmark for Emergency Care (MC-BEC): A Comprehensive Benchmark for Evaluating Foundation Models in Emergency Medicine

    Authors: Emma Chen, Aman Kansal, Julie Chen, Boyang Tom Jin, Julia Rachel Reisler, David A Kim, Pranav Rajpurkar

    Abstract: We propose the Multimodal Clinical Benchmark for Emergency Care (MC-BEC), a comprehensive benchmark for evaluating foundation models in Emergency Medicine using a dataset of 100K+ continuously monitored Emergency Department visits from 2020-2022. MC-BEC focuses on clinically relevant prediction tasks at timescales from minutes to days, including predicting patient decompensation, disposition, and… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track

  29. arXiv:2310.17811  [pdf, other

    cs.AI cs.CL

    Style-Aware Radiology Report Generation with RadGraph and Few-Shot Prompting

    Authors: Benjamin Yan, Ruochen Liu, David E. Kuo, Subathra Adithan, Eduardo Pontes Reis, Stephen Kwak, Vasantha Kumar Venugopal, Chloe P. O'Connell, Agustina Saenz, Pranav Rajpurkar, Michael Moor

    Abstract: Automatically generated reports from medical images promise to improve the workflow of radiologists. Existing methods consider an image-to-report modeling task by directly generating a fully-fledged report from an image. However, this conflates the content of the report (e.g., findings and their attributes) with its style (e.g., format and choice of words), which can lead to clinically inaccurate… ▽ More

    Submitted 31 October, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023

  30. arXiv:2310.14573  [pdf, other

    cs.CL

    Exploring the Boundaries of GPT-4 in Radiology

    Authors: Qianchu Liu, Stephanie Hyland, Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Maria Teodora Wetscherek, Robert Tinn, Harshita Sharma, Fernando Pérez-García, Anton Schwaighofer, Pranav Rajpurkar, Sameer Tajdin Khanna, Hoifung Poon, Naoto Usuyama, Anja Thieme, Aditya V. Nori, Matthew P. Lungren, Ozan Oktay, Javier Alvarez-Valle

    Abstract: The recent success of general-domain large language models (LLMs) has significantly changed the natural language processing paradigm towards a unified foundation model across domains and applications. In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-s… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 main

  31. arXiv:2308.12453  [pdf, other

    cs.CV cs.AI cs.LG

    Augmenting medical image classifiers with synthetic data from latent diffusion models

    Authors: Luke W. Sagers, James A. Diao, Luke Melas-Kyriazi, Matthew Groh, Pranav Rajpurkar, Adewole S. Adamson, Veronica Rotemberg, Roxana Daneshjou, Arjun K. Manrai

    Abstract: While hundreds of artificial intelligence (AI) algorithms are now approved or cleared by the US Food and Drugs Administration (FDA), many studies have shown inconsistent generalization or latent bias, particularly for underrepresented populations. Some have proposed that generative AI could reduce the need for real data, but its utility in model development remains unclear. Skin disease serves as… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  32. arXiv:2308.05046  [pdf, other

    cs.CL cs.LG

    RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction

    Authors: Sameer Khanna, Adam Dejl, Kibo Yoon, Quoc Hung Truong, Hanh Duong, Agustina Saenz, Pranav Rajpurkar

    Abstract: We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a mo… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Accepted at Machine Learning for Healthcare 2023

  33. arXiv:2307.15189  [pdf, other

    cs.CV cs.AI

    Med-Flamingo: a Multimodal Medical Few-shot Learner

    Authors: Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, Cyril Zakka, Yash Dalmia, Eduardo Pontes Reis, Pranav Rajpurkar, Jure Leskovec

    Abstract: Medicine, by its nature, is a multifaceted domain that requires the synthesis of information across various modalities. Medical generative vision-language models (VLMs) make a first step in this direction and promise many exciting clinical applications. However, existing models typically have to be fine-tuned on sizeable down-stream datasets, which poses a significant limitation as in many medical… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Preprint

  34. arXiv:2306.08000  [pdf, ps, other

    physics.med-ph cs.CL cs.CV cs.LG eess.IV

    Improving Zero-Shot Detection of Low Prevalence Chest Pathologies using Domain Pre-trained Language Models

    Authors: Aakash Mishra, Rajat Mittal, Christy Jestin, Kostas Tingos, Pranav Rajpurkar

    Abstract: Recent advances in zero-shot learning have enabled the use of paired image-text data to replace structured labels, replacing the need for expert annotated datasets. Models such as CLIP-based CheXzero utilize these advancements in the domain of chest X-ray interpretation. We hypothesize that domain pre-trained models such as CXR-BERT, BlueBERT, and ClinicalBERT offer the potential to improve the pe… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 3 pages, 1 table, Medical Imaging with Deep Learning, Short Paper

    Report number: Short-Paper-120

  35. arXiv:2304.08486  [pdf, other

    cs.CV

    BenchMD: A Benchmark for Unified Learning on Medical Images and Sensors

    Authors: Kathryn Wantlin, Chenwei Wu, Shih-Cheng Huang, Oishi Banerjee, Farah Dadabhoy, Veeral Vipin Mehta, Ryan Wonhee Han, Fang Cao, Raja R. Narayan, Errol Colak, Adewole Adamson, Laura Heacock, Geoffrey H. Tison, Alex Tamkin, Pranav Rajpurkar

    Abstract: Medical data poses a daunting challenge for AI algorithms: it exists in many different modalities, experiences frequent distribution shifts, and suffers from a scarcity of examples and labels. Recent advances, including transformers and self-supervised learning, promise a more universal approach that can be applied flexibly across these diverse conditions. To measure and drive progress in this dir… ▽ More

    Submitted 26 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

  36. arXiv:2304.00546  [pdf, other

    eess.IV cs.CV cs.LG

    Video Pretraining Advances 3D Deep Learning on Chest CT Tasks

    Authors: Alexander Ke, Shih-Cheng Huang, Chloe P O'Connell, Michal Klimont, Serena Yeung, Pranav Rajpurkar

    Abstract: Pretraining on large natural image classification datasets such as ImageNet has aided model development on data-scarce 2D medical tasks. 3D medical tasks often have much less data than 2D medical tasks, prompting practitioners to rely on pretrained 2D models to featurize slices. However, these 2D models have been surpassed by 3D models on 3D computer vision benchmarks since they do not natively le… ▽ More

    Submitted 2 April, 2023; originally announced April 2023.

    Comments: Accepted at MIDL 2023

  37. arXiv:2303.17579  [pdf, other

    cs.CL cs.AI cs.CV

    Multimodal Image-Text Matching Improves Retrieval-based Chest X-Ray Report Generation

    Authors: Jaehwan Jeong, Katherine Tian, Andrew Li, Sina Hartung, Fardad Behzadi, Juan Calle, David Osayande, Michael Pohlen, Subathra Adithan, Pranav Rajpurkar

    Abstract: Automated generation of clinically accurate radiology reports can improve patient care. Previous report generation methods that rely on image captioning models often generate incoherent and incorrect text due to their lack of relevant domain knowledge, while retrieval-based attempts frequently retrieve reports that are irrelevant to the input image. In this work, we propose Contrastive X-Ray REpor… ▽ More

    Submitted 2 May, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Journal ref: Medical Imaging with Deep Learning 2023

  38. arXiv:2211.13352  [pdf, other

    eess.IV cs.CV cs.LG

    Improving dermatology classifiers across populations using images generated by large diffusion models

    Authors: Luke W. Sagers, James A. Diao, Matthew Groh, Pranav Rajpurkar, Adewole S. Adamson, Arjun K. Manrai

    Abstract: Dermatological classification algorithms developed without sufficiently diverse training data may generalize poorly across populations. While intentional data collection and annotation offer the best means for improving representation, new computational approaches for generating training data may also aid in mitigating the effects of sampling bias. In this paper, we show that DALL$\cdot$E 2, a lar… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022 Workshop on Synthetic Data for Empowering ML Research

  39. arXiv:2210.06340  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Radiology Report Generation Systems by Removing Hallucinated References to Non-existent Priors

    Authors: Vignav Ramesh, Nathan Andrew Chi, Pranav Rajpurkar

    Abstract: Current deep learning models trained to generate radiology reports from chest radiographs are capable of producing clinically accurate, clear, and actionable text that can advance patient care. However, such systems all succumb to the same problem: making hallucinated references to non-existent prior reports. Such hallucinations occur because these models are trained on datasets of real-world pati… ▽ More

    Submitted 13 October, 2022; v1 submitted 26 September, 2022; originally announced October 2022.

    Comments: 13 pages, 1 figure, 11 tables

  40. arXiv:2201.01449  [pdf, other

    eess.IV cs.CV cs.LG

    Deep Learning-Based Sparse Whole-Slide Image Analysis for the Diagnosis of Gastric Intestinal Metaplasia

    Authors: Jon Braatz, Pranav Rajpurkar, Stephanie Zhang, Andrew Y. Ng, Jeanne Shen

    Abstract: In recent years, deep learning has successfully been applied to automate a wide variety of tasks in diagnostic histopathology. However, fast and reliable localization of small-scale regions-of-interest (ROI) has remained a key challenge, as discriminative morphologic features often occupy only a small fraction of a gigapixel-scale whole-slide image (WSI). In this paper, we propose a sparse WSI ana… ▽ More

    Submitted 4 January, 2022; originally announced January 2022.

  41. arXiv:2108.01764  [pdf, other

    cs.CL cs.AI

    Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management

    Authors: Cécile Logé, Emily Ross, David Yaw Amoah Dadey, Saahil Jain, Adriel Saporta, Andrew Y. Ng, Pranav Rajpurkar

    Abstract: Recent advances in Natural Language Processing (NLP), and specifically automated Question Answering (QA) systems, have demonstrated both impressive linguistic fluency and a pernicious tendency to reflect social biases. In this study, we introduce Q-Pain, a dataset for assessing bias in medical QA in the context of pain management, one of the most challenging forms of clinical decision-making. Alon… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: Accepted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks

  42. arXiv:2106.14463  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

    Authors: Saahil Jain, Ashwin Agrawal, Adriel Saporta, Steven QH Truong, Du Nguyen Duong, Tan Bui, Pierre Chambon, Yuhao Zhang, Matthew P. Lungren, Andrew Y. Ng, Curtis P. Langlotz, Pranav Rajpurkar

    Abstract: Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a devel… ▽ More

    Submitted 29 August, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

    Comments: Accepted to the 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks

  43. arXiv:2106.04452  [pdf, other

    physics.med-ph cs.LG eess.SP

    3KG: Contrastive Learning of 12-Lead Electrocardiograms using Physiologically-Inspired Augmentations

    Authors: Bryan Gopal, Ryan W. Han, Gautham Raghupathi, Andrew Y. Ng, Geoffrey H. Tison, Pranav Rajpurkar

    Abstract: We propose 3KG, a physiologically-inspired contrastive learning approach that generates views using 3D augmentations of the 12-lead electrocardiogram. We evaluate representation quality by fine-tuning a linear layer for the downstream task of 23-class diagnosis on the PhysioNet 2020 challenge training data and find that 3KG achieves a $9.1\%$ increase in mean AUC over the best self-supervised base… ▽ More

    Submitted 20 September, 2021; v1 submitted 21 April, 2021; originally announced June 2021.

    Comments: 11 pages, 3 figures, paper revision with new set of experiments and comparison to previous methods

  44. arXiv:2105.03020  [pdf, other

    eess.IV cs.CV cs.LG

    Structured dataset documentation: a datasheet for CheXpert

    Authors: Christian Garbin, Pranav Rajpurkar, Jeremy Irvin, Matthew P. Lungren, Oge Marques

    Abstract: Billions of X-ray images are taken worldwide each year. Machine learning, and deep learning in particular, has shown potential to help radiologists triage and diagnose images. However, deep learning requires large datasets with reliable labels. The CheXpert dataset was created with the participation of board-certified radiologists, resulting in the strong ground truth needed to train deep learning… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

  45. arXiv:2104.00793  [pdf, ps, other

    eess.IV cs.CV cs.LG

    Effect of Radiology Report Labeler Quality on Deep Learning Models for Chest X-Ray Interpretation

    Authors: Saahil Jain, Akshay Smit, Andrew Y. Ng, Pranav Rajpurkar

    Abstract: Although deep learning models for chest X-ray interpretation are commonly trained on labels generated by automatic radiology report labelers, the impact of improvements in report labeling on the performance of chest X-ray classification models has not been systematically investigated. We first compare the CheXpert, CheXbert, and VisualCheXbert labelers on the task of extracting accurate chest X-ra… ▽ More

    Submitted 27 November, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: In Neural Information Processing Systems (NeurIPS) Workshop on Data-Centric AI (DCAI)

  46. arXiv:2103.14339  [pdf, other

    cs.CV cs.AI cs.LG

    MedSelect: Selective Labeling for Medical Image Classification Combining Meta-Learning with Deep Reinforcement Learning

    Authors: Akshay Smit, Damir Vrabac, Yujie He, Andrew Y. Ng, Andrew L. Beam, Pranav Rajpurkar

    Abstract: We propose a selective learning method using meta-learning and deep reinforcement learning for medical image interpretation in the setting of limited labeling resources. Our method, MedSelect, consists of a trainable deep learning selector that uses image embeddings obtained from contrastive pretraining for determining which images to label, and a non-parametric selector that uses cosine similarit… ▽ More

    Submitted 26 March, 2021; originally announced March 2021.

  47. arXiv:2103.09957  [pdf, other

    cs.CV cs.AI cs.LG

    CheXbreak: Misclassification Identification for Deep Learning Models Interpreting Chest X-rays

    Authors: Emma Chen, Andy Kim, Rayan Krishnan, Jin Long, Andrew Y. Ng, Pranav Rajpurkar

    Abstract: A major obstacle to the integration of deep learning models for chest x-ray interpretation into clinical settings is the lack of understanding of their failure modes. In this work, we first investigate whether there are patient subgroups that chest x-ray models are likely to misclassify. We find that patient age and the radiographic finding of lung lesion, pneumothorax or support devices are stati… ▽ More

    Submitted 20 July, 2021; v1 submitted 17 March, 2021; originally announced March 2021.

    Comments: In Proceedings of the 2021 Conference on Machine Learning for Health Care, 2021. In ACM Conference on Health, Inference, and Learning (ACM-CHIL) Workshop 2021

  48. arXiv:2103.04590  [pdf, other

    cs.CV cs.AI cs.LG

    CheXseen: Unseen Disease Detection for Deep Learning Interpretation of Chest X-rays

    Authors: Siyu Shi, Ishaan Malhi, Kevin Tran, Andrew Y. Ng, Pranav Rajpurkar

    Abstract: We systematically evaluate the performance of deep learning models in the presence of diseases not labeled for or present during training. First, we evaluate whether deep learning models trained on a subset of diseases (seen diseases) can detect the presence of any one of a larger set of diseases. We find that models tend to falsely classify diseases outside of the subset (unseen diseases) as "no… ▽ More

    Submitted 17 May, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

    Comments: Accepted at MIDL Conference 2021. Previous version accepted at ACM Conference on Health, Inference, and Learning (ACM-CHIL) Workshop 2021

  49. arXiv:2102.11467  [pdf, other

    eess.IV cs.CV cs.LG

    VisualCheXbert: Addressing the Discrepancy Between Radiology Report Labels and Image Labels

    Authors: Saahil Jain, Akshay Smit, Steven QH Truong, Chanh DT Nguyen, Minh-Thanh Huynh, Mudit Jain, Victoria A. Young, Andrew Y. Ng, Matthew P. Lungren, Pranav Rajpurkar

    Abstract: Automatic extraction of medical conditions from free-text radiology reports is critical for supervising computer vision models to interpret medical images. In this work, we show that radiologists labeling reports significantly disagree with radiologists labeling corresponding chest X-ray images, which reduces the quality of report labels as proxies for image labels. We develop and evaluate methods… ▽ More

    Submitted 15 March, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: Accepted to ACM Conference on Health, Inference, and Learning (ACM-CHIL) 2021

  50. arXiv:2102.10663  [pdf, other

    eess.IV cs.CV cs.LG

    MedAug: Contrastive learning leveraging patient metadata improves representations for chest X-ray interpretation

    Authors: Yen Nhi Truong Vu, Richard Wang, Niranjan Balachandar, Can Liu, Andrew Y. Ng, Pranav Rajpurkar

    Abstract: Self-supervised contrastive learning between pairs of multiple views of the same image has been shown to successfully leverage unlabeled data to produce meaningful visual representations for both natural and medical images. However, there has been limited work on determining how to select pairs for medical images, where availability of patient metadata can be leveraged to improve representations.… ▽ More

    Submitted 17 October, 2021; v1 submitted 21 February, 2021; originally announced February 2021.