-
Health AI Developer Foundations
Authors:
Atilla P. Kiraly,
Sebastien Baur,
Kenneth Philbrick,
Fereshteh Mahvar,
Liron Yatziv,
Tiffany Chen,
Bram Sterling,
Nick George,
Fayaz Jamil,
Jing Tang,
Kai Bailey,
Faruk Ahmed,
Akshay Goel,
Abbi Ward,
Lin Yang,
Andrew Sellergren,
Yossi Matias,
Avinatan Hassidim,
Shravya Shetty,
Daniel Golden,
Shekoofeh Azizi,
David F. Steiner,
Yun Liu,
Tim Thelin,
Rory Pilgrim
, et al. (1 additional authors not shown)
Abstract:
Robust medical Machine Learning (ML) models have the potential to revolutionize healthcare by accelerating clinical research, improving workflows and outcomes, and producing novel insights or capabilities. Developing such ML models from scratch is cost prohibitive and requires substantial compute, data, and time (e.g., expert labeling). To address these challenges, we introduce Health AI Developer…
▽ More
Robust medical Machine Learning (ML) models have the potential to revolutionize healthcare by accelerating clinical research, improving workflows and outcomes, and producing novel insights or capabilities. Developing such ML models from scratch is cost prohibitive and requires substantial compute, data, and time (e.g., expert labeling). To address these challenges, we introduce Health AI Developer Foundations (HAI-DEF), a suite of pre-trained, domain-specific foundation models, tools, and recipes to accelerate building ML for health applications. The models cover various modalities and domains, including radiology (X-rays and computed tomography), histopathology, dermatological imaging, and audio. These models provide domain specific embeddings that facilitate AI development with less labeled data, shorter training times, and reduced computational costs compared to traditional approaches. In addition, we utilize a common interface and style across these models, and prioritize usability to enable developers to integrate HAI-DEF efficiently. We present model evaluations across various tasks and conclude with a discussion of their application and evaluation, covering the importance of ensuring efficacy, fairness, and equity. Finally, while HAI-DEF and specifically the foundation models lower the barrier to entry for ML in healthcare, we emphasize the importance of validation with problem- and population-specific data for each desired usage setting. This technical report will be updated over time as more modalities and features are added.
△ Less
Submitted 26 November, 2024; v1 submitted 22 November, 2024;
originally announced November 2024.
-
Non-Contact Breathing Rate Detection Using Optical Flow
Authors:
Robyn Maxwell,
Timothy Hanley,
Dara Golden,
Adara Andonie,
Joseph Lemley,
Ashkan Parsi
Abstract:
Breathing rate is a vital health metric that is an invaluable indicator of the overall health of a person. In recent years, the non-contact measurement of health signals such as breathing rate has been a huge area of development, with a wide range of applications from telemedicine to driver monitoring systems. This paper presents an investigation into a method of non-contact breathing rate detecti…
▽ More
Breathing rate is a vital health metric that is an invaluable indicator of the overall health of a person. In recent years, the non-contact measurement of health signals such as breathing rate has been a huge area of development, with a wide range of applications from telemedicine to driver monitoring systems. This paper presents an investigation into a method of non-contact breathing rate detection using a motion detection algorithm, optical flow. Optical flow is used to successfully measure breathing rate by tracking the motion of specific points on the body. In this study, the success of optical flow when using different sets of points is evaluated. Testing shows that both chest and facial movement can be used to determine breathing rate but to different degrees of success. The chest generates very accurate signals, with an RMSE of 0.63 on the tested videos. Facial points can also generate reliable signals when there is minimal head movement but are much more vulnerable to noise caused by head/body movements. These findings highlight the potential of optical flow as a non-invasive method for breathing rate detection and emphasize the importance of selecting appropriate points to optimize accuracy.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders
Authors:
Shawn Xu,
Lin Yang,
Christopher Kelly,
Marcin Sieniek,
Timo Kohlberger,
Martin Ma,
Wei-Hung Weng,
Atilla Kiraly,
Sahar Kazemzadeh,
Zakkai Melamed,
Jungyeon Park,
Patricia Strachan,
Yun Liu,
Chuck Lau,
Preeti Singh,
Christina Chen,
Mozziyar Etemadi,
Sreenivasa Raju Kalidindi,
Yossi Matias,
Katherine Chou,
Greg S. Corrado,
Shravya Shetty,
Daniel Tse,
Shruthi Prabhakara,
Daniel Golden
, et al. (3 additional authors not shown)
Abstract:
In this work, we present an approach, which we call Embeddings for Language/Image-aligned X-Rays, or ELIXR, that leverages a language-aligned image encoder combined or grafted onto a fixed LLM, PaLM 2, to perform a broad range of chest X-ray tasks. We train this lightweight adapter architecture using images paired with corresponding free-text radiology reports from the MIMIC-CXR dataset. ELIXR ach…
▽ More
In this work, we present an approach, which we call Embeddings for Language/Image-aligned X-Rays, or ELIXR, that leverages a language-aligned image encoder combined or grafted onto a fixed LLM, PaLM 2, to perform a broad range of chest X-ray tasks. We train this lightweight adapter architecture using images paired with corresponding free-text radiology reports from the MIMIC-CXR dataset. ELIXR achieved state-of-the-art performance on zero-shot chest X-ray (CXR) classification (mean AUC of 0.850 across 13 findings), data-efficient CXR classification (mean AUCs of 0.893 and 0.898 across five findings (atelectasis, cardiomegaly, consolidation, pleural effusion, and pulmonary edema) for 1% (~2,200 images) and 10% (~22,000 images) training data), and semantic search (0.76 normalized discounted cumulative gain (NDCG) across nineteen queries, including perfect retrieval on twelve of them). Compared to existing data-efficient methods including supervised contrastive learning (SupCon), ELIXR required two orders of magnitude less data to reach similar performance. ELIXR also showed promise on CXR vision-language tasks, demonstrating overall accuracies of 58.7% and 62.5% on visual question answering and report quality assurance tasks, respectively. These results suggest that ELIXR is a robust and versatile approach to CXR AI.
△ Less
Submitted 7 September, 2023; v1 submitted 2 August, 2023;
originally announced August 2023.