-
Tx-LLM: A Large Language Model for Therapeutics
Authors:
Juan Manuel Zambrano Chaves,
Eric Wang,
Tao Tu,
Eeshit Dhaval Vaishnav,
Byron Lee,
S. Sara Mahdavi,
Christopher Semturs,
David Fleet,
Vivek Natarajan,
Shekoofeh Azizi
Abstract:
Developing therapeutics is a lengthy and expensive process that requires the satisfaction of many different criteria, and AI models capable of expediting the process would be invaluable. However, the majority of current AI approaches address only a narrowly defined set of tasks, often circumscribed within a particular domain. To bridge this gap, we introduce Tx-LLM, a generalist large language mod…
▽ More
Developing therapeutics is a lengthy and expensive process that requires the satisfaction of many different criteria, and AI models capable of expediting the process would be invaluable. However, the majority of current AI approaches address only a narrowly defined set of tasks, often circumscribed within a particular domain. To bridge this gap, we introduce Tx-LLM, a generalist large language model (LLM) fine-tuned from PaLM-2 which encodes knowledge about diverse therapeutic modalities. Tx-LLM is trained using a collection of 709 datasets that target 66 tasks spanning various stages of the drug discovery pipeline. Using a single set of weights, Tx-LLM simultaneously processes a wide variety of chemical or biological entities(small molecules, proteins, nucleic acids, cell lines, diseases) interleaved with free-text, allowing it to predict a broad range of associated properties, achieving competitive with state-of-the-art (SOTA) performance on 43 out of 66 tasks and exceeding SOTA on 22. Among these, Tx-LLM is particularly powerful and exceeds best-in-class performance on average for tasks combining molecular SMILES representations with text such as cell line names or disease names, likely due to context learned during pretraining. We observe evidence of positive transfer between tasks with diverse drug types (e.g.,tasks involving small molecules and tasks involving proteins), and we study the impact of model size, domain finetuning, and prompting strategies on performance. We believe Tx-LLM represents an important step towards LLMs encoding biochemical knowledge and could have a future role as an end-to-end tool across the drug discovery development pipeline.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Capabilities of Gemini Models in Medicine
Authors:
Khaled Saab,
Tao Tu,
Wei-Hung Weng,
Ryutaro Tanno,
David Stutz,
Ellery Wulczyn,
Fan Zhang,
Tim Strother,
Chunjong Park,
Elahe Vedadi,
Juanma Zambrano Chaves,
Szu-Yeu Hu,
Mike Schaekermann,
Aishwarya Kamath,
Yong Cheng,
David G. T. Barrett,
Cathy Cheung,
Basil Mustafa,
Anil Palepu,
Daniel McDuff,
Le Hou,
Tomer Golany,
Luyang Liu,
Jean-baptiste Alayrac,
Neil Houlsby
, et al. (42 additional authors not shown)
Abstract:
Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-G…
▽ More
Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-Gemini, a family of highly capable multimodal models that are specialized in medicine with the ability to seamlessly use web search, and that can be efficiently tailored to novel modalities using custom encoders. We evaluate Med-Gemini on 14 medical benchmarks, establishing new state-of-the-art (SoTA) performance on 10 of them, and surpass the GPT-4 model family on every benchmark where a direct comparison is viable, often by a wide margin. On the popular MedQA (USMLE) benchmark, our best-performing Med-Gemini model achieves SoTA performance of 91.1% accuracy, using a novel uncertainty-guided search strategy. On 7 multimodal benchmarks including NEJM Image Challenges and MMMU (health & medicine), Med-Gemini improves over GPT-4V by an average relative margin of 44.5%. We demonstrate the effectiveness of Med-Gemini's long-context capabilities through SoTA performance on a needle-in-a-haystack retrieval task from long de-identified health records and medical video question answering, surpassing prior bespoke methods using only in-context learning. Finally, Med-Gemini's performance suggests real-world utility by surpassing human experts on tasks such as medical text summarization, alongside demonstrations of promising potential for multimodal medical dialogue, medical research and education. Taken together, our results offer compelling evidence for Med-Gemini's potential, although further rigorous evaluation will be crucial before real-world deployment in this safety-critical domain.
△ Less
Submitted 1 May, 2024; v1 submitted 29 April, 2024;
originally announced April 2024.
-
VideoSAGE: Video Summarization with Graph Representation Learning
Authors:
Jose M. Rojas Chaves,
Subarna Tripathi
Abstract:
We propose a graph-based representation learning framework for video summarization. First, we convert an input video to a graph where nodes correspond to each of the video frames. Then, we impose sparsity on the graph by connecting only those pairs of nodes that are within a specified temporal distance. We then formulate the video summarization task as a binary node classification problem, precise…
▽ More
We propose a graph-based representation learning framework for video summarization. First, we convert an input video to a graph where nodes correspond to each of the video frames. Then, we impose sparsity on the graph by connecting only those pairs of nodes that are within a specified temporal distance. We then formulate the video summarization task as a binary node classification problem, precisely classifying video frames whether they should belong to the output summary video. A graph constructed this way aims to capture long-range interactions among video frames, and the sparsity ensures the model trains without hitting the memory and compute bottleneck. Experiments on two datasets(SumMe and TVSum) demonstrate the effectiveness of the proposed nimble model compared to existing state-of-the-art summarization approaches while being one order of magnitude more efficient in compute time and memory
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation
Authors:
Juan Manuel Zambrano Chaves,
Shih-Cheng Huang,
Yanbo Xu,
Hanwen Xu,
Naoto Usuyama,
Sheng Zhang,
Fei Wang,
Yujia Xie,
Mahmoud Khademi,
Ziyi Yang,
Hany Awadalla,
Julia Gong,
Houdong Hu,
Jianwei Yang,
Chunyuan Li,
Jianfeng Gao,
Yu Gu,
Cliff Wong,
Mu Wei,
Tristan Naumann,
Muhao Chen,
Matthew P. Lungren,
Akshay Chaudhari,
Serena Yeung-Levy,
Curtis P. Langlotz
, et al. (2 additional authors not shown)
Abstract:
The scaling laws and extraordinary performance of large foundation models motivate the development and utilization of such models in biomedicine. However, despite early promising results on some biomedical benchmarks, there are still major challenges that need to be addressed before these models can be used in real-world clinics. Frontier general-domain models such as GPT-4V still have significant…
▽ More
The scaling laws and extraordinary performance of large foundation models motivate the development and utilization of such models in biomedicine. However, despite early promising results on some biomedical benchmarks, there are still major challenges that need to be addressed before these models can be used in real-world clinics. Frontier general-domain models such as GPT-4V still have significant performance gaps in multimodal biomedical applications. More importantly, less-acknowledged pragmatic issues, including accessibility, model cost, and tedious manual evaluation make it hard for clinicians to use state-of-the-art large models directly on private patient data. Here, we explore training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. To maximize data efficiency, we adopt a modular approach by incorporating state-of-the-art pre-trained models for image and text modalities, and focusing on training a lightweight adapter to ground each modality to the text embedding space, as exemplified by LLaVA-Med. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. For best practice, we conduct a systematic ablation study on various choices in data engineering and multimodal training. The resulting LlaVA-Rad (7B) model attains state-of-the-art results on standard radiology tasks such as report generation and cross-modal retrieval, even outperforming much larger models such as GPT-4V and Med-PaLM M (84B). The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
△ Less
Submitted 26 June, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
RadAdapt: Radiology Report Summarization via Lightweight Domain Adaptation of Large Language Models
Authors:
Dave Van Veen,
Cara Van Uden,
Maayane Attias,
Anuj Pareek,
Christian Bluethgen,
Malgorzata Polacin,
Wah Chiu,
Jean-Benoit Delbrouck,
Juan Manuel Zambrano Chaves,
Curtis P. Langlotz,
Akshay S. Chaudhari,
John Pauly
Abstract:
We systematically investigate lightweight strategies to adapt large language models (LLMs) for the task of radiology report summarization (RRS). Specifically, we focus on domain adaptation via pretraining (on natural language, biomedical text, or clinical text) and via discrete prompting or parameter-efficient fine-tuning. Our results consistently achieve best performance by maximally adapting to…
▽ More
We systematically investigate lightweight strategies to adapt large language models (LLMs) for the task of radiology report summarization (RRS). Specifically, we focus on domain adaptation via pretraining (on natural language, biomedical text, or clinical text) and via discrete prompting or parameter-efficient fine-tuning. Our results consistently achieve best performance by maximally adapting to the task via pretraining on clinical text and fine-tuning on RRS examples. Importantly, this method fine-tunes a mere 0.32% of parameters throughout the model, in contrast to end-to-end fine-tuning (100% of parameters). Additionally, we study the effect of in-context examples and out-of-distribution (OOD) training before concluding with a radiologist reader study and qualitative analysis. Our findings highlight the importance of domain adaptation in RRS and provide valuable insights toward developing effective natural language processing solutions for clinical tasks.
△ Less
Submitted 20 July, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Comp2Comp: Open-Source Body Composition Assessment on Computed Tomography
Authors:
Louis Blankemeier,
Arjun Desai,
Juan Manuel Zambrano Chaves,
Andrew Wentland,
Sally Yao,
Eduardo Reis,
Malte Jensen,
Bhanushree Bahl,
Khushboo Arora,
Bhavik N. Patel,
Leon Lenchik,
Marc Willis,
Robert D. Boutin,
Akshay S. Chaudhari
Abstract:
Computed tomography (CT) is routinely used in clinical practice to evaluate a wide variety of medical conditions. While CT scans provide diagnoses, they also offer the ability to extract quantitative body composition metrics to analyze tissue volume and quality. Extracting quantitative body composition measures manually from CT scans is a cumbersome and time-consuming task. Proprietary software ha…
▽ More
Computed tomography (CT) is routinely used in clinical practice to evaluate a wide variety of medical conditions. While CT scans provide diagnoses, they also offer the ability to extract quantitative body composition metrics to analyze tissue volume and quality. Extracting quantitative body composition measures manually from CT scans is a cumbersome and time-consuming task. Proprietary software has been developed recently to automate this process, but the closed-source nature impedes widespread use. There is a growing need for fully automated body composition software that is more accessible and easier to use, especially for clinicians and researchers who are not experts in medical image processing. To this end, we have built Comp2Comp, an open-source Python package for rapid and automated body composition analysis of CT scans. This package offers models, post-processing heuristics, body composition metrics, automated batching, and polychromatic visualizations. Comp2Comp currently computes body composition measures for bone, skeletal muscle, visceral adipose tissue, and subcutaneous adipose tissue on CT scans of the abdomen. We have created two pipelines for this purpose. The first pipeline computes vertebral measures, as well as muscle and adipose tissue measures, at the T12 - L5 vertebral levels from abdominal CT scans. The second pipeline computes muscle and adipose tissue measures on user-specified 2D axial slices. In this guide, we discuss the architecture of the Comp2Comp pipelines, provide usage instructions, and report internal and external validation results to measure the quality of segmentations and body composition measures. Comp2Comp can be found at https://github.com/StanfordMIMI/Comp2Comp.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
RoentGen: Vision-Language Foundation Model for Chest X-ray Generation
Authors:
Pierre Chambon,
Christian Bluethgen,
Jean-Benoit Delbrouck,
Rogier Van der Sluijs,
Małgorzata Połacin,
Juan Manuel Zambrano Chaves,
Tanishq Mathew Abraham,
Shivanshu Purohit,
Curtis P. Langlotz,
Akshay Chaudhari
Abstract:
Multimodal models trained on large natural image-text pair datasets have exhibited astounding abilities in generating high-quality images. Medical imaging data is fundamentally different to natural images, and the language used to succinctly capture relevant details in medical data uses a different, narrow but semantically rich, domain-specific vocabulary. Not surprisingly, multi-modal models trai…
▽ More
Multimodal models trained on large natural image-text pair datasets have exhibited astounding abilities in generating high-quality images. Medical imaging data is fundamentally different to natural images, and the language used to succinctly capture relevant details in medical data uses a different, narrow but semantically rich, domain-specific vocabulary. Not surprisingly, multi-modal models trained on natural image-text pairs do not tend to generalize well to the medical domain. Developing generative imaging models faithfully representing medical concepts while providing compositional diversity could mitigate the existing paucity of high-quality, annotated medical imaging datasets. In this work, we develop a strategy to overcome the large natural-medical distributional shift by adapting a pre-trained latent diffusion model on a corpus of publicly available chest x-rays (CXR) and their corresponding radiology (text) reports. We investigate the model's ability to generate high-fidelity, diverse synthetic CXR conditioned on text prompts. We assess the model outputs quantitatively using image quality metrics, and evaluate image quality and text-image alignment by human domain experts. We present evidence that the resulting model (RoentGen) is able to create visually convincing, diverse synthetic CXR images, and that the output can be controlled to a new extent by using free-form text prompts including radiology-specific language. Fine-tuning this model on a fixed training set and using it as a data augmentation method, we measure a 5% improvement of a classifier trained jointly on synthetic and real images, and a 3% improvement when trained on a larger but purely synthetic training set. Finally, we observe that this fine-tuning distills in-domain knowledge in the text-encoder and can improve its representation capabilities of certain diseases like pneumothorax by 25%.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
OKSP: A Novel Deep Learning Automatic Event Detection Pipeline for Seismic Monitoringin Costa Rica
Authors:
Leonardo van der Laat,
Ronald J. L. Baldares,
Esteban J. Chaves,
Esteban Meneses
Abstract:
Small magnitude earthquakes are the most abundant but the most difficult to locate robustly and well due to their low amplitudes and high frequencies usually obscured by heterogeneous noise sources. They highlight crucial information about the stress state and the spatio-temporal behavior of fault systems during the earthquake cycle, therefore, its full characterization is then crucial for improvi…
▽ More
Small magnitude earthquakes are the most abundant but the most difficult to locate robustly and well due to their low amplitudes and high frequencies usually obscured by heterogeneous noise sources. They highlight crucial information about the stress state and the spatio-temporal behavior of fault systems during the earthquake cycle, therefore, its full characterization is then crucial for improving earthquake hazard assessment. Modern DL algorithms along with the increasing computational power are exploiting the continuously growing seismological databases, allowing scientists to improve the completeness for earthquake catalogs, systematically detecting smaller magnitude earthquakes and reducing the errors introduced mainly by human intervention. In this work, we introduce OKSP, a novel automatic earthquake detection pipeline for seismic monitoring in Costa Rica. Using Kabre supercomputer from the Costa Rica High Technology Center, we applied OKSP to the day before and the first 5 days following the Puerto Armuelles, M6.5, earthquake that occurred on 26 June, 2019, along the Costa Rica-Panama border and found 1100 more earthquakes previously unidentified by the Volcanological and Seismological Observatory of Costa Rica. From these events, a total of 23 earthquakes with magnitudes below 1.0 occurred a day to hours prior to the mainshock, shedding light about the rupture initiation and earthquake interaction leading to the occurrence of this productive seismic sequence. Our observations show that for the study period, the model was 100% exhaustive and 82% precise, resulting in an F1 score of 0.90. This effort represents the very first attempt for automatically detecting earthquakes in Costa Rica using deep learning methods and demonstrates that, in the near future, earthquake monitoring routines will be carried out entirely by AI algorithms.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Entropy as a measure of attractiveness and socioeconomic complexity in Rio de Janeiro metropolitan area
Authors:
Maxime Lenormand,
Horacio Samaniego,
Julio C. Chaves,
Vinicius F. Vieira,
Moacyr A. H. B. da Silva,
Alexandre G. Evsukoff
Abstract:
Defining and measuring spatial inequalities across the urban environment remains a complex and elusive task that has been facilitated by the increasing availability of large geolocated databases. In this study, we rely on a mobile phone dataset and an entropy-based metric to measure the attractiveness of a location in the Rio de Janeiro Metropolitan Area (Brazil) as the diversity of visitors' loca…
▽ More
Defining and measuring spatial inequalities across the urban environment remains a complex and elusive task that has been facilitated by the increasing availability of large geolocated databases. In this study, we rely on a mobile phone dataset and an entropy-based metric to measure the attractiveness of a location in the Rio de Janeiro Metropolitan Area (Brazil) as the diversity of visitors' location of residence. The results show that the attractiveness of a given location measured by entropy is an important descriptor of the socioeconomic status of the location, and can thus be used as a proxy for complex socioeconomic indicators.
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
Large-Scale Query-by-Image Video Retrieval Using Bloom Filters
Authors:
Andre Araujo,
Jason Chaves,
Haricharan Lakshman,
Roland Angst,
Bernd Girod
Abstract:
We consider the problem of using image queries to retrieve videos from a database. Our focus is on large-scale applications, where it is infeasible to index each database video frame independently. Our main contribution is a framework based on Bloom filters, which can be used to index long video segments, enabling efficient image-to-video comparisons. Using this framework, we investigate several r…
▽ More
We consider the problem of using image queries to retrieve videos from a database. Our focus is on large-scale applications, where it is infeasible to index each database video frame independently. Our main contribution is a framework based on Bloom filters, which can be used to index long video segments, enabling efficient image-to-video comparisons. Using this framework, we investigate several retrieval architectures, by considering different types of aggregation and different functions to encode visual information -- these play a crucial role in achieving high performance. Extensive experiments show that the proposed technique improves mean average precision by 24% on a public dataset, while being 4X faster, compared to the previous state-of-the-art.
△ Less
Submitted 12 July, 2016; v1 submitted 27 April, 2016;
originally announced April 2016.
-
Advances in the Biomedical Applications of the EELA Project
Authors:
Vicente Hernández,
Ignacio Blanquer,
Gabriel Aparicio,
Raul Isea,
Juan Luis Chavés,
Álvaro Hernández,
Henry Ricardo Mora,
Manuel Fernández,
Alicia Acero,
Esther Montes,
Rafael Mayo
Abstract:
In the last years an increasing demand for Grid Infrastructures has resulted in several international collaborations. This is the case of the EELA Project, which has brought together collaborating groups of Latin America and Europe. One year ago we presented this e-infrastructure used, among others, by the Biomedical groups for the studies of oncological analysis, neglected diseases, sequence alig…
▽ More
In the last years an increasing demand for Grid Infrastructures has resulted in several international collaborations. This is the case of the EELA Project, which has brought together collaborating groups of Latin America and Europe. One year ago we presented this e-infrastructure used, among others, by the Biomedical groups for the studies of oncological analysis, neglected diseases, sequence alignments and computation phylogenetics. After this period, the achieved advances are summarised in this paper.
△ Less
Submitted 17 December, 2010;
originally announced December 2010.
-
e-Science initiatives in Venezuela
Authors:
J. L. Chaves,
G. Diaz,
V. Hamar,
R. Isea,
F. Rojas,
N. Ruiz,
R. Torrens,
M. Uzcategui,
J. Florez-Lopez,
H. Hoeger,
C. Mendoza,
L. A. Nunez
Abstract:
Within the context of the nascent e-Science infrastructure in Venezuela, we describe several web-based scientific applications developed at the Centro Nacional de Calculo Cientifico Universidad de Los Andes (CeCalCULA), Merida, and at the Instituto Venezolano de Investigaciones Cientificas (IVIC), Caracas. The different strategies that have been followed for implementing quantum chemistry and at…
▽ More
Within the context of the nascent e-Science infrastructure in Venezuela, we describe several web-based scientific applications developed at the Centro Nacional de Calculo Cientifico Universidad de Los Andes (CeCalCULA), Merida, and at the Instituto Venezolano de Investigaciones Cientificas (IVIC), Caracas. The different strategies that have been followed for implementing quantum chemistry and atomic physics applications are presented. We also briefly discuss a damage portal based on dynamic, nonlinear, finite elements of lumped damage mechanics and a biomedical portal developed within the framework of the \textit{E-Infrastructure shared between Europe and Latin America} (EELA) initiative for searching common sequences and inferring their functions in parasitic diseases such as leishmaniasis, chagas and malaria.
△ Less
Submitted 24 July, 2007;
originally announced July 2007.