-
A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models
Authors:
Aviv Brokman,
Xuguang Ai,
Yuhang Jiang,
Shashank Gupta,
Ramakanth Kavuluru
Abstract:
Objective: Zero-shot methodology promises to cut down on costs of dataset annotation and domain expertise needed to make use of NLP. Generative large language models trained to align with human goals have achieved high zero-shot performance across a wide variety of tasks. As of yet, it is unclear how well these models perform on biomedical relation extraction (RE). To address this knowledge gap, w…
▽ More
Objective: Zero-shot methodology promises to cut down on costs of dataset annotation and domain expertise needed to make use of NLP. Generative large language models trained to align with human goals have achieved high zero-shot performance across a wide variety of tasks. As of yet, it is unclear how well these models perform on biomedical relation extraction (RE). To address this knowledge gap, we explore patterns in the performance of OpenAI LLMs across a diverse sampling of RE tasks.
Methods: We use OpenAI GPT-4-turbo and their reasoning model o1 to conduct end-to-end RE experiments on seven datasets. We use the JSON generation capabilities of GPT models to generate structured output in two ways: (1) by defining an explicit schema describing the structure of relations, and (2) using a setting that infers the structure from the prompt language.
Results: Our work is the first to study and compare the performance of the GPT-4 and o1 for the end-to-end zero-shot biomedical RE task across a broad array of datasets. We found the zero-shot performances to be proximal to that of fine-tuned methods. The limitations of this approach are that it performs poorly on instances containing many relations and errs on the boundaries of textual mentions.
Conclusion: Recent large language models exhibit promising zero-shot capabilities in complex biomedical RE tasks, offering competitive performance with reduced dataset curation and NLP modeling needs at the cost of increased computing, potentially increasing medical community accessibility. Addressing the limitations we identify could further boost reliability. The code, data, and prompts for all our experiments are publicly available: https://github.com/bionlproc/ZeroShotRE
△ Less
Submitted 5 April, 2025;
originally announced April 2025.
-
How Important is Domain Specificity in Language Models and Instruction Finetuning for Biomedical Relation Extraction?
Authors:
Aviv Brokman,
Ramakanth Kavuluru
Abstract:
Cutting edge techniques developed in the general NLP domain are often subsequently applied to the high-value, data-rich biomedical domain. The past few years have seen generative language models (LMs), instruction finetuning, and few-shot learning become foci of NLP research. As such, generative LMs pretrained on biomedical corpora have proliferated and biomedical instruction finetuning has been a…
▽ More
Cutting edge techniques developed in the general NLP domain are often subsequently applied to the high-value, data-rich biomedical domain. The past few years have seen generative language models (LMs), instruction finetuning, and few-shot learning become foci of NLP research. As such, generative LMs pretrained on biomedical corpora have proliferated and biomedical instruction finetuning has been attempted as well, all with the hope that domain specificity improves performance on downstream tasks. Given the nontrivial effort in training such models, we investigate what, if any, benefits they have in the key biomedical NLP task of relation extraction. Specifically, we address two questions: (1) Do LMs trained on biomedical corpora outperform those trained on general domain corpora? (2) Do models instruction finetuned on biomedical datasets outperform those finetuned on assorted datasets or those simply pretrained? We tackle these questions using existing LMs, testing across four datasets. In a surprising result, general-domain models typically outperformed biomedical-domain models. However, biomedical instruction finetuning improved performance to a similar degree as general instruction finetuning, despite having orders of magnitude fewer instructions. Our findings suggest it may be more fruitful to focus research effort on larger-scale biomedical instruction finetuning of general LMs over building domain-specific biomedical LMs
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
A strategy to identify event specific hospitalizations in large health claims database
Authors:
Joshua Lambert,
Harpal Sandhu,
Emily Kean,
Teenu Xavier,
Aviv Brokman,
Zachary Steckler,
Lee Park,
Arnold Stromberg
Abstract:
Health insurance claims data offer a unique opportunity to study disease distribution on a large scale. Challenges arise in the process of accurately analyzing these raw data. One important challenge to overcome is the accurate classification of study outcomes. For example, using claims data, there is no clear way of classifying hospitalizations due to a specific event. This is because of the inhe…
▽ More
Health insurance claims data offer a unique opportunity to study disease distribution on a large scale. Challenges arise in the process of accurately analyzing these raw data. One important challenge to overcome is the accurate classification of study outcomes. For example, using claims data, there is no clear way of classifying hospitalizations due to a specific event. This is because of the inherent disjointedness and lack of context that typically come with raw claims data. In this paper, we propose a framework for classifying hospitalizations due to a specific event. We then test this framework in a health insurance claims database with approximately 4 million US adults who tested positive with COVID-19 between March and December 2020. Our claims specific COVID-19 related hospitalizations proportion is then compared to nationally reported rates from the Centers for Disease Control by age and sex.
△ Less
Submitted 25 October, 2021;
originally announced November 2021.
-
Linguistic Inspired Graph Analysis
Authors:
Andrew Broekman,
Linda Marshall
Abstract:
Isomorphisms allow human cognition to transcribe a potentially unsolvable problem from one domain to a different domain where the problem might be more easily addressed. Current approaches only focus on transcribing structural information from the source to target structure, ignoring semantic and pragmatic information. Functional Language Theory presents five subconstructs for the classification a…
▽ More
Isomorphisms allow human cognition to transcribe a potentially unsolvable problem from one domain to a different domain where the problem might be more easily addressed. Current approaches only focus on transcribing structural information from the source to target structure, ignoring semantic and pragmatic information. Functional Language Theory presents five subconstructs for the classification and understanding of languages. By deriving a mapping between the metamodels in linguistics and graph theory it will be shown that currently, no constructs exist in canonical graphs for the representation of semantic and pragmatic information. It is found that further work needs to be done to understand how graphs can be enriched to allow for isomorphisms to capture semantic and pragmatic information. This capturing of additional information could lead to understandings of the source structure and enhanced manipulations and interrogations of the contained relationships. Current mathematical graph structures in their general definition do not allow for the expression of higher information levels of a source.
△ Less
Submitted 13 May, 2021;
originally announced May 2021.