-
OPTIC: Optimizing Patient-Provider Triaging & Improving Communications in Clinical Operations using GPT-4 Data Labeling and Model Distillation
Authors:
Alberto Santamaria-Pang,
Frank Tuan,
Ross Campbell,
Cindy Zhang,
Ankush Jindal,
Roopa Surapur,
Brad Holloman,
Deanna Hanisch,
Rae Buckley,
Carisa Cooney,
Ivan Tarapov,
Kimberly S. Peairs,
Brian Hasselfeld,
Peter Greene
Abstract:
The COVID-19 pandemic has accelerated the adoption of telemedicine and patient messaging through electronic medical portals (patient medical advice requests, or PMARs). While these platforms enhance patient access to healthcare, they have also increased the burden on healthcare providers due to the surge in PMARs. This study seeks to develop an efficient tool for message triaging to reduce physici…
▽ More
The COVID-19 pandemic has accelerated the adoption of telemedicine and patient messaging through electronic medical portals (patient medical advice requests, or PMARs). While these platforms enhance patient access to healthcare, they have also increased the burden on healthcare providers due to the surge in PMARs. This study seeks to develop an efficient tool for message triaging to reduce physician workload and improve patient-provider communication. We developed OPTIC (Optimizing Patient-Provider Triaging & Improving Communications in Clinical Operations), a powerful message triaging tool that utilizes GPT-4 for data labeling and BERT for model distillation. The study used a dataset of 405,487 patient messaging encounters from Johns Hopkins Medicine between January and June 2020. High-quality labeled data was generated through GPT-4-based prompt engineering, which was then used to train a BERT model to classify messages as "Admin" or "Clinical." The BERT model achieved 88.85% accuracy on the test set validated by GPT-4 labeling, with a sensitivity of 88.29%, specificity of 89.38%, and an F1 score of 0.8842. BERTopic analysis identified 81 distinct topics within the test data, with over 80% accuracy in classifying 58 topics. The system was successfully deployed through Epic's Nebula Cloud Platform, demonstrating its practical effectiveness in healthcare settings.
△ Less
Submitted 5 February, 2025;
originally announced March 2025.
-
End-to-End Document Classification and Key Information Extraction using Assignment Optimization
Authors:
Ciaran Cooney,
Joana Cavadas,
Liam Madigan,
Bradley Savage,
Rachel Heyburn,
Mairead O'Cuinn
Abstract:
We propose end-to-end document classification and key information extraction (KIE) for automating document processing in forms. Through accurate document classification we harness known information from templates to enhance KIE from forms. We use text and layout encoding with a cosine similarity measure to classify visually-similar documents. We then demonstrate a novel application of mixed intege…
▽ More
We propose end-to-end document classification and key information extraction (KIE) for automating document processing in forms. Through accurate document classification we harness known information from templates to enhance KIE from forms. We use text and layout encoding with a cosine similarity measure to classify visually-similar documents. We then demonstrate a novel application of mixed integer programming by using assignment optimization to extract key information from documents. Our approach is validated on an in-house dataset of noisy scanned forms. The best performing document classification approach achieved 0.97 f1 score. A mean f1 score of 0.94 for the KIE task suggests there is significant potential in applying optimization techniques. Abation results show that the method relies on document preprocessing techniques to mitigate Type II errors and achieve optimal performance.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Unimodal and Multimodal Representation Training for Relation Extraction
Authors:
Ciaran Cooney,
Rachel Heyburn,
Liam Madigan,
Mairead O'Cuinn,
Chloe Thompson,
Joana Cavadas
Abstract:
Multimodal integration of text, layout and visual information has achieved SOTA results in visually rich document understanding (VrDU) tasks, including relation extraction (RE). However, despite its importance, evaluation of the relative predictive capacity of these modalities is less prevalent. Here, we demonstrate the value of shared representations for RE tasks by conducting experiments in whic…
▽ More
Multimodal integration of text, layout and visual information has achieved SOTA results in visually rich document understanding (VrDU) tasks, including relation extraction (RE). However, despite its importance, evaluation of the relative predictive capacity of these modalities is less prevalent. Here, we demonstrate the value of shared representations for RE tasks by conducting experiments in which each data type is iteratively excluded during training. In addition, text and layout data are evaluated in isolation. While a bimodal text and layout approach performs best (F1=0.684), we show that text is the most important single predictor of entity relations. Additionally, layout geometry is highly predictive and may even be a feasible unimodal approach. Despite being less effective, we highlight circumstances where visual information can bolster performance. In total, our results demonstrate the efficacy of training joint representations for RE.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Cloud Storage Forensic: hubiC as a Case-Study
Authors:
Ben Blakeley,
Chris Cooney,
Ali Dehghantanha,
Rob Aspin
Abstract:
In today society where we live in a world of constant connectivity, many people are now looking to cloud services in order to store their files so they can have access to them wherever they are. By using cloud services, users can access files anywhere with an internet connection. However, while cloud storage is convenient, it also presents security risks. From a forensics perspective, the increasi…
▽ More
In today society where we live in a world of constant connectivity, many people are now looking to cloud services in order to store their files so they can have access to them wherever they are. By using cloud services, users can access files anywhere with an internet connection. However, while cloud storage is convenient, it also presents security risks. From a forensics perspective, the increasing popularity of cloud storage platforms, makes investigation into such exploits much more difficult, especially since many platforms such as mobile devices as well as computers are able to use these services. This paper presents investigation of hubiC as one of popular cloud platforms running on Microsoft Windows 8.1. Remaining artefacts pertaining different usage of hubiC namely upload, download, installation and uninstallation on Microsoft Windows 8.1are presented.
△ Less
Submitted 26 July, 2018;
originally announced July 2018.