Skip to main content

Showing 1–14 of 14 results for author: Savova, G

.
  1. arXiv:2503.07329  [pdf, other

    cs.CL cs.AI cs.LG

    Assessing the Macro and Micro Effects of Random Seeds on Fine-Tuning Large Language Models

    Authors: Hao Zhou, Guergana Savova, Lijing Wang

    Abstract: The impact of random seeds in fine-tuning large language models (LLMs) has been largely overlooked despite its potential influence on model performance.In this study, we systematically evaluate the effects of random seeds on LLMs using the GLUE and SuperGLUE benchmarks. We analyze the macro-level impact through traditional metrics like accuracy and F1, calculating their mean and variance to quanti… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 7 pages, 5 tables, 3 figures

  2. arXiv:2502.10388  [pdf, other

    cs.CL

    Aspect-Oriented Summarization for Psychiatric Short-Term Readmission Prediction

    Authors: WonJin Yoon, Boyu Ren, Spencer Thomas, Chanwhi Kim, Guergana Savova, Mei-Hua Hall, Timothy Miller

    Abstract: Recent progress in large language models (LLMs) has enabled the automated processing of lengthy documents even without supervised training on a task-specific dataset. Yet, their zero-shot performance in complex tasks as opposed to straightforward information extraction tasks remains suboptimal. One feasible approach for tasks with lengthy, complex input is to first summarize the document and then… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

  3. arXiv:2410.12774  [pdf, other

    cs.CL cs.AI

    Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information

    Authors: Yingya Li, Timothy Miller, Steven Bethard, Guergana Savova

    Abstract: The success of multi-task learning can depend heavily on which tasks are grouped together. Naively grouping all tasks or a random set of tasks can result in negative transfer, with the multi-task models performing worse than single-task models. Though many efforts have been made to identify task groupings and to measure the relatedness among different tasks, it remains a challenging research topic… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: main paper 12 pages, Appendix 7 pages, 1 figure, 18 tables

  4. arXiv:2410.12722  [pdf, other

    cs.CL

    WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation

    Authors: João Matos, Shan Chen, Siena Placino, Yingya Li, Juan Carlos Climent Pardo, Daphna Idan, Takeshi Tohyama, David Restrepo, Luis F. Nakayama, Jose M. M. Pascual-Leone, Guergana Savova, Hugo Aerts, Leo A. Celi, A. Ian Wong, Danielle S. Bitterman, Jack Gallifant

    Abstract: Multimodal/vision language models (VLMs) are increasingly being deployed in healthcare settings worldwide, necessitating robust benchmarks to ensure their safety, efficacy, and fairness. Multiple-choice question and answer (QA) datasets derived from national medical examinations have long served as valuable evaluation tools, but existing datasets are largely text-only and available in a limited su… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: submitted for review, total of 14 pages

  5. arXiv:2410.09937  [pdf

    cs.CY

    Artificial Intelligence in the Legal Field: Law Students Perspective

    Authors: Daniela Andreeva, Guergana Savova

    Abstract: The Artificial Intelligence field, or AI, experienced a renaissance in the last few years across various fields such as law, medicine, and finance. While there are studies outlining the landscape of AI in the legal field as well as surveys of the current AI efforts of law firms, to our knowledge there has not been an investigation of the intersection of law students and AI. Such research is critic… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    Comments: main paper 11 pages, Appendix 5 pages, 1 table

  6. arXiv:2405.09153  [pdf, other

    cs.CL cs.LG

    Adapting Abstract Meaning Representation Parsing to the Clinical Narrative -- the SPRING THYME parser

    Authors: Jon Z. Cai, Kristin Wright-Bettner, Martha Palmer, Guergana K. Savova, James H. Martin

    Abstract: This paper is dedicated to the design and evaluation of the first AMR parser tailored for clinical notes. Our objective was to facilitate the precise transformation of the clinical notes into structured AMR expressions, thereby enhancing the interpretability and usability of clinical text data at scale. Leveraging the colon cancer dataset from the Temporal Histories of Your Medical Events (THYME)… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted to the 6th Clinical NLP Workshop at NAACL, 2024

  7. arXiv:2310.17703  [pdf

    cs.CL

    The impact of responding to patient messages with large language model assistance

    Authors: Shan Chen, Marco Guevara, Shalini Moningi, Frank Hoebers, Hesham Elhalawani, Benjamin H. Kann, Fallon E. Chipidza, Jonathan Leeman, Hugo J. W. L. Aerts, Timothy Miller, Guergana K. Savova, Raymond H. Mak, Maryam Lustberg, Majid Afshar, Danielle S. Bitterman

    Abstract: Documentation burden is a major contributor to clinician burnout, which is rising nationally and is an urgent threat to our ability to care for patients. Artificial intelligence (AI) chatbots, such as ChatGPT, could reduce clinician burden by assisting with documentation. Although many hospitals are actively integrating such systems into electronic medical record systems, AI chatbots utility and i… ▽ More

    Submitted 29 November, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: 4 figures and tables in main, submitted for review

  8. arXiv:2310.12300  [pdf, other

    cs.CL

    Measuring Pointwise $\mathcal{V}$-Usable Information In-Context-ly

    Authors: Sheng Lu, Shan Chen, Yingya Li, Danielle Bitterman, Guergana Savova, Iryna Gurevych

    Abstract: In-context learning (ICL) is a new learning paradigm that has gained popularity along with the development of large language models. In this work, we adapt a recently proposed hardness metric, pointwise $\mathcal{V}$-usable information (PVI), to an in-context version (in-context PVI). Compared to the original PVI, in-context PVI is more efficient in that it requires only a few exemplars and does n… ▽ More

    Submitted 8 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

  9. Large Language Models to Identify Social Determinants of Health in Electronic Health Records

    Authors: Marco Guevara, Shan Chen, Spencer Thomas, Tafadzwa L. Chaunzwa, Idalid Franco, Benjamin Kann, Shalini Moningi, Jack Qian, Madeleine Goldstein, Susan Harper, Hugo JWL Aerts, Guergana K. Savova, Raymond H. Mak, Danielle S. Bitterman

    Abstract: Social determinants of health (SDoH) have an important impact on patient outcomes but are incompletely collected from the electronic health records (EHR). This study researched the ability of large language models to extract SDoH from free text in EHRs, where they are most commonly documented, and explored the role of synthetic clinical text for improving the extraction of these scarcely documente… ▽ More

    Submitted 5 March, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: Peer-reviewed version published at NPJ Digital Medicine: https://www.nature.com/articles/s41746-023-00970-0

    Journal ref: NPJ Digit Med. 2024 Jan 11;7(1):6

  10. Evaluation of ChatGPT Family of Models for Biomedical Reasoning and Classification

    Authors: Shan Chen, Yingya Li, Sheng Lu, Hoang Van, Hugo JWL Aerts, Guergana K. Savova, Danielle S. Bitterman

    Abstract: Recent advances in large language models (LLMs) have shown impressive ability in biomedical question-answering, but have not been adequately investigated for more specific biomedical applications. This study investigates the performance of LLMs such as the ChatGPT family of models (GPT-3.5s, GPT-4) in biomedical tasks beyond question-answering. Because no patient data can be passed to the OpenAI A… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: 28 pages, 2 tables and 4 figures. Submitting for review

  11. Natural language processing to automatically extract the presence and severity of esophagitis in notes of patients undergoing radiotherapy

    Authors: Shan Chen, Marco Guevara, Nicolas Ramirez, Arpi Murray, Jeremy L. Warner, Hugo JWL Aerts, Timothy A. Miller, Guergana K. Savova, Raymond H. Mak, Danielle S. Bitterman

    Abstract: Radiotherapy (RT) toxicities can impair survival and quality-of-life, yet remain under-studied. Real-world evidence holds potential to improve our understanding of toxicities, but toxicity information is often only in clinical notes. We developed natural language processing (NLP) models to identify the presence and severity of esophagitis from notes of patients treated with thoracic RT. We fine-tu… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: 17 pages, 6 tables, 1figure, submiting to JCO-CCI for review

  12. arXiv:2006.13737  [pdf

    stat.AP cs.IR cs.LG

    Diagnosis Prevalence vs. Efficacy in Machine-learning Based Diagnostic Decision Support

    Authors: Gil Alon, Elizabeth Chen, Guergana Savova, Carsten Eickhoff

    Abstract: Many recent studies use machine learning to predict a small number of ICD-9-CM codes. In practice, on the other hand, physicians have to consider a broader range of diagnoses. This study aims to put these previously incongruent evaluation settings on a more equal footing by predicting ICD-9-CM codes based on electronic health record properties and demonstrating the relationship between diagnosis p… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: AMIA Joint Summits in Translational Science, 2020

  13. arXiv:2006.13721  [pdf

    cs.IR cs.CY cs.DL

    Mining Misdiagnosis Patterns from Biomedical Literature

    Authors: Cindy Li, Elizabeth Chen, Guergana Savova, Hamish Fraser, Carsten Eickhoff

    Abstract: Diagnostic errors can pose a serious threat to patient safety, leading to serious harm and even death. Efforts are being made to develop interventions that allow physicians to reassess for errors and improve diagnostic accuracy. Our study presents an exploration of misdiagnosis patterns mined from PubMed abstracts. Article titles containing certain phrases indicating misdiagnosis were selected and… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: AMIA Joint Summits in Translational Science, 2020

    Journal ref: AMIA Jt Summits Transl Sci Proc. 2020;2020:360-366. Published 2020 May 30

  14. arXiv:1912.12371  [pdf

    q-bio.OT cs.SE

    Open Source Software Sustainability Models: Initial White Paper from the Informatics Technology for Cancer Research Sustainability and Industry Partnership Work Group

    Authors: Y. Ye, R. D. Boyce, M. K. Davis, K. Elliston, C. Davatzikos, A. Fedorov, J. C. Fillion-Robin, I. Foster, J. Gilbertson, M. Heiskanen, J. Klemm, A. Lasso, J. V. Miller, M. Morgan, S. Pieper, B. Raumann, B. Sarachan, G. Savova, J. C. Silverstein, D. Taylor, J. Zelnis, G. Q. Zhang, M. J. Becich

    Abstract: The Sustainability and Industry Partnership Work Group (SIP-WG) is a part of the National Cancer Institute Informatics Technology for Cancer Research (ITCR) program. The charter of the SIP-WG is to investigate options of long-term sustainability of open source software (OSS) developed by the ITCR, in part by developing a collection of business model archetypes that can serve as sustainability plan… ▽ More

    Submitted 1 January, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

    Comments: 21-page main manuscript, 43-page supplemental file