Skip to main content

Showing 1–2 of 2 results for author: Mahal, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.17047  [pdf

    cs.CL cs.AI

    Assessing the Quality of AI-Generated Clinical Notes: A Validated Evaluation of a Large Language Model Scribe

    Authors: Erin Palm, Astrit Manikantan, Mark E. Pepin, Herprit Mahal, Srikanth Subramanya Belwadi

    Abstract: In medical practices across the United States, physicians have begun implementing generative artificial intelligence (AI) tools to perform the function of scribes in order to reduce the burden of documenting clinical encounters. Despite their widespread use, no established methods exist to gauge the quality of AI scribes. To address this gap, we developed a blinded study comparing the relative per… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: 15 pages, 5 tables, 1 figure. Submitted for peer review 05/15/2025

  2. arXiv:2409.01941  [pdf, other

    cs.CL cs.LG

    Towards Leveraging Large Language Models for Automated Medical Q&A Evaluation

    Authors: Jack Krolik, Herprit Mahal, Feroz Ahmad, Gaurav Trivedi, Bahador Saket

    Abstract: This paper explores the potential of using Large Language Models (LLMs) to automate the evaluation of responses in medical Question and Answer (Q\&A) systems, a crucial form of Natural Language Processing. Traditionally, human evaluation has been indispensable for assessing the quality of these responses. However, manual evaluation by medical professionals is time-consuming and costly. Our study e… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: 10 pages, 3 figures, 3 tables

    ACM Class: I.2.7; J.3