Can large language models be privacy preserving and fair medical coders?

Dadsetan, Ali; Soleymani, Dorsa; Zeng, Xijie; Rudzicz, Frank

Computer Science > Machine Learning

arXiv:2412.05533 (cs)

[Submitted on 7 Dec 2024]

Title:Can large language models be privacy preserving and fair medical coders?

Authors:Ali Dadsetan, Dorsa Soleymani, Xijie Zeng, Frank Rudzicz

View PDF HTML (experimental)

Abstract:Protecting patient data privacy is a critical concern when deploying machine learning algorithms in healthcare. Differential privacy (DP) is a common method for preserving privacy in such settings and, in this work, we examine two key trade-offs in applying DP to the NLP task of medical coding (ICD classification). Regarding the privacy-utility trade-off, we observe a significant performance drop in the privacy preserving models, with more than a 40% reduction in micro F1 scores on the top 50 labels in the MIMIC-III dataset. From the perspective of the privacy-fairness trade-off, we also observe an increase of over 3% in the recall gap between male and female patients in the DP models. Further understanding these trade-offs will help towards the challenges of real-world deployment.

Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Cite as:	arXiv:2412.05533 [cs.LG]
	(or arXiv:2412.05533v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2412.05533

Submission history

From: Ali Dadsetan [view email]
[v1] Sat, 7 Dec 2024 04:27:05 UTC (111 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2024-12

Change to browse by:

cs
cs.CR

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:Can large language models be privacy preserving and fair medical coders?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Can large language models be privacy preserving and fair medical coders?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators