Enhancing Diagnostic Accuracy in Rare and Common Fundus Diseases with a Knowledge-Rich Vision-Language Model

Wang, Meng; Lin, Tian; Lin, Aidi; Yu, Kai; Peng, Yuanyuan; Wang, Lianyu; Chen, Cheng; Zou, Ke; Liang, Huiyu; Chen, Man; Yao, Xue; Zhang, Meiqin; Huang, Binwei; Zheng, Chaoxin; Zhang, Peixin; Chen, Wei; Luo, Yilong; Chen, Yifan; Xia, Honghe; Shi, Tingkun; Zhang, Qi; Guo, Jinming; Chen, Xiaolin; Wang, Jingcheng; Tham, Yih Chung; Liu, Dianbo; Wong, Wendy; Thakur, Sahil; Fenner, Beau; Fang, Danqi; Liu, Siying; Liu, Qingyun; Huang, Yuqiang; Zeng, Hongqiang; Meng, Yanda; Zhou, Yukun; Jiang, Zehua; Qiu, Minghui; Zhang, Changqing; Chen, Xinjian; Wang, Sophia Y.; Lee, Cecilia S.; Sobrin, Lucia; Cheung, Carol Y; Pang, Chi Pui; Keane, Pearse A.; Cheng, Ching-Yu; Chen, Haoyu; Fu, Huazhu

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2406.09317 (eess)

[Submitted on 13 Jun 2024 (v1), last revised 10 Apr 2025 (this version, v3)]

Title:Enhancing Diagnostic Accuracy in Rare and Common Fundus Diseases with a Knowledge-Rich Vision-Language Model

Abstract:Previous foundation models for fundus images were pre-trained with limited disease categories and knowledge base. Here we introduce a knowledge-rich vision-language model (RetiZero) that leverages knowledge from more than 400 fundus diseases. For RetiZero's pretraining, we compiled 341,896 fundus images paired with texts, sourced from public datasets, ophthalmic literature, and online resources, encompassing a diverse range of diseases across multiple ethnicities and countries. RetiZero exhibits remarkable performance in several downstream tasks, including zero-shot disease recognition, image-to-image retrieval, AI-assisted clinical diagnosis,few-shot fine-tuning, and internal- and cross-domain disease identification. In zero-shot scenarios, RetiZero achieves Top-5 accuracies of 0.843 for 15 diseases and 0.756 for 52 diseases. For image retrieval, it achieves Top-5 scores of 0.950 and 0.886 for the same sets, respectively. AI-assisted clinical diagnosis results show that RetiZero's Top-3 zero-shot performance surpasses the average of 19 ophthalmologists from Singapore, China, and the United States. RetiZero substantially enhances clinicians' accuracy in diagnosing fundus diseases, in particularly rare ones. These findings underscore the value of integrating the RetiZero into clinical settings, where various fundus diseases are encountered.

Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.09317 [eess.IV]
	(or arXiv:2406.09317v3 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2406.09317

Submission history

From: Meng Wang [view email]
[v1] Thu, 13 Jun 2024 16:53:57 UTC (11,842 KB)
[v2] Sun, 30 Jun 2024 17:32:15 UTC (12,883 KB)
[v3] Thu, 10 Apr 2025 16:51:29 UTC (10,394 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:Enhancing Diagnostic Accuracy in Rare and Common Fundus Diseases with a Knowledge-Rich Vision-Language Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:Enhancing Diagnostic Accuracy in Rare and Common Fundus Diseases with a Knowledge-Rich Vision-Language Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators