-
Towards Accountable AI-Assisted Eye Disease Diagnosis: Workflow Design, External Validation, and Continual Learning
Authors:
Qingyu Chen,
Tiarnan D L Keenan,
Elvira Agron,
Alexis Allot,
Emily Guan,
Bryant Duong,
Amr Elsawy,
Benjamin Hou,
Cancan Xue,
Sanjeeb Bhandari,
Geoffrey Broadhead,
Chantal Cousineau-Krieger,
Ellen Davis,
William G Gensheimer,
David Grasic,
Seema Gupta,
Luis Haddock,
Eleni Konstantinou,
Tania Lamba,
Michele Maiberger,
Dimosthenis Mantopoulos,
Mitul C Mehta,
Ayman G Nahri,
Mutaz AL-Nawaflh,
Arnold Oshinsky
, et al. (13 additional authors not shown)
Abstract:
Timely disease diagnosis is challenging due to increasing disease burdens and limited clinician availability. AI shows promise in diagnosis accuracy but faces real-world application issues due to insufficient validation in clinical workflows and diverse populations. This study addresses gaps in medical AI downstream accountability through a case study on age-related macular degeneration (AMD) diag…
▽ More
Timely disease diagnosis is challenging due to increasing disease burdens and limited clinician availability. AI shows promise in diagnosis accuracy but faces real-world application issues due to insufficient validation in clinical workflows and diverse populations. This study addresses gaps in medical AI downstream accountability through a case study on age-related macular degeneration (AMD) diagnosis and severity classification. We designed and implemented an AI-assisted diagnostic workflow for AMD, comparing diagnostic performance with and without AI assistance among 24 clinicians from 12 institutions with real patient data sampled from the Age-Related Eye Disease Study (AREDS). Additionally, we demonstrated continual enhancement of an existing AI model by incorporating approximately 40,000 additional medical images (named AREDS2 dataset). The improved model was then systematically evaluated using both AREDS and AREDS2 test sets, as well as an external test set from Singapore. AI assistance markedly enhanced diagnostic accuracy and classification for 23 out of 24 clinicians, with the average F1-score increasing by 20% from 37.71 (Manual) to 45.52 (Manual + AI) (P-value < 0.0001), achieving an improvement of over 50% in some cases. In terms of efficiency, AI assistance reduced diagnostic times for 17 out of the 19 clinicians tracked, with time savings of up to 40%. Furthermore, a model equipped with continual learning showed robust performance across three independent datasets, recording a 29% increase in accuracy, and elevating the F1-score from 42 to 54 in the Singapore population.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
Multi-modal, multi-task, multi-attention (M3) deep learning detection of reticular pseudodrusen: towards automated and accessible classification of age-related macular degeneration
Authors:
Qingyu Chen,
Tiarnan D. L. Keenan,
Alexis Allot,
Yifan Peng,
Elvira Agrón,
Amitha Domalpally,
Caroline C. W. Klaver,
Daniel T. Luttikhuizen,
Marcus H. Colyer,
Catherine A. Cukras,
Henry E. Wiley,
M. Teresa Magone,
Chantal Cousineau-Krieger,
Wai T. Wong,
Yingying Zhu,
Emily Y. Chew,
Zhiyong Lu
Abstract:
Objective Reticular pseudodrusen (RPD), a key feature of age-related macular degeneration (AMD), are poorly detected by human experts on standard color fundus photography (CFP) and typically require advanced imaging modalities such as fundus autofluorescence (FAF). The objective was to develop and evaluate the performance of a novel 'M3' deep learning framework on RPD detection. Materials and Meth…
▽ More
Objective Reticular pseudodrusen (RPD), a key feature of age-related macular degeneration (AMD), are poorly detected by human experts on standard color fundus photography (CFP) and typically require advanced imaging modalities such as fundus autofluorescence (FAF). The objective was to develop and evaluate the performance of a novel 'M3' deep learning framework on RPD detection. Materials and Methods A deep learning framework M3 was developed to detect RPD presence accurately using CFP alone, FAF alone, or both, employing >8000 CFP-FAF image pairs obtained prospectively (Age-Related Eye Disease Study 2). The M3 framework includes multi-modal (detection from single or multiple image modalities), multi-task (training different tasks simultaneously to improve generalizability), and multi-attention (improving ensembled feature representation) operation. Performance on RPD detection was compared with state-of-the-art deep learning models and 13 ophthalmologists; performance on detection of two other AMD features (geographic atrophy and pigmentary abnormalities) was also evaluated. Results For RPD detection, M3 achieved area under receiver operating characteristic (AUROC) 0.832, 0.931, and 0.933 for CFP alone, FAF alone, and both, respectively. M3 performance on CFP was very substantially superior to human retinal specialists (median F1-score 0.644 versus 0.350). External validation (on Rotterdam Study, Netherlands) demonstrated high accuracy on CFP alone (AUROC 0.965). The M3 framework also accurately detected geographic atrophy and pigmentary abnormalities (AUROC 0.909 and 0.912, respectively), demonstrating its generalizability. Conclusion This study demonstrates the successful development, robust evaluation, and external validation of a novel deep learning framework that enables accessible, accurate, and automated AMD diagnosis and prognosis.
△ Less
Submitted 11 November, 2020; v1 submitted 8 November, 2020;
originally announced November 2020.
-
Predicting risk of late age-related macular degeneration using deep learning
Authors:
Yifan Peng,
Tiarnan D. Keenan,
Qingyu Chen,
Elvira Agrón,
Alexis Allot,
Wai T. Wong,
Emily Y. Chew,
Zhiyong Lu
Abstract:
By 2040, age-related macular degeneration (AMD) will affect approximately 288 million people worldwide. Identifying individuals at high risk of progression to late AMD, the sight-threatening stage, is critical for clinical actions, including medical interventions and timely monitoring. Although deep learning has shown promise in diagnosing/screening AMD using color fundus photographs, it remains d…
▽ More
By 2040, age-related macular degeneration (AMD) will affect approximately 288 million people worldwide. Identifying individuals at high risk of progression to late AMD, the sight-threatening stage, is critical for clinical actions, including medical interventions and timely monitoring. Although deep learning has shown promise in diagnosing/screening AMD using color fundus photographs, it remains difficult to predict individuals' risks of late AMD accurately. For both tasks, these initial deep learning attempts have remained largely unvalidated in independent cohorts. Here, we demonstrate how deep learning and survival analysis can predict the probability of progression to late AMD using 3,298 participants (over 80,000 images) from the Age-Related Eye Disease Studies AREDS and AREDS2, the largest longitudinal clinical trials in AMD. When validated against an independent test dataset of 601 participants, our model achieved high prognostic accuracy (five-year C-statistic 86.4 (95% confidence interval 86.2-86.6)) that substantially exceeded that of retinal specialists using two existing clinical standards (81.3 (81.1-81.5) and 82.0 (81.8-82.3), respectively). Interestingly, our approach offers additional strengths over the existing clinical standards in AMD prognosis (e.g., risk ascertainment above 50%) and is likely to be highly generalizable, given the breadth of training data from 82 US retinal specialty clinics. Indeed, during external validation through training on AREDS and testing on AREDS2 as an independent cohort, our model retained substantially higher prognostic accuracy than existing clinical standards. These results highlight the potential of deep learning systems to enhance clinical decision-making in AMD patients.
△ Less
Submitted 18 July, 2020;
originally announced July 2020.