Grader variability and the importance of reference standards for evaluating machine learning models for diabetic retinopathy

Krause, Jonathan; Gulshan, Varun; Rahimy, Ehsan; Karth, Peter; Widner, Kasumi; Corrado, Greg S.; Peng, Lily; Webster, Dale R.

doi:10.1016/j.ophtha.2018.01.034

Computer Science > Computer Vision and Pattern Recognition

arXiv:1710.01711 (cs)

[Submitted on 4 Oct 2017 (v1), last revised 3 Jul 2018 (this version, v3)]

Title:Grader variability and the importance of reference standards for evaluating machine learning models for diabetic retinopathy

Authors:Jonathan Krause, Varun Gulshan, Ehsan Rahimy, Peter Karth, Kasumi Widner, Greg S. Corrado, Lily Peng, Dale R. Webster

View PDF

Abstract:Diabetic retinopathy (DR) and diabetic macular edema are common complications of diabetes which can lead to vision loss. The grading of DR is a fairly complex process that requires the detection of fine features such as microaneurysms, intraretinal hemorrhages, and intraretinal microvascular abnormalities. Because of this, there can be a fair amount of grader variability. There are different methods of obtaining the reference standard and resolving disagreements between graders, and while it is usually accepted that adjudication until full consensus will yield the best reference standard, the difference between various methods of resolving disagreements has not been examined extensively. In this study, we examine the variability in different methods of grading, definitions of reference standards, and their effects on building deep learning models for the detection of diabetic eye disease. We find that a small set of adjudicated DR grades allows substantial improvements in algorithm performance. The resulting algorithm's performance was on par with that of individual U.S. board-certified ophthalmologists and retinal specialists.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1710.01711 [cs.CV]
	(or arXiv:1710.01711v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1710.01711
Journal reference:	Ophthalmology (2018)
Related DOI:	https://doi.org/10.1016/j.ophtha.2018.01.034

Submission history

From: Jonathan Krause [view email]
[v1] Wed, 4 Oct 2017 17:29:06 UTC (288 KB)
[v2] Wed, 30 May 2018 23:33:08 UTC (521 KB)
[v3] Tue, 3 Jul 2018 18:02:16 UTC (521 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Grader variability and the importance of reference standards for evaluating machine learning models for diabetic retinopathy

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Grader variability and the importance of reference standards for evaluating machine learning models for diabetic retinopathy

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators