CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

Irvin, Jeremy; Rajpurkar, Pranav; Ko, Michael; Yu, Yifan; Ciurea-Ilcus, Silviana; Chute, Chris; Marklund, Henrik; Haghgoo, Behzad; Ball, Robyn; Shpanskaya, Katie; Seekins, Jayne; Mong, David A.; Halabi, Safwan S.; Sandberg, Jesse K.; Jones, Ricky; Larson, David B.; Langlotz, Curtis P.; Patel, Bhavik N.; Lungren, Matthew P.; Ng, Andrew Y.

Computer Science > Computer Vision and Pattern Recognition

arXiv:1901.07031 (cs)

[Submitted on 21 Jan 2019]

Title:CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

View PDF

Abstract:Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models.
The dataset is freely available at this https URL .

Comments:	Published in AAAI 2019
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as:	arXiv:1901.07031 [cs.CV]
	(or arXiv:1901.07031v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1901.07031

Submission history

From: Jeremy Irvin [view email]
[v1] Mon, 21 Jan 2019 18:41:59 UTC (1,392 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators