An Instance-Dependent Simulation Framework for Learning with Label Noise

Gu, Keren; Masotto, Xander; Bachani, Vandana; Lakshminarayanan, Balaji; Nikodem, Jack; Yin, Dong

Computer Science > Machine Learning

arXiv:2107.11413 (cs)

[Submitted on 23 Jul 2021 (v1), last revised 17 Oct 2021 (this version, v4)]

Title:An Instance-Dependent Simulation Framework for Learning with Label Noise

Authors:Keren Gu, Xander Masotto, Vandana Bachani, Balaji Lakshminarayanan, Jack Nikodem, Dong Yin

View PDF

Abstract:We propose a simulation framework for generating instance-dependent noisy labels via a pseudo-labeling paradigm. We show that the distribution of the synthetic noisy labels generated with our framework is closer to human labels compared to independent and class-conditional random flipping. Equipped with controllable label noise, we study the negative impact of noisy labels across a few practical settings to understand when label noise is more problematic. We also benchmark several existing algorithms for learning with noisy labels and compare their behavior on our synthetic datasets and on the datasets with independent random label noise. Additionally, with the availability of annotator information from our simulation framework, we propose a new technique, Label Quality Model (LQM), that leverages annotator features to predict and correct against noisy labels. We show that by adding LQM as a label correction step before applying existing noisy label techniques, we can further improve the models' performance.

Comments:	Datasets released at this https URL
Subjects:	Machine Learning (cs.LG); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2107.11413 [cs.LG]
	(or arXiv:2107.11413v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2107.11413

Submission history

From: Dong Yin [view email]
[v1] Fri, 23 Jul 2021 18:53:53 UTC (1,551 KB)
[v2] Sun, 29 Aug 2021 04:24:14 UTC (1,551 KB)
[v3] Tue, 28 Sep 2021 18:26:13 UTC (1,552 KB)
[v4] Sun, 17 Oct 2021 21:20:16 UTC (1,552 KB)

Computer Science > Machine Learning

Title:An Instance-Dependent Simulation Framework for Learning with Label Noise

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:An Instance-Dependent Simulation Framework for Learning with Label Noise

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators