Learning from Noisy Similar and Dissimilar Data

Dan, Soham; Bao, Han; Sugiyama, Masashi

Computer Science > Machine Learning

arXiv:2002.00995 (cs)

[Submitted on 3 Feb 2020]

Title:Learning from Noisy Similar and Dissimilar Data

Authors:Soham Dan, Han Bao, Masashi Sugiyama

View PDF

Abstract:With the widespread use of machine learning for classification, it becomes increasingly important to be able to use weaker kinds of supervision for tasks in which it is hard to obtain standard labeled data. One such kind of supervision is provided pairwise---in the form of Similar (S) pairs (if two examples belong to the same class) and Dissimilar (D) pairs (if two examples belong to different classes). This kind of supervision is realistic in privacy-sensitive domains. Although this problem has been looked at recently, it is unclear how to learn from such supervision under label noise, which is very common when the supervision is crowd-sourced. In this paper, we close this gap and demonstrate how to learn a classifier from noisy S and D labeled data. We perform a detailed investigation of this problem under two realistic noise models and propose two algorithms to learn from noisy S-D data. We also show important connections between learning from such pairwise supervision data and learning from ordinary class-labeled data. Finally, we perform experiments on synthetic and real world datasets and show our noise-informed algorithms outperform noise-blind baselines in learning from noisy pairwise data.

Comments:	8 pages
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2002.00995 [cs.LG]
	(or arXiv:2002.00995v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2002.00995

Submission history

From: Soham Dan [view email]
[v1] Mon, 3 Feb 2020 19:59:16 UTC (8,099 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-02

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Soham Dan
Han Bao
Masashi Sugiyama

export BibTeX citation

Computer Science > Machine Learning

Title:Learning from Noisy Similar and Dissimilar Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning from Noisy Similar and Dissimilar Data

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators