Learning to Detect and Retrieve Objects from Unlabeled Videos

Amrani, Elad; Ben-Ari, Rami; Hakim, Tal; Bronstein, Alex

Computer Science > Computer Vision and Pattern Recognition

arXiv:1905.11137 (cs)

[Submitted on 27 May 2019 (v1), last revised 19 Oct 2019 (this version, v2)]

Title:Learning to Detect and Retrieve Objects from Unlabeled Videos

Authors:Elad Amrani, Rami Ben-Ari, Tal Hakim, Alex Bronstein

View PDF

Abstract:Learning an object detector or retrieval requires a large data set with manual annotations. Such data sets are expensive and time consuming to create and therefore difficult to obtain on a large scale. In this work, we propose to exploit the natural correlation in narrations and the visual presence of objects in video, to learn an object detector and retrieval without any manual labeling involved. We pose the problem as weakly supervised learning with noisy labels, and propose a novel object detection paradigm under these constraints. We handle the background rejection by using contrastive samples and confront the high level of label noise with a new clustering score. Our evaluation is based on a set of 11 manually annotated objects in over 5000 frames. We show comparison to a weakly-supervised approach as baseline and provide a strongly labeled upper bound.

Comments:	ICCV 2019 Workshop on Multi-modal Video Analysis and Moments in Time Challenge
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
ACM classes:	I.2.10; I.4; I.5
Cite as:	arXiv:1905.11137 [cs.CV]
	(or arXiv:1905.11137v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1905.11137

Submission history

From: Elad Amrani [view email]
[v1] Mon, 27 May 2019 11:36:32 UTC (3,266 KB)
[v2] Sat, 19 Oct 2019 07:26:57 UTC (5,805 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2019-05

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Elad Amrani
Rami Ben-Ari
Tal Hakim
Alex Bronstein

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Learning to Detect and Retrieve Objects from Unlabeled Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning to Detect and Retrieve Objects from Unlabeled Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators