Factors in Finetuning Deep Model for object detection

Ouyang, Wanli; Wang, Xiaogang; Zhang, Cong; Yang, Xiaokang

Computer Science > Computer Vision and Pattern Recognition

arXiv:1601.05150 (cs)

[Submitted on 20 Jan 2016 (v1), last revised 14 Apr 2016 (this version, v2)]

Title:Factors in Finetuning Deep Model for object detection

Authors:Wanli Ouyang, Xiaogang Wang, Cong Zhang, Xiaokang Yang

View PDF

Abstract:Finetuning from a pretrained deep model is found to yield state-of-the-art performance for many vision tasks. This paper investigates many factors that influence the performance in finetuning for object detection. There is a long-tailed distribution of sample numbers for classes in object detection. Our analysis and empirical results show that classes with more samples have higher impact on the feature learning. And it is better to make the sample number more uniform across classes. Generic object detection can be considered as multiple equally important tasks. Detection of each class is a task. These classes/tasks have their individuality in discriminative visual appearance representation. Taking this individuality into account, we cluster objects into visually similar class groups and learn deep representations for these groups separately. A hierarchical feature learning scheme is proposed. In this scheme, the knowledge from the group with large number of classes is transferred for learning features in its sub-groups. Finetuned on the GoogLeNet model, experimental results show 4.7% absolute mAP improvement of our approach on the ImageNet object detection dataset without increasing much computational cost at the testing stage.

Comments:	CVPR2016 camera ready version. Our ImageNet large scale recognition challenge (ILSVRC15) object detection results (rank 3rd for provided data and 2nd for external data) are based on this method. Code available later on this http URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1601.05150 [cs.CV]
	(or arXiv:1601.05150v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1601.05150

Submission history

From: Wanli Ouyang [view email]
[v1] Wed, 20 Jan 2016 02:19:48 UTC (701 KB)
[v2] Thu, 14 Apr 2016 01:15:12 UTC (1,183 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Factors in Finetuning Deep Model for object detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Factors in Finetuning Deep Model for object detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators