Contextual Action Recognition with R*CNN

Gkioxari, Georgia; Girshick, Ross; Malik, Jitendra

Computer Science > Computer Vision and Pattern Recognition

arXiv:1505.01197 (cs)

[Submitted on 5 May 2015 (v1), last revised 25 Mar 2016 (this version, v3)]

Title:Contextual Action Recognition with R*CNN

Authors:Georgia Gkioxari, Ross Girshick, Jitendra Malik

View PDF

Abstract:There are multiple cues in an image which reveal what action a person is performing. For example, a jogger has a pose that is characteristic for jogging, but the scene (e.g. road, trail) and the presence of other joggers can be an additional source of information. In this work, we exploit the simple observation that actions are accompanied by contextual cues to build a strong action recognition system. We adapt RCNN to use more than one region for classification while still maintaining the ability to localize the action. We call our system R*CNN. The action-specific models and the feature maps are trained jointly, allowing for action specific representations to emerge. R*CNN achieves 90.2% mean AP on the PASAL VOC Action dataset, outperforming all other approaches in the field by a significant margin. Last, we show that R*CNN is not limited to action recognition. In particular, R*CNN can also be used to tackle fine-grained tasks such as attribute classification. We validate this claim by reporting state-of-the-art performance on the Berkeley Attributes of People dataset.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1505.01197 [cs.CV]
	(or arXiv:1505.01197v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1505.01197

Submission history

From: Georgia Gkioxari [view email]
[v1] Tue, 5 May 2015 21:56:10 UTC (9,425 KB)
[v2] Sat, 26 Sep 2015 20:29:26 UTC (7,161 KB)
[v3] Fri, 25 Mar 2016 01:06:01 UTC (7,073 KB)

Full-text links:

Access Paper:

view license

Current browse context:

< prev | next >

new | recent | 2015-05

Change to browse by:

cs.CV

References & Citations

DBLP - CS Bibliography

listing | bibtex

Georgia Gkioxari
Ross B. Girshick
Jitendra Malik

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Contextual Action Recognition with R*CNN

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Contextual Action Recognition with R*CNN

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators