Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs > arXiv:1606.07256

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Computer Science > Computer Vision and Pattern Recognition

arXiv:1606.07256 (cs)
[Submitted on 23 Jun 2016]

Title:Saliency Driven Object recognition in egocentric videos with deep CNN

Authors:Philippe Pérez de San Roman, Jenny Benois-Pineau, Jean-Philippe Domenger, Florent Paclet, Daniel Cataert, Aymar de Rugy
View a PDF of the paper titled Saliency Driven Object recognition in egocentric videos with deep CNN, by Philippe P\'erez de San Roman and 5 other authors
View PDF
Abstract:The problem of object recognition in natural scenes has been recently successfully addressed with Deep Convolutional Neuronal Networks giving a significant break-through in recognition scores. The computational efficiency of Deep CNNs as a function of their depth, allows for their use in real-time applications. One of the key issues here is to reduce the number of windows selected from images to be submitted to a Deep CNN. This is usually solved by preliminary segmentation and selection of specific windows, having outstanding "objectiveness" or other value of indicators of possible location of objects. In this paper we propose a Deep CNN approach and the general framework for recognition of objects in a real-time scenario and in an egocentric perspective. Here the window of interest is built on the basis of visual attention map computed over gaze fixations measured by a glass-worn eye-tracker. The application of this set-up is an interactive user-friendly environment for upper-limb amputees. Vision has to help the subject to control his worn neuro-prosthesis in case of a small amount of remaining muscles when the EMG control becomes unefficient. The recognition results on a specifically recorded corpus of 151 videos with simple geometrical objects show the mAP of 64,6\% and the computational time at the generalization lower than a time of a visual fixation on the object-of-interest.
Comments: 20 pages, 8 figures, 3 tables, Submitted to the Journal of Computer Vision and Image Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:1606.07256 [cs.CV]
  (or arXiv:1606.07256v1 [cs.CV] for this version)
  https://doi.org/10.48550/arXiv.1606.07256
arXiv-issued DOI via DataCite

Submission history

From: Jenny Benois-Pineau [view email]
[v1] Thu, 23 Jun 2016 10:10:31 UTC (2,497 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled Saliency Driven Object recognition in egocentric videos with deep CNN, by Philippe P\'erez de San Roman and 5 other authors
  • View PDF
  • TeX Source
  • Other Formats
view license
Current browse context:
cs.CV
< prev   |   next >
new | recent | 2016-06
Change to browse by:
cs

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

listing | bibtex
Philippe Pérez de San Roman
Jenny Benois-Pineau
Jean-Philippe Domenger
Florent Paclet
Daniel Cataert
…
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack