Pushing the envelope in deep visual recognition for mobile platforms

Alvino, Lorenzo

Abstract:Image classification is the task of assigning to an input image a label from a fixed set of categories. One of its most important applicative fields is that of robotics, in particular the needing of a robot to be aware of what's around and the consequent exploitation of that information as a benefit for its tasks. In this work we consider the problem of a robot that enters a new environment and wants to understand visual data coming from its camera, so to extract knowledge from them. As main novelty we want to overcome the needing of a physical robot, as it could be expensive and unhandy, so to hopefully enhance, speed up and ease the research in this field. That's why we propose to develop an application for a mobile platform that wraps several deep visual recognition tasks. First we deal with a simple Image classification, testing a model obtained from an AlexNet trained on the ILSVRC 2012 dataset. Several photo settings are considered to better understand which factors affect most the quality of classification. For the same purpose we are interested to integrate the classification task with an extra module dealing with segmentation of the object inside the image. In particular we propose a technique for extracting the object shape and moving out all the background, so to focus the classification only on the region occupied by the object. Another significant task that is included is that of object discovery. Its purpose is to simulate the situation in which the robot needs a certain object to complete one of its activities. It starts searching for what it needs by looking around and trying to understand the location of the object by scanning the surrounding environment. Finally we provide a tool for dealing with the creation of customized task-specific databases, meant to better suit to one's needing in a particular vision task.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:1710.05982 [cs.CV]
	(or arXiv:1710.05982v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1710.05982

Computer Science > Computer Vision and Pattern Recognition

Title:Pushing the envelope in deep visual recognition for mobile platforms

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators