Computer Science > Performance
[Submitted on 14 Jul 2017 (this version), latest version 15 Apr 2018 (v2)]
Title:Towards Efficient Deep Inference for Mobile Applications
View PDFAbstract:Mobile applications are benefiting significantly from the advancement in deep learning, e.g. providing new features. Given a trained deep learning model, applications usually need to perform a series of matrix operations based on the input data, in order to infer possible output values. Because of model computation complexity and increased model sizes, those trained models are usually hosted in the cloud. When mobile apps need to utilize those models, they will have to send input data over the network. While cloud-based deep learning can provide reasonable response time for mobile apps, it also restricts the use case scenarios, e.g. mobile apps need to have access to network. With mobile specific deep learning optimizations, it is now possible to employ device-based inference. However, because mobile hardware, e.g. GPU and memory size, can be very different and limited when compared to desktop counterpart, it is important to understand the feasibility of this new device-based deep learning inference architecture. In this paper, we empirically evaluate the inference efficiency of three Convolutional Neural Networks using a benchmark Android application we developed. Based on our application-driven analysis, we have identified several performance bottlenecks for mobile applications powered by on-device deep learning inference.
Submission history
From: Tian Guo [view email][v1] Fri, 14 Jul 2017 19:05:50 UTC (1,325 KB)
[v2] Sun, 15 Apr 2018 17:48:20 UTC (1,407 KB)
Current browse context:
cs.PF
References & Citations
export BibTeX citation
Loading...
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.