Towards Efficient Deep Inference for Mobile Applications

Guo, Tian

Computer Science > Performance

arXiv:1707.04610v1 (cs)

[Submitted on 14 Jul 2017 (this version), latest version 15 Apr 2018 (v2)]

Title:Towards Efficient Deep Inference for Mobile Applications

Authors:Tian Guo

View PDF

Abstract:Mobile applications are benefiting significantly from the advancement in deep learning, e.g. providing new features. Given a trained deep learning model, applications usually need to perform a series of matrix operations based on the input data, in order to infer possible output values. Because of model computation complexity and increased model sizes, those trained models are usually hosted in the cloud. When mobile apps need to utilize those models, they will have to send input data over the network. While cloud-based deep learning can provide reasonable response time for mobile apps, it also restricts the use case scenarios, e.g. mobile apps need to have access to network. With mobile specific deep learning optimizations, it is now possible to employ device-based inference. However, because mobile hardware, e.g. GPU and memory size, can be very different and limited when compared to desktop counterpart, it is important to understand the feasibility of this new device-based deep learning inference architecture. In this paper, we empirically evaluate the inference efficiency of three Convolutional Neural Networks using a benchmark Android application we developed. Based on our application-driven analysis, we have identified several performance bottlenecks for mobile applications powered by on-device deep learning inference.

Subjects:	Performance (cs.PF); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:1707.04610 [cs.PF]
	(or arXiv:1707.04610v1 [cs.PF] for this version)
	https://doi.org/10.48550/arXiv.1707.04610

Submission history

From: Tian Guo [view email]
[v1] Fri, 14 Jul 2017 19:05:50 UTC (1,325 KB)
[v2] Sun, 15 Apr 2018 17:48:20 UTC (1,407 KB)

Computer Science > Performance

Title:Towards Efficient Deep Inference for Mobile Applications

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Performance

Title:Towards Efficient Deep Inference for Mobile Applications

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators