Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisation

Gunawardena, Nishan; Lui, Gough Yumu; Ginige, Jeewani Anupama; Javadi, Bahman

doi:10.1016/j.iot.2024.101481

Computer Science > Computer Vision and Pattern Recognition

arXiv:2408.12463 (cs)

[Submitted on 22 Aug 2024 (v1), last revised 14 Jan 2025 (this version, v2)]

Title:Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisation

Authors:Nishan Gunawardena, Gough Yumu Lui, Jeewani Anupama Ginige, Bahman Javadi

View PDF

Abstract:A significant limitation of current smartphone-based eye-tracking algorithms is their low accuracy when applied to video-type visual stimuli, as they are typically trained on static images. Also, the increasing demand for real-time interactive applications like games, VR, and AR on smartphones requires overcoming the limitations posed by resource constraints such as limited computational power, battery life, and network bandwidth. Therefore, we developed two new smartphone eye-tracking techniques for video-type visuals by combining Convolutional Neural Networks (CNN) with two different Recurrent Neural Networks (RNN), namely Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU). Our CNN+LSTM and CNN+GRU models achieved an average Root Mean Square Error of 0.955 cm and 1.091 cm, respectively. To address the computational constraints of smartphones, we developed an edge intelligence architecture to enhance the performance of smartphone-based eye tracking. We applied various optimisation methods like quantisation and pruning to deep learning models for better energy, CPU, and memory usage on edge devices, focusing on real-time processing. Using model quantisation, the model inference time in the CNN+LSTM and CNN+GRU models was reduced by 21.72% and 19.50%, respectively, on edge devices.

Comments:	I have included the three papers as reference, which are closely related. We have expanded the future work section to provide a more thorough discussion of the concepts of "varying lighting conditions" and "dynamic user environments." We have added a note below Table 4 to clarify the abbreviations' meaning. Elaborated the role of the Domain Expert within the presentation layer in Section 4.1
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Performance (cs.PF)
Cite as:	arXiv:2408.12463 [cs.CV]
	(or arXiv:2408.12463v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2408.12463
Journal reference:	Internet of Things, 2024, Article 101481, Elsevier
Related DOI:	https://doi.org/10.1016/j.iot.2024.101481

Submission history

From: Nishan Gunawardena Mr [view email]
[v1] Thu, 22 Aug 2024 15:04:59 UTC (21,435 KB)
[v2] Tue, 14 Jan 2025 01:57:04 UTC (2,363 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators