Learning deep representations for video-based intake gesture detection

Rouast, Philipp V.; Adam, Marc T. P.

doi:10.1109/JBHI.2019.2942845

Computer Science > Computer Vision and Pattern Recognition

arXiv:1909.10695 (cs)

[Submitted on 24 Sep 2019]

Title:Learning deep representations for video-based intake gesture detection

Authors:Philipp V. Rouast, Marc T. P. Adam

View PDF

Abstract:Automatic detection of individual intake gestures during eating occasions has the potential to improve dietary monitoring and support dietary recommendations. Existing studies typically make use of on-body solutions such as inertial and audio sensors, while video is used as ground truth. Intake gesture detection directly based on video has rarely been attempted. In this study, we address this gap and show that deep learning architectures can successfully be applied to the problem of video-based detection of intake gestures. For this purpose, we collect and label video data of eating occasions using 360-degree video of 102 participants. Applying state-of-the-art approaches from video action recognition, our results show that (1) the best model achieves an $F_1$ score of 0.858, (2) appearance features contribute more than motion features, and (3) temporal context in form of multiple video frames is essential for top model performance.

Comments:	To be published in IEEE Journal of Biomedical and Health Informatics
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as:	arXiv:1909.10695 [cs.CV]
	(or arXiv:1909.10695v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1909.10695
Related DOI:	https://doi.org/10.1109/JBHI.2019.2942845

Submission history

From: Philipp V. Rouast [view email]
[v1] Tue, 24 Sep 2019 03:29:53 UTC (4,302 KB)

Full-text links:

Access Paper:

view license

Current browse context:

< prev | next >

new | recent | 2019-09

Change to browse by:

cs.CV
cs.LG
eess
eess.IV

References & Citations

DBLP - CS Bibliography

listing | bibtex

Philipp V. Rouast
Marc T. P. Adam

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Learning deep representations for video-based intake gesture detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning deep representations for video-based intake gesture detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators