Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

Fan, Zicong; Ohkawa, Takehiko; Yang, Linlin; Lin, Nie; Zhou, Zhishan; Zhou, Shihao; Liang, Jiajun; Gao, Zhong; Zhang, Xuanyang; Zhang, Xue; Li, Fei; Liu, Zheng; Lu, Feng; Zeid, Karim Abou; Leibe, Bastian; On, Jeongwan; Baek, Seungryul; Prakash, Aditya; Gupta, Saurabh; He, Kun; Sato, Yoichi; Hilliges, Otmar; Chang, Hyung Jin; Yao, Angela

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.16428 (cs)

[Submitted on 25 Mar 2024 (v1), last revised 6 Aug 2024 (this version, v2)]

Title:Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

Abstract:We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3Dunderstanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D is challenging due to heavy occlusion, viewpoint bias, camera distortion, and motion blur from the head movement. To this end, we designed the HANDS23 challenge based on the AssemblyHands and ARCTIC datasets with carefully designed training and testing splits. Based on the results of the top submitted methods and more recent baselines on the leaderboards, we perform a thorough analysis on 3D hand(-object) reconstruction tasks. Our analysis demonstrates the effectiveness of addressing distortion specific to egocentric cameras, adopting high-capacity transformers to learn complex hand-object interactions, and fusing predictions from different views. Our study further reveals challenging scenarios intractable with state-of-the-art methods, such as fast hand motion, object reconstruction from narrow egocentric views, and close contact between two hands and objects. Our efforts will enrich the community's knowledge foundation and facilitate future hand studies on egocentric hand-object interactions.

Comments:	Accepted to ECCV 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.16428 [cs.CV]
	(or arXiv:2403.16428v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.16428

Submission history

From: Linlin Yang [view email]
[v1] Mon, 25 Mar 2024 05:12:21 UTC (3,689 KB)
[v2] Tue, 6 Aug 2024 03:44:00 UTC (3,480 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators