MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Huang, Junwen; Yu, Hao; Yu, Kuan-Ting; Navab, Nassir; Ilic, Slobodan; Busam, Benjamin

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.01517 (cs)

[Submitted on 3 Mar 2024 (v1), last revised 8 May 2024 (this version, v2)]

Title:MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Authors:Junwen Huang, Hao Yu, Kuan-Ting Yu, Nassir Navab, Slobodan Ilic, Benjamin Busam

View PDF HTML (experimental)

Abstract:Recent learning methods for object pose estimation require resource-intensive training for each individual object instance or category, hampering their scalability in real applications when confronted with previously unseen objects. In this paper, we propose MatchU, a Fuse-Describe-Match strategy for 6D pose estimation from RGB-D images. MatchU is a generic approach that fuses 2D texture and 3D geometric cues for 6D pose prediction of unseen objects. We rely on learning geometric 3D descriptors that are rotation-invariant by design. By encoding pose-agnostic geometry, the learned descriptors naturally generalize to unseen objects and capture symmetries. To tackle ambiguous associations using 3D geometry only, we fuse additional RGB information into our descriptor. This is achieved through a novel attention-based mechanism that fuses cross-modal information, together with a matching loss that leverages the latent space learned from RGB data to guide the descriptor learning process. Extensive experiments reveal the generalizability of both the RGB-D fusion strategy as well as the descriptor efficacy. Benefiting from the novel designs, MatchU surpasses all existing methods by a significant margin in terms of both accuracy and speed, even without the requirement of expensive re-training or rendering.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.01517 [cs.CV]
	(or arXiv:2403.01517v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.01517

Submission history

From: Junwen Huang [view email]
[v1] Sun, 3 Mar 2024 14:01:03 UTC (17,700 KB)
[v2] Wed, 8 May 2024 11:54:05 UTC (27,554 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators