Find Any Part in 3D

Ma, Ziqi; Yue, Yisong; Gkioxari, Georgia

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.13550 (cs)

[Submitted on 20 Nov 2024 (v1), last revised 28 Mar 2025 (this version, v2)]

Title:Find Any Part in 3D

Authors:Ziqi Ma, Yisong Yue, Georgia Gkioxari

View PDF HTML (experimental)

Abstract:Why don't we have foundation models in 3D yet? A key limitation is data scarcity. For 3D object part segmentation, existing datasets are small in size and lack diversity. We show that it is possible to break this data barrier by building a data engine powered by 2D foundation models. Our data engine automatically annotates any number of object parts: 1755x more unique part types than existing datasets combined. By training on our annotated data with a simple contrastive objective, we obtain an open-world model that generalizes to any part in any object based on any text query. Even when evaluated zero-shot, we outperform existing methods on the datasets they train on. We achieve 260% improvement in mIoU and boost speed by 6x to 300x. Our scaling analysis confirms that this generalization stems from the data scale, which underscores the impact of our data engine. Finally, to advance general-category open-world 3D part segmentation, we release a benchmark covering a wide range of objects and parts. Project website: this https URL

Comments:	Project website: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2411.13550 [cs.CV]
	(or arXiv:2411.13550v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.13550

Submission history

From: Ziqi Ma [view email]
[v1] Wed, 20 Nov 2024 18:59:01 UTC (2,064 KB)
[v2] Fri, 28 Mar 2025 04:36:55 UTC (5,592 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Find Any Part in 3D

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Find Any Part in 3D

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators