Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning

Wilcox, Albert; Ghanem, Mohamed; Moghani, Masoud; Barroso, Pierre; Joffe, Benjamin; Garg, Animesh

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.04877 (cs)

[Submitted on 6 Mar 2025 (v1), last revised 15 May 2025 (this version, v2)]

Title:Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning

Authors:Albert Wilcox, Mohamed Ghanem, Masoud Moghani, Pierre Barroso, Benjamin Joffe, Animesh Garg

View PDF HTML (experimental)

Abstract:Imitation Learning can train robots to perform complex and diverse manipulation tasks, but learned policies are brittle with observations outside of the training distribution. 3D scene representations that incorporate observations from calibrated RGBD cameras have been proposed as a way to mitigate this, but in our evaluations with unseen embodiments and camera viewpoints they show only modest improvement. To address those challenges, we propose Adapt3R, a general-purpose 3D observation encoder which synthesizes data from calibrated RGBD cameras into a vector that can be used as conditioning for arbitrary IL algorithms. The key idea is to use a pretrained 2D backbone to extract semantic information, using 3D only as a medium to localize this information with respect to the end-effector. We show across 93 simulated and 6 real tasks that when trained end-to-end with a variety of IL algorithms, Adapt3R maintains these algorithms' learning capacity while enabling zero-shot transfer to novel embodiments and camera poses.

Comments:	Videos, code, and data: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
Cite as:	arXiv:2503.04877 [cs.CV]
	(or arXiv:2503.04877v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.04877

Submission history

From: Albert Wilcox Iii [view email]
[v1] Thu, 6 Mar 2025 18:17:09 UTC (37,743 KB)
[v2] Thu, 15 May 2025 20:49:51 UTC (24,407 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators