Future Success Prediction in Open-Vocabulary Object Manipulation Tasks Based on End-Effector Trajectories

Kambara, Motonari; Sugiura, Komei

Computer Science > Robotics

arXiv:2412.19112 (cs)

[Submitted on 26 Dec 2024 (v1), last revised 8 Jan 2025 (this version, v2)]

Title:Future Success Prediction in Open-Vocabulary Object Manipulation Tasks Based on End-Effector Trajectories

Authors:Motonari Kambara, Komei Sugiura

View PDF HTML (experimental)

Abstract:This study addresses a task designed to predict the future success or failure of open-vocabulary object manipulation. In this task, the model is required to make predictions based on natural language instructions, egocentric view images before manipulation, and the given end-effector trajectories. Conventional methods typically perform success prediction only after the manipulation is executed, limiting their efficiency in executing the entire task sequence. We propose a novel approach that enables the prediction of success or failure by aligning the given trajectories and images with natural language instructions. We introduce Trajectory Encoder to apply learnable weighting to the input trajectories, allowing the model to consider temporal dynamics and interactions between objects and the end effector, improving the model's ability to predict manipulation outcomes accurately. We constructed a dataset based on the RT-1 dataset, a large-scale benchmark for open-vocabulary object manipulation tasks, to evaluate our method. The experimental results show that our method achieved a higher prediction accuracy than baseline approaches.

Comments:	Accepted for presentation at LangRob @ CoRL 2024
Subjects:	Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2412.19112 [cs.RO]
	(or arXiv:2412.19112v2 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2412.19112

Submission history

From: Motonari Kambara [view email]
[v1] Thu, 26 Dec 2024 08:11:41 UTC (13,242 KB)
[v2] Wed, 8 Jan 2025 06:45:02 UTC (13,242 KB)

Computer Science > Robotics

Title:Future Success Prediction in Open-Vocabulary Object Manipulation Tasks Based on End-Effector Trajectories

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Future Success Prediction in Open-Vocabulary Object Manipulation Tasks Based on End-Effector Trajectories

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators