Text-guided Generation of Efficient Personalized Inspection Plans

Sun, Xingpeng; Pan, Zherong; Gao, Xifeng; Wu, Kui; Bera, Aniket

Abstract:We propose a training-free, Vision-Language Model (VLM)-guided approach for efficiently generating trajectories to facilitate target inspection planning based on text descriptions. Unlike existing Vision-and-Language Navigation (VLN) methods designed for general agents in unknown environments, our approach specifically targets the efficient inspection of known scenes, with widespread applications in fields such as medical, marine, and civil engineering. Leveraging VLMs, our method first extracts points of interest (POIs) from the text description, then identifies a set of waypoints from which POIs are both salient and align with the spatial constraints defined in the prompt. Next, we interact with the VLM to iteratively refine the trajectory, preserving the visibility and prominence of the POIs. Further, we solve a Traveling Salesman Problem (TSP) to find the most efficient visitation order that satisfies the order constraint implied in the text description. Finally, we apply trajectory optimization to generate smooth, executable inspection paths for aerial and underwater vehicles. We have evaluated our method across a series of both handcrafted and real-world scanned environments. The results demonstrate that our approach effectively generates inspection planning trajectories that adhere to user instructions.

Comments:	8 pages, 5 figures
Subjects:	Robotics (cs.RO)
Cite as:	arXiv:2506.02917 [cs.RO]
	(or arXiv:2506.02917v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2506.02917

Computer Science > Robotics

Title:Text-guided Generation of Efficient Personalized Inspection Plans

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators