-
Exploiting the Asymmetric Uncertainty Structure of Pre-trained VLMs on the Unit Hypersphere
Authors:
Li Ju,
Max Andersson,
Stina Fredriksson,
Edward Glöckner,
Andreas Hellander,
Ekta Vats,
Prashant Singh
Abstract:
Vision-language models (VLMs) as foundation models have significantly enhanced performance across a wide range of visual and textual tasks, without requiring large-scale training from scratch for downstream tasks. However, these deterministic VLMs fail to capture the inherent ambiguity and uncertainty in natural language and visual data. Recent probabilistic post-hoc adaptation methods address thi…
▽ More
Vision-language models (VLMs) as foundation models have significantly enhanced performance across a wide range of visual and textual tasks, without requiring large-scale training from scratch for downstream tasks. However, these deterministic VLMs fail to capture the inherent ambiguity and uncertainty in natural language and visual data. Recent probabilistic post-hoc adaptation methods address this by mapping deterministic embeddings onto probability distributions; however, existing approaches do not account for the asymmetric uncertainty structure of the modalities, and the constraint that meaningful deterministic embeddings reside on a unit hypersphere, potentially leading to suboptimal performance. In this paper, we address the asymmetric uncertainty structure inherent in textual and visual data, and propose AsymVLM to build probabilistic embeddings from pre-trained VLMs on the unit hypersphere, enabling uncertainty quantification. We validate the effectiveness of the probabilistic embeddings on established benchmarks, and present comprehensive ablation studies demonstrating the inherent nature of asymmetry in the uncertainty structure of textual and visual data.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
Multi-Agent Path Finding Using Conflict-Based Search and Structural-Semantic Topometric Maps
Authors:
Scott Fredriksson,
Yifan Bai,
Akshit Saradagi,
George Nikolakopoulos
Abstract:
As industries increasingly adopt large robotic fleets, there is a pressing need for computationally efficient, practical, and optimal conflict-free path planning for multiple robots. Conflict-Based Search (CBS) is a popular method for multi-agent path finding (MAPF) due to its completeness and optimality; however, it is often impractical for real-world applications, as it is computationally intens…
▽ More
As industries increasingly adopt large robotic fleets, there is a pressing need for computationally efficient, practical, and optimal conflict-free path planning for multiple robots. Conflict-Based Search (CBS) is a popular method for multi-agent path finding (MAPF) due to its completeness and optimality; however, it is often impractical for real-world applications, as it is computationally intensive to solve and relies on assumptions about agents and operating environments that are difficult to realize. This article proposes a solution to overcome computational challenges and practicality issues of CBS by utilizing structural-semantic topometric maps. Instead of running CBS over large grid-based maps, the proposed solution runs CBS over a sparse topometric map containing structural-semantic cells representing intersections, pathways, and dead ends. This approach significantly accelerates the MAPF process and reduces the number of conflict resolutions handled by CBS while operating in continuous time. In the proposed method, robots are assigned time ranges to move between topometric regions, departing from the traditional CBS assumption that a robot can move to any connected cell in a single time step. The approach is validated through real-world multi-robot path-finding experiments and benchmarking simulations. The results demonstrate that the proposed MAPF method can be applied to real-world non-holonomic robots and yields significant improvement in computational efficiency compared to traditional CBS methods while improving conflict detection and resolution in cases of corridor symmetries.
△ Less
Submitted 29 January, 2025;
originally announced January 2025.
-
Robotic Exploration through Semantic Topometric Mapping
Authors:
Scott Fredriksson,
Akshit Saradagi,
George Nikolakopoulos
Abstract:
In this article, we introduce a novel strategy for robotic exploration in unknown environments using a semantic topometric map. As it will be presented, the semantic topometric map is generated by segmenting the grid map of the currently explored parts of the environment into regions, such as intersections, pathways, dead-ends, and unexplored frontiers, which constitute the structural semantics of…
▽ More
In this article, we introduce a novel strategy for robotic exploration in unknown environments using a semantic topometric map. As it will be presented, the semantic topometric map is generated by segmenting the grid map of the currently explored parts of the environment into regions, such as intersections, pathways, dead-ends, and unexplored frontiers, which constitute the structural semantics of an environment. The proposed exploration strategy leverages metric information of the frontier, such as distance and angle to the frontier, similar to existing frameworks, with the key difference being the additional utilization of structural semantic information, such as properties of the intersections leading to frontiers. The algorithm for generating semantic topometric mapping utilized by the proposed method is lightweight, resulting in the method's online execution being both rapid and computationally efficient. Moreover, the proposed framework can be applied to both structured and unstructured indoor and outdoor environments, which enhances the versatility of the proposed exploration algorithm. We validate our exploration strategy and demonstrate the utility of structural semantics in exploration in two complex indoor environments by utilizing a Turtlebot3 as the robotic agent. Compared to traditional frontier-based methods, our findings indicate that the proposed approach leads to faster exploration and requires less computation time.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
GRID-FAST: A Grid-based Intersection Detection for Fast Semantic Topometric Mapping
Authors:
Scott Fredriksson,
Akshit Saradagi,
George Nikolakopoulos
Abstract:
This article introduces a novel approach to constructing a topometric map that allows for efficient navigation and decision-making in mobile robotics applications. The method generates the topometric map from a 2D grid-based map. The topometric map segments areas of the input map into different structural-semantic classes: intersections, pathways, dead ends, and pathways leading to unexplored area…
▽ More
This article introduces a novel approach to constructing a topometric map that allows for efficient navigation and decision-making in mobile robotics applications. The method generates the topometric map from a 2D grid-based map. The topometric map segments areas of the input map into different structural-semantic classes: intersections, pathways, dead ends, and pathways leading to unexplored areas. This method is grounded in a new technique for intersection detection that identifies the area and the openings of intersections in a semantically meaningful way. The framework introduces two levels of pre-filtering with minimal computational cost to eliminate small openings and objects from the map which are unimportant in the context of high-level map segmentation and decision making. The topological map generated by GRID-FAST enables fast navigation in large-scale environments, and the structural semantics can aid in mission planning, autonomous exploration, and human-to-robot cooperation. The efficacy of the proposed method is demonstrated through validation on real maps gathered from robotic experiments: 1) a structured indoor environment, 2) an unstructured cave-like subterranean environment, and 3) a large-scale outdoor environment, which comprises pathways, buildings, and scattered objects. Additionally, the proposed framework has been compared with state-of-the-art topological mapping solutions and is able to produce a topometric and topological map with up to \blue92% fewer nodes than the next best solution.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Voxel Map to Occupancy Map Conversion Using Free Space Projection for Efficient Map Representation for Aerial and Ground Robots
Authors:
Scott Fredriksson,
Akshit Saradagi,
George Nikolakopoulos
Abstract:
This article introduces a novel method for converting 3D voxel maps, commonly utilized by robots for localization and navigation, into 2D occupancy maps for both unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs). The generated 2D maps can be used for more efficient global navigation for both UAVs and UGVs, in enabling algorithms developed for 2D maps to be useful in 3D applicatio…
▽ More
This article introduces a novel method for converting 3D voxel maps, commonly utilized by robots for localization and navigation, into 2D occupancy maps for both unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs). The generated 2D maps can be used for more efficient global navigation for both UAVs and UGVs, in enabling algorithms developed for 2D maps to be useful in 3D applications, and allowing for faster transfer of maps between multiple agents in bandwidth-limited scenarios. The proposed method uses the free space representation in the UFOMap mapping solution to generate 2D occupancy maps. During the 3D to 2D map conversion, the method conducts safety checks and eliminates free spaces in the map with dimensions (in the height axis) lower than the robot's safety margins. This ensures that an aerial or ground robot can navigate safely, relying primarily on the 2D map generated by the method. Additionally, the method extracts the height of navigable free space and a local estimate of the slope of the floor from the 3D voxel map. The height data is utilized in converting paths generated using the 2D map into paths in 3D space for both UAVs and UGVs. The slope data identifies areas too steep for a ground robot to traverse, marking them as occupied, thus enabling a more accurate representation of the terrain for ground robots. The effectiveness of the proposed method in enabling computationally efficient navigation for both aerial and ground robots is validated in two different environments, over both static maps and in online implementation in an exploration mission. The methods proposed within this article have been implemented in the popular robotics framework ROS and are open-sourced. The code is available at: https://github.com/LTU-RAI/Map-Conversion-3D-Voxel-Map-to-2D-Occupancy-Map.
△ Less
Submitted 21 July, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Semantic and Topological Mapping using Intersection Identification
Authors:
Scott Fredriksson,
Akshit Saradagi,
George Nikolakopoulos
Abstract:
This article presents a novel approach to identifying and classifying intersections for semantic and topological mapping. More specifically, the proposed novel approach has the merit of generating a semantically meaningful map containing intersections, pathways, dead ends, and pathways leading to unexplored frontiers. Furthermore, the resulting semantic map can be used to generate a sparse topolog…
▽ More
This article presents a novel approach to identifying and classifying intersections for semantic and topological mapping. More specifically, the proposed novel approach has the merit of generating a semantically meaningful map containing intersections, pathways, dead ends, and pathways leading to unexplored frontiers. Furthermore, the resulting semantic map can be used to generate a sparse topological map representation, that can be utilized by robots for global navigation. The proposed solution also introduces a built-in filtering to handle noises in the environment, to remove openings in the map that the robot cannot pass, and to remove small objects to optimize and simplify the overall mapping results. The efficacy of the proposed semantic and topological mapping method is demonstrated over a map of an indoor structured environment that is built from experimental data. The proposed framework, when compared with similar state-of-the-art topological mapping solutions, is able to produce a map with up to 89% fewer nodes than the next best solution.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.