-
Learning Smooth State-Dependent Traversability from Dense Point Clouds
Authors:
Zihao Dong,
Alan Papalia,
Leonard Jung,
Alenna Spiro,
Philip R. Osteen,
Christa S. Robison,
Michael Everett
Abstract:
A key open challenge in off-road autonomy is that the traversability of terrain often depends on the vehicle's state. In particular, some obstacles are only traversable from some orientations. However, learning this interaction by encoding the angle of approach as a model input demands a large and diverse training dataset and is computationally inefficient during planning due to repeated model inf…
▽ More
A key open challenge in off-road autonomy is that the traversability of terrain often depends on the vehicle's state. In particular, some obstacles are only traversable from some orientations. However, learning this interaction by encoding the angle of approach as a model input demands a large and diverse training dataset and is computationally inefficient during planning due to repeated model inference. To address these challenges, we present SPARTA, a method for estimating approach angle conditioned traversability from point clouds. Specifically, we impose geometric structure into our network by outputting a smooth analytical function over the 1-Sphere that predicts risk distribution for any angle of approach with minimal overhead and can be reused for subsequent queries. The function is composed of Fourier basis functions, which has important advantages for generalization due to their periodic nature and smoothness. We demonstrate SPARTA both in a high-fidelity simulation platform, where our model achieves a 91\% success rate crossing a 40m boulder field (compared to 73\% for the baseline), and on hardware, illustrating the generalization ability of the model to real-world settings.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
Anomalies by Synthesis: Anomaly Detection using Generative Diffusion Models for Off-Road Navigation
Authors:
Siddharth Ancha,
Sunshine Jiang,
Travis Manderson,
Laura Brandt,
Yilun Du,
Philip R. Osteen,
Nicholas Roy
Abstract:
In order to navigate safely and reliably in off-road and unstructured environments, robots must detect anomalies that are out-of-distribution (OOD) with respect to the training data. We present an analysis-by-synthesis approach for pixel-wise anomaly detection without making any assumptions about the nature of OOD data. Given an input image, we use a generative diffusion model to synthesize an edi…
▽ More
In order to navigate safely and reliably in off-road and unstructured environments, robots must detect anomalies that are out-of-distribution (OOD) with respect to the training data. We present an analysis-by-synthesis approach for pixel-wise anomaly detection without making any assumptions about the nature of OOD data. Given an input image, we use a generative diffusion model to synthesize an edited image that removes anomalies while keeping the remaining image unchanged. Then, we formulate anomaly detection as analyzing which image segments were modified by the diffusion model. We propose a novel inference approach for guided diffusion by analyzing the ideal guidance gradient and deriving a principled approximation that bootstraps the diffusion model to predict guidance gradients. Our editing technique is purely test-time that can be integrated into existing workflows without the need for retraining or fine-tuning. Finally, we use a combination of vision-language foundation models to compare pixels in a learned feature space and detect semantically meaningful edits, enabling accurate anomaly detection for off-road navigation. Project website: https://siddancha.github.io/anomalies-by-diffusion-synthesis/
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
GO: The Great Outdoors Multimodal Dataset
Authors:
Peng Jiang,
Kasi Viswanath,
Akhil Nagariya,
George Chustz,
Maggie Wigness,
Philip Osteen,
Timothy Overbye,
Christian Ellis,
Long Quang,
Srikanth Saripalli
Abstract:
The Great Outdoors (GO) dataset is a multi-modal annotated data resource aimed at advancing ground robotics research in unstructured environments. This dataset provides the most comprehensive set of data modalities and annotations compared to existing off-road datasets. In total, the GO dataset includes six unique sensor types with high-quality semantic annotations and GPS traces to support tasks…
▽ More
The Great Outdoors (GO) dataset is a multi-modal annotated data resource aimed at advancing ground robotics research in unstructured environments. This dataset provides the most comprehensive set of data modalities and annotations compared to existing off-road datasets. In total, the GO dataset includes six unique sensor types with high-quality semantic annotations and GPS traces to support tasks such as semantic segmentation, object detection, and SLAM. The diverse environmental conditions represented in the dataset present significant real-world challenges that provide opportunities to develop more robust solutions to support the continued advancement of field robotics, autonomous exploration, and perception systems in natural environments. The dataset can be downloaded at: https://www.unmannedlab.org/the-great-outdoors-dataset/
△ Less
Submitted 31 January, 2025;
originally announced January 2025.
-
Submodular Optimization for Keyframe Selection & Usage in SLAM
Authors:
David Thorne,
Nathan Chan,
Yanlong Ma,
Christa S. Robison,
Philip R. Osteen,
Brett T. Lopez
Abstract:
Keyframes are LiDAR scans saved for future reference in Simultaneous Localization And Mapping (SLAM), but despite their central importance most algorithms leave choices of which scans to save and how to use them to wasteful heuristics. This work proposes two novel keyframe selection strategies for localization and map summarization, as well as a novel approach to submap generation which selects ke…
▽ More
Keyframes are LiDAR scans saved for future reference in Simultaneous Localization And Mapping (SLAM), but despite their central importance most algorithms leave choices of which scans to save and how to use them to wasteful heuristics. This work proposes two novel keyframe selection strategies for localization and map summarization, as well as a novel approach to submap generation which selects keyframes that best constrain localization. Our results show that online keyframe selection and submap generation reduce the number of saved keyframes and improve per scan computation time without compromising localization performance. We also present a map summarization feature for quickly capturing environments under strict map size constraints.
△ Less
Submitted 16 April, 2025; v1 submitted 7 October, 2024;
originally announced October 2024.
-
LiDAR Inertial Odometry And Mapping Using Learned Registration-Relevant Features
Authors:
Zihao Dong,
Jeff Pflueger,
Leonard Jung,
David Thorne,
Philip R. Osteen,
Christa S. Robison,
Brett T. Lopez,
Michael Everett
Abstract:
SLAM is an important capability for many autonomous systems, and modern LiDAR-based methods offer promising performance. However, for long duration missions, existing works that either operate directly the full pointclouds or on extracted features face key tradeoffs in accuracy and computational efficiency (e.g., memory consumption). To address these issues, this paper presents DFLIOM with several…
▽ More
SLAM is an important capability for many autonomous systems, and modern LiDAR-based methods offer promising performance. However, for long duration missions, existing works that either operate directly the full pointclouds or on extracted features face key tradeoffs in accuracy and computational efficiency (e.g., memory consumption). To address these issues, this paper presents DFLIOM with several key innovations. Unlike previous methods that rely on handcrafted heuristics and hand-tuned parameters for feature extraction, we propose a learning-based approach that select points relevant to LiDAR SLAM pointcloud registration. Furthermore, we extend our prior work DLIOM with the learned feature extractor and observe our method enables similar or even better localization performance using only about 20\% of the points in the dense point clouds. We demonstrate that DFLIOM performs well on multiple public benchmarks, achieving a 2.4\% decrease in localization error and 57.5\% decrease in memory usage compared to state-of-the-art methods (DLIOM). Although extracting features with the proposed network requires extra time, it is offset by the faster processing time downstream, thus maintaining real-time performance using 20Hz LiDAR on our hardware setup. The effectiveness of our learning-based feature extraction module is further demonstrated through comparison with several handcrafted feature extractors.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
M2P2: A Multi-Modal Passive Perception Dataset for Off-Road Mobility in Extreme Low-Light Conditions
Authors:
Aniket Datar,
Anuj Pokhrel,
Mohammad Nazeri,
Madhan B. Rao,
Chenhui Pan,
Yufan Zhang,
Andre Harrison,
Maggie Wigness,
Philip R. Osteen,
Jinwei Ye,
Xuesu Xiao
Abstract:
Long-duration, off-road, autonomous missions require robots to continuously perceive their surroundings regardless of the ambient lighting conditions. Most existing autonomy systems heavily rely on active sensing, e.g., LiDAR, RADAR, and Time-of-Flight sensors, or use (stereo) visible light imaging sensors, e.g., color cameras, to perceive environment geometry and semantics. In scenarios where ful…
▽ More
Long-duration, off-road, autonomous missions require robots to continuously perceive their surroundings regardless of the ambient lighting conditions. Most existing autonomy systems heavily rely on active sensing, e.g., LiDAR, RADAR, and Time-of-Flight sensors, or use (stereo) visible light imaging sensors, e.g., color cameras, to perceive environment geometry and semantics. In scenarios where fully passive perception is required and lighting conditions are degraded to an extent that visible light cameras fail to perceive, most downstream mobility tasks such as obstacle avoidance become impossible. To address such a challenge, this paper presents a Multi-Modal Passive Perception dataset, M2P2, to enable off-road mobility in low-light to no-light conditions. We design a multi-modal sensor suite including thermal, event, and stereo RGB cameras, GPS, two Inertia Measurement Units (IMUs), as well as a high-resolution LiDAR for ground truth, with a novel multi-sensor calibration procedure that can efficiently transform multi-modal perceptual streams into a common coordinate system. Our 10-hour, 32 km dataset also includes mobility data such as robot odometry and actions and covers well-lit, low-light, and no-light conditions, along with paved, on-trail, and off-trail terrain. Our results demonstrate that off-road mobility is possible through only passive perception in extreme low-light conditions using end-to-end learning and classical planning. The project website can be found at https://cs.gmu.edu/~xiao/Research/M2P2/
△ Less
Submitted 22 April, 2025; v1 submitted 1 October, 2024;
originally announced October 2024.
-
PIETRA: Physics-Informed Evidential Learning for Traversing Out-of-Distribution Terrain
Authors:
Xiaoyi Cai,
James Queeney,
Tong Xu,
Aniket Datar,
Chenhui Pan,
Max Miller,
Ashton Flather,
Philip R. Osteen,
Nicholas Roy,
Xuesu Xiao,
Jonathan P. How
Abstract:
Self-supervised learning is a powerful approach for developing traversability models for off-road navigation, but these models often struggle with inputs unseen during training. Existing methods utilize techniques like evidential deep learning to quantify model uncertainty, helping to identify and avoid out-of-distribution terrain. However, always avoiding out-of-distribution terrain can be overly…
▽ More
Self-supervised learning is a powerful approach for developing traversability models for off-road navigation, but these models often struggle with inputs unseen during training. Existing methods utilize techniques like evidential deep learning to quantify model uncertainty, helping to identify and avoid out-of-distribution terrain. However, always avoiding out-of-distribution terrain can be overly conservative, e.g., when novel terrain can be effectively analyzed using a physics-based model. To overcome this challenge, we introduce Physics-Informed Evidential Traversability (PIETRA), a self-supervised learning framework that integrates physics priors directly into the mathematical formulation of evidential neural networks and introduces physics knowledge implicitly through an uncertainty-aware, physics-informed training loss. Our evidential network seamlessly transitions between learned and physics-based predictions for out-of-distribution inputs. Additionally, the physics-informed loss regularizes the learned model, ensuring better alignment with the physics model. Extensive simulations and hardware experiments demonstrate that PIETRA improves both learning accuracy and navigation performance in environments with significant distribution shifts.
△ Less
Submitted 23 December, 2024; v1 submitted 4 September, 2024;
originally announced September 2024.
-
EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy
Authors:
Xiaoyi Cai,
Siddharth Ancha,
Lakshay Sharma,
Philip R. Osteen,
Bernadette Bucher,
Stephen Phillips,
Jiuguang Wang,
Michael Everett,
Nicholas Roy,
Jonathan P. How
Abstract:
Traversing terrain with good traction is crucial for achieving fast off-road navigation. Instead of manually designing costs based on terrain features, existing methods learn terrain properties directly from data via self-supervision to automatically penalize trajectories moving through undesirable terrain, but challenges remain to properly quantify and mitigate the risk due to uncertainty in lear…
▽ More
Traversing terrain with good traction is crucial for achieving fast off-road navigation. Instead of manually designing costs based on terrain features, existing methods learn terrain properties directly from data via self-supervision to automatically penalize trajectories moving through undesirable terrain, but challenges remain to properly quantify and mitigate the risk due to uncertainty in learned models. To this end, this work proposes a unified framework to learn uncertainty-aware traction model and plan risk-aware trajectories. For uncertainty quantification, we efficiently model both aleatoric and epistemic uncertainty by learning discrete traction distributions and probability densities of the traction predictor's latent features. Leveraging evidential deep learning, we parameterize Dirichlet distributions with the network outputs and propose a novel uncertainty-aware squared Earth Mover's distance loss with a closed-form expression that improves learning accuracy and navigation performance. For risk-aware navigation, the proposed planner simulates state trajectories with the worst-case expected traction to handle aleatoric uncertainty, and penalizes trajectories moving through terrain with high epistemic uncertainty. Our approach is extensively validated in simulation and on wheeled and quadruped robots, showing improved navigation performance compared to methods that assume no slip, assume the expected traction, or optimize for the worst-case expected cost.
△ Less
Submitted 31 March, 2024; v1 submitted 10 November, 2023;
originally announced November 2023.
-
RAMP: A Risk-Aware Mapping and Planning Pipeline for Fast Off-Road Ground Robot Navigation
Authors:
Lakshay Sharma,
Michael Everett,
Donggun Lee,
Xiaoyi Cai,
Philip Osteen,
Jonathan P. How
Abstract:
A key challenge in fast ground robot navigation in 3D terrain is balancing robot speed and safety. Recent work has shown that 2.5D maps (2D representations with additional 3D information) are ideal for real-time safe and fast planning. However, the prevalent approach of generating 2D occupancy grids through raytracing makes the generated map unsafe to plan in, due to inaccurate representation of u…
▽ More
A key challenge in fast ground robot navigation in 3D terrain is balancing robot speed and safety. Recent work has shown that 2.5D maps (2D representations with additional 3D information) are ideal for real-time safe and fast planning. However, the prevalent approach of generating 2D occupancy grids through raytracing makes the generated map unsafe to plan in, due to inaccurate representation of unknown space. Additionally, existing planners such as MPPI do not consider speeds in known free and unknown space separately, leading to slower overall plans. The RAMP pipeline proposed here solves these issues using new mapping and planning methods. This work first presents ground point inflation with persistent spatial memory as a way to generate accurate occupancy grid maps from classified pointclouds. Then we present an MPPI-based planner with embedded variability in horizon, to maximize speed in known free space while retaining cautionary penetration into unknown space. Finally, we integrate this mapping and planning pipeline with risk constraints arising from 3D terrain, and verify that it enables fast and safe navigation using simulations and hardware demonstrations.
△ Less
Submitted 10 March, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
Probabilistic Traversability Model for Risk-Aware Motion Planning in Off-Road Environments
Authors:
Xiaoyi Cai,
Michael Everett,
Lakshay Sharma,
Philip R. Osteen,
Jonathan P. How
Abstract:
A key challenge in off-road navigation is that even visually similar terrains or ones from the same semantic class may have substantially different traction properties. Existing work typically assumes no wheel slip or uses the expected traction for motion planning, where the predicted trajectories provide a poor indication of the actual performance if the terrain traction has high uncertainty. In…
▽ More
A key challenge in off-road navigation is that even visually similar terrains or ones from the same semantic class may have substantially different traction properties. Existing work typically assumes no wheel slip or uses the expected traction for motion planning, where the predicted trajectories provide a poor indication of the actual performance if the terrain traction has high uncertainty. In contrast, this work proposes to analyze terrain traversability with the empirical distribution of traction parameters in unicycle dynamics, which can be learned by a neural network in a self-supervised fashion. The probabilistic traction model leads to two risk-aware cost formulations that account for the worst-case expected cost and traction. To help the learned model generalize to unseen environment, terrains with features that lead to unreliable predictions are detected via a density estimator fit to the trained network's latent space and avoided via auxiliary penalties during planning. Simulation results demonstrate that the proposed approach outperforms existing work that assumes no slip or uses the expected traction in both navigation success rate and completion time. Furthermore, avoiding terrains with low density-based confidence score achieves up to 30% improvement in success rate when the learned traction model is used in a novel environment.
△ Less
Submitted 31 July, 2023; v1 submitted 30 September, 2022;
originally announced October 2022.
-
Semantic keypoint-based pose estimation from single RGB frames
Authors:
Karl Schmeckpeper,
Philip R. Osteen,
Yufu Wang,
Georgios Pavlakos,
Kenneth Chaney,
Wyatt Jordan,
Xiaowei Zhou,
Konstantinos G. Derpanis,
Kostas Daniilidis
Abstract:
This paper presents an approach to estimating the continuous 6-DoF pose of an object from a single RGB image. The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model. Unlike prior investigators, we are agnostic to whether the object is textured or textureless, as the convnet learns the optimal representation from the available training-…
▽ More
This paper presents an approach to estimating the continuous 6-DoF pose of an object from a single RGB image. The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model. Unlike prior investigators, we are agnostic to whether the object is textured or textureless, as the convnet learns the optimal representation from the available training-image data. Furthermore, the approach can be applied to instance- and class-based pose recovery. Additionally, we accompany our main pipeline with a technique for semi-automatic data generation from unlabeled videos. This procedure allows us to train the learnable components of our method with minimal manual intervention in the labeling process. Empirically, we show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios even against a cluttered background. We apply our approach both to several, existing, large-scale datasets - including PASCAL3D+, LineMOD-Occluded, YCB-Video, and TUD-Light - and, using our labeling pipeline, to a new dataset with novel object classes that we introduce here. Extensive empirical evaluations show that our approach is able to provide pose estimation results comparable to the state of the art.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
SemCal: Semantic LiDAR-Camera Calibration using Neural MutualInformation Estimator
Authors:
Peng Jiang,
Philip Osteen,
Srikanth Saripalli
Abstract:
This paper proposes SemCal: an automatic, targetless, extrinsic calibration algorithm for a LiDAR and camera system using semantic information. We leverage a neural information estimator to estimate the mutual information (MI) of semantic information extracted from each sensor measurement, facilitating semantic-level data association. By using a matrix exponential formulation of the $se(3)$ transf…
▽ More
This paper proposes SemCal: an automatic, targetless, extrinsic calibration algorithm for a LiDAR and camera system using semantic information. We leverage a neural information estimator to estimate the mutual information (MI) of semantic information extracted from each sensor measurement, facilitating semantic-level data association. By using a matrix exponential formulation of the $se(3)$ transformation and a kernel-based sampling method to sample from camera measurement based on LiDAR projected points, we can formulate the LiDAR-Camera calibration problem as a novel differentiable objective function that supports gradient-based optimization methods. We also introduce a semantic-based initial calibration method using 2D MI-based image registration and Perspective-n-Point (PnP) solver. To evaluate performance, we demonstrate the robustness of our method and quantitatively analyze the accuracy using a synthetic dataset. We also evaluate our algorithm qualitatively on an urban dataset (KITTI360) and an off-road dataset (RELLIS-3D) benchmark datasets using both hand-annotated ground truth labels as well as labels predicted by the state-of-the-art deep learning models, showing improvement over recent comparable calibration approaches.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
Calibrating LiDAR and Camera using Semantic Mutual information
Authors:
Peng Jiang,
Philip Osteen,
Srikanth Saripalli
Abstract:
We propose an algorithm for automatic, targetless, extrinsic calibration of a LiDAR and camera system using semantic information. We achieve this goal by maximizing mutual information (MI) of semantic information between sensors, leveraging a neural network to estimate semantic mutual information, and matrix exponential for calibration computation. Using kernel-based sampling to sample data from c…
▽ More
We propose an algorithm for automatic, targetless, extrinsic calibration of a LiDAR and camera system using semantic information. We achieve this goal by maximizing mutual information (MI) of semantic information between sensors, leveraging a neural network to estimate semantic mutual information, and matrix exponential for calibration computation. Using kernel-based sampling to sample data from camera measurement based on LiDAR projected points, we formulate the problem as a novel differentiable objective function which supports the use of gradient-based optimization methods. We also introduce an initial calibration method using 2D MI-based image registration. Finally, we demonstrate the robustness of our method and quantitatively analyze the accuracy on a synthetic dataset and also evaluate our algorithm qualitatively on KITTI360 and RELLIS-3D benchmark datasets, showing improvement over recent comparable approaches.
△ Less
Submitted 24 April, 2021;
originally announced April 2021.
-
RELLIS-3D Dataset: Data, Benchmarks and Analysis
Authors:
Peng Jiang,
Philip Osteen,
Maggie Wigness,
Srikanth Saripalli
Abstract:
Semantic scene understanding is crucial for robust and safe autonomous navigation, particularly so in off-road environments. Recent deep learning advances for 3D semantic segmentation rely heavily on large sets of training data, however existing autonomy datasets either represent urban environments or lack multimodal off-road data. We fill this gap with RELLIS-3D, a multimodal dataset collected in…
▽ More
Semantic scene understanding is crucial for robust and safe autonomous navigation, particularly so in off-road environments. Recent deep learning advances for 3D semantic segmentation rely heavily on large sets of training data, however existing autonomy datasets either represent urban environments or lack multimodal off-road data. We fill this gap with RELLIS-3D, a multimodal dataset collected in an off-road environment, which contains annotations for 13,556 LiDAR scans and 6,235 images. The data was collected on the Rellis Campus of Texas A\&M University and presents challenges to existing algorithms related to class imbalance and environmental topography. Additionally, we evaluate the current state-of-the-art deep learning semantic segmentation models on this dataset. Experimental results show that RELLIS-3D presents challenges for algorithms designed for segmentation in urban environments. This novel dataset provides the resources needed by researchers to continue to develop more advanced algorithms and investigate new research directions to enhance autonomous navigation in off-road environments. RELLIS-3D is available at https://github.com/unmannedlab/RELLIS-3D
△ Less
Submitted 25 May, 2022; v1 submitted 17 November, 2020;
originally announced November 2020.
-
Experimental Evaluation of 3D-LIDAR Camera Extrinsic Calibration
Authors:
Subodh Mishra,
Philip R. Osteen,
Gaurav Pandey,
Srikanth Saripalli
Abstract:
In this paper we perform an experimental comparison of three different target based 3D-LIDAR camera calibration algorithms. We briefly elucidate the mathematical background behind each method and provide insights into practical aspects like ease of data collection for all of them. We extensively evaluate these algorithms on a sensor suite which consists multiple cameras and LIDARs by assessing the…
▽ More
In this paper we perform an experimental comparison of three different target based 3D-LIDAR camera calibration algorithms. We briefly elucidate the mathematical background behind each method and provide insights into practical aspects like ease of data collection for all of them. We extensively evaluate these algorithms on a sensor suite which consists multiple cameras and LIDARs by assessing their robustness to random initialization and by using metrics like Mean Line Re-projection Error (MLRE) and Factory Stereo Calibration Error. We also show the effect of noisy sensor on the calibration result from all the algorithms and conclude with a note on which calibration algorithm should be used under what circumstances.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.