Search | arXiv e-print repository

TRADE: Object Tracking with 3D Trajectory and Ground Depth Estimates for UAVs

Authors: Pedro F. Proença, Patrick Spieler, Robert A. Hewitt, Jeff Delaune

Abstract: We propose TRADE for robust tracking and 3D localization of a moving target in cluttered environments, from UAVs equipped with a single camera. Ultimately TRADE enables 3d-aware target following. Tracking-by-detection approaches are vulnerable to target switching, especially between similar objects. Thus, TRADE predicts and incorporates the target 3D trajectory to select the right target from th… ▽ More We propose TRADE for robust tracking and 3D localization of a moving target in cluttered environments, from UAVs equipped with a single camera. Ultimately TRADE enables 3d-aware target following. Tracking-by-detection approaches are vulnerable to target switching, especially between similar objects. Thus, TRADE predicts and incorporates the target 3D trajectory to select the right target from the tracker's response map. Unlike static environments, depth estimation of a moving target from a single camera is a ill-posed problem. Therefore we propose a novel 3D localization method for ground targets on complex terrain. It reasons about scene geometry by combining ground plane segmentation, depth-from-motion and single-image depth estimation. The benefits of using TRADE are demonstrated as tracking robustness and depth accuracy on several dynamic scenes simulated in this work. Additionally, we demonstrate autonomous target following using a thermal camera by running TRADE on a quadcopter's board computer. △ Less

Submitted 6 October, 2022; originally announced October 2022.

arXiv:2205.03522 [pdf, other]

Optimizing Terrain Mapping and Landing Site Detection for Autonomous UAVs

Authors: Pedro F. Proença, Jeff Delaune, Roland Brockers

Abstract: The next generation of Mars rotorcrafts requires on-board autonomous hazard avoidance landing. To this end, this work proposes a system that performs continuous multi-resolution height map reconstruction and safe landing spot detection. Structure-from-Motion measurements are aggregated in a pyramid structure using a novel Optimal Mixture of Gaussians formulation that provides a comprehensive uncer… ▽ More The next generation of Mars rotorcrafts requires on-board autonomous hazard avoidance landing. To this end, this work proposes a system that performs continuous multi-resolution height map reconstruction and safe landing spot detection. Structure-from-Motion measurements are aggregated in a pyramid structure using a novel Optimal Mixture of Gaussians formulation that provides a comprehensive uncertainty model. Our multiresolution pyramid is built more efficiently and accurately than past work by decoupling pyramid filling from the measurement updates of different resolutions. To detect the safest landing location, after an optimized hazard segmentation, we use a mean shift algorithm on multiple distance transform peaks to account for terrain roughness and uncertainty. The benefits of our contributions are evaluated on real and synthetic flight data. △ Less

Submitted 6 May, 2022; originally announced May 2022.

Comments: Accepted to ICRA 2022

arXiv:2111.06271 [pdf, other]

Multi-Resolution Elevation Mapping and Safe Landing Site Detection with Applications to Planetary Rotorcraft

Authors: Pascal Schoppmann, Pedro F. Proença, Jeff Delaune, Michael Pantic, Timo Hinzmann, Larry Matthies, Roland Siegwart, Roland Brockers

Abstract: In this paper, we propose a resource-efficient approach to provide an autonomous UAV with an on-board perception method to detect safe, hazard-free landing sites during flights over complex 3D terrain. We aggregate 3D measurements acquired from a sequence of monocular images by a Structure-from-Motion approach into a local, robot-centric, multi-resolution elevation map of the overflown terrain, wh… ▽ More In this paper, we propose a resource-efficient approach to provide an autonomous UAV with an on-board perception method to detect safe, hazard-free landing sites during flights over complex 3D terrain. We aggregate 3D measurements acquired from a sequence of monocular images by a Structure-from-Motion approach into a local, robot-centric, multi-resolution elevation map of the overflown terrain, which fuses depth measurements according to their lateral surface resolution (pixel-footprint) in a probabilistic framework based on the concept of dynamic Level of Detail. Map aggregation only requires depth maps and the associated poses, which are obtained from an onboard Visual Odometry algorithm. An efficient landing site detection method then exploits the features of the underlying multi-resolution map to detect safe landing sites based on slope, roughness, and quality of the reconstructed terrain surface. The evaluation of the performance of the mapping and landing site detection modules are analyzed independently and jointly in simulated and real-world experiments in order to establish the efficacy of the proposed approach. △ Less

Submitted 11 November, 2021; originally announced November 2021.

Comments: 8 pages, 12 figures. Accepted at IROS 2021

arXiv:2003.06975 [pdf, other]

TACO: Trash Annotations in Context for Litter Detection

Authors: Pedro F Proença, Pedro Simões

Abstract: TACO is an open image dataset for litter detection and segmentation, which is growing through crowdsourcing. Firstly, this paper describes this dataset and the tools developed to support it. Secondly, we report instance segmentation performance using Mask R-CNN on the current version of TACO. Despite its small size (1500 images and 4784 annotations), our results are promising on this challenging p… ▽ More TACO is an open image dataset for litter detection and segmentation, which is growing through crowdsourcing. Firstly, this paper describes this dataset and the tools developed to support it. Secondly, we report instance segmentation performance using Mask R-CNN on the current version of TACO. Despite its small size (1500 images and 4784 annotations), our results are promising on this challenging problem. However, to achieve satisfactory trash detection in the wild for deployment, TACO still needs much more manual annotations. These can be contributed using: http://tacodataset.org/ △ Less

Submitted 17 March, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

arXiv:1907.04298 [pdf, other]

Deep Learning for Spacecraft Pose Estimation from Photorealistic Rendering

Authors: Pedro F. Proenca, Yang Gao

Abstract: On-orbit proximity operations in space rendezvous, docking and debris removal require precise and robust 6D pose estimation under a wide range of lighting conditions and against highly textured background, i.e., the Earth. This paper investigates leveraging deep learning and photorealistic rendering for monocular pose estimation of known uncooperative spacecrafts. We first present a simulator buil… ▽ More On-orbit proximity operations in space rendezvous, docking and debris removal require precise and robust 6D pose estimation under a wide range of lighting conditions and against highly textured background, i.e., the Earth. This paper investigates leveraging deep learning and photorealistic rendering for monocular pose estimation of known uncooperative spacecrafts. We first present a simulator built on Unreal Engine 4, named URSO, to generate labeled images of spacecrafts orbiting the Earth, which can be used to train and evaluate neural networks. Secondly, we propose a deep learning framework for pose estimation based on orientation soft classification, which allows modelling orientation ambiguity as a mixture of Gaussians. This framework was evaluated both on URSO datasets and the ESA pose estimation challenge. In this competition, our best model achieved 3rd place on the synthetic test set and 2nd place on the real test set. Moreover, our results show the impact of several architectural and training aspects, and we demonstrate qualitatively how models learned on URSO datasets can perform on real images from space. △ Less

Submitted 29 August, 2019; v1 submitted 9 July, 2019; originally announced July 2019.

Comments: * Adding more related work and references

arXiv:1803.02380 [pdf, other]

Fast Cylinder and Plane Extraction from Depth Cameras for Visual Odometry

Authors: Pedro F. Proença, Yang Gao

Abstract: This paper presents CAPE, a method to extract planes and cylinder segments from organized point clouds, which processes 640x480 depth images on a single CPU core at an average of 300 Hz, by operating on a grid of planar cells. While, compared to state-of-the-art plane extraction, the latency of CAPE is more consistent and 4-10 times faster, depending on the scene, we also demonstrate empirically t… ▽ More This paper presents CAPE, a method to extract planes and cylinder segments from organized point clouds, which processes 640x480 depth images on a single CPU core at an average of 300 Hz, by operating on a grid of planar cells. While, compared to state-of-the-art plane extraction, the latency of CAPE is more consistent and 4-10 times faster, depending on the scene, we also demonstrate empirically that applying CAPE to visual odometry can improve trajectory estimation on scenes made of cylindrical surfaces (e.g. tunnels), whereas using a plane extraction approach that is not curve-aware deteriorates performance on these scenes. To use these geometric primitives in visual odometry, we propose extending a probabilistic RGB-D odometry framework based on points, lines and planes to cylinder primitives. Following this framework, CAPE runs on fused depth maps and the parameters of cylinders are modelled probabilistically to account for uncertainty and weight accordingly the pose optimization residuals. △ Less

Submitted 5 July, 2018; v1 submitted 6 March, 2018; originally announced March 2018.

Comments: Accepted to IROS 2018

arXiv:1708.02837 [pdf, other]

SPLODE: Semi-Probabilistic Point and Line Odometry with Depth Estimation from RGB-D Camera Motion

Authors: Pedro F. Proença, Yang Gao

Abstract: Active depth cameras suffer from several limitations, which cause incomplete and noisy depth maps, and may consequently affect the performance of RGB-D Odometry. To address this issue, this paper presents a visual odometry method based on point and line features that leverages both measurements from a depth sensor and depth estimates from camera motion. Depth estimates are generated continuously b… ▽ More Active depth cameras suffer from several limitations, which cause incomplete and noisy depth maps, and may consequently affect the performance of RGB-D Odometry. To address this issue, this paper presents a visual odometry method based on point and line features that leverages both measurements from a depth sensor and depth estimates from camera motion. Depth estimates are generated continuously by a probabilistic depth estimation framework for both types of features to compensate for the lack of depth measurements and inaccurate feature depth associations. The framework models explicitly the uncertainty of triangulating depth from both point and line observations to validate and obtain precise estimates. Furthermore, depth measurements are exploited by propagating them through a depth map registration module and using a frame-to-frame motion estimation method that considers 3D-to-2D and 2D-to-3D reprojection errors, independently. Results on RGB-D sequences captured on large indoor and outdoor scenes, where depth sensor limitations are critical, show that the combination of depth measurements and estimates through our approach is able to overcome the absence and inaccuracy of depth measurements. △ Less

Submitted 9 August, 2017; originally announced August 2017.

Comments: IROS 2017

arXiv:1706.04034 [pdf, other]

Probabilistic RGB-D Odometry based on Points, Lines and Planes Under Depth Uncertainty

Authors: Pedro F. Proenca, Yang Gao

Abstract: This work proposes a robust visual odometry method for structured environments that combines point features with line and plane segments, extracted through an RGB-D camera. Noisy depth maps are processed by a probabilistic depth fusion framework based on Mixtures of Gaussians to denoise and derive the depth uncertainty, which is then propagated throughout the visual odometry pipeline. Probabilisti… ▽ More This work proposes a robust visual odometry method for structured environments that combines point features with line and plane segments, extracted through an RGB-D camera. Noisy depth maps are processed by a probabilistic depth fusion framework based on Mixtures of Gaussians to denoise and derive the depth uncertainty, which is then propagated throughout the visual odometry pipeline. Probabilistic 3D plane and line fitting solutions are used to model the uncertainties of the feature parameters and pose is estimated by combining the three types of primitives based on their uncertainties. Performance evaluation on RGB-D sequences collected in this work and two public RGB-D datasets: TUM and ICL-NUIM show the benefit of using the proposed depth fusion framework and combining the three feature-types, particularly in scenes with low-textured surfaces, dynamic objects and missing depth measurements. △ Less

Submitted 18 January, 2018; v1 submitted 13 June, 2017; originally announced June 2017.

Comments: Major update: more results, depth filter released as opensource, 34 pages

arXiv:1705.06516 [pdf, other]

Probabilistic Combination of Noisy Points and Planes for RGB-D Odometry

Authors: Pedro F. Proença, Yang Gao

Abstract: This work proposes a visual odometry method that combines points and plane primitives, extracted from a noisy depth camera. Depth measurement uncertainty is modelled and propagated through the extraction of geometric primitives to the frame-to-frame motion estimation, where pose is optimized by weighting the residuals of 3D point and planes matches, according to their uncertainties. Results on an… ▽ More This work proposes a visual odometry method that combines points and plane primitives, extracted from a noisy depth camera. Depth measurement uncertainty is modelled and propagated through the extraction of geometric primitives to the frame-to-frame motion estimation, where pose is optimized by weighting the residuals of 3D point and planes matches, according to their uncertainties. Results on an RGB-D dataset show that the combination of points and planes, through the proposed method, is able to perform well in poorly textured environments, where point-based odometry is bound to fail. △ Less

Submitted 18 May, 2017; originally announced May 2017.

Comments: Accepted to TAROS 2017

Showing 1–9 of 9 results for author: Proenca, P F