Search | arXiv e-print repository

doi 10.1016/j.isprsjprs.2018.11.011

Aerial Imagery for Roof Segmentation: A Large-Scale Dataset towards Automatic Mapping of Buildings

Authors: Qi Chen, Lei Wang, Yifan Wu, Guangming Wu, Zhiling Guo, Steven L. Waslander

Abstract: arXiv admin note: This version has been removed as the user did not have the right to agree to the license at the time of submission arXiv admin note: This version has been removed as the user did not have the right to agree to the license at the time of submission △ Less

Submitted 27 July, 2018; v1 submitted 25 July, 2018; originally announced July 2018.

Comments: arXiv admin note: This version has been removed as the user did not have the right to agree to the license at the time of submission

arXiv:1807.09304 [pdf, other]

Encoderless Gimbal Calibration of Dynamic Multi-Camera Clusters

Authors: Christopher L. Choi, Jason Rebello, Leonid Koppel, Pranav Ganti, Arun Das, Steven L. Waslander

Abstract: Dynamic Camera Clusters (DCCs) are multi-camera systems where one or more cameras are mounted on actuated mechanisms such as a gimbal. Existing methods for DCC calibration rely on joint angle measurements to resolve the time-varying transformation between the dynamic and static camera. This information is usually provided by motor encoders, however, joint angle measurements are not always readily… ▽ More Dynamic Camera Clusters (DCCs) are multi-camera systems where one or more cameras are mounted on actuated mechanisms such as a gimbal. Existing methods for DCC calibration rely on joint angle measurements to resolve the time-varying transformation between the dynamic and static camera. This information is usually provided by motor encoders, however, joint angle measurements are not always readily available on off-the-shelf mechanisms. In this paper, we present an encoderless approach for DCC calibration which simultaneously estimates the kinematic parameters of the transformation chain as well as the unknown joint angles. We also demonstrate the integration of an encoderless gimbal mechanism with a state-of-the art VIO algorithm, and show the extensions required in order to perform simultaneous online estimation of the joint angles and vehicle localization state. The proposed calibration approach is validated both in simulation and on a physical DCC composed of a 2-DOF gimbal mounted on a UAV. Finally, we show the experimental results of the calibrated mechanism integrated into the OKVIS VIO package, and demonstrate successful online joint angle estimation while maintaining localization accuracy that is comparable to a standard static multi-camera configuration. △ Less

Submitted 24 July, 2018; originally announced July 2018.

Comments: ICRA 2018

arXiv:1807.06072 [pdf, other]

Leveraging Pre-Trained 3D Object Detection Models For Fast Ground Truth Generation

Authors: Jungwook Lee, Sean Walsh, Ali Harakeh, Steven L. Waslander

Abstract: Training 3D object detectors for autonomous driving has been limited to small datasets due to the effort required to generate annotations. Reducing both task complexity and the amount of task switching done by annotators is key to reducing the effort and time required to generate 3D bounding box annotations. This paper introduces a novel ground truth generation method that combines human supervisi… ▽ More Training 3D object detectors for autonomous driving has been limited to small datasets due to the effort required to generate annotations. Reducing both task complexity and the amount of task switching done by annotators is key to reducing the effort and time required to generate 3D bounding box annotations. This paper introduces a novel ground truth generation method that combines human supervision with pretrained neural networks to generate per-instance 3D point cloud segmentation, 3D bounding boxes, and class annotations. The annotators provide object anchor clicks which behave as a seed to generate instance segmentation results in 3D. The points belonging to each instance are then used to regress object centroids, bounding box dimensions, and object orientation. Our proposed annotation scheme requires 30x lower human annotation time. We use the KITTI 3D object detection dataset to evaluate the efficiency and the quality of our annotation scheme. We also test the the proposed scheme on previously unseen data from the Autonomoose self-driving vehicle to demonstrate generalization capabilities of the network. △ Less

Submitted 16 July, 2018; originally announced July 2018.

arXiv:1806.07987 [pdf, other]

A Hierarchical Deep Architecture and Mini-Batch Selection Method For Joint Traffic Sign and Light Detection

Authors: Alex D. Pon, Oles Andrienko, Ali Harakeh, Steven L. Waslander

Abstract: Traffic light and sign detectors on autonomous cars are integral for road scene perception. The literature is abundant with deep learning networks that detect either lights or signs, not both, which makes them unsuitable for real-life deployment due to the limited graphics processing unit (GPU) memory and power available on embedded systems. The root cause of this issue is that no public dataset c… ▽ More Traffic light and sign detectors on autonomous cars are integral for road scene perception. The literature is abundant with deep learning networks that detect either lights or signs, not both, which makes them unsuitable for real-life deployment due to the limited graphics processing unit (GPU) memory and power available on embedded systems. The root cause of this issue is that no public dataset contains both traffic light and sign labels, which leads to difficulties in developing a joint detection framework. We present a deep hierarchical architecture in conjunction with a mini-batch proposal selection mechanism that allows a network to detect both traffic lights and signs from training on separate traffic light and sign datasets. Our method solves the overlapping issue where instances from one dataset are not labelled in the other dataset. We are the first to present a network that performs joint detection on traffic lights and signs. We measure our network on the Tsinghua-Tencent 100K benchmark for traffic sign detection and the Bosch Small Traffic Lights benchmark for traffic light detection and show it outperforms the existing Bosch Small Traffic light state-of-the-art method. We focus on autonomous car deployment and show our network is more suitable than others because of its low memory footprint and real-time image processing time. Qualitative results can be viewed at https://youtu.be/_YmogPzBXOw △ Less

Submitted 13 September, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

Comments: Accepted in the IEEE 15th Conference on Computer and Robot Vision

arXiv:1806.00526 [pdf, ps, other]

Multi-Step Prediction of Dynamic Systems with Recurrent Neural Networks

Authors: Nima Mohajerin, Steven L. Waslander

Abstract: Recurrent Neural Networks (RNNs) can encode rich dynamics which makes them suitable for modeling dynamic systems. To train an RNN for multi-step prediction of dynamic systems, it is crucial to efficiently address the state initialization problem, which seeks proper values for the RNN initial states at the beginning of each prediction interval. In this work, the state initialization problem is addr… ▽ More Recurrent Neural Networks (RNNs) can encode rich dynamics which makes them suitable for modeling dynamic systems. To train an RNN for multi-step prediction of dynamic systems, it is crucial to efficiently address the state initialization problem, which seeks proper values for the RNN initial states at the beginning of each prediction interval. In this work, the state initialization problem is addressed using Neural Networks (NNs) to effectively train a variety of RNNs for modeling two aerial vehicles, a helicopter and a quadrotor, from experimental data. It is shown that the RNN initialized by the NN-based initialization method outperforms the state of the art. Further, a comprehensive study of RNNs trained for multi-step prediction of the two aerial vehicles is presented. The multi-step prediction of the quadrotor is enhanced using a hybrid model which combines a simplified physics-based motion model of the vehicle with RNNs. While the maximum translational and rotational velocities in the quadrotor dataset are about 4 m/s and 3.8 rad/s, respectively, the hybrid model produces predictions, over 1.9 second, which remain within 9 cm/s and 0.12 rad/s of the measured translational and rotational velocities, with 99\% confidence on the test dataset △ Less

Submitted 19 May, 2018; originally announced June 2018.

arXiv:1805.01810 [pdf, other]

Manifold Geometry with Fast Automatic Derivatives and Coordinate Frame Semantics Checking in C++

Authors: Leonid Koppel, Steven L. Waslander

Abstract: Computer vision and robotics problems often require representation and estimation of poses on the SE(3) manifold. Developers of algorithms that must run in real time face several time-consuming programming tasks, including deriving and computing analytic derivatives and avoiding mathematical errors when handling poses in multiple coordinate frames. To support rapid and error-free development, we p… ▽ More Computer vision and robotics problems often require representation and estimation of poses on the SE(3) manifold. Developers of algorithms that must run in real time face several time-consuming programming tasks, including deriving and computing analytic derivatives and avoiding mathematical errors when handling poses in multiple coordinate frames. To support rapid and error-free development, we present wave_geometry, a C++ manifold geometry library with two key contributions: expression template-based automatic differentiation and compile-time enforcement of coordinate frame semantics. We contrast the library with existing open source packages and show that it can evaluate Jacobians in forward and reverse mode with little to no runtime overhead compared to hand-coded derivatives. The library is available at https://github.com/wavelab/wave_geometry . △ Less

Submitted 4 May, 2018; originally announced May 2018.

Comments: 8 pages, Conference on Computer and Robot Vision (CRV) 2018

arXiv:1802.00036 [pdf, other]

In Defense of Classical Image Processing: Fast Depth Completion on the CPU

Authors: Jason Ku, Ali Harakeh, Steven L. Waslander

Abstract: With the rise of data driven deep neural networks as a realization of universal function approximators, most research on computer vision problems has moved away from hand crafted classical image processing algorithms. This paper shows that with a well designed algorithm, we are capable of outperforming neural network based methods on the task of depth completion. The proposed algorithm is simple a… ▽ More With the rise of data driven deep neural networks as a realization of universal function approximators, most research on computer vision problems has moved away from hand crafted classical image processing algorithms. This paper shows that with a well designed algorithm, we are capable of outperforming neural network based methods on the task of depth completion. The proposed algorithm is simple and fast, runs on the CPU, and relies only on basic image processing operations to perform depth completion of sparse LIDAR depth data. We evaluate our algorithm on the challenging KITTI depth completion benchmark, and at the time of submission, our method ranks first on the KITTI test server among all published methods. Furthermore, our algorithm is data independent, requiring no training data to perform the task at hand. The code written in Python will be made publicly available at https://github.com/kujason/ip_basic. △ Less

Submitted 31 January, 2018; originally announced February 2018.

arXiv:1601.01289 [pdf, other]

Internet of Drones

Authors: Mirmojtaba Gharibi, Raouf Boutaba, Steven L. Waslander

Abstract: The Internet of Drones (IoD) is a layered network control architecture designed mainly for coordinating the access of unmanned aerial vehicles to controlled airspace, and providing navigation services between locations referred to as nodes. The IoD provides generic services for various drone applications such as package delivery, traffic surveillance, search and rescue and more. In this paper, we… ▽ More The Internet of Drones (IoD) is a layered network control architecture designed mainly for coordinating the access of unmanned aerial vehicles to controlled airspace, and providing navigation services between locations referred to as nodes. The IoD provides generic services for various drone applications such as package delivery, traffic surveillance, search and rescue and more. In this paper, we present a conceptual model of how such an architecture can be organized and we specify the features that an IoD system based on our architecture should implement. For doing so, we extract key concepts from three existing large scale networks, namely the air traffic control network, the cellular network, and the Internet and explore their connections to our novel architecture for drone traffic management. △ Less

Submitted 1 February, 2016; v1 submitted 6 January, 2016; originally announced January 2016.

arXiv:1509.07075 [pdf, other]

doi 10.1002/rob.21616

3D Scan Registration using Curvelet Features in Planetary Environments

Authors: Siddhant Ahuja, Peter Iles, Steven L. Waslander

Abstract: Topographic mapping in planetary environments relies on accurate 3D scan registration methods. However, most global registration algorithms relying on features such as FPFH and Harris-3D show poor alignment accuracy in these settings due to the poor structure of the Mars-like terrain and variable resolution, occluded, sparse range data that is hard to register without some a-priori knowledge of th… ▽ More Topographic mapping in planetary environments relies on accurate 3D scan registration methods. However, most global registration algorithms relying on features such as FPFH and Harris-3D show poor alignment accuracy in these settings due to the poor structure of the Mars-like terrain and variable resolution, occluded, sparse range data that is hard to register without some a-priori knowledge of the environment. In this paper, we propose an alternative approach to 3D scan registration using the curvelet transform that performs multi-resolution geometric analysis to obtain a set of coefficients indexed by scale (coarsest to finest), angle and spatial position. Features are detected in the curvelet domain to take advantage of the directional selectivity of the transform. A descriptor is computed for each feature by calculating the 3D spatial histogram of the image gradients, and nearest neighbor based matching is used to calculate the feature correspondences. Correspondence rejection using Random Sample Consensus identifies inliers, and a locally optimal Singular Value Decomposition-based estimation of the rigid-body transformation aligns the laser scans given the re-projected correspondences in the metric space. Experimental results on a publicly available data-set of planetary analogue indoor facility, as well as simulated and real-world scans from Neptec Design Group's IVIGMS 3D laser rangefinder at the outdoor CSA Mars yard demonstrates improved performance over existing methods in the challenging sparse Mars-like terrain. △ Less

Submitted 23 September, 2015; originally announced September 2015.

Comments: 27 pages in Journal of Field Robotics, 2015

arXiv:1506.07597 [pdf, ps, other]

Degenerate Motions in Multicamera Cluster SLAM with Non-overlapping Fields of View

Authors: Michael J. Tribou, David W. L. Wang, Steven L. Waslander

Abstract: An analysis of the relative motion and point feature model configurations leading to solution degeneracy is presented, for the case of a Simultaneous Localization and Mapping system using multicamera clusters with non-overlapping fields-of-view. The SLAM optimization system seeks to minimize image space reprojection error and is formulated for a cluster containing any number of component cameras,… ▽ More An analysis of the relative motion and point feature model configurations leading to solution degeneracy is presented, for the case of a Simultaneous Localization and Mapping system using multicamera clusters with non-overlapping fields-of-view. The SLAM optimization system seeks to minimize image space reprojection error and is formulated for a cluster containing any number of component cameras, observing any number of point features over two keyframes. The measurement Jacobian is transformed to expose a reduced-dimension representation such that the degeneracy of the system can be determined by the rank of a dense submatrix. A set of relative motions sufficient for degeneracy are identified for certain cluster configurations, independent of target model geometry. Furthermore, it is shown that increasing the number of cameras within the cluster and observing features across different cameras over the two keyframes reduces the size of the degenerate motion sets significantly. △ Less

Submitted 24 June, 2015; originally announced June 2015.

Comments: 18 pages, 18 figures

Showing 51–60 of 60 results for author: Waslander, S L