-
Learning 3D Scene Analogies with Neural Contextual Scene Maps
Authors:
Junho Kim,
Gwangtak Bae,
Eun Sun Lee,
Young Min Kim
Abstract:
Understanding scene contexts is crucial for machines to perform tasks and adapt prior knowledge in unseen or noisy 3D environments. As data-driven learning is intractable to comprehensively encapsulate diverse ranges of layouts and open spaces, we propose teaching machines to identify relational commonalities in 3D spaces. Instead of focusing on point-wise or object-wise representations, we introd…
▽ More
Understanding scene contexts is crucial for machines to perform tasks and adapt prior knowledge in unseen or noisy 3D environments. As data-driven learning is intractable to comprehensively encapsulate diverse ranges of layouts and open spaces, we propose teaching machines to identify relational commonalities in 3D spaces. Instead of focusing on point-wise or object-wise representations, we introduce 3D scene analogies, which are smooth maps between 3D scene regions that align spatial relationships. Unlike well-studied single instance-level maps, these scene-level maps smoothly link large scene regions, potentially enabling unique applications in trajectory transfer in AR/VR, long demonstration transfer for imitation learning, and context-aware object rearrangement. To find 3D scene analogies, we propose neural contextual scene maps, which extract descriptor fields summarizing semantic and geometric contexts, and holistically align them in a coarse-to-fine manner for map estimation. This approach reduces reliance on individual feature points, making it robust to input noise or shape variations. Experiments demonstrate the effectiveness of our approach in identifying scene analogies and transferring trajectories or object placements in diverse indoor scenes, indicating its potential for robotics and AR/VR applications.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Chronic Disease Diagnoses Using Behavioral Data
Authors:
Di Wang,
Yidan Hu,
Eng Sing Lee,
Hui Hwang Teong,
Ray Tian Rui Lai,
Wai Han Hoi,
Chunyan Miao
Abstract:
Early detection of chronic diseases is beneficial to healthcare by providing a golden opportunity for timely interventions. Although numerous prior studies have successfully used machine learning (ML) models for disease diagnoses, they highly rely on medical data, which are scarce for most patients in the early stage of the chronic diseases. In this paper, we aim to diagnose hyperglycemia (diabete…
▽ More
Early detection of chronic diseases is beneficial to healthcare by providing a golden opportunity for timely interventions. Although numerous prior studies have successfully used machine learning (ML) models for disease diagnoses, they highly rely on medical data, which are scarce for most patients in the early stage of the chronic diseases. In this paper, we aim to diagnose hyperglycemia (diabetes), hyperlipidemia, and hypertension (collectively known as 3H) using own collected behavioral data, thus, enable the early detection of 3H without using medical data collected in clinical settings. Specifically, we collected daily behavioral data from 629 participants over a 3-month study period, and trained various ML models after data preprocessing. Experimental results show that only using the participants' uploaded behavioral data, we can achieve accurate 3H diagnoses: 80.2\%, 71.3\%, and 81.2\% for diabetes, hyperlipidemia, and hypertension, respectively. Furthermore, we conduct Shapley analysis on the trained models to identify the most influential features for each type of diseases. The identified influential features are consistent with those reported in the literature.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
J-Net: Improved U-Net for Terahertz Image Super-Resolution
Authors:
Woon-Ha Yeo,
Seung-Hwan Jung,
Seung Jae Oh,
Inhee Maeng,
Eui Su Lee,
Han-Cheol Ryu
Abstract:
Terahertz (THz) waves are electromagnetic waves in the 0.1 to 10 THz frequency range, and THz imaging is utilized in a range of applications, including security inspections, biomedical fields, and the non-destructive examination of materials. However, THz images have low resolution due to the long wavelength of THz waves. Therefore, improving the resolution of THz images is one of the current hot…
▽ More
Terahertz (THz) waves are electromagnetic waves in the 0.1 to 10 THz frequency range, and THz imaging is utilized in a range of applications, including security inspections, biomedical fields, and the non-destructive examination of materials. However, THz images have low resolution due to the long wavelength of THz waves. Therefore, improving the resolution of THz images is one of the current hot research topics. We propose a novel network architecture called J-Net which is improved version of U-Net to solve the THz image super-resolution. It employs the simple baseline blocks which can extract low resolution (LR) image features and learn the mapping of LR images to highresolution (HR) images efficiently. All training was conducted using the DIV2K+Flickr2K dataset, and we employed the peak signal-to-noise ratio (PSNR) for quantitative comparison. In our comparisons with other THz image super-resolution methods, JNet achieved a PSNR of 32.52 dB, surpassing other techniques by more than 1 dB. J-Net also demonstrates superior performance on real THz images compared to other methods. Experiments show that the proposed J-Net achieves better PSNR and visual improvement compared with other THz image super-resolution methods.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Calibrating Panoramic Depth Estimation for Practical Localization and Mapping
Authors:
Junho Kim,
Eun Sun Lee,
Young Min Kim
Abstract:
The absolute depth values of surrounding environments provide crucial cues for various assistive technologies, such as localization, navigation, and 3D structure estimation. We propose that accurate depth estimated from panoramic images can serve as a powerful and light-weight input for a wide range of downstream tasks requiring 3D information. While panoramic images can easily capture the surroun…
▽ More
The absolute depth values of surrounding environments provide crucial cues for various assistive technologies, such as localization, navigation, and 3D structure estimation. We propose that accurate depth estimated from panoramic images can serve as a powerful and light-weight input for a wide range of downstream tasks requiring 3D information. While panoramic images can easily capture the surrounding context from commodity devices, the estimated depth shares the limitations of conventional image-based depth estimation; the performance deteriorates under large domain shifts and the absolute values are still ambiguous to infer from 2D observations. By taking advantage of the holistic view, we mitigate such effects in a self-supervised way and fine-tune the network with geometric consistency during the test phase. Specifically, we construct a 3D point cloud from the current depth prediction and project the point cloud at various viewpoints or apply stretches on the current input image to generate synthetic panoramas. Then we minimize the discrepancy of the 3D structure estimated from synthetic images without collecting additional data. We empirically evaluate our method in robot navigation and map-free localization where our method shows large performance enhancements. Our calibration method can therefore widen the applicability under various external conditions, serving as a key component for practical panorama-based machine vision systems. Code is available through the following link: \url{https://github.com/82magnolia/panoramic-depth-calibration}.
△ Less
Submitted 2 February, 2024; v1 submitted 27 August, 2023;
originally announced August 2023.
-
Graph Neural Networks for Decentralized Multi-Agent Perimeter Defense
Authors:
Elijah S. Lee,
Lifeng Zhou,
Alejandro Ribeiro,
Vijay Kumar
Abstract:
In this work, we study the problem of decentralized multi-agent perimeter defense that asks for computing actions for defenders with local perceptions and communications to maximize the capture of intruders. One major challenge for practical implementations is to make perimeter defense strategies scalable for large-scale problem instances. To this end, we leverage graph neural networks (GNNs) to d…
▽ More
In this work, we study the problem of decentralized multi-agent perimeter defense that asks for computing actions for defenders with local perceptions and communications to maximize the capture of intruders. One major challenge for practical implementations is to make perimeter defense strategies scalable for large-scale problem instances. To this end, we leverage graph neural networks (GNNs) to develop an imitation learning framework that learns a mapping from defenders' local perceptions and their communication graph to their actions. The proposed GNN-based learning network is trained by imitating a centralized expert algorithm such that the learned actions are close to that generated by the expert algorithm. We demonstrate that our proposed network performs closer to the expert algorithm and is superior to other baseline algorithms by capturing more intruders. Our GNN-based network is trained at a small scale and can be generalized to large-scale cases. We run perimeter defense games in scenarios with different team sizes and configurations to demonstrate the performance of the learned network.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
MoDA: Map style transfer for self-supervised Domain Adaptation of embodied agents
Authors:
Eun Sun Lee,
Junho Kim,
SangWon Park,
Young Min Kim
Abstract:
We propose a domain adaptation method, MoDA, which adapts a pretrained embodied agent to a new, noisy environment without ground-truth supervision. Map-based memory provides important contextual information for visual navigation, and exhibits unique spatial structure mainly composed of flat walls and rectangular obstacles. Our adaptation approach encourages the inherent regularities on the estimat…
▽ More
We propose a domain adaptation method, MoDA, which adapts a pretrained embodied agent to a new, noisy environment without ground-truth supervision. Map-based memory provides important contextual information for visual navigation, and exhibits unique spatial structure mainly composed of flat walls and rectangular obstacles. Our adaptation approach encourages the inherent regularities on the estimated maps to guide the agent to overcome the prevalent domain discrepancy in a novel environment. Specifically, we propose an efficient learning curriculum to handle the visual and dynamics corruptions in an online manner, self-supervised with pseudo clean maps generated by style transfer networks. Because the map-based representation provides spatial knowledge for the agent's policy, our formulation can deploy the pretrained policy networks from simulators in a new setting. We evaluate MoDA in various practical scenarios and show that our proposed method quickly enhances the agent's performance in downstream tasks including localization, mapping, exploration, and point-goal navigation.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Learning Decentralized Strategies for a Perimeter Defense Game with Graph Neural Networks
Authors:
Elijah S. Lee,
Lifeng Zhou,
Alejandro Ribeiro,
Vijay Kumar
Abstract:
We consider the problem of finding decentralized strategies for multi-agent perimeter defense games. In this work, we design a graph neural network-based learning framework to learn a mapping from defenders' local perceptions and the communication graph to defenders' actions such that the learned actions are close to that generated by a centralized expert algorithm. We demonstrate that our propose…
▽ More
We consider the problem of finding decentralized strategies for multi-agent perimeter defense games. In this work, we design a graph neural network-based learning framework to learn a mapping from defenders' local perceptions and the communication graph to defenders' actions such that the learned actions are close to that generated by a centralized expert algorithm. We demonstrate that our proposed networks stay closer to the expert policy and are superior to other baseline algorithms by capturing more intruders. Our GNN-based networks are trained at a small scale and can generalize to large scales. To validate our results, we run perimeter defense games in scenarios with different team sizes and initial configurations to evaluate the performance of the learned networks.
△ Less
Submitted 24 September, 2022;
originally announced November 2022.
-
Vision-based Perimeter Defense via Multiview Pose Estimation
Authors:
Elijah S. Lee,
Giuseppe Loianno,
Dinesh Jayaraman,
Vijay Kumar
Abstract:
Previous studies in the perimeter defense game have largely focused on the fully observable setting where the true player states are known to all players. However, this is unrealistic for practical implementation since defenders may have to perceive the intruders and estimate their states. In this work, we study the perimeter defense game in a photo-realistic simulator and the real world, requirin…
▽ More
Previous studies in the perimeter defense game have largely focused on the fully observable setting where the true player states are known to all players. However, this is unrealistic for practical implementation since defenders may have to perceive the intruders and estimate their states. In this work, we study the perimeter defense game in a photo-realistic simulator and the real world, requiring defenders to estimate intruder states from vision. We train a deep machine learning-based system for intruder pose detection with domain randomization that aggregates multiple views to reduce state estimation errors and adapt the defensive strategy to account for this. We newly introduce performance metrics to evaluate the vision-based perimeter defense. Through extensive experiments, we show that our approach improves state estimation, and eventually, perimeter defense performance in both 1-defender-vs-1-intruder games, and 2-defenders-vs-1-intruder games.
△ Less
Submitted 24 September, 2022;
originally announced September 2022.
-
Self-supervised Multi-modal Training from Uncurated Image and Reports Enables Zero-shot Oversight Artificial Intelligence in Radiology
Authors:
Sangjoon Park,
Eun Sun Lee,
Kyung Sook Shin,
Jeong Eun Lee,
Jong Chul Ye
Abstract:
Oversight AI is an emerging concept in radiology where the AI forms a symbiosis with radiologists by continuously supporting radiologists in their decision-making. Recent advances in vision-language models sheds a light on the long-standing problems of the oversight AI by the understanding both visual and textual concepts and their semantic correspondences. However, there have been limited success…
▽ More
Oversight AI is an emerging concept in radiology where the AI forms a symbiosis with radiologists by continuously supporting radiologists in their decision-making. Recent advances in vision-language models sheds a light on the long-standing problems of the oversight AI by the understanding both visual and textual concepts and their semantic correspondences. However, there have been limited successes in the application of vision-language models in the medical domain, as the current vision-language models and learning strategies for photographic images and captions call for the web-scale data corpus of image and text pairs which was not often feasible in the medical domain. To address this, here we present a model dubbed Medical Cross-attention Vision-Language model (Medical X-VL), leveraging the key components to be tailored for the medical domain. Our medical X-VL model is based on the following components: self-supervised uni-modal models in medical domain and fusion encoder to bridge them, momentum distillation, sentence-wise contrastive learning for medical reports, and the sentence similarity-adjusted hard negative mining. We experimentally demonstrated that our model enables various zero-shot tasks for oversight AI, ranging from the zero-shot classification to zero-shot error correction. Our model outperformed the current state-of-the-art models in two different medical image database, suggesting the novel clinical usage of our oversight AI model for monitoring human errors. Our method was especially successful in the data-limited setting, which is frequently encountered in the clinics, suggesting the potential widespread applicability in medical domain.
△ Less
Submitted 12 April, 2023; v1 submitted 10 August, 2022;
originally announced August 2022.
-
MR Image Denoising and Super-Resolution Using Regularized Reverse Diffusion
Authors:
Hyungjin Chung,
Eun Sun Lee,
Jong Chul Ye
Abstract:
Patient scans from MRI often suffer from noise, which hampers the diagnostic capability of such images. As a method to mitigate such artifact, denoising is largely studied both within the medical imaging community and beyond the community as a general subject. However, recent deep neural network-based approaches mostly rely on the minimum mean squared error (MMSE) estimates, which tend to produce…
▽ More
Patient scans from MRI often suffer from noise, which hampers the diagnostic capability of such images. As a method to mitigate such artifact, denoising is largely studied both within the medical imaging community and beyond the community as a general subject. However, recent deep neural network-based approaches mostly rely on the minimum mean squared error (MMSE) estimates, which tend to produce a blurred output. Moreover, such models suffer when deployed in real-world sitautions: out-of-distribution data, and complex noise distributions that deviate from the usual parametric noise models. In this work, we propose a new denoising method based on score-based reverse diffusion sampling, which overcomes all the aforementioned drawbacks. Our network, trained only with coronal knee scans, excels even on out-of-distribution in vivo liver MRI data, contaminated with complex mixture of noise. Even more, we propose a method to enhance the resolution of the denoised image with the same network. With extensive experiments, we show that our method establishes state-of-the-art performance, while having desirable properties which prior MMSE denoisers did not have: flexibly choosing the extent of denoising, and quantifying uncertainty.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Tunable Image Quality Control of 3-D Ultrasound using Switchable CycleGAN
Authors:
Jaeyoung Huh,
Shujaat Khan,
Sungjin Choi,
Dongkuk Shin,
Eun Sun Lee,
Jong Chul Ye
Abstract:
In contrast to 2-D ultrasound (US) for uniaxial plane imaging, a 3-D US imaging system can visualize a volume along three axial planes. This allows for a full view of the anatomy, which is useful for gynecological (GYN) and obstetrical (OB) applications. Unfortunately, the 3-D US has an inherent limitation in resolution compared to the 2-D US. In the case of 3-D US with a 3-D mechanical probe, for…
▽ More
In contrast to 2-D ultrasound (US) for uniaxial plane imaging, a 3-D US imaging system can visualize a volume along three axial planes. This allows for a full view of the anatomy, which is useful for gynecological (GYN) and obstetrical (OB) applications. Unfortunately, the 3-D US has an inherent limitation in resolution compared to the 2-D US. In the case of 3-D US with a 3-D mechanical probe, for example, the image quality is comparable along the beam direction, but significant deterioration in image quality is often observed in the other two axial image planes. To address this, here we propose a novel unsupervised deep learning approach to improve 3-D US image quality. In particular, using {\em unmatched} high-quality 2-D US images as a reference, we trained a recently proposed switchable CycleGAN architecture so that every mapping plane in 3-D US can learn the image quality of 2-D US images. Thanks to the switchable architecture, our network can also provide real-time control of image enhancement level based on user preference, which is ideal for a user-centric scanner setup. Extensive experiments with clinical evaluation confirm that our method offers significantly improved image quality as well user-friendly flexibility.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
Self-Supervised Domain Adaptation for Visual Navigation with Global Map Consistency
Authors:
Eun Sun Lee,
Junho Kim,
Young Min Kim
Abstract:
We propose a light-weight, self-supervised adaptation for a visual navigation agent to generalize to unseen environment. Given an embodied agent trained in a noiseless environment, our objective is to transfer the agent to a noisy environment where actuation and odometry sensor noise is present. Our method encourages the agent to maximize the consistency between the global maps generated at differ…
▽ More
We propose a light-weight, self-supervised adaptation for a visual navigation agent to generalize to unseen environment. Given an embodied agent trained in a noiseless environment, our objective is to transfer the agent to a noisy environment where actuation and odometry sensor noise is present. Our method encourages the agent to maximize the consistency between the global maps generated at different time steps in a round-trip trajectory. The proposed task is completely self-supervised, not requiring any supervision from ground-truth pose data or explicit noise model. In addition, optimization of the task objective is extremely light-weight, as training terminates within a few minutes on a commodity GPU. Our experiments show that the proposed task helps the agent to successfully transfer to new, noisy environments. The transferred agent exhibits improved localization and mapping accuracy, further leading to enhanced performance in downstream visual navigation tasks. Moreover, we demonstrate test-time adaptation with our self-supervised task to show its potential applicability in real-world deployment.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
SGoLAM: Simultaneous Goal Localization and Mapping for Multi-Object Goal Navigation
Authors:
Junho Kim,
Eun Sun Lee,
Mingi Lee,
Donsu Zhang,
Young Min Kim
Abstract:
We present SGoLAM, short for simultaneous goal localization and mapping, which is a simple and efficient algorithm for Multi-Object Goal navigation. Given an agent equipped with an RGB-D camera and a GPS/Compass sensor, our objective is to have the agent navigate to a sequence of target objects in realistic 3D environments. Our pipeline fully leverages the strength of classical approaches for visu…
▽ More
We present SGoLAM, short for simultaneous goal localization and mapping, which is a simple and efficient algorithm for Multi-Object Goal navigation. Given an agent equipped with an RGB-D camera and a GPS/Compass sensor, our objective is to have the agent navigate to a sequence of target objects in realistic 3D environments. Our pipeline fully leverages the strength of classical approaches for visual navigation, by decomposing the problem into two key components: mapping and goal localization. The mapping module converts the depth observations into an occupancy map, and the goal localization module marks the locations of goal objects. The agent's policy is determined using the information provided by the two modules: if a current goal is found, plan towards the goal and otherwise, perform exploration. As our approach does not require any training of neural networks, it could be used in an off-the-shelf manner, and amenable for fast generalization in new, unseen environments. Nonetheless, our approach performs on par with the state-of-the-art learning-based approaches. SGoLAM is ranked 2nd in the CVPR 2021 MultiON (Multi-Object Goal Navigation) challenge. We have made our code publicly available at \emph{https://github.com/eunsunlee/SGoLAM}.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Infrastructure Node-based Vehicle Localization for Autonomous Driving
Authors:
Elijah S. Lee,
Ankit Vora,
Armin Parchami,
Punarjay Chakravarty,
Gaurav Pandey,
Vijay Kumar
Abstract:
Vehicle localization is essential for autonomous vehicle (AV) navigation and Advanced Driver Assistance Systems (ADAS). Accurate vehicle localization is often achieved via expensive inertial navigation systems or by employing compute-intensive vision processing (LiDAR/camera) to augment the low-cost and noisy inertial sensors. Here we have developed a framework for fusing the information obtained…
▽ More
Vehicle localization is essential for autonomous vehicle (AV) navigation and Advanced Driver Assistance Systems (ADAS). Accurate vehicle localization is often achieved via expensive inertial navigation systems or by employing compute-intensive vision processing (LiDAR/camera) to augment the low-cost and noisy inertial sensors. Here we have developed a framework for fusing the information obtained from a smart infrastructure node (ix-node) with the autonomous vehicles on-board localization engine to estimate the robust and accurate pose of the ego-vehicle even with cheap inertial sensors. A smart ix-node is typically used to augment the perception capability of an autonomous vehicle, especially when the onboard perception sensors of AVs are blocked by the dynamic and static objects in the environment thereby making them ineffectual. In this work, we utilize this perception output from an ix-node to increase the localization accuracy of the AV. The fusion of ix-node perception output with the vehicle's low-cost inertial sensors allows us to perform reliable vehicle localization without the need for relying on expensive inertial navigation systems or compute-intensive vision processing onboard the AVs. The proposed approach has been tested on real-world datasets collected from a test track in Ann Arbor, Michigan. Detailed analysis of the experimental results shows that incorporating ix-node data improves localization performance.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
Defending a Perimeter from a Ground Intruder Using an Aerial Defender: Theory and Practice
Authors:
Elijah S. Lee,
Daigo Shishika,
Giuseppe Loianno,
Vijay Kumar
Abstract:
The perimeter defense game has received interest in recent years as a variant of the pursuit-evasion game. A number of previous works have solved this game to obtain the optimal strategies for defender and intruder, but the derived theory considers the players as point particles with first-order assumptions. In this work, we aim to apply the theory derived from the perimeter defense problem to rob…
▽ More
The perimeter defense game has received interest in recent years as a variant of the pursuit-evasion game. A number of previous works have solved this game to obtain the optimal strategies for defender and intruder, but the derived theory considers the players as point particles with first-order assumptions. In this work, we aim to apply the theory derived from the perimeter defense problem to robots with realistic models of actuation and sensing and observe performance discrepancy in relaxing the first-order assumptions. In particular, we focus on the hemisphere perimeter defense problem where a ground intruder tries to reach the base of a hemisphere while an aerial defender constrained to move on the hemisphere aims to capture the intruder. The transition from theory to practice is detailed, and the designed system is simulated in Gazebo. Two metrics for parametric analysis and comparative study are proposed to evaluate the performance discrepancy.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
Active Perception with Neural Networks
Authors:
Elijah S. Lee
Abstract:
Active perception has been employed in many domains, particularly in the field of robotics. The idea of active perception is to utilize the input data to predict the next action that can help robots to improve their performance. The main challenge lies in understanding the input data to be coupled with the action, and gathering meaningful information of the environment in an efficient way is neces…
▽ More
Active perception has been employed in many domains, particularly in the field of robotics. The idea of active perception is to utilize the input data to predict the next action that can help robots to improve their performance. The main challenge lies in understanding the input data to be coupled with the action, and gathering meaningful information of the environment in an efficient way is necessary and desired. With recent developments of neural networks, interpreting the perceived data has become possible at the semantic level, and real-time interpretation based on deep learning has enabled the efficient closing of the perception-action loop. This report highlights recent progress in employing active perception based on neural networks for single and multi-agent systems.
△ Less
Submitted 6 September, 2021;
originally announced September 2021.
-
Perimeter-defense Game between Aerial Defender and Ground Intruder
Authors:
Elijah S. Lee,
Daigo Shishika,
Vijay Kumar
Abstract:
We study a variant of pursuit-evasion game in the context of perimeter defense. In this problem, the intruder aims to reach the base plane of a hemisphere without being captured by the defender, while the defender tries to capture the intruder. The perimeter-defense game was previously studied under the assumption that the defender moves on a circle. We extend the problem to the case where the def…
▽ More
We study a variant of pursuit-evasion game in the context of perimeter defense. In this problem, the intruder aims to reach the base plane of a hemisphere without being captured by the defender, while the defender tries to capture the intruder. The perimeter-defense game was previously studied under the assumption that the defender moves on a circle. We extend the problem to the case where the defender moves on a hemisphere. To solve this problem, we analyze the strategies based on the breaching point at which the intruder tries to reach the target and predict the goal position, defined as optimal breaching point, that is achieved by the optimal strategies on both players. We provide the barrier that divides the state space into defender-winning and intruder-winning regions and prove that the optimal strategies for both players are to move towards the optimal breaching point. Simulation results are presented to demonstrate that the optimality of the game is given as a Nash equilibrium.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
SLOAM: Semantic Lidar Odometry and Mapping for Forest Inventory
Authors:
Steven W. Chen,
Guilherme V. Nardari,
Elijah S. Lee,
Chao Qu,
Xu Liu,
Roseli A. F. Romero,
Vijay Kumar
Abstract:
This paper describes an end-to-end pipeline for tree diameter estimation based on semantic segmentation and lidar odometry and mapping. Accurate mapping of this type of environment is challenging since the ground and the trees are surrounded by leaves, thorns and vines, and the sensor typically experiences extreme motion. We propose a semantic feature based pose optimization that simultaneously re…
▽ More
This paper describes an end-to-end pipeline for tree diameter estimation based on semantic segmentation and lidar odometry and mapping. Accurate mapping of this type of environment is challenging since the ground and the trees are surrounded by leaves, thorns and vines, and the sensor typically experiences extreme motion. We propose a semantic feature based pose optimization that simultaneously refines the tree models while estimating the robot pose. The pipeline utilizes a custom virtual reality tool for labeling 3D scans that is used to train a semantic segmentation network. The masked point cloud is used to compute a trellis graph that identifies individual instances and extracts relevant features that are used by the SLAM module. We show that traditional lidar and image based methods fail in the forest environment on both Unmanned Aerial Vehicle (UAV) and hand-carry systems, while our method is more robust, scalable, and automatically generates tree diameter estimations.
△ Less
Submitted 29 December, 2019;
originally announced December 2019.
-
Mine Tunnel Exploration using Multiple Quadrupedal Robots
Authors:
Ian D. Miller,
Fernando Cladera,
Anthony Cowley,
Shreyas S. Shivakumar,
Elijah S. Lee,
Laura Jarin-Lipschitz,
Akhilesh Bhat,
Neil Rodrigues,
Alex Zhou,
Avraham Cohen,
Adarsh Kulkarni,
James Laney,
Camillo Jose Taylor,
Vijay Kumar
Abstract:
Robotic exploration of underground environments is a particularly challenging problem due to communication, endurance, and traversability constraints which necessitate high degrees of autonomy and agility. These challenges are further exacerbated by the need to minimize human intervention for practical applications. While legged robots have the ability to traverse extremely challenging terrain, th…
▽ More
Robotic exploration of underground environments is a particularly challenging problem due to communication, endurance, and traversability constraints which necessitate high degrees of autonomy and agility. These challenges are further exacerbated by the need to minimize human intervention for practical applications. While legged robots have the ability to traverse extremely challenging terrain, they also engender new challenges for planning, estimation, and control. In this work, we describe a fully autonomous system for multi-robot mine exploration and mapping using legged quadrupeds, as well as a distributed database mesh networking system for reporting data. In addition, we show results from the DARPA Subterranean Challenge (SubT) Tunnel Circuit demonstrating localization of artifacts after traversals of hundreds of meters. These experiments describe fully autonomous exploration of an unknown Global Navigation Satellite System (GNSS)-denied environment undertaken by legged robots.
△ Less
Submitted 3 February, 2020; v1 submitted 20 September, 2019;
originally announced September 2019.
-
MAVNet: an Effective Semantic Segmentation Micro-Network for MAV-based Tasks
Authors:
Ty Nguyen,
Shreyas S. Shivakumar,
Ian D. Miller,
James Keller,
Elijah S. Lee,
Alex Zhou,
Tolga Ozaslan,
Giuseppe Loianno,
Joseph H. Harwood,
Jennifer Wozencraft,
Camillo J. Taylor,
Vijay Kumar
Abstract:
Real-time semantic image segmentation on platforms subject to size, weight and power (SWaP) constraints is a key area of interest for air surveillance and inspection. In this work, we propose MAVNet: a small, light-weight, deep neural network for real-time semantic segmentation on micro Aerial Vehicles (MAVs). MAVNet, inspired by ERFNet, features 400 times fewer parameters and achieves comparable…
▽ More
Real-time semantic image segmentation on platforms subject to size, weight and power (SWaP) constraints is a key area of interest for air surveillance and inspection. In this work, we propose MAVNet: a small, light-weight, deep neural network for real-time semantic segmentation on micro Aerial Vehicles (MAVs). MAVNet, inspired by ERFNet, features 400 times fewer parameters and achieves comparable performance with some reference models in empirical experiments. Our model achieves a trade-off between speed and accuracy, achieving up to 48 FPS on an NVIDIA 1080Ti and 9 FPS on the NVIDIA Jetson Xavier when processing high resolution imagery. Additionally, we provide two novel datasets that represent challenges in semantic segmentation for real-time MAV tracking and infrastructure inspection tasks and verify MAVNet on these datasets. Our algorithm and datasets are made publicly available.
△ Less
Submitted 8 June, 2019; v1 submitted 3 April, 2019;
originally announced April 2019.