-
An Efficient Real-Time Planning Method for Swarm Robotics Based on an Optimal Virtual Tube
Authors:
Pengda Mao,
Shuli Lv,
Chen Min,
Zhaolong Shen,
Quan Quan
Abstract:
Swarm robotics navigating through unknown obstacle environments is an emerging research area that faces challenges. Performing tasks in such environments requires swarms to achieve autonomous localization, perception, decision-making, control, and planning. The limited computational resources of onboard platforms present significant challenges for planning and control. Reactive planners offer low…
▽ More
Swarm robotics navigating through unknown obstacle environments is an emerging research area that faces challenges. Performing tasks in such environments requires swarms to achieve autonomous localization, perception, decision-making, control, and planning. The limited computational resources of onboard platforms present significant challenges for planning and control. Reactive planners offer low computational demands and high re-planning frequencies but lack predictive capabilities, often resulting in local minima. Long-horizon planners, on the other hand, can perform multi-step predictions to reduce deadlocks but cost much computation, leading to lower re-planning frequencies. This paper proposes a real-time optimal virtual tube planning method for swarm robotics in unknown environments, which generates approximate solutions for optimal trajectories through affine functions. As a result, the computational complexity of approximate solutions is $O(n_t)$, where $n_t$ is the number of parameters in the trajectory, thereby significantly reducing the overall computational burden. By integrating reactive methods, the proposed method enables low-computation, safe swarm motion in unknown environments. The effectiveness of the proposed method is validated through several simulations and experiments.
△ Less
Submitted 2 May, 2025;
originally announced May 2025.
-
Correspondence-Free Pose Estimation with Patterns: A Unified Approach for Multi-Dimensional Vision
Authors:
Quan Quan,
Dun Dai
Abstract:
6D pose estimation is a central problem in robot vision. Compared with pose estimation based on point correspondences or its robust versions, correspondence-free methods are often more flexible. However, existing correspondence-free methods often rely on feature representation alignment or end-to-end regression. For such a purpose, a new correspondence-free pose estimation method and its practical…
▽ More
6D pose estimation is a central problem in robot vision. Compared with pose estimation based on point correspondences or its robust versions, correspondence-free methods are often more flexible. However, existing correspondence-free methods often rely on feature representation alignment or end-to-end regression. For such a purpose, a new correspondence-free pose estimation method and its practical algorithms are proposed, whose key idea is the elimination of unknowns by process of addition to separate the pose estimation from correspondence. By taking the considered point sets as patterns, feature functions used to describe these patterns are introduced to establish a sufficient number of equations for optimization. The proposed method is applicable to nonlinear transformations such as perspective projection and can cover various pose estimations from 3D-to-3D points, 3D-to-2D points, and 2D-to-2D points. Experimental results on both simulation and actual data are presented to demonstrate the effectiveness of the proposed method.
△ Less
Submitted 26 February, 2025;
originally announced March 2025.
-
Navigating Robot Swarm Through a Virtual Tube with Flow-Adaptive Distribution Control
Authors:
Yongwei Zhang,
Shuli Lv,
Kairong Liu,
Quanyi Liang,
Quan Quan,
Zhikun She
Abstract:
With the rapid development of robot swarm technology and its diverse applications, navigating robot swarms through complex environments has emerged as a critical research direction. To ensure safe navigation and avoid potential collisions with obstacles, the concept of virtual tubes has been introduced to define safe and navigable regions. However, current control methods in virtual tubes face the…
▽ More
With the rapid development of robot swarm technology and its diverse applications, navigating robot swarms through complex environments has emerged as a critical research direction. To ensure safe navigation and avoid potential collisions with obstacles, the concept of virtual tubes has been introduced to define safe and navigable regions. However, current control methods in virtual tubes face the congestion issues, particularly in narrow virtual tubes with low throughput. To address these challenges, we first originally introduce the concepts of virtual tube area and flow capacity, and develop an new evolution model for the spatial density function. Next, we propose a novel control method that combines a modified artificial potential field (APF) for swarm navigation and density feedback control for distribution regulation, under which a saturated velocity command is designed. Then, we generate a global velocity field that not only ensures collision-free navigation through the virtual tube, but also achieves locally input-to-state stability (LISS) for density tracking errors, both of which are rigorously proven. Finally, numerical simulations and realistic applications validate the effectiveness and advantages of the proposed method in managing robot swarms within narrow virtual tubes.
△ Less
Submitted 21 January, 2025;
originally announced January 2025.
-
A Degree of Flowability for Virtual Tubes
Authors:
Quan Quan,
Shuhan Huang,
Kai-Yuan Cai
Abstract:
With the rapid development of robotics swarm technology, there are more tasks that require the swarm to pass through complicated environments safely and efficiently. Virtual tube technology is a novel way to achieve this goal. Virtual tubes are free spaces connecting two places that provide safety boundaries and direction of motion for swarm robotics. How to determine the design quality of a virtu…
▽ More
With the rapid development of robotics swarm technology, there are more tasks that require the swarm to pass through complicated environments safely and efficiently. Virtual tube technology is a novel way to achieve this goal. Virtual tubes are free spaces connecting two places that provide safety boundaries and direction of motion for swarm robotics. How to determine the design quality of a virtual tube is a fundamental problem. For such a purpose, this paper presents a degree of flowability (DOF) for two-dimensional virtual tubes according to a minimum energy principle. After that, methods to calculate DOF are proposed with a feasibility analysis. Simulations of swarm robotics in different kinds of two-dimensional virtual tubes are performed to demonstrate the effectiveness of the proposed method of calculating DOF.
△ Less
Submitted 3 November, 2024; v1 submitted 29 October, 2024;
originally announced October 2024.
-
HySparK: Hybrid Sparse Masking for Large Scale Medical Image Pre-Training
Authors:
Fenghe Tang,
Ronghao Xu,
Qingsong Yao,
Xueming Fu,
Quan Quan,
Heqin Zhu,
Zaiyi Liu,
S. Kevin Zhou
Abstract:
The generative self-supervised learning strategy exhibits remarkable learning representational capabilities. However, there is limited attention to end-to-end pre-training methods based on a hybrid architecture of CNN and Transformer, which can learn strong local and global representations simultaneously. To address this issue, we propose a generative pre-training strategy called Hybrid Sparse mas…
▽ More
The generative self-supervised learning strategy exhibits remarkable learning representational capabilities. However, there is limited attention to end-to-end pre-training methods based on a hybrid architecture of CNN and Transformer, which can learn strong local and global representations simultaneously. To address this issue, we propose a generative pre-training strategy called Hybrid Sparse masKing (HySparK) based on masked image modeling and apply it to large-scale pre-training on medical images. First, we perform a bottom-up 3D hybrid masking strategy on the encoder to keep consistency masking. Then we utilize sparse convolution for the top CNNs and encode unmasked patches for the bottom vision Transformers. Second, we employ a simple hierarchical decoder with skip-connections to achieve dense multi-scale feature reconstruction. Third, we implement our pre-training method on a collection of multiple large-scale 3D medical imaging datasets. Extensive experiments indicate that our proposed pre-training strategy demonstrates robust transfer-ability in supervised downstream tasks and sheds light on HySparK's promising prospects. The code is available at https://github.com/FengheTan9/HySparK
△ Less
Submitted 11 August, 2024;
originally announced August 2024.
-
Tube RRT*: Efficient Homotopic Path Planning for Swarm Robotics Passing-Through Large-Scale Obstacle Environments
Authors:
Pengda Mao,
Shuli Lv,
Quan Quan
Abstract:
Recently, the concept of homotopic trajectory planning has emerged as a novel solution to navigation in large-scale obstacle environments for swarm robotics, offering a wide ranging of applications. However, it lacks an efficient homotopic path planning method in large-scale obstacle environments. This paper introduces Tube RRT*, an innovative homotopic path planning method that builds upon and im…
▽ More
Recently, the concept of homotopic trajectory planning has emerged as a novel solution to navigation in large-scale obstacle environments for swarm robotics, offering a wide ranging of applications. However, it lacks an efficient homotopic path planning method in large-scale obstacle environments. This paper introduces Tube RRT*, an innovative homotopic path planning method that builds upon and improves the Rapidly-exploring Random Tree (RRT) algorithm. Tube RRT* is specifically designed to generate homotopic paths, strategically considering gap volume and path length to mitigate swarm congestion and ensure agile navigation. Through comprehensive simulations and experiments, the effectiveness of Tube RRT* is validated.
△ Less
Submitted 31 October, 2024; v1 submitted 14 April, 2024;
originally announced April 2024.
-
High-Speed Interception Multicopter Control by Image-based Visual Servoing
Authors:
Kun Yang,
Chenggang Bai,
Zhikun She,
Quan Quan
Abstract:
In recent years, reports of illegal drones threatening public safety have increased. For the invasion of fully autonomous drones, traditional methods such as radio frequency interference and GPS shielding may fail. This paper proposes a scheme that uses an autonomous multicopter with a strapdown camera to intercept a maneuvering intruder UAV. The interceptor multicopter can autonomously detect and…
▽ More
In recent years, reports of illegal drones threatening public safety have increased. For the invasion of fully autonomous drones, traditional methods such as radio frequency interference and GPS shielding may fail. This paper proposes a scheme that uses an autonomous multicopter with a strapdown camera to intercept a maneuvering intruder UAV. The interceptor multicopter can autonomously detect and intercept intruders moving at high speed in the air. The strapdown camera avoids the complex mechanical structure of the electro-optical pod, making the interceptor multicopter compact. However, the coupling of the camera and multicopter motion makes interception tasks difficult. To solve this problem, an Image-Based Visual Servoing (IBVS) controller is proposed to make the interception fast and accurate. Then, in response to the time delay of sensor imaging and image processing relative to attitude changes in high-speed scenarios, a Delayed Kalman Filter (DKF) observer is generalized to predict the current image position and increase the update frequency. Finally, Hardware-in-the-Loop (HITL) simulations and outdoor flight experiments verify that this method has a high interception accuracy and success rate. In the flight experiments, a high-speed interception is achieved with a terminal speed of 20 m/s.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation
Authors:
Zikang Xu,
Fenghe Tang,
Quan Quan,
Qingsong Yao,
S. Kevin Zhou
Abstract:
Ensuring fairness in deep-learning-based segmentors is crucial for health equity. Much effort has been dedicated to mitigating unfairness in the training datasets or procedures. However, with the increasing prevalence of foundation models in medical image analysis, it is hard to train fair models from scratch while preserving utility. In this paper, we propose a novel method, Adversarial Privacy-a…
▽ More
Ensuring fairness in deep-learning-based segmentors is crucial for health equity. Much effort has been dedicated to mitigating unfairness in the training datasets or procedures. However, with the increasing prevalence of foundation models in medical image analysis, it is hard to train fair models from scratch while preserving utility. In this paper, we propose a novel method, Adversarial Privacy-aware Perturbations on Latent Embedding (APPLE), that can improve the fairness of deployed segmentors by introducing a small latent feature perturber without updating the weights of the original model. By adding perturbation to the latent vector, APPLE decorates the latent vector of segmentors such that no fairness-related features can be passed to the decoder of the segmentors while preserving the architecture and parameters of the segmentor. Experiments on two segmentation datasets and five segmentors (three U-Net-like and two SAM-like) illustrate the effectiveness of our proposed method compared to several unfairness mitigation methods.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Inspecting Model Fairness in Ultrasound Segmentation Tasks
Authors:
Zikang Xu,
Fenghe Tang,
Quan Quan,
Jianrui Ding,
Chunping Ning,
S. Kevin Zhou
Abstract:
With the rapid expansion of machine learning and deep learning (DL), researchers are increasingly employing learning-based algorithms to alleviate diagnostic challenges across diverse medical tasks and applications. While advancements in diagnostic precision are notable, some researchers have identified a concerning trend: their models exhibit biased performance across subgroups characterized by d…
▽ More
With the rapid expansion of machine learning and deep learning (DL), researchers are increasingly employing learning-based algorithms to alleviate diagnostic challenges across diverse medical tasks and applications. While advancements in diagnostic precision are notable, some researchers have identified a concerning trend: their models exhibit biased performance across subgroups characterized by different sensitive attributes. This bias not only infringes upon the rights of patients but also has the potential to lead to life-altering consequences. In this paper, we inspect a series of DL segmentation models using two ultrasound datasets, aiming to assess the presence of model unfairness in these specific tasks. Our findings reveal that even state-of-the-art DL algorithms demonstrate unfair behavior in ultrasound segmentation tasks. These results serve as a crucial warning, underscoring the necessity for careful model evaluation before their deployment in real-world scenarios. Such assessments are imperative to ensure ethical considerations and mitigate the risk of adverse impacts on patient outcomes.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
MobileUtr: Revisiting the relationship between light-weight CNN and Transformer for efficient medical image segmentation
Authors:
Fenghe Tang,
Bingkun Nian,
Jianrui Ding,
Quan Quan,
Jie Yang,
Wei Liu,
S. Kevin Zhou
Abstract:
Due to the scarcity and specific imaging characteristics in medical images, light-weighting Vision Transformers (ViTs) for efficient medical image segmentation is a significant challenge, and current studies have not yet paid attention to this issue. This work revisits the relationship between CNNs and Transformers in lightweight universal networks for medical image segmentation, aiming to integra…
▽ More
Due to the scarcity and specific imaging characteristics in medical images, light-weighting Vision Transformers (ViTs) for efficient medical image segmentation is a significant challenge, and current studies have not yet paid attention to this issue. This work revisits the relationship between CNNs and Transformers in lightweight universal networks for medical image segmentation, aiming to integrate the advantages of both worlds at the infrastructure design level. In order to leverage the inductive bias inherent in CNNs, we abstract a Transformer-like lightweight CNNs block (ConvUtr) as the patch embeddings of ViTs, feeding Transformer with denoised, non-redundant and highly condensed semantic information. Moreover, an adaptive Local-Global-Local (LGL) block is introduced to facilitate efficient local-to-global information flow exchange, maximizing Transformer's global context information extraction capabilities. Finally, we build an efficient medical image segmentation model (MobileUtr) based on CNN and Transformer. Extensive experiments on five public medical image datasets with three different modalities demonstrate the superiority of MobileUtr over the state-of-the-art methods, while boasting lighter weights and lower computational cost. Code is available at https://github.com/FengheTan9/MobileUtr.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
RflyMAD: A Dataset for Multicopter Fault Detection and Health Assessment
Authors:
Xiangli Le,
Bo Jin,
Gen Cui,
Xunhua Dai,
Quan Quan
Abstract:
This paper presents an open-source dataset RflyMAD, a Multicopter Abnomal Dataset developed by Reliable Flight Control (Rfly) Group aiming to promote the development of research fields like fault detection and isolation (FDI) or health assessment (HA). The entire 114 GB dataset includes 11 types of faults under 6 flight statuses which are adapted from ADS-33 file to cover more occasions in which t…
▽ More
This paper presents an open-source dataset RflyMAD, a Multicopter Abnomal Dataset developed by Reliable Flight Control (Rfly) Group aiming to promote the development of research fields like fault detection and isolation (FDI) or health assessment (HA). The entire 114 GB dataset includes 11 types of faults under 6 flight statuses which are adapted from ADS-33 file to cover more occasions in which the multicopters have different mobility levels when faults occur. In the total 5629 flight cases, the fault time is up to 3283 minutes, and there are 2566 cases for software-in-the-loop (SIL) simulation, 2566 cases for hardware-in-the-loop (HIL) simulation and 497 cases for real flight. As it contains simulation data based on RflySim and real flight data, it is possible to improve the quantity while increasing the data quality. In each case, there are ULog, Telemetry log, Flight information and processed files for researchers to use and check. The RflyMAD dataset could be used as a benchmark for fault diagnosis methods and the support relationship between simulation data and real flight is verified through transfer learning methods. More methods as a baseline will be presented in the future, and RflyMAD will be updated with more data and types. In addition, the dataset and related toolkit can be accessed through https://rfly-openha.github.io/documents/4_resources/dataset.html.
△ Less
Submitted 11 January, 2024; v1 submitted 19 November, 2023;
originally announced November 2023.
-
Slide-SAM: Medical SAM Meets Sliding Window
Authors:
Quan Quan,
Fenghe Tang,
Zikang Xu,
Heqin Zhu,
S. Kevin Zhou
Abstract:
The Segment Anything Model (SAM) has achieved a notable success in two-dimensional image segmentation in natural images. However, the substantial gap between medical and natural images hinders its direct application to medical image segmentation tasks. Particularly in 3D medical images, SAM struggles to learn contextual relationships between slices, limiting its practical applicability. Moreover,…
▽ More
The Segment Anything Model (SAM) has achieved a notable success in two-dimensional image segmentation in natural images. However, the substantial gap between medical and natural images hinders its direct application to medical image segmentation tasks. Particularly in 3D medical images, SAM struggles to learn contextual relationships between slices, limiting its practical applicability. Moreover, applying 2D SAM to 3D images requires prompting the entire volume, which is time- and label-consuming. To address these problems, we propose Slide-SAM, which treats a stack of three adjacent slices as a prediction window. It firstly takes three slices from a 3D volume and point- or bounding box prompts on the central slice as inputs to predict segmentation masks for all three slices. Subsequently, the masks of the top and bottom slices are then used to generate new prompts for adjacent slices. Finally, step-wise prediction can be achieved by sliding the prediction window forward or backward through the entire volume. Our model is trained on multiple public and private medical datasets and demonstrates its effectiveness through extensive 3D segmetnation experiments, with the help of minimal prompts. Code is available at \url{https://github.com/Curli-quan/Slide-SAM}.
△ Less
Submitted 16 April, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
A Survey on Passing-through Control of Multi-Robot Systems in Cluttered Environments
Authors:
Yan Gao,
Chenggang Bai,
Quan Quan
Abstract:
This survey presents a comprehensive review of various methods and algorithms related to passing-through control of multi-robot systems in cluttered environments. Numerous studies have investigated this area, and we identify several avenues for enhancing existing methods. This survey describes some models of robots and commonly considered control objectives, followed by an in-depth analysis of fou…
▽ More
This survey presents a comprehensive review of various methods and algorithms related to passing-through control of multi-robot systems in cluttered environments. Numerous studies have investigated this area, and we identify several avenues for enhancing existing methods. This survey describes some models of robots and commonly considered control objectives, followed by an in-depth analysis of four types of algorithms that can be employed for passing-through control: leader-follower formation control, multi-robot trajectory planning, control-based methods, and virtual tube planning and control. Furthermore, we conduct a comparative analysis of these techniques and provide some subjective and general evaluations.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Speed and Density Planning for a Speed-Constrained Robot Swarm Through a Virtual Tube
Authors:
Wenqi Song,
Yan Gao,
Quan Quan
Abstract:
The planning and control of a robot swarm in a complex environment have attracted increasing attention. To this end, the idea of virtual tubes has been taken up in our previous work. Specifically, a virtual tube with varying widths has been planned to avoid collisions with obstacles in a complex environment. Based on the planned virtual tube for a large number of speed-constrained robots, the aver…
▽ More
The planning and control of a robot swarm in a complex environment have attracted increasing attention. To this end, the idea of virtual tubes has been taken up in our previous work. Specifically, a virtual tube with varying widths has been planned to avoid collisions with obstacles in a complex environment. Based on the planned virtual tube for a large number of speed-constrained robots, the average forward speed and density along the virtual tube are further planned in this paper to ensure safety and improve efficiency. Compared with the existing methods, the proposed method is based on global information and can be applied to traversing narrow spaces for speed-constrained robot swarms. Numerical simulations and experiments are conducted to show that the safety and efficiency of the passing-through process are improved. A video about simulations and experiments is available on https://youtu.be/lJHdMQMqSpc.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Autonomous Drone Racing: Time-Optimal Spatial Iterative Learning Control within a Virtual Tube
Authors:
Shuli Lv,
Yan Gao,
Jiaxing Che,
Quan Quan
Abstract:
It is often necessary for drones to complete delivery, photography, and rescue in the shortest time to increase efficiency. Many autonomous drone races provide platforms to pursue algorithms to finish races as quickly as possible for the above purpose. Unfortunately, existing methods often fail to keep training and racing time short in drone racing competitions. This motivates us to develop a high…
▽ More
It is often necessary for drones to complete delivery, photography, and rescue in the shortest time to increase efficiency. Many autonomous drone races provide platforms to pursue algorithms to finish races as quickly as possible for the above purpose. Unfortunately, existing methods often fail to keep training and racing time short in drone racing competitions. This motivates us to develop a high-efficient learning method by imitating the training experience of top racing drivers. Unlike traditional iterative learning control methods for accurate tracking, the proposed approach iteratively learns a trajectory online to finish the race as quickly as possible. Simulations and experiments using different models show that the proposed approach is model-free and is able to achieve the optimal result with low computation requirements. Furthermore, this approach surpasses some state-of-the-art methods in racing time on a benchmark drone racing platform. An experiment on a real quadcopter is also performed to demonstrate its effectiveness.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
UOD: Universal One-shot Detection of Anatomical Landmarks
Authors:
Heqin Zhu,
Quan Quan,
Qingsong Yao,
Zaiyi Liu,
S. Kevin Zhou
Abstract:
One-shot medical landmark detection gains much attention and achieves great success for its label-efficient training process. However, existing one-shot learning methods are highly specialized in a single domain and suffer domain preference heavily in the situation of multi-domain unlabeled data. Moreover, one-shot learning is not robust that it faces performance drop when annotating a sub-optimal…
▽ More
One-shot medical landmark detection gains much attention and achieves great success for its label-efficient training process. However, existing one-shot learning methods are highly specialized in a single domain and suffer domain preference heavily in the situation of multi-domain unlabeled data. Moreover, one-shot learning is not robust that it faces performance drop when annotating a sub-optimal image. To tackle these issues, we resort to developing a domain-adaptive one-shot landmark detection framework for handling multi-domain medical images, named Universal One-shot Detection (UOD). UOD consists of two stages and two corresponding universal models which are designed as combinations of domain-specific modules and domain-shared modules. In the first stage, a domain-adaptive convolution model is self-supervised learned to generate pseudo landmark labels. In the second stage, we design a domain-adaptive transformer to eliminate domain preference and build the global context for multi-domain data. Even though only one annotated sample from each domain is available for training, the domain-shared modules help UOD aggregate all one-shot samples to detect more robust and accurate landmarks. We investigated both qualitatively and quantitatively the proposed UOD on three widely-used public X-ray datasets in different anatomical domains (i.e., head, hand, chest) and obtained state-of-the-art performances in each domain. The code is available at https://github.com/heqin-zhu/UOD_universal_oneshot_detection.
△ Less
Submitted 17 July, 2023; v1 submitted 13 June, 2023;
originally announced June 2023.
-
Unsupervised augmentation optimization for few-shot medical image segmentation
Authors:
Quan Quan,
Shang Zhao,
Qingsong Yao,
Heqin Zhu,
S. Kevin Zhou
Abstract:
The augmentation parameters matter to few-shot semantic segmentation since they directly affect the training outcome by feeding the networks with varying perturbated samples. However, searching optimal augmentation parameters for few-shot segmentation models without annotations is a challenge that current methods fail to address. In this paper, we first propose a framework to determine the ``optim…
▽ More
The augmentation parameters matter to few-shot semantic segmentation since they directly affect the training outcome by feeding the networks with varying perturbated samples. However, searching optimal augmentation parameters for few-shot segmentation models without annotations is a challenge that current methods fail to address. In this paper, we first propose a framework to determine the ``optimal'' parameters without human annotations by solving a distribution-matching problem between the intra-instance and intra-class similarity distribution, with the intra-instance similarity describing the similarity between the original sample of a particular anatomy and its augmented ones and the intra-class similarity representing the similarity between the selected sample and the others in the same class. Extensive experiments demonstrate the superiority of our optimized augmentation in boosting few-shot segmentation models. We greatly improve the top competing method by 1.27\% and 1.11\% on Abd-MRI and Abd-CT datasets, respectively, and even achieve a significant improvement for SSL-ALP on the left kidney by 3.39\% on the Abd-CT dataset.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
An Image Based Visual Servo Method for Probe-and-Drogue Autonomous Aerial Refueling
Authors:
Quan Quan,
Runxiao Liu,
Hao Liu,
Zeqing Ma,
Jinrui Ren
Abstract:
With the high focus on autonomous aerial refueling recently, it becomes increasingly urgent to design efficient methods or algorithms to solve AAR problems in complicated aerial environments. Apart from the complex aerodynamic disturbance, another problem is the pose estimation error caused by the camera calibration error, installation error, or 3D object modeling error, which may not satisfy the…
▽ More
With the high focus on autonomous aerial refueling recently, it becomes increasingly urgent to design efficient methods or algorithms to solve AAR problems in complicated aerial environments. Apart from the complex aerodynamic disturbance, another problem is the pose estimation error caused by the camera calibration error, installation error, or 3D object modeling error, which may not satisfy the highly accurate docking. The main objective of the effort described in this paper is the implementation of an image-based visual servo control method, which contains the establishment of an image-based visual servo model involving the receiver's dynamics and the design of the corresponding controller. Simulation results indicate that the proposed method can make the system dock successfully under complicated conditions and improve the robustness against pose estimation error.
△ Less
Submitted 27 May, 2023;
originally announced May 2023.
-
Optimal Virtual Tube Planning and Control for Swarm Robotics
Authors:
Pengda Mao,
Rao Fu,
Quan Quan
Abstract:
This paper presents a novel method for efficiently solving a trajectory planning problem for swarm robotics in cluttered environments. Recent research has demonstrated high success rates in real-time local trajectory planning for swarm robotics in cluttered environments, but optimizing trajectories for each robot is still computationally expensive, with a computational complexity from…
▽ More
This paper presents a novel method for efficiently solving a trajectory planning problem for swarm robotics in cluttered environments. Recent research has demonstrated high success rates in real-time local trajectory planning for swarm robotics in cluttered environments, but optimizing trajectories for each robot is still computationally expensive, with a computational complexity from $O\left(k\left(n_t,\varepsilon \right)n_t^2\right)$ to $ O\left(k\left(n_t,\varepsilon \right)n_t^3\right)$ where $n_t$ is the number of parameters in the parameterized trajectory, $\varepsilon$ is precision and $k\left(n_t,\varepsilon \right)$ is the number of iterations with respect to $n_t$ and $\varepsilon$. Furthermore, the swarm is difficult to move as a group. To address this issue, we define and then construct the optimal virtual tube, which includes infinite optimal trajectories. Under certain conditions, any optimal trajectory in the optimal virtual tube can be expressed as a convex combination of a finite number of optimal trajectories, with a computational complexity of $O\left(n_t\right)$. Afterward, a hierarchical approach including a planning method of the optimal virtual tube with minimizing energy and distributed model predictive control is proposed. In simulations and experiments, the proposed approach is validated and its effectiveness over other methods is demonstrated through comparison.
△ Less
Submitted 23 October, 2023; v1 submitted 22 April, 2023;
originally announced April 2023.
-
Conductivity Imaging from Internal Measurements with Mixed Least-Squares Deep Neural Networks
Authors:
Bangti Jin,
Xiyao Li,
Qimeng Quan,
Zhi Zhou
Abstract:
In this work we develop a novel approach using deep neural networks to reconstruct the conductivity distribution in elliptic problems from one measurement of the solution over the whole domain. The approach is based on a mixed reformulation of the governing equation and utilizes the standard least-squares objective, with deep neural networks as ansatz functions to approximate the conductivity and…
▽ More
In this work we develop a novel approach using deep neural networks to reconstruct the conductivity distribution in elliptic problems from one measurement of the solution over the whole domain. The approach is based on a mixed reformulation of the governing equation and utilizes the standard least-squares objective, with deep neural networks as ansatz functions to approximate the conductivity and flux simultaneously. We provide a thorough analysis of the deep neural network approximations of the conductivity for both continuous and empirical losses, including rigorous error estimates that are explicit in terms of the noise level, various penalty parameters and neural network architectural parameters (depth, width and parameter bound). We also provide multiple numerical experiments in two- and multi-dimensions to illustrate distinct features of the approach, e.g., excellent stability with respect to data noise and capability of solving high-dimensional problems.
△ Less
Submitted 19 December, 2023; v1 submitted 29 March, 2023;
originally announced March 2023.
-
GDDS: Pulmonary Bronchioles Segmentation with Group Deep Dense Supervision
Authors:
Mingyue Zhao,
Shang Zhao,
Quan Quan,
Li Fan,
Xiaolan Qiu,
Shiyuan Liu,
S. Kevin Zhou
Abstract:
Airway segmentation, especially bronchioles segmentation, is an important but challenging task because distal bronchus are sparsely distributed and of a fine scale. Existing neural networks usually exploit sparse topology to learn the connectivity of bronchioles and inefficient shallow features to capture such high-frequency information, leading to the breakage or missed detection of individual th…
▽ More
Airway segmentation, especially bronchioles segmentation, is an important but challenging task because distal bronchus are sparsely distributed and of a fine scale. Existing neural networks usually exploit sparse topology to learn the connectivity of bronchioles and inefficient shallow features to capture such high-frequency information, leading to the breakage or missed detection of individual thin branches. To address these problems, we contribute a new bronchial segmentation method based on Group Deep Dense Supervision (GDDS) that emphasizes fine-scale bronchioles segmentation in a simple-but-effective manner. First, Deep Dense Supervision (DDS) is proposed by constructing local dense topology skillfully and implementing dense topological learning on a specific shallow feature layer. GDDS further empowers the shallow features with better perception ability to detect bronchioles, even the ones that are not easily discernible to the naked eye. Extensive experiments on the BAS benchmark dataset have shown that our method promotes the network to have a high sensitivity in capturing fine-scale branches and outperforms state-of-the-art methods by a large margin (+12.8 % in BD and +8.8 % in TD) while only introducing a small number of extra parameters.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
FairAdaBN: Mitigating unfairness with adaptive batch normalization and its application to dermatological disease classification
Authors:
Zikang Xu,
Shang Zhao,
Quan Quan,
Qingsong Yao,
S. Kevin Zhou
Abstract:
Deep learning is becoming increasingly ubiquitous in medical research and applications while involving sensitive information and even critical diagnosis decisions. Researchers observe a significant performance disparity among subgroups with different demographic attributes, which is called model unfairness, and put lots of effort into carefully designing elegant architectures to address unfairness…
▽ More
Deep learning is becoming increasingly ubiquitous in medical research and applications while involving sensitive information and even critical diagnosis decisions. Researchers observe a significant performance disparity among subgroups with different demographic attributes, which is called model unfairness, and put lots of effort into carefully designing elegant architectures to address unfairness, which poses heavy training burden, brings poor generalization, and reveals the trade-off between model performance and fairness. To tackle these issues, we propose FairAdaBN by making batch normalization adaptive to sensitive attribute. This simple but effective design can be adopted to several classification backbones that are originally unaware of fairness. Additionally, we derive a novel loss function that restrains statistical parity between subgroups on mini-batches, encouraging the model to converge with considerable fairness. In order to evaluate the trade-off between model performance and fairness, we propose a new metric, named Fairness-Accuracy Trade-off Efficiency (FATE), to compute normalized fairness improvement over accuracy drop. Experiments on two dermatological datasets show that our proposed method outperforms other methods on fairness criteria and FATE.
△ Less
Submitted 4 July, 2023; v1 submitted 14 March, 2023;
originally announced March 2023.
-
Lifting-wing Quadcopter Modeling and Unified Control
Authors:
Quan Quan,
Wang Shuai,
Gao Wenhan
Abstract:
Hybrid unmanned aerial vehicles (UAVs) integrate the efficient forward flight of fixed-wing and vertical takeoff and landing (VTOL) capabilities of multicopter UAVs. This paper presents the modeling, control and simulation of a new type of hybrid micro-small UAVs, coined as lifting-wing quadcopters. The airframe orientation of the lifting wing needs to tilt a specific angle often within $ 45$ degr…
▽ More
Hybrid unmanned aerial vehicles (UAVs) integrate the efficient forward flight of fixed-wing and vertical takeoff and landing (VTOL) capabilities of multicopter UAVs. This paper presents the modeling, control and simulation of a new type of hybrid micro-small UAVs, coined as lifting-wing quadcopters. The airframe orientation of the lifting wing needs to tilt a specific angle often within $ 45$ degrees, neither nearly $ 90$ nor approximately $ 0$ degrees. Compared with some convertiplane and tail-sitter UAVs, the lifting-wing quadcopter has a highly reliable structure, robust wind resistance, low cruise speed and reliable transition flight, making it potential to work fully-autonomous outdoor or some confined airspace indoor. In the modeling part, forces and moments generated by both lifting wing and rotors are considered. Based on the established model, a unified controller for the full flight phase is designed. The controller has the capability of uniformly treating the hovering and forward flight, and enables a continuous transition between two modes, depending on the velocity command. What is more, by taking rotor thrust and aerodynamic force under consideration simultaneously, a control allocation based on optimization is utilized to realize cooperative control for energy saving. Finally, comprehensive Hardware-In-the-Loop (HIL) simulations are performed to verify the advantages of the designed aircraft and the proposed controller.
△ Less
Submitted 2 January, 2023;
originally announced January 2023.
-
Differential Flatness of Lifting-Wing Quadcopters Subject to Drag and Lift for Accurate Tracking
Authors:
Shuai Wang,
Wenhan Gao,
Quan Quan
Abstract:
In this paper, we propose an effective unified control law for accurately tracking agile trajectories for lifting-wing quadcopters with different installation angles, which have the capability of vertical takeoff and landing (VTOL) as well as high-speed cruise flight. First, we derive a differential flatness transform for the lifting-wing dynamics with a nonlinear model under coordinated turn cond…
▽ More
In this paper, we propose an effective unified control law for accurately tracking agile trajectories for lifting-wing quadcopters with different installation angles, which have the capability of vertical takeoff and landing (VTOL) as well as high-speed cruise flight. First, we derive a differential flatness transform for the lifting-wing dynamics with a nonlinear model under coordinated turn condition. To increase the tracking performance on agile trajectories, the proposed controller incorporates the state and input variables calculated from differential flatness as feedforward. In particular, the jerk, the 3-order derivative of the trajectory, is converted into angular velocity as a feedforward item, which significantly improves the system bandwidth. At the same time, feedback and feedforward outputs are combined to deal with external disturbances and model mismatch. The control algorithm has been thoroughly evaluated in the outdoor flight tests, which show that it can achieve accurate trajectory tracking.
△ Less
Submitted 25 December, 2022;
originally announced December 2022.
-
Distributed Control within a Trapezoid Virtual Tube Containing Obstacles for Robotic Swarms Subject to Speed Constraints
Authors:
Yan Gao,
Chenggang Bai,
Quan Quan
Abstract:
In our previous work, we design a trapezoid virtual tube to guide robotic swarms through narrow openings. This paper extends the application of the trapezoid virtual tube to the situations where there are static obstacles inside and robots have strict speed constraints. We first propose a distributed swarm controller for the trapezoid virtual tube without obstacles and present the relationship bet…
▽ More
In our previous work, we design a trapezoid virtual tube to guide robotic swarms through narrow openings. This paper extends the application of the trapezoid virtual tube to the situations where there are static obstacles inside and robots have strict speed constraints. We first propose a distributed swarm controller for the trapezoid virtual tube without obstacles and present the relationship between the trapezoid virtual tube and speed constraints. Then a switching logic for obstacle avoidance is proposed by dividing the trapezoid virtual tube containing static obstacles into several sub trapezoid virtual tubes without obstacles. Formal analyses and proofs are presented to demonstrate that all robots can pass through the trapezoid virtual tube safely. Besides, we validate the effectiveness of our method through numerical simulations and real experiments.
△ Less
Submitted 22 September, 2024; v1 submitted 23 December, 2022;
originally announced December 2022.
-
Uniform Passive Fault-Tolerant Control of a Quadcopter with One, Two, or Three Rotor Failure
Authors:
Chenxu Ke,
Kai-Yuan Cai,
Quan Quan
Abstract:
This study proposes a uniform passive fault-tolerant control (FTC) method for a quadcopter that does not rely on fault information subject to one, two adjacent, two opposite, or three rotors failure. The uniform control implies that the passive FTC is able to cover the condition from quadcopter fault-free to rotor failure without controller switching. To achieve the purpose of the passive FTC, the…
▽ More
This study proposes a uniform passive fault-tolerant control (FTC) method for a quadcopter that does not rely on fault information subject to one, two adjacent, two opposite, or three rotors failure. The uniform control implies that the passive FTC is able to cover the condition from quadcopter fault-free to rotor failure without controller switching. To achieve the purpose of the passive FTC, the rotors' fault is modeled as a disturbance acting on the virtual control of the quadcopter system. The disturbance estimate is used directly for the passive FTC with rotor failure. To avoid controller switching between normal control and FTC, a dynamic control allocation is used. In addition, the closed-loop stability has been analyzed and a virtual control feedback is adopted to achieve the passive FTC for the quadcopter with two and three rotor failure. To validate the proposed uniform passive FTC method, outdoor experiments are performed for the first time, which have demonstrated that the hovering quadcopter is able to recover from one rotor failure by the proposed controller and continue to fly even if two adjacent, two opposite, or three rotors fail, without any rotor fault information and controller switching.
△ Less
Submitted 25 December, 2022; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Information-guided pixel augmentation for pixel-wise contrastive learning
Authors:
Quan Quan,
Qingsong Yao,
Jun Li,
S. kevin Zhou
Abstract:
Contrastive learning (CL) is a form of self-supervised learning and has been widely used for various tasks. Different from widely studied instance-level contrastive learning, pixel-wise contrastive learning mainly helps with pixel-wise tasks such as medical landmark detection. The counterpart to an instance in instance-level CL is a pixel, along with its neighboring context, in pixel-wise CL. Aimi…
▽ More
Contrastive learning (CL) is a form of self-supervised learning and has been widely used for various tasks. Different from widely studied instance-level contrastive learning, pixel-wise contrastive learning mainly helps with pixel-wise tasks such as medical landmark detection. The counterpart to an instance in instance-level CL is a pixel, along with its neighboring context, in pixel-wise CL. Aiming to build better feature representation, there is a vast literature about designing instance augmentation strategies for instance-level CL; but there is little similar work on pixel augmentation for pixel-wise CL with a pixel granularity. In this paper, we attempt to bridge this gap. We first classify a pixel into three categories, namely low-, medium-, and high-informative, based on the information quantity the pixel contains. Inspired by the ``InfoMin" principle, we then design separate augmentation strategies for each category in terms of augmentation intensity and sampling ratio. Extensive experiments validate that our information-guided pixel augmentation strategy succeeds in encoding more discriminative representations and surpassing other competitive approaches in unsupervised local feature matching. Furthermore, our pretrained model improves the performance of both one-shot and fully supervised models. To the best of our knowledge, we are the first to propose a pixel augmentation method with a pixel granularity for enhancing unsupervised pixel-wise contrastive learning.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
OA-Bug: An Olfactory-Auditory Augmented Bug Algorithm for Swarm Robots in a Denied Environment
Authors:
Siqi Tan,
Xiaoya Zhang,
Jingyao Li,
Ruitao Jing,
Mufan Zhao,
Yang Liu,
Quan Quan
Abstract:
Searching in a denied environment is challenging for swarm robots as no assistance from GNSS, mapping, data sharing, and central processing is allowed. However, using olfactory and auditory signals to cooperate like animals could be an important way to improve the collaboration of swarm robots. In this paper, an Olfactory-Auditory augmented Bug algorithm (OA-Bug) is proposed for a swarm of autonom…
▽ More
Searching in a denied environment is challenging for swarm robots as no assistance from GNSS, mapping, data sharing, and central processing is allowed. However, using olfactory and auditory signals to cooperate like animals could be an important way to improve the collaboration of swarm robots. In this paper, an Olfactory-Auditory augmented Bug algorithm (OA-Bug) is proposed for a swarm of autonomous robots to explore a denied environment. A simulation environment is built to measure the performance of OA-Bug. The coverage of the search task can reach 96.93% using OA-Bug, which is significantly improved compared with a similar algorithm, SGBA. Furthermore, experiments are conducted on real swarm robots to prove the validity of OA-Bug. Results show that OA-Bug can improve the performance of swarm robots in a denied environment.
△ Less
Submitted 29 September, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Distributed Control for a Multi-Agent System to Pass through a Connected Quadrangle Virtual Tube
Authors:
Yan Gao,
Chenggang Bai,
Quan Quan
Abstract:
In order to guide the multi-agent system in a cluttered environment, a connected quadrangle virtual tube is designed for all agents to keep moving within it, whose basis is called the single trapezoid virtual tube. There is no obstacle inside the tube, namely the area inside the tube can be seen as a safety zone. Then, a distributed swarm controller is proposed for the single trapezoid virtual tub…
▽ More
In order to guide the multi-agent system in a cluttered environment, a connected quadrangle virtual tube is designed for all agents to keep moving within it, whose basis is called the single trapezoid virtual tube. There is no obstacle inside the tube, namely the area inside the tube can be seen as a safety zone. Then, a distributed swarm controller is proposed for the single trapezoid virtual tube passing problem. This issue is resolved by a gradient vector field method with no local minima. Formal analyses and proofs are made to show that all agents are able to pass the single trapezoid virtual tube. Finally, a modified controller is put forward for convenience in practical use. For the connected quadrangle virtual tube, a modified switching logic is proposed to avoid the deadlock and prevent agents from moving outside the virtual tube. Finally, the effectiveness of the proposed method is validated by numerical simulations and real experiments.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Control with Patterns: A D-learning Method
Authors:
Quan Quan,
Kai-Yuan Cai,
Chenyu Wang
Abstract:
Learning-based control policies are widely used in various tasks in the field of robotics and control. However, formal (Lyapunov) stability guarantees for learning-based controllers with nonlinear dynamical systems are difficult to obtain. We propose a novel control approach, namely Control with Patterns (CWP), to address the stability issue over data sets corresponding to nonlinear dynamical syst…
▽ More
Learning-based control policies are widely used in various tasks in the field of robotics and control. However, formal (Lyapunov) stability guarantees for learning-based controllers with nonlinear dynamical systems are difficult to obtain. We propose a novel control approach, namely Control with Patterns (CWP), to address the stability issue over data sets corresponding to nonlinear dynamical systems. For such data sets, we introduce a new definition, namely exponential attraction on data sets, to describe the nonlinear dynamical systems under consideration. The problem of exponential attraction on data sets is transformed into a problem of pattern classification one based on the data sets and parameterized Lyapunov functions. Furthermore, D-learning is proposed as a method to perform CWP without knowledge of the system dynamics. Finally, the effectiveness of CWP based on D-learning is demonstrated through simulations and real flight experiments. In these experiments, the position of the multicopter is stabilized using real-time images as feedback, which can be considered as an Image-Based Visual Servoing (IBVS) problem.
△ Less
Submitted 14 September, 2024; v1 submitted 8 June, 2022;
originally announced June 2022.
-
Reliable Flight Control: Gravity-Compensation-First Principle
Authors:
Quan Quan
Abstract:
Safety is always the priority in aviation. However, current state-of-the-art passive fault-tolerant control is too conservative to use; current state-of-the-art active fault-tolerant control requires time to perform fault detection and diagnosis, and control switching. But it may be later to recover impaired aircraft. Most designs depend on failures determined as a priori and cannot deal with faul…
▽ More
Safety is always the priority in aviation. However, current state-of-the-art passive fault-tolerant control is too conservative to use; current state-of-the-art active fault-tolerant control requires time to perform fault detection and diagnosis, and control switching. But it may be later to recover impaired aircraft. Most designs depend on failures determined as a priori and cannot deal with fault, causing the original system's state to be uncontrollable. However, experienced human pilots can save a serve impaired aircraft as far as they can. Motivated by this, this paper develops a principle to try to explain human pilot behavior behind, coined the gravity-compensation-first principle. This further supports reliable flight control for aircraft such as quadcopters and tail-sitter unmanned aerial vehicles.
△ Less
Submitted 8 June, 2022;
originally announced June 2022.
-
Robust Distributed Control within a Curve Virtual Tube for a Robotic Swarm under Self-Localization Drift and Precise Relative Navigation
Authors:
Yan Gao,
Chenggang Bai,
Quan Quan
Abstract:
To guide the movement of a robotic swarm in a corridor-like environment, a curve virtual tube with no obstacle inside is designed in our previous work. This paper generalizes the controller design to the condition that all robots have self-localization drifts and precise relative navigation, where the flocking algorithm is introduced to reduce the negative impact of the self-localization drift. It…
▽ More
To guide the movement of a robotic swarm in a corridor-like environment, a curve virtual tube with no obstacle inside is designed in our previous work. This paper generalizes the controller design to the condition that all robots have self-localization drifts and precise relative navigation, where the flocking algorithm is introduced to reduce the negative impact of the self-localization drift. It is shown that the cohesion behavior and the velocity alignment behavior are able to reduce the influence of the position measurement drift and the velocity measurement error, respectively. For the convenience in practical use, a modified vector field controller with five control terms is put forward. Finally, the effectiveness of the proposed method is validated by numerical simulations and real experiments.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
Recovering medical images from CT film photos
Authors:
Quan Quan,
Qiyuan Wang,
Yuanqi Du,
Liu Li,
S. Kevin Zhou
Abstract:
While medical images such as computed tomography (CT) are stored in DICOM format in hospital PACS, it is still quite routine in many countries to print a film as a transferable medium for the purposes of self-storage and secondary consultation. Also, with the ubiquitousness of mobile phone cameras, it is quite common to take pictures of CT films, which unfortunately suffer from geometric deformati…
▽ More
While medical images such as computed tomography (CT) are stored in DICOM format in hospital PACS, it is still quite routine in many countries to print a film as a transferable medium for the purposes of self-storage and secondary consultation. Also, with the ubiquitousness of mobile phone cameras, it is quite common to take pictures of CT films, which unfortunately suffer from geometric deformation and illumination variation. In this work, we study the problem of recovering a CT film, which marks \textbf{the first attempt} in the literature, to the best of our knowledge. We start with building a large-scale head CT film database CTFilm20K, consisting of approximately 20,000 pictures, using the widely used computer graphics software Blender. We also record all accompanying information related to the geometric deformation (such as 3D coordinate, depth, normal, and UV maps) and illumination variation (such as albedo map). Then we propose a deep framework called \textbf{F}ilm \textbf{I}mage \textbf{Re}covery \textbf{Net}work (\textbf{FIReNet}) to tackle geometric deformation and illumination variation using the multiple maps extracted from the CT films to collaboratively guide the recovery process. Finally, we convert the dewarped images to DICOM files with our cascade model for further analysis such as radiomics feature extraction. Extensive experiments demonstrate the superiority of our approach over the previous approaches. We plan to open source the simulated images and deep models for promoting the research on CT film image analysis.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
MixCL: Pixel label matters to contrastive learning
Authors:
Jun Li,
Quan Quan,
S. Kevin Zhou
Abstract:
Contrastive learning and self-supervised techniques have gained prevalence in computer vision for the past few years. It is essential for medical image analysis, which is often notorious for its lack of annotations. Most existing self-supervised methods applied in natural imaging tasks focus on designing proxy tasks for unlabeled data. For example, contrastive learning is often based on the fact t…
▽ More
Contrastive learning and self-supervised techniques have gained prevalence in computer vision for the past few years. It is essential for medical image analysis, which is often notorious for its lack of annotations. Most existing self-supervised methods applied in natural imaging tasks focus on designing proxy tasks for unlabeled data. For example, contrastive learning is often based on the fact that an image and its transformed version share the same identity. However, pixel annotations contain much valuable information for medical image segmentation, which is largely ignored in contrastive learning. In this work, we propose a novel pre-training framework called Mixed Contrastive Learning (MixCL) that leverages both image identities and pixel labels for better modeling by maintaining identity consistency, label consistency, and reconstruction consistency together. Consequently, thus pre-trained model has more robust representations that characterize medical images. Extensive experiments demonstrate the effectiveness of the proposed method, improving the baseline by 5.28% and 14.12% in Dice coefficient when 5% labeled data of Spleen and 15% of BTVC are used in fine-tuning, respectively.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Universal Segmentation of 33 Anatomies
Authors:
Pengbo Liu,
Yang Deng,
Ce Wang,
Yuan Hui,
Qian Li,
Jun Li,
Shiwei Luo,
Mengke Sun,
Quan Quan,
Shuxin Yang,
You Hao,
Honghu Xiao,
Chunpeng Zhao,
Xinbao Wu,
S. Kevin Zhou
Abstract:
In the paper, we present an approach for learning a single model that universally segments 33 anatomical structures, including vertebrae, pelvic bones, and abdominal organs. Our model building has to address the following challenges. Firstly, while it is ideal to learn such a model from a large-scale, fully-annotated dataset, it is practically hard to curate such a dataset. Thus, we resort to lear…
▽ More
In the paper, we present an approach for learning a single model that universally segments 33 anatomical structures, including vertebrae, pelvic bones, and abdominal organs. Our model building has to address the following challenges. Firstly, while it is ideal to learn such a model from a large-scale, fully-annotated dataset, it is practically hard to curate such a dataset. Thus, we resort to learn from a union of multiple datasets, with each dataset containing the images that are partially labeled. Secondly, along the line of partial labelling, we contribute an open-source, large-scale vertebra segmentation dataset for the benefit of spine analysis community, CTSpine1K, boasting over 1,000 3D volumes and over 11K annotated vertebrae. Thirdly, in a 3D medical image segmentation task, due to the limitation of GPU memory, we always train a model using cropped patches as inputs instead a whole 3D volume, which limits the amount of contextual information to be learned. To this, we propose a cross-patch transformer module to fuse more information in adjacent patches, which enlarges the aggregated receptive field for improved segmentation performance. This is especially important for segmenting, say, the elongated spine. Based on 7 partially labeled datasets that collectively contain about 2,800 3D volumes, we successfully learn such a universal model. Finally, we evaluate the universal model on multiple open-source datasets, proving that our model has a good generalization performance and can potentially serve as a solid foundation for downstream tasks.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Relative distance matters for one-shot landmark detection
Authors:
Qingsong Yao,
Jianji Wang,
Yihua Sun,
Quan Quan,
Heqin Zhu,
S. Kevin Zhou
Abstract:
Contrastive learning based methods such as cascade comparing to detect (CC2D) have shown great potential for one-shot medical landmark detection. However, the important cue of relative distance between landmarks is ignored in CC2D. In this paper, we upgrade CC2D to version II by incorporating a simple-yet-effective relative distance bias in the training stage, which is theoretically proved to enco…
▽ More
Contrastive learning based methods such as cascade comparing to detect (CC2D) have shown great potential for one-shot medical landmark detection. However, the important cue of relative distance between landmarks is ignored in CC2D. In this paper, we upgrade CC2D to version II by incorporating a simple-yet-effective relative distance bias in the training stage, which is theoretically proved to encourage the encoder to project the relatively distant landmarks to the embeddings with low similarities. As consequence, CC2Dv2 is less possible to detect a wrong point far from the correct landmark. Furthermore, we present an open-source, landmark-labeled dataset for the measurement of biomechanical parameters of the lower extremity to alleviate the burden of orthopedic surgeons. The effectiveness of CC2Dv2 is evaluated on the public dataset from the ISBI 2015 Grand-Challenge of cephalometric radiographs and our new dataset, which greatly outperforms the state-of-the-art one-shot landmark detection approaches.
△ Less
Submitted 4 March, 2022; v1 submitted 3 March, 2022;
originally announced March 2022.
-
Which images to label for few-shot medical landmark detection?
Authors:
Quan Quan,
Qingsong Yao,
Jun Li,
S. Kevin Zhou
Abstract:
The success of deep learning methods relies on the availability of well-labeled large-scale datasets. However, for medical images, annotating such abundant training data often requires experienced radiologists and consumes their limited time. Few-shot learning is developed to alleviate this burden, which achieves competitive performances with only several labeled data. However, a crucial yet previ…
▽ More
The success of deep learning methods relies on the availability of well-labeled large-scale datasets. However, for medical images, annotating such abundant training data often requires experienced radiologists and consumes their limited time. Few-shot learning is developed to alleviate this burden, which achieves competitive performances with only several labeled data. However, a crucial yet previously overlooked problem in few-shot learning is about the selection of template images for annotation before learning, which affects the final performance. We herein propose a novel Sample Choosing Policy (SCP) to select "the most worthy" images for annotation, in the context of few-shot medical landmark detection. SCP consists of three parts: 1) Self-supervised training for building a pre-trained deep model to extract features from radiological images, 2) Key Point Proposal for localizing informative patches, and 3) Representative Score Estimation for searching the most representative samples or templates. The advantage of SCP is demonstrated by various experiments on three widely-used public datasets. For one-shot medical landmark detection, its use reduces the mean radial errors on Cephalometric and HandXray datasets by 14.2% (from 3.595mm to 3.083mm) and 35.5% (4.114mm to 2.653mm), respectively.
△ Less
Submitted 28 April, 2024; v1 submitted 7 December, 2021;
originally announced December 2021.
-
Distributed Control for a Robotic Swarm to Pass through a Curve Virtual Tube
Authors:
Quan Quan,
Yan Gao,
Chenggang Bai
Abstract:
Robotic swarm systems are now becoming increasingly attractive for many challenging applications. The main task for any robot is to reach the destination while keeping a safe separation from other robots and obstacles. In many scenarios, robots need to move within a narrow corridor, through a window or a doorframe. In order to guide all robots to move in a cluttered environment, a curve virtual tu…
▽ More
Robotic swarm systems are now becoming increasingly attractive for many challenging applications. The main task for any robot is to reach the destination while keeping a safe separation from other robots and obstacles. In many scenarios, robots need to move within a narrow corridor, through a window or a doorframe. In order to guide all robots to move in a cluttered environment, a curve virtual tube with no obstacle inside is carefully designed in this paper. There is no obstacle inside the tube, namely the area inside the tube can be seen as a safety zone. Then, a distributed swarm controller is proposed with three elaborate control terms: a line approaching term, a robot avoidance term and a tube keeping term. Formal analysis and proofs are made to show that the curve virtual tube passing problem can be solved in a finite time. For the convenience in practical use, a modified controller with an approximate control performance is put forward. Finally, the effectiveness of the proposed method is validated by numerical simulations and real experiments. To show the advantages of the proposed method, the comparison between our method and the control barrier function method is also presented in terms of calculation speed.
△ Less
Submitted 17 May, 2022; v1 submitted 2 December, 2021;
originally announced December 2021.
-
Practical Distributed Control for Cooperative Multicopters in Structured Free Flight Concepts
Authors:
Rao Fu,
Quan Quan,
Mengxin Li,
Kai-Yuan Cai
Abstract:
Unmanned Aerial Vehicles (UAVs) are now becoming increasingly accessible to amateur and com-mercial users alike. Several types of airspace structures are proposed in recent research, which include several structured free flight concepts. In this paper, for simplic-ity, distributed coordinating the motions of multicopters in structured airspace concepts is focused. This is formulated as a free flig…
▽ More
Unmanned Aerial Vehicles (UAVs) are now becoming increasingly accessible to amateur and com-mercial users alike. Several types of airspace structures are proposed in recent research, which include several structured free flight concepts. In this paper, for simplic-ity, distributed coordinating the motions of multicopters in structured airspace concepts is focused. This is formulated as a free flight problem, which includes convergence to destination lines and inter-agent collision avoidance. The destination line of each multicopter is known a priori. Further, Lyapunov-like functions are designed elaborately, and formal analysis and proofs of the proposed distributed control are made to show that the free flight control problem can be solved. What is more, by the proposed controller, a multicopter can keep away from another as soon as possible, once it enters into the safety area of another one. Simulations and experiments are given to show the effectiveness of the proposed method.
△ Less
Submitted 22 November, 2021;
originally announced November 2021.
-
How Far Two UAVs Should Be subject to Communication Uncertainties
Authors:
Quan Quan,
Rao Fu,
Kai-Yuan
Abstract:
Unmanned aerial vehicles are now becoming increasingly accessible to amateur and commercial users alike. A safety air traffic management system is needed to help ensure that every newest entrant into the sky does not collide with others. Much research has been done to design various methods to perform collision avoidance with obstacles. However, how to decide the safety radius subject to communica…
▽ More
Unmanned aerial vehicles are now becoming increasingly accessible to amateur and commercial users alike. A safety air traffic management system is needed to help ensure that every newest entrant into the sky does not collide with others. Much research has been done to design various methods to perform collision avoidance with obstacles. However, how to decide the safety radius subject to communication uncertainties is still suspended. Based on assumptions on communication uncertainties and supposed control performance, a separation principle of the safety radius design and controller design is proposed. With it, the safety radius corresponding to the safety area in the design phase (without uncertainties) and flight phase (subject to uncertainties) are studied. Furthermore, the results are extended to multiple obstacles. Simulations and experiments are carried out to show the effectiveness of the proposed methods.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Where is the disease? Semi-supervised pseudo-normality synthesis from an abnormal image
Authors:
Yuanqi Du,
Quan Quan,
Hu Han,
S. Kevin Zhou
Abstract:
Pseudo-normality synthesis, which computationally generates a pseudo-normal image from an abnormal one (e.g., with lesions), is critical in many perspectives, from lesion detection, data augmentation to clinical surgery suggestion. However, it is challenging to generate high-quality pseudo-normal images in the absence of the lesion information. Thus, expensive lesion segmentation data have been in…
▽ More
Pseudo-normality synthesis, which computationally generates a pseudo-normal image from an abnormal one (e.g., with lesions), is critical in many perspectives, from lesion detection, data augmentation to clinical surgery suggestion. However, it is challenging to generate high-quality pseudo-normal images in the absence of the lesion information. Thus, expensive lesion segmentation data have been introduced to provide lesion information for the generative models and improve the quality of the synthetic images. In this paper, we aim to alleviate the need of a large amount of lesion segmentation data when generating pseudo-normal images. We propose a Semi-supervised Medical Image generative LEarning network (SMILE) which not only utilizes limited medical images with segmentation masks, but also leverages massive medical images without segmentation masks to generate realistic pseudo-normal images. Extensive experiments show that our model outperforms the best state-of-the-art model by up to 6% for data augmentation task and 3% in generating high-quality images. Moreover, the proposed semi-supervised learning achieves comparable medical image synthesis quality with supervised learning model, using only 50 of segmentation data.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
CTSpine1K: A Large-Scale Dataset for Spinal Vertebrae Segmentation in Computed Tomography
Authors:
Yang Deng,
Ce Wang,
Yuan Hui,
Qian Li,
Jun Li,
Shiwei Luo,
Mengke Sun,
Quan Quan,
Shuxin Yang,
You Hao,
Pengbo Liu,
Honghu Xiao,
Chunpeng Zhao,
Xinbao Wu,
S. Kevin Zhou
Abstract:
Spine-related diseases have high morbidity and cause a huge burden of social cost. Spine imaging is an essential tool for noninvasively visualizing and assessing spinal pathology. Segmenting vertebrae in computed tomography (CT) images is the basis of quantitative medical image analysis for clinical diagnosis and surgery planning of spine diseases. Current publicly available annotated datasets on…
▽ More
Spine-related diseases have high morbidity and cause a huge burden of social cost. Spine imaging is an essential tool for noninvasively visualizing and assessing spinal pathology. Segmenting vertebrae in computed tomography (CT) images is the basis of quantitative medical image analysis for clinical diagnosis and surgery planning of spine diseases. Current publicly available annotated datasets on spinal vertebrae are small in size. Due to the lack of a large-scale annotated spine image dataset, the mainstream deep learning-based segmentation methods, which are data-driven, are heavily restricted. In this paper, we introduce a large-scale spine CT dataset, called CTSpine1K, curated from multiple sources for vertebra segmentation, which contains 1,005 CT volumes with over 11,100 labeled vertebrae belonging to different spinal conditions. Based on this dataset, we conduct several spinal vertebrae segmentation experiments to set the first benchmark. We believe that this large-scale dataset will facilitate further research in many spine-related image analysis tasks, including but not limited to vertebrae segmentation, labeling, 3D spine reconstruction from biplanar radiographs, image super-resolution, and enhancement.
△ Less
Submitted 3 October, 2024; v1 submitted 31 May, 2021;
originally announced May 2021.
-
One-Shot Medical Landmark Detection
Authors:
Qingsong Yao,
Quan Quan,
Li Xiao,
S. Kevin Zhou
Abstract:
The success of deep learning methods relies on the availability of a large number of datasets with annotations; however, curating such datasets is burdensome, especially for medical images. To relieve such a burden for a landmark detection task, we explore the feasibility of using only a single annotated image and propose a novel framework named Cascade Comparing to Detect (CC2D) for one-shot land…
▽ More
The success of deep learning methods relies on the availability of a large number of datasets with annotations; however, curating such datasets is burdensome, especially for medical images. To relieve such a burden for a landmark detection task, we explore the feasibility of using only a single annotated image and propose a novel framework named Cascade Comparing to Detect (CC2D) for one-shot landmark detection. CC2D consists of two stages: 1) Self-supervised learning (CC2D-SSL) and 2) Training with pseudo-labels (CC2D-TPL). CC2D-SSL captures the consistent anatomical information in a coarse-to-fine fashion by comparing the cascade feature representations and generates predictions on the training set. CC2D-TPL further improves the performance by training a new landmark detector with those predictions. The effectiveness of CC2D is evaluated on a widely-used public dataset of cephalometric landmark detection, which achieves a competitive detection accuracy of 81.01\% within 4.0mm, comparable to the state-of-the-art fully-supervised methods using a lot more than one training image.
△ Less
Submitted 7 March, 2021;
originally announced March 2021.
-
Practical Distributed Control for VTOL UAVs to Pass a Virtual Tube
Authors:
Quan Quan,
Rao Fu,
Mengxin Li,
Donghui Wei,
Yan Gao,
Kai-Yuan Cai
Abstract:
Unmanned Aerial Vehicles (UAVs) are now becoming increasingly accessible to amateur and commercial users alike. An air traffic management (ATM) system is needed to help ensure that this newest entrant into the skies does not collide with others. In an ATM, airspace can be composed of airways, intersections and nodes. In this paper, for simplicity, distributed coordinating the motions of Vertical T…
▽ More
Unmanned Aerial Vehicles (UAVs) are now becoming increasingly accessible to amateur and commercial users alike. An air traffic management (ATM) system is needed to help ensure that this newest entrant into the skies does not collide with others. In an ATM, airspace can be composed of airways, intersections and nodes. In this paper, for simplicity, distributed coordinating the motions of Vertical TakeOff and Landing (VTOL) UAVs to pass an airway is focused. This is formulated as a virtual tube passing problem, which includes passing a virtual tube, inter-agent collision avoidance and keeping within the virtual tube. Lyapunov-like functions are designed elaborately, and formal analysis based on invariant set theorem is made to show that all UAVs can pass the virtual tube without getting trapped, avoid collision and keep within the virtual tube. What is more, by the proposed distributed control, a VTOL UAV can keep away from another VTOL UAV or return back to the virtual tube as soon as possible, once it enters into the safety area of another or has a collision with the virtual tube during it is passing the virtual tube. Simulations and experiments are carried out to show the effectiveness of the proposed method and the comparison with other methods.
△ Less
Submitted 30 July, 2021; v1 submitted 19 January, 2021;
originally announced January 2021.
-
Practical Control for Multicopters to Avoid Non-Cooperative Moving Obstacles
Authors:
Quan Quan,
Rao Fu,
Kai-Yuan Cai
Abstract:
Unmanned Aerial Vehicles (UAVs) are now becoming increasingly accessible to amateur and commercial users alike. The main task for UAVs is to keep a prescribed separation with obstacles in the air. In this paper, a collision-avoidance control method for non-cooperative moving obstacles is proposed for a multicopter with the altitude hold mode by using a Lyapunov-like barrier function. Lyapunov-like…
▽ More
Unmanned Aerial Vehicles (UAVs) are now becoming increasingly accessible to amateur and commercial users alike. The main task for UAVs is to keep a prescribed separation with obstacles in the air. In this paper, a collision-avoidance control method for non-cooperative moving obstacles is proposed for a multicopter with the altitude hold mode by using a Lyapunov-like barrier function. Lyapunov-like functions are designed elaborately, based on which formal analysis and proofs of the proposed control are made to show that the collision-avoidance control problem can be solved if the moving obstacle is slower than the multicopter. The result can be extended to some cases of multiple obstacles. What is more, by the proposed control, a multicopter can keep away from obstacles as soon as possible, once obstacles enter into the safety area of the multicopter accidentally, and converge to the waypoint. Simulations and experiments are given to show the effectiveness of the proposed method by showing the distance between UAV and waypoint, obstacles respectively.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
CT Film Recovery via Disentangling Geometric Deformation and Illumination Variation: Simulated Datasets and Deep Models
Authors:
Quan Quan,
Qiyuan Wang,
Liu Li,
Yuanqi Du,
S. Kevin Zhou
Abstract:
While medical images such as computed tomography (CT) are stored in DICOM format in hospital PACS, it is still quite routine in many countries to print a film as a transferable medium for the purposes of self-storage and secondary consultation. Also, with the ubiquitousness of mobile phone cameras, it is quite common to take pictures of the CT films, which unfortunately suffer from geometric defor…
▽ More
While medical images such as computed tomography (CT) are stored in DICOM format in hospital PACS, it is still quite routine in many countries to print a film as a transferable medium for the purposes of self-storage and secondary consultation. Also, with the ubiquitousness of mobile phone cameras, it is quite common to take pictures of the CT films, which unfortunately suffer from geometric deformation and illumination variation. In this work, we study the problem of recovering a CT film, which marks the first attempt in the literature, to the best of our knowledge. We start with building a large-scale head CT film database CTFilm20K, consisting of approximately 20,000 pictures, using the widely used computer graphics software Blender. We also record all accompanying information related to the geometric deformation (such as 3D coordinate, depth, normal, and UV maps) and illumination variation (such as albedo map). Then we propose a deep framework to disentangle geometric deformation and illumination variation using the multiple maps extracted from the CT films to collaboratively guide the recovery process. Extensive experiments on simulated and real images demonstrate the superiority of our approach over the previous approaches. We plan to open source the simulated images and deep models for promoting the research on CT film recovery (https://anonymous.4open.science/r/e6b1f6e3-9b36-423f-a225-55b7d0b55523/).
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Quantum-Inspired Classical Algorithm for Slow Feature Analysis
Authors:
Daniel Chen,
Yekun Xu,
Betis Baheri,
Samuel A. Stein,
Chuan Bi,
Ying Mao,
Qiang Quan,
Shuai Xu
Abstract:
Recently, there has been a surge of interest for quantum computation for its ability to exponentially speed up algorithms, including machine learning algorithms. However, Tang suggested that the exponential speed up can also be done on a classical computer. In this paper, we proposed an algorithm for slow feature analysis, a machine learning algorithm that extracts the slow-varying features, with…
▽ More
Recently, there has been a surge of interest for quantum computation for its ability to exponentially speed up algorithms, including machine learning algorithms. However, Tang suggested that the exponential speed up can also be done on a classical computer. In this paper, we proposed an algorithm for slow feature analysis, a machine learning algorithm that extracts the slow-varying features, with a run time O(polylog(n)poly(d)). To achieve this, we assumed necessary preprocessing of the input data as well as the existence of a data structure supporting a particular sampling scheme. The analysis of algorithm borrowed results from matrix perturbation theory, which was crucial for the algorithm's correctness. This work demonstrates the possible application and extent for which quantum-inspired computation can be used.
△ Less
Submitted 1 December, 2020;
originally announced December 2020.
-
Sky Highway Design for Dense Traffic
Authors:
Quan Quan,
Mengxin Li
Abstract:
The number of Unmanned Aerial Vehicles (UAVs) continues to explode. Within the total spectrum of Unmanned Aircraft System (UAS) operations, Urban Air Mobility (UAM) is also on the way. Dense air traffic is getting ever closer to us. Current research either focuses on traffic network design and route design for safety purpose or swarm control in open airspace to contain large volume of UAVs. In ord…
▽ More
The number of Unmanned Aerial Vehicles (UAVs) continues to explode. Within the total spectrum of Unmanned Aircraft System (UAS) operations, Urban Air Mobility (UAM) is also on the way. Dense air traffic is getting ever closer to us. Current research either focuses on traffic network design and route design for safety purpose or swarm control in open airspace to contain large volume of UAVs. In order to achieve a tradeoff between safety and volumes of UAVs, a sky highway with its basic operation for Vertical Take-Off and Landing (VTOL) UAV is proposed, where traffic network, route and swarm control design are all considered. In the sky highway, each UAV will have its route, and an airway like a highway road can allow many UAVs to perform free flight. The geometrical structure of the proposed sky highway with corresponding flight modes to support dense traffic is studied one by one. The effectiveness of the proposed sky highway is shown by the given demonstration.
△ Less
Submitted 18 October, 2020;
originally announced October 2020.
-
Quantum-Inspired Classical Algorithm for Principal Component Regression
Authors:
Daniel Chen,
Yekun Xu,
Betis Baheri,
Chuan Bi,
Ying Mao,
Qiang Quan,
Shuai Xu
Abstract:
This paper presents a sublinear classical algorithm for principal component regression. The algorithm uses quantum-inspired linear algebra, an idea developed by Tang. Using this technique, her algorithm for recommendation systems achieved runtime only polynomially slower than its quantum counterpart. Her work was quickly adapted to solve many other problems in sublinear time complexity. In this wo…
▽ More
This paper presents a sublinear classical algorithm for principal component regression. The algorithm uses quantum-inspired linear algebra, an idea developed by Tang. Using this technique, her algorithm for recommendation systems achieved runtime only polynomially slower than its quantum counterpart. Her work was quickly adapted to solve many other problems in sublinear time complexity. In this work, we developed an algorithm for principal component regression that runs in time polylogarithmic to the number of data points, an exponential speed up over the state-of-the-art algorithm, under the mild assumption that the input is given in some data structure that supports a norm-based sampling procedure. This exponential speed up allows for potential applications in much larger data sets.
△ Less
Submitted 16 October, 2020;
originally announced October 2020.
-
Fast Collision Probability Estimation Based on Finite-Dimensional Monte Carlo Method
Authors:
Zhang Hepeng,
Quan Quan
Abstract:
The safety concern for unmanned systems, namely the concern for the potential casualty caused by system abnormalities, has been a bottleneck for their development, especially in populated areas. Evidently, the collision between the unmanned system and the obstacles, including both moving and static objects, accounts for a great proportion of the system abnormalities. The route planning and corresp…
▽ More
The safety concern for unmanned systems, namely the concern for the potential casualty caused by system abnormalities, has been a bottleneck for their development, especially in populated areas. Evidently, the collision between the unmanned system and the obstacles, including both moving and static objects, accounts for a great proportion of the system abnormalities. The route planning and corresponding controller are established in order to avoid the collision, whereas, in the presence of uncertainties, it is possible that the unmanned system would deviate from the predetermined route and collide with the obstacles. Therefore, for the safety of unmanned systems, collision probability estimation and further safety decision are very important. To estimate the collision probability, the Monte Carlo method could be applied, however, it is generally rather slow. This paper introduces a fast collision probability estimation method based on finite-dimensional distribution, whose main idea is to filter out the sampling points needed and generate the states directly by samples of finite-dimensional distribution, reducing the estimation time significantly. Besides, further techniques including the probabilistic equidistance sampling and dimension reduction, also serve to reduce the estimation time. The simulation shows that the proposed method reduces over 99% of the estimation time.
△ Less
Submitted 9 March, 2020;
originally announced March 2020.