Search | arXiv e-print repository

Unsupervised Deformable Image Registration with Structural Nonparametric Smoothing

Authors: Hang Zhang, Xiang Chen, Renjiu Hu, Rongguang Wang, Jinwei Zhang, Min Liu, Yaonan Wang, Gaolei Li, Xinxing Cheng, Jinming Duan

Abstract: Learning-based deformable image registration (DIR) accelerates alignment by amortizing traditional optimization via neural networks. Label supervision further enhances accuracy, enabling efficient and precise nonlinear alignment of unseen scans. However, images with sparse features amid large smooth regions, such as retinal vessels, introduce aperture and large-displacement challenges that unsuper… ▽ More Learning-based deformable image registration (DIR) accelerates alignment by amortizing traditional optimization via neural networks. Label supervision further enhances accuracy, enabling efficient and precise nonlinear alignment of unseen scans. However, images with sparse features amid large smooth regions, such as retinal vessels, introduce aperture and large-displacement challenges that unsupervised DIR methods struggle to address. This limitation occurs because neural networks predict deformation fields in a single forward pass, leaving fields unconstrained post-training and shifting the regularization burden entirely to network weights. To address these issues, we introduce SmoothProper, a plug-and-play neural module enforcing smoothness and promoting message passing within the network's forward pass. By integrating a duality-based optimization layer with tailored interaction terms, SmoothProper efficiently propagates flow signals across spatial locations, enforces smoothness, and preserves structural consistency. It is model-agnostic, seamlessly integrates into existing registration frameworks with minimal parameter overhead, and eliminates regularizer hyperparameter tuning. Preliminary results on a retinal vessel dataset exhibiting aperture and large-displacement challenges demonstrate our method reduces registration error to 1.88 pixels on 2912x2912 images, marking the first unsupervised DIR approach to effectively address both challenges. The source code will be available at https://github.com/tinymilky/SmoothProper. △ Less

Submitted 12 June, 2025; originally announced June 2025.

Comments: Accepted for publication at Information Processing in Medical Imaging (IPMI) 2025

arXiv:2506.04552 [pdf]

DAS-MAE: A self-supervised pre-training framework for universal and high-performance representation learning of distributed fiber-optic acoustic sensing

Authors: Junyi Duan, Jiageng Chen, Zuyuan He

Abstract: Distributed fiber-optic acoustic sensing (DAS) has emerged as a transformative approach for distributed vibration measurement with high spatial resolution and long measurement range while maintaining cost-efficiency. However, the two-dimensional spatial-temporal DAS signals present analytical challenges. The abstract signal morphology lacking intuitive physical correspondence complicates human int… ▽ More Distributed fiber-optic acoustic sensing (DAS) has emerged as a transformative approach for distributed vibration measurement with high spatial resolution and long measurement range while maintaining cost-efficiency. However, the two-dimensional spatial-temporal DAS signals present analytical challenges. The abstract signal morphology lacking intuitive physical correspondence complicates human interpretation, and its unique spatial-temporal coupling renders conventional image processing methods suboptimal. This study investigates spatial-temporal characteristics and proposes a self-supervised pre-training framework that learns signals' representations through a mask-reconstruction task. This framework is named the DAS Masked AutoEncoder (DAS-MAE). The DAS-MAE learns high-level representations (e.g., event class) without using labels. It achieves up to 1% error and 64.5% relative improvement (RI) over the semi-supervised baseline in few-shot classification tasks. In a practical external damage prevention application, DAS-MAE attains a 5.0% recognition error, marking a 75.7% RI over supervised training from scratch. These results demonstrate the high-performance and universal representations learned by the DAS-MAE framework, highlighting its potential as a foundation model for analyzing massive unlabeled DAS signals. △ Less

Submitted 4 June, 2025; originally announced June 2025.

arXiv:2505.09327 [pdf, ps, other]

Adaptive control for multi-scale stochastic dynamical systems with stochastic next generation reservoir computing

Authors: Jiani Cheng, Ting Gao, Jinqiao Duan

Abstract: The rapid advancement of neuroscience and machine learning has established data-driven stochastic dynamical system modeling as a powerful tool for understanding and controlling high-dimensional, spatio-temporal processes. We introduce the stochastic next-generation reservoir computing (NG-RC) controller, a framework that integrates the computational efficiency of NG-RC with stochastic analysis to… ▽ More The rapid advancement of neuroscience and machine learning has established data-driven stochastic dynamical system modeling as a powerful tool for understanding and controlling high-dimensional, spatio-temporal processes. We introduce the stochastic next-generation reservoir computing (NG-RC) controller, a framework that integrates the computational efficiency of NG-RC with stochastic analysis to enable robust event-triggered control in multiscale stochastic systems. The asymptotic stability of the controller is rigorously proven via an extended stochastic LaSalle theorem, providing theoretical guarantees for amplitude regulation in nonlinear stochastic dynamics. Numerical experiments on a stochastic Van-der-Pol system subject to both additive and multiplicative noise validate the algorithm, demonstrating its convergence rate across varying temporal scales and noise intensities. To bridge theoretical insights with real-world applications, we deploy the controller to modulate pathological dynamics reconstructed from epileptic EEG data. This work advances a theoretically guaranteed scalable framework for adaptive control of stochastic systems, with broad potential for data-driven decision making in engineering, neuroscience, and beyond. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: 30 pages, 14 figures

arXiv:2503.23179 [pdf, other]

OncoReg: Medical Image Registration for Oncological Challenges

Authors: Wiebke Heyer, Yannic Elser, Lennart Berkel, Xinrui Song, Xuanang Xu, Pingkun Yan, Xi Jia, Jinming Duan, Zi Li, Tony C. W. Mok, BoWen LI, Christian Staackmann, Christoph Großbröhmer, Lasse Hansen, Alessa Hering, Malte M. Sieren, Mattias P. Heinrich

Abstract: In modern cancer research, the vast volume of medical data generated is often underutilised due to challenges related to patient privacy. The OncoReg Challenge addresses this issue by enabling researchers to develop and validate image registration methods through a two-phase framework that ensures patient privacy while fostering the development of more generalisable AI models. Phase one involves w… ▽ More In modern cancer research, the vast volume of medical data generated is often underutilised due to challenges related to patient privacy. The OncoReg Challenge addresses this issue by enabling researchers to develop and validate image registration methods through a two-phase framework that ensures patient privacy while fostering the development of more generalisable AI models. Phase one involves working with a publicly available dataset, while phase two focuses on training models on a private dataset within secure hospital networks. OncoReg builds upon the foundation established by the Learn2Reg Challenge by incorporating the registration of interventional cone-beam computed tomography (CBCT) with standard planning fan-beam CT (FBCT) images in radiotherapy. Accurate image registration is crucial in oncology, particularly for dynamic treatment adjustments in image-guided radiotherapy, where precise alignment is necessary to minimise radiation exposure to healthy tissues while effectively targeting tumours. This work details the methodology and data behind the OncoReg Challenge and provides a comprehensive analysis of the competition entries and results. Findings reveal that feature extraction plays a pivotal role in this registration task. A new method emerging from this challenge demonstrated its versatility, while established approaches continue to perform comparably to newer techniques. Both deep learning and classical approaches still play significant roles in image registration, with the combination of methods - particularly in feature extraction - proving most effective. △ Less

Submitted 1 April, 2025; v1 submitted 29 March, 2025; originally announced March 2025.

Comments: 26 pages, 6 figures

arXiv:2503.14386 [pdf, other]

A Comprehensive Scatter Correction Model for Micro-Focus Dual-Source Imaging Systems: Combining Ambient, Cross, and Forward Scatter

Authors: Jianing Sun, Jigang Duan, Guangyin Li, Xu Jiang, Xing Zhao

Abstract: Compared to single-source imaging systems, dual-source imaging systems equipped with two cross-distributed scanning beams significantly enhance temporal resolution and capture more comprehensive object scanning information. Nevertheless, the interaction between the two scanning beams introduces more complex scatter signals into the acquired projection data. Existing methods typically model these s… ▽ More Compared to single-source imaging systems, dual-source imaging systems equipped with two cross-distributed scanning beams significantly enhance temporal resolution and capture more comprehensive object scanning information. Nevertheless, the interaction between the two scanning beams introduces more complex scatter signals into the acquired projection data. Existing methods typically model these scatter signals as the sum of cross-scatter and forward scatter, with cross-scatter estimation limited to single-scatter along primary paths. Through experimental measurements on our selfdeveloped micro-focus dual-source imaging system, we observed that the peak ratio of hardware-induced ambient scatter to single-source projection intensity can even exceed 60%, a factor often overlooked in conventional models. To address this limitation, we propose a more comprehensive model that decomposes the total scatter signals into three distinct components: ambient scatter, cross-scatter, and forward scatter. Furthermore, we introduce a cross-scatter kernel superposition (xSKS) module to enhance the accuracy of cross-scatter estimation by modeling both single and multiple crossscatter events along non-primary paths. Additionally, we employ a fast object-adaptive scatter kernel superposition (FOSKS) module for efficient forward scatter estimation. In Monte Carlo (MC) simulation experiments performed on a custom-designed waterbone phantom, our model demonstrated remarkable superiority, achieving a scatter-toprimary-weighted mean absolute percentage error (SPMAPE) of 1.32%, significantly lower than the 12.99% attained by the state-of-the-art method. Physical experiments further validate the superior performance of our model in correcting scatter artifacts. △ Less

Submitted 18 March, 2025; originally announced March 2025.

arXiv:2502.04399 [pdf, other]

Online Location Planning for AI-Defined Vehicles: Optimizing Joint Tasks of Order Serving and Spatio-Temporal Heterogeneous Model Fine-Tuning

Authors: Bokeng Zheng, Bo Rao, Tianxiang Zhu, Chee Wei Tan, Jingpu Duan, Zhi Zhou, Xu Chen, Xiaoxi Zhang

Abstract: Advances in artificial intelligence (AI) including foundation models (FMs), are increasingly transforming human society, with smart city driving the evolution of urban living.Meanwhile, vehicle crowdsensing (VCS) has emerged as a key enabler, leveraging vehicles' mobility and sensor-equipped capabilities. In particular, ride-hailing vehicles can effectively facilitate flexible data collection and… ▽ More Advances in artificial intelligence (AI) including foundation models (FMs), are increasingly transforming human society, with smart city driving the evolution of urban living.Meanwhile, vehicle crowdsensing (VCS) has emerged as a key enabler, leveraging vehicles' mobility and sensor-equipped capabilities. In particular, ride-hailing vehicles can effectively facilitate flexible data collection and contribute towards urban intelligence, despite resource limitations. Therefore, this work explores a promising scenario, where edge-assisted vehicles perform joint tasks of order serving and the emerging foundation model fine-tuning using various urban data. However, integrating the VCS AI task with the conventional order serving task is challenging, due to their inconsistent spatio-temporal characteristics: (i) The distributions of ride orders and data point-of-interests (PoIs) may not coincide in geography, both following a priori unknown patterns; (ii) they have distinct forms of temporal effects, i.e., prolonged waiting makes orders become instantly invalid while data with increased staleness gradually reduces its utility for model fine-tuning.To overcome these obstacles, we propose an online framework based on multi-agent reinforcement learning (MARL) with careful augmentation. A new quality-of-service (QoS) metric is designed to characterize and balance the utility of the two joint tasks, under the effects of varying data volumes and staleness. We also integrate graph neural networks (GNNs) with MARL to enhance state representations, capturing graph-structured, time-varying dependencies among vehicles and across locations. Extensive experiments on our testbed simulator, utilizing various real-world foundation model fine-tuning tasks and the New York City Taxi ride order dataset, demonstrate the advantage of our proposed method. △ Less

Submitted 6 February, 2025; originally announced February 2025.

arXiv:2501.15217 [pdf, other]

Predictive Lagrangian Optimization for Constrained Reinforcement Learning

Authors: Tianqi Zhang, Puzhen Yuan, Guojian Zhan, Ziyu Lin, Yao Lyu, Zhenzhi Qin, Jingliang Duan, Liping Zhang, Shengbo Eben Li

Abstract: Constrained optimization is popularly seen in reinforcement learning for addressing complex control tasks. From the perspective of dynamic system, iteratively solving a constrained optimization problem can be framed as the temporal evolution of a feedback control system. Classical constrained optimization methods, such as penalty and Lagrangian approaches, inherently use proportional and integral… ▽ More Constrained optimization is popularly seen in reinforcement learning for addressing complex control tasks. From the perspective of dynamic system, iteratively solving a constrained optimization problem can be framed as the temporal evolution of a feedback control system. Classical constrained optimization methods, such as penalty and Lagrangian approaches, inherently use proportional and integral feedback controllers. In this paper, we propose a more generic equivalence framework to build the connection between constrained optimization and feedback control system, for the purpose of developing more effective constrained RL algorithms. Firstly, we define that each step of the system evolution determines the Lagrange multiplier by solving a multiplier feedback optimal control problem (MFOCP). In this problem, the control input is multiplier, the state is policy parameters, the dynamics is described by policy gradient descent, and the objective is to minimize constraint violations. Then, we introduce a multiplier guided policy learning (MGPL) module to perform policy parameters updating. And we prove that the resulting optimal policy, achieved through alternating MFOCP and MGPL, aligns with the solution of the primal constrained RL problem, thereby establishing our equivalence framework. Furthermore, we point out that the existing PID Lagrangian is merely one special case within our framework that utilizes a PID controller. We also accommodate the integration of other various feedback controllers, thereby facilitating the development of new algorithms. As a representative, we employ model predictive control (MPC) as the feedback controller and consequently propose a new algorithm called predictive Lagrangian optimization (PLO). Numerical experiments demonstrate its superiority over the PID Lagrangian method, achieving a larger feasible region up to 7.2% and a comparable average reward. △ Less

Submitted 25 January, 2025; originally announced January 2025.

arXiv:2410.16821 [pdf, other]

Guiding Reinforcement Learning with Incomplete System Dynamics

Authors: Shuyuan Wang, Jingliang Duan, Nathan P. Lawrence, Philip D. Loewen, Michael G. Forbes, R. Bhushan Gopaluni, Lixian Zhang

Abstract: Model-free reinforcement learning (RL) is inherently a reactive method, operating under the assumption that it starts with no prior knowledge of the system and entirely depends on trial-and-error for learning. This approach faces several challenges, such as poor sample efficiency, generalization, and the need for well-designed reward functions to guide learning effectively. On the other hand, cont… ▽ More Model-free reinforcement learning (RL) is inherently a reactive method, operating under the assumption that it starts with no prior knowledge of the system and entirely depends on trial-and-error for learning. This approach faces several challenges, such as poor sample efficiency, generalization, and the need for well-designed reward functions to guide learning effectively. On the other hand, controllers based on complete system dynamics do not require data. This paper addresses the intermediate situation where there is not enough model information for complete controller design, but there is enough to suggest that a model-free approach is not the best approach either. By carefully decoupling known and unknown information about the system dynamics, we obtain an embedded controller guided by our partial model and thus improve the learning efficiency of an RL-enhanced approach. A modular design allows us to deploy mainstream RL algorithms to refine the policy. Simulation results show that our method significantly improves sample efficiency compared with standard RL methods on continuous control tasks, and also offers enhanced performance over traditional control approaches. Experiments on a real ground vehicle also validate the performance of our method, including generalization and robustness. △ Less

Submitted 23 October, 2024; v1 submitted 22 October, 2024; originally announced October 2024.

Comments: Accepted to IROS 2024

arXiv:2404.03179 [pdf, other]

UniAV: Unified Audio-Visual Perception for Multi-Task Video Event Localization

Authors: Tiantian Geng, Teng Wang, Yanfu Zhang, Jinming Duan, Weili Guan, Feng Zheng, Ling shao

Abstract: Video localization tasks aim to temporally locate specific instances in videos, including temporal action localization (TAL), sound event detection (SED) and audio-visual event localization (AVEL). Existing methods over-specialize on each task, overlooking the fact that these instances often occur in the same video to form the complete video content. In this work, we present UniAV, a Unified Audio… ▽ More Video localization tasks aim to temporally locate specific instances in videos, including temporal action localization (TAL), sound event detection (SED) and audio-visual event localization (AVEL). Existing methods over-specialize on each task, overlooking the fact that these instances often occur in the same video to form the complete video content. In this work, we present UniAV, a Unified Audio-Visual perception network, to achieve joint learning of TAL, SED and AVEL tasks for the first time. UniAV can leverage diverse data available in task-specific datasets, allowing the model to learn and share mutually beneficial knowledge across tasks and modalities. To tackle the challenges posed by substantial variations in datasets (size/domain/duration) and distinct task characteristics, we propose to uniformly encode visual and audio modalities of all videos to derive generic representations, while also designing task-specific experts to capture unique knowledge for each task. Besides, we develop a unified language-aware classifier by utilizing a pre-trained text encoder, enabling the model to flexibly detect various types of instances and previously unseen ones by simply changing prompts during inference. UniAV outperforms its single-task counterparts by a large margin with fewer parameters, achieving on-par or superior performances compared to state-of-the-art task-specific methods across ActivityNet 1.3, DESED and UnAV-100 benchmarks. △ Less

Submitted 11 August, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2402.03585 [pdf, other]

Decoder-Only Image Registration

Authors: Xi Jia, Wenqi Lu, Xinxing Cheng, Jinming Duan

Abstract: In unsupervised medical image registration, the predominant approaches involve the utilization of a encoder-decoder network architecture, allowing for precise prediction of dense, full-resolution displacement fields from given paired images. Despite its widespread use in the literature, we argue for the necessity of making both the encoder and decoder learnable in such an architecture. For this, w… ▽ More In unsupervised medical image registration, the predominant approaches involve the utilization of a encoder-decoder network architecture, allowing for precise prediction of dense, full-resolution displacement fields from given paired images. Despite its widespread use in the literature, we argue for the necessity of making both the encoder and decoder learnable in such an architecture. For this, we propose a novel network architecture, termed LessNet in this paper, which contains only a learnable decoder, while entirely omitting the utilization of a learnable encoder. LessNet substitutes the learnable encoder with simple, handcrafted features, eliminating the need to learn (optimize) network parameters in the encoder altogether. Consequently, this leads to a compact, efficient, and decoder-only architecture for 3D medical image registration. Evaluated on two publicly available brain MRI datasets, we demonstrate that our decoder-only LessNet can effectively and efficiently learn both dense displacement and diffeomorphic deformation fields in 3D. Furthermore, our decoder-only LessNet can achieve comparable registration performance to state-of-the-art methods such as VoxelMorph and TransMorph, while requiring significantly fewer computational resources. Our code and pre-trained models are available at https://github.com/xi-jia/LessNet. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2310.19022 [pdf, other]

doi 10.1109/TCYB.2023.3323316

Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback

Authors: Jingliang Duan, Jie Li, Xuyang Chen, Kai Zhao, Shengbo Eben Li, Lin Zhao

Abstract: In recent times, significant advancements have been made in delving into the optimization landscape of policy gradient methods for achieving optimal control in linear time-invariant (LTI) systems. Compared with state-feedback control, output-feedback control is more prevalent since the underlying state of the system may not be fully observed in many practical settings. This paper analyzes the opti… ▽ More In recent times, significant advancements have been made in delving into the optimization landscape of policy gradient methods for achieving optimal control in linear time-invariant (LTI) systems. Compared with state-feedback control, output-feedback control is more prevalent since the underlying state of the system may not be fully observed in many practical settings. This paper analyzes the optimization landscape inherent to policy gradient methods when applied to static output feedback (SOF) control in discrete-time LTI systems subject to quadratic cost. We begin by establishing crucial properties of the SOF cost, encompassing coercivity, L-smoothness, and M-Lipschitz continuous Hessian. Despite the absence of convexity, we leverage these properties to derive novel findings regarding convergence (and nearly dimension-free rate) to stationary points for three policy gradient methods, including the vanilla policy gradient method, the natural policy gradient method, and the Gauss-Newton method. Moreover, we provide proof that the vanilla policy gradient method exhibits linear convergence towards local minima when initialized near such minima. The paper concludes by presenting numerical examples that validate our theoretical findings. These results not only characterize the performance of gradient descent for optimizing the SOF problem but also provide insights into the effectiveness of general policy gradient methods within the realm of reinforcement learning. △ Less

Submitted 29 October, 2023; originally announced October 2023.

Journal ref: IEEE Transactions on Cybernetics, 2023

arXiv:2310.05858 [pdf, other]

doi 10.1109/TPAMI.2025.3537087

Distributional Soft Actor-Critic with Three Refinements

Authors: Jingliang Duan, Wenxuan Wang, Liming Xiao, Jiaxin Gao, Shengbo Eben Li, Chang Liu, Ya-Qin Zhang, Bo Cheng, Keqiang Li

Abstract: Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal policies. To address this issue, we previously proposed the Distributional Soft Actor-Critic (DSAC or DSA… ▽ More Reinforcement learning (RL) has shown remarkable success in solving complex decision-making and control tasks. However, many model-free RL algorithms experience performance degradation due to inaccurate value estimation, particularly the overestimation of Q-values, which can lead to suboptimal policies. To address this issue, we previously proposed the Distributional Soft Actor-Critic (DSAC or DSACv1), an off-policy RL algorithm that enhances value estimation accuracy by learning a continuous Gaussian value distribution. Despite its effectiveness, DSACv1 faces challenges such as training instability and sensitivity to reward scaling, caused by high variance in critic gradients due to return randomness. In this paper, we introduce three key refinements to DSACv1 to overcome these limitations and further improve Q-value estimation accuracy: expected value substitution, twin value distribution learning, and variance-based critic gradient adjustment. The enhanced algorithm, termed DSAC with Three refinements (DSAC-T or DSACv2), is systematically evaluated across a diverse set of benchmark tasks. Without the need for task-specific hyperparameter tuning, DSAC-T consistently matches or outperforms leading model-free RL algorithms, including SAC, TD3, DDPG, TRPO, and PPO, in all tested environments. Additionally, DSAC-T ensures a stable learning process and maintains robust performance across varying reward scales. Its effectiveness is further demonstrated through real-world application in controlling a wheeled robot, highlighting its potential for deployment in practical robotic tasks. △ Less

Submitted 1 February, 2025; v1 submitted 9 October, 2023; originally announced October 2023.

Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

arXiv:2310.04992 [pdf, other]

doi 10.1056/AIoa2300221

VisionFM: a Multi-Modal Multi-Task Vision Foundation Model for Generalist Ophthalmic Artificial Intelligence

Authors: Jianing Qiu, Jian Wu, Hao Wei, Peilun Shi, Minqing Zhang, Yunyun Sun, Lin Li, Hanruo Liu, Hongyi Liu, Simeng Hou, Yuyang Zhao, Xuehui Shi, Junfang Xian, Xiaoxia Qu, Sirui Zhu, Lijie Pan, Xiaoniao Chen, Xiaojia Zhang, Shuai Jiang, Kebing Wang, Chenlong Yang, Mingqiang Chen, Sujie Fan, Jianhua Hu, Aiguo Lv , et al. (17 additional authors not shown)

Abstract: We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassifi… ▽ More We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassification of disease phenotype, and systemic biomarker and disease prediction, with each application enhanced with expert-level intelligence and accuracy. The generalist intelligence of VisionFM outperformed ophthalmologists with basic and intermediate levels in jointly diagnosing 12 common ophthalmic diseases. Evaluated on a new large-scale ophthalmic disease diagnosis benchmark database, as well as a new large-scale segmentation and detection benchmark database, VisionFM outperformed strong baseline deep neural networks. The ophthalmic image representations learned by VisionFM exhibited noteworthy explainability, and demonstrated strong generalizability to new ophthalmic modalities, disease spectrum, and imaging devices. As a foundation model, VisionFM has a large capacity to learn from diverse ophthalmic imaging data and disparate datasets. To be commensurate with this capacity, in addition to the real data used for pre-training, we also generated and leveraged synthetic ophthalmic imaging data. Experimental results revealed that synthetic data that passed visual Turing tests, can also enhance the representation learning capability of VisionFM, leading to substantial performance gains on downstream ophthalmic AI tasks. Beyond the ophthalmic AI applications developed, validated, and demonstrated in this work, substantial further applications can be achieved in an efficient and cost-effective manner using VisionFM as the foundation. △ Less

Submitted 7 October, 2023; originally announced October 2023.

Journal ref: The latest VisionFM work has been published in NEJM AI, 2024

arXiv:2307.15273 [pdf, other]

doi 10.1002/mrm.30187

Recovering high-quality FODs from a reduced number of diffusion-weighted images using a model-driven deep learning architecture

Authors: J Bartlett, C E Davey, L A Johnston, J Duan

Abstract: Fibre orientation distribution (FOD) reconstruction using deep learning has the potential to produce accurate FODs from a reduced number of diffusion-weighted images (DWIs), decreasing total imaging time. Diffusion acquisition invariant representations of the DWI signals are typically used as input to these methods to ensure that they can be applied flexibly to data with different b-vectors and b-… ▽ More Fibre orientation distribution (FOD) reconstruction using deep learning has the potential to produce accurate FODs from a reduced number of diffusion-weighted images (DWIs), decreasing total imaging time. Diffusion acquisition invariant representations of the DWI signals are typically used as input to these methods to ensure that they can be applied flexibly to data with different b-vectors and b-values; however, this means the network cannot condition its output directly on the DWI signal. In this work, we propose a spherical deconvolution network, a model-driven deep learning FOD reconstruction architecture, that ensures intermediate and output FODs produced by the network are consistent with the input DWI signals. Furthermore, we implement a fixel classification penalty within our loss function, encouraging the network to produce FODs that can subsequently be segmented into the correct number of fixels and improve downstream fixel-based analysis. Our results show that the model-based deep learning architecture achieves competitive performance compared to a state-of-the-art FOD super-resolution network, FOD-Net. Moreover, we show that the fixel classification penalty can be tuned to offer improved performance with respect to metrics that rely on accurately segmented of FODs. Our code is publicly available at https://github.com/Jbartlett6/SDNet . △ Less

Submitted 27 July, 2023; originally announced July 2023.

Comments: 10 pages, 7 figures, This work has been submitted to the IEEE for possible publication

Journal ref: Magn Reson Med.2024;92:2193-2206

arXiv:2307.05382 [pdf, other]

Protecting the Future: Neonatal Seizure Detection with Spatial-Temporal Modeling

Authors: Ziyue Li, Yuchen Fang, You Li, Kan Ren, Yansen Wang, Xufang Luo, Juanyong Duan, Congrui Huang, Dongsheng Li, Lili Qiu

Abstract: A timely detection of seizures for newborn infants with electroencephalogram (EEG) has been a common yet life-saving practice in the Neonatal Intensive Care Unit (NICU). However, it requires great human efforts for real-time monitoring, which calls for automated solutions to neonatal seizure detection. Moreover, the current automated methods focusing on adult epilepsy monitoring often fail due to… ▽ More A timely detection of seizures for newborn infants with electroencephalogram (EEG) has been a common yet life-saving practice in the Neonatal Intensive Care Unit (NICU). However, it requires great human efforts for real-time monitoring, which calls for automated solutions to neonatal seizure detection. Moreover, the current automated methods focusing on adult epilepsy monitoring often fail due to (i) dynamic seizure onset location in human brains; (ii) different montages on neonates and (iii) huge distribution shift among different subjects. In this paper, we propose a deep learning framework, namely STATENet, to address the exclusive challenges with exquisite designs at the temporal, spatial and model levels. The experiments over the real-world large-scale neonatal EEG dataset illustrate that our framework achieves significantly better seizure detection performance. △ Less

Submitted 2 July, 2023; originally announced July 2023.

Comments: Accepted in IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2023

arXiv:2307.02997 [pdf, other]

Fourier-Net+: Leveraging Band-Limited Representation for Efficient 3D Medical Image Registration

Authors: Xi Jia, Alexander Thorley, Alberto Gomez, Wenqi Lu, Dipak Kotecha, Jinming Duan

Abstract: U-Net style networks are commonly utilized in unsupervised image registration to predict dense displacement fields, which for high-resolution volumetric image data is a resource-intensive and time-consuming task. To tackle this challenge, we first propose Fourier-Net, which replaces the costly U-Net style expansive path with a parameter-free model-driven decoder. Instead of directly predicting a f… ▽ More U-Net style networks are commonly utilized in unsupervised image registration to predict dense displacement fields, which for high-resolution volumetric image data is a resource-intensive and time-consuming task. To tackle this challenge, we first propose Fourier-Net, which replaces the costly U-Net style expansive path with a parameter-free model-driven decoder. Instead of directly predicting a full-resolution displacement field, our Fourier-Net learns a low-dimensional representation of the displacement field in the band-limited Fourier domain which our model-driven decoder converts to a full-resolution displacement field in the spatial domain. Expanding upon Fourier-Net, we then introduce Fourier-Net+, which additionally takes the band-limited spatial representation of the images as input and further reduces the number of convolutional layers in the U-Net style network's contracting path. Finally, to enhance the registration performance, we propose a cascaded version of Fourier-Net+. We evaluate our proposed methods on three datasets, on which our proposed Fourier-Net and its variants achieve comparable results with current state-of-the art methods, while exhibiting faster inference speeds, lower memory footprint, and fewer multiply-add operations. With such small computational cost, our Fourier-Net+ enables the efficient training of large-scale 3D registration on low-VRAM GPUs. Our code is publicly available at \url{https://github.com/xi-jia/Fourier-Net}. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: Under review. arXiv admin note: text overlap with arXiv:2211.16342

arXiv:2305.18355 [pdf, other]

An Efficient Membership Inference Attack for the Diffusion Model by Proximal Initialization

Authors: Fei Kong, Jinhao Duan, RuiPeng Ma, Hengtao Shen, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu

Abstract: Recently, diffusion models have achieved remarkable success in generating tasks, including image and audio generation. However, like other generative models, diffusion models are prone to privacy issues. In this paper, we propose an efficient query-based membership inference attack (MIA), namely Proximal Initialization Attack (PIA), which utilizes groundtruth trajectory obtained by $ε$ initialized… ▽ More Recently, diffusion models have achieved remarkable success in generating tasks, including image and audio generation. However, like other generative models, diffusion models are prone to privacy issues. In this paper, we propose an efficient query-based membership inference attack (MIA), namely Proximal Initialization Attack (PIA), which utilizes groundtruth trajectory obtained by $ε$ initialized in $t=0$ and predicted point to infer memberships. Experimental results indicate that the proposed method can achieve competitive performance with only two queries on both discrete-time and continuous-time diffusion models. Moreover, previous works on the privacy of diffusion models have focused on vision tasks without considering audio tasks. Therefore, we also explore the robustness of diffusion models to MIA in the text-to-speech (TTS) task, which is an audio generation task. To the best of our knowledge, this work is the first to study the robustness of diffusion models to MIA in the TTS task. Experimental results indicate that models with mel-spectrogram (image-like) output are vulnerable to MIA, while models with audio output are relatively robust to MIA. {Code is available at \url{https://github.com/kong13661/PIA}}. △ Less

Submitted 9 October, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

arXiv:2304.01041 [pdf, other]

Integrated Behavior Planning and Motion Control for Autonomous Vehicles with Traffic Rules Compliance

Authors: Haichao Liu, Kai Chen, Yulin Li, Zhenmin Huang, Jianghua Duan, Jun Ma

Abstract: In this article, we propose an optimization-based integrated behavior planning and motion control scheme, which is an interpretable and adaptable urban autonomous driving solution that complies with complex traffic rules while ensuring driving safety. Inherently, to ensure compliance with traffic rules, an innovative design of potential functions (PFs) is presented to characterize various traffic… ▽ More In this article, we propose an optimization-based integrated behavior planning and motion control scheme, which is an interpretable and adaptable urban autonomous driving solution that complies with complex traffic rules while ensuring driving safety. Inherently, to ensure compliance with traffic rules, an innovative design of potential functions (PFs) is presented to characterize various traffic rules related to traffic lights, traversable and non-traversable traffic line markings, etc. These PFs are further incorporated as part of the model predictive control (MPC) formulation. In this sense, high-level behavior planning is attained implicitly along with motion control as an integrated architecture, facilitating flexible maneuvers with safety guarantees. Due to the well-designed objective function of the MPC scheme, our integrated behavior planning and motion control scheme is competent for various urban driving scenarios and able to generate versatile behaviors, such as overtaking with adaptive cruise control, turning in the intersection, and merging in and out of the roundabout. As demonstrated from a series of simulations with challenging scenarios in CARLA, it is noteworthy that the proposed framework admits real-time performance and high generalizability. △ Less

Submitted 30 November, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

Comments: 7 pages, 5 figures, accepted for publication in The 2023 IEEE International Conference on Robotics and Biomimetics (ROBIO)

arXiv:2303.12930 [pdf, other]

Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline

Authors: Tiantian Geng, Teng Wang, Jinming Duan, Runmin Cong, Feng Zheng

Abstract: Existing audio-visual event localization (AVE) handles manually trimmed videos with only a single instance in each of them. However, this setting is unrealistic as natural videos often contain numerous audio-visual events with different categories. To better adapt to real-life applications, in this paper we focus on the task of dense-localizing audio-visual events, which aims to jointly localize a… ▽ More Existing audio-visual event localization (AVE) handles manually trimmed videos with only a single instance in each of them. However, this setting is unrealistic as natural videos often contain numerous audio-visual events with different categories. To better adapt to real-life applications, in this paper we focus on the task of dense-localizing audio-visual events, which aims to jointly localize and recognize all audio-visual events occurring in an untrimmed video. The problem is challenging as it requires fine-grained audio-visual scene and context understanding. To tackle this problem, we introduce the first Untrimmed Audio-Visual (UnAV-100) dataset, which contains 10K untrimmed videos with over 30K audio-visual events. Each video has 2.8 audio-visual events on average, and the events are usually related to each other and might co-occur as in real-life scenes. Next, we formulate the task using a new learning-based framework, which is capable of fully integrating audio and visual modalities to localize audio-visual events with various lengths and capture dependencies between them in a single pass. Extensive experiments demonstrate the effectiveness of our method as well as the significance of multi-scale cross-modal perception and dependency modeling for this task. △ Less

Submitted 24 March, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: Accepted by CVPR2023

arXiv:2210.07553 [pdf, other]

Safe Model-Based Reinforcement Learning with an Uncertainty-Aware Reachability Certificate

Authors: Dongjie Yu, Wenjun Zou, Yujie Yang, Haitong Ma, Shengbo Eben Li, Jingliang Duan, Jianyu Chen

Abstract: Safe reinforcement learning (RL) that solves constraint-satisfactory policies provides a promising way to the broader safety-critical applications of RL in real-world problems such as robotics. Among all safe RL approaches, model-based methods reduce training time violations further due to their high sample efficiency. However, lacking safety robustness against the model uncertainties remains an i… ▽ More Safe reinforcement learning (RL) that solves constraint-satisfactory policies provides a promising way to the broader safety-critical applications of RL in real-world problems such as robotics. Among all safe RL approaches, model-based methods reduce training time violations further due to their high sample efficiency. However, lacking safety robustness against the model uncertainties remains an issue in safe model-based RL, especially in training time safety. In this paper, we propose a distributional reachability certificate (DRC) and its Bellman equation to address model uncertainties and characterize robust persistently safe states. Furthermore, we build a safe RL framework to resolve constraints required by the DRC and its corresponding shield policy. We also devise a line search method to maintain safety and reach higher returns simultaneously while leveraging the shield policy. Comprehensive experiments on classical benchmarks such as constrained tracking and navigation indicate that the proposed algorithm achieves comparable returns with much fewer constraint violations during training. △ Less

Submitted 14 October, 2022; originally announced October 2022.

Comments: 12 pages, 6 figures

arXiv:2209.05042 [pdf, other]

doi 10.1109/CDC51059.2022.9992503

On the Optimization Landscape of Dynamic Output Feedback: A Case Study for Linear Quadratic Regulator

Authors: Jingliang Duan, Wenhan Cao, Yang Zheng, Lin Zhao

Abstract: The convergence of policy gradient algorithms in reinforcement learning hinges on the optimization landscape of the underlying optimal control problem. Theoretical insights into these algorithms can often be acquired from analyzing those of linear quadratic control. However, most of the existing literature only considers the optimization landscape for static full-state or output feedback policies… ▽ More The convergence of policy gradient algorithms in reinforcement learning hinges on the optimization landscape of the underlying optimal control problem. Theoretical insights into these algorithms can often be acquired from analyzing those of linear quadratic control. However, most of the existing literature only considers the optimization landscape for static full-state or output feedback policies (controllers). We investigate the more challenging case of dynamic output-feedback policies for linear quadratic regulation (abbreviated as dLQR), which is prevalent in practice but has a rather complicated optimization landscape. We first show how the dLQR cost varies with the coordinate transformation of the dynamic controller and then derive the optimal transformation for a given observable stabilizing controller. At the core of our results is the uniqueness of the stationary point of dLQR when it is observable, which is in a concise form of an observer-based controller with the optimal similarity transformation. These results shed light on designing efficient algorithms for general decision-making problems with partially observed information. △ Less

Submitted 29 October, 2023; v1 submitted 12 September, 2022; originally announced September 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2201.09598

Journal ref: 2022 IEEE 61st Conference on Decision and Control (CDC)

arXiv:2208.04939 [pdf, other]

U-Net vs Transformer: Is U-Net Outdated in Medical Image Registration?

Authors: Xi Jia, Joseph Bartlett, Tianyang Zhang, Wenqi Lu, Zhaowen Qiu, Jinming Duan

Abstract: Due to their extreme long-range modeling capability, vision transformer-based networks have become increasingly popular in deformable image registration. We believe, however, that the receptive field of a 5-layer convolutional U-Net is sufficient to capture accurate deformations without needing long-range dependencies. The purpose of this study is therefore to investigate whether U-Net-based metho… ▽ More Due to their extreme long-range modeling capability, vision transformer-based networks have become increasingly popular in deformable image registration. We believe, however, that the receptive field of a 5-layer convolutional U-Net is sufficient to capture accurate deformations without needing long-range dependencies. The purpose of this study is therefore to investigate whether U-Net-based methods are outdated compared to modern transformer-based approaches when applied to medical image registration. For this, we propose a large kernel U-Net (LKU-Net) by embedding a parallel convolutional block to a vanilla U-Net in order to enhance the effective receptive field. On the public 3D IXI brain dataset for atlas-based registration, we show that the performance of the vanilla U-Net is already comparable with that of state-of-the-art transformer-based networks (such as TransMorph), and that the proposed LKU-Net outperforms TransMorph by using only 1.12% of its parameters and 10.8% of its mult-adds operations. We further evaluate LKU-Net on a MICCAI Learn2Reg 2021 challenge dataset for inter-subject registration, our LKU-Net also outperforms TransMorph on this dataset and ranks first on the public leaderboard as of the submission of this work. With only modest modifications to the vanilla U-Net, we show that U-Net can outperform transformer-based architectures on inter-subject and atlas-based 3D medical image registration. Code is available at https://github.com/xi-jia/LKU-Net. △ Less

Submitted 13 August, 2022; v1 submitted 7 August, 2022; originally announced August 2022.

Comments: Accepted to MICCAI-MLMI 2022

arXiv:2206.02346 [pdf, other]

Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

Authors: Dongsheng Ding, Kaiqing Zhang, Jiali Duan, Tamer Başar, Mihailo R. Jovanović

Abstract: We study sequential decision making problems aimed at maximizing the expected total reward while satisfying a constraint on the expected total utility. We employ the natural policy gradient method to solve the discounted infinite-horizon optimal control problem for Constrained Markov Decision Processes (constrained MDPs). Specifically, we propose a new Natural Policy Gradient Primal-Dual (NPG-PD)… ▽ More We study sequential decision making problems aimed at maximizing the expected total reward while satisfying a constraint on the expected total utility. We employ the natural policy gradient method to solve the discounted infinite-horizon optimal control problem for Constrained Markov Decision Processes (constrained MDPs). Specifically, we propose a new Natural Policy Gradient Primal-Dual (NPG-PD) method that updates the primal variable via natural policy gradient ascent and the dual variable via projected sub-gradient descent. Although the underlying maximization involves a nonconcave objective function and a nonconvex constraint set, under the softmax policy parametrization we prove that our method achieves global convergence with sublinear rates regarding both the optimality gap and the constraint violation. Such convergence is independent of the size of the state-action space, i.e., it is~dimension-free. Furthermore, for log-linear and general smooth policy parametrizations, we establish sublinear convergence rates up to a function approximation error caused by restricted policy parametrization. We also provide convergence and finite-sample complexity guarantees for two sample-based NPG-PD algorithms. Finally, we use computational experiments to showcase the merits and the effectiveness of our approach. △ Less

Submitted 28 August, 2024; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: 74 pages, 4 figures, 2 tables

arXiv:2205.12857 [pdf, other]

Structure Unbiased Adversarial Model for Medical Image Segmentation

Authors: Tianyang Zhang, Shaoming Zheng, Jun Cheng, Xi Jia, Joseph Bartlett, Xinxing Cheng, Huazhu Fu, Zhaowen Qiu, Jiang Liu, Jinming Duan

Abstract: Generative models have been widely proposed in image recognition to generate more images where the distribution is similar to that of the real ones. It often introduces a discriminator network to differentiate the real data from the generated ones. Such models utilise a discriminator network tasked with differentiating style transferred data from data contained in the target dataset. However in do… ▽ More Generative models have been widely proposed in image recognition to generate more images where the distribution is similar to that of the real ones. It often introduces a discriminator network to differentiate the real data from the generated ones. Such models utilise a discriminator network tasked with differentiating style transferred data from data contained in the target dataset. However in doing so the network focuses on discrepancies in the intensity distribution and may overlook structural differences between the datasets. In this paper we formulate a new image-to-image translation problem to ensure that the structure of the generated images is similar to that in the target dataset. We propose a simple, yet powerful Structure-Unbiased Adversarial (SUA) network which accounts for both intensity and structural differences between the training and test sets when performing image segmentation. It consists of a spatial transformation block followed by an intensity distribution rendering module. The spatial transformation block is proposed to reduce the structure gap between the two images, and also produce an inverse deformation field to warp the final segmented image back. The intensity distribution rendering module then renders the deformed structure to an image with the target intensity distribution. Experimental results show that the proposed SUA method has the capability to transfer both intensity distribution and structural content between multiple datasets. △ Less

Submitted 30 July, 2024; v1 submitted 25 May, 2022; originally announced May 2022.

Comments: Will revise the paper and resubmit

arXiv:2204.04403 [pdf, other]

Improve Generalization of Driving Policy at Signalized Intersections with Adversarial Learning

Authors: Yangang Ren, Guojian Zhan, Liye Tang, Shengbo Eben Li, Jianhua Jiang, Jingliang Duan

Abstract: Intersections are quite challenging among various driving scenes wherein the interaction of signal lights and distinct traffic actors poses great difficulty to learn a wise and robust driving policy. Current research rarely considers the diversity of intersections and stochastic behaviors of traffic participants. For practical applications, the randomness usually leads to some devastating events,… ▽ More Intersections are quite challenging among various driving scenes wherein the interaction of signal lights and distinct traffic actors poses great difficulty to learn a wise and robust driving policy. Current research rarely considers the diversity of intersections and stochastic behaviors of traffic participants. For practical applications, the randomness usually leads to some devastating events, which should be the focus of autonomous driving. This paper introduces an adversarial learning paradigm to boost the intelligence and robustness of driving policy for signalized intersections with dense traffic flow. Firstly, we design a static path planner which is capable of generating trackable candidate paths for multiple intersections with diversified topology. Next, a constrained optimal control problem (COCP) is built based on these candidate paths wherein the bounded uncertainty of dynamic models is considered to capture the randomness of driving environment. We propose adversarial policy gradient (APG) to solve the COCP wherein the adversarial policy is introduced to provide disturbances by seeking the most severe uncertainty while the driving policy learns to handle this situation by competition. Finally, a comprehensive system is established to conduct training and testing wherein the perception module is introduced and the human experience is incorporated to solve the yellow light dilemma. Experiments indicate that the trained policy can handle the signal lights flexibly meanwhile realizing the smooth and efficient passing with a humanoid paradigm. Besides, APG enables a large-margin improvement of the resistance to the abnormal behaviors and thus ensures a high safety level for the autonomous vehicle. △ Less

Submitted 9 April, 2022; originally announced April 2022.

arXiv:2204.02857 [pdf, other]

Primal-dual Estimator Learning: an Offline Constrained Moving Horizon Estimation Method with Feasibility and Near-optimality Guarantees

Authors: Wenhan Cao, Jingliang Duan, Shengbo Eben Li, Chen Chen, Chang Liu, Yu Wang

Abstract: This paper proposes a primal-dual framework to learn a stable estimator for linear constrained estimation problems leveraging the moving horizon approach. To avoid the online computational burden in most existing methods, we learn a parameterized function offline to approximate the primal estimate. Meanwhile, a dual estimator is trained to check the suboptimality of the primal estimator during exe… ▽ More This paper proposes a primal-dual framework to learn a stable estimator for linear constrained estimation problems leveraging the moving horizon approach. To avoid the online computational burden in most existing methods, we learn a parameterized function offline to approximate the primal estimate. Meanwhile, a dual estimator is trained to check the suboptimality of the primal estimator during execution time. Both the primal and dual estimators are learned from data using supervised learning techniques, and the explicit sample size is provided, which enables us to guarantee the quality of each learned estimator in terms of feasibility and optimality. This in turn allows us to bound the probability of the learned estimator being infeasible or suboptimal. Furthermore, we analyze the stability of the resulting estimator with a bounded error in the minimization of the cost function. Since our algorithm does not require the solution of an optimization problem during runtime, state estimates can be generated online almost instantly. Simulation results are presented to show the accuracy and time efficiency of the proposed framework compared to online optimization of moving horizon estimation and Kalman filter. To the best of our knowledge, this is the first learning-based state estimator with feasibility and near-optimality guarantees for linear constrained systems. △ Less

Submitted 6 April, 2022; originally announced April 2022.

arXiv:2201.09598 [pdf, other]

doi 10.1109/TAC.2023.3275732

On the Optimization Landscape of Dynamic Output Feedback Linear Quadratic Control

Authors: Jingliang Duan, Wenhan Cao, Yang Zheng, Lin Zhao

Abstract: The convergence of policy gradient algorithms hinges on the optimization landscape of the underlying optimal control problem. Theoretical insights into these algorithms can often be acquired from analyzing those of linear quadratic control. However, most of the existing literature only considers the optimization landscape for static full-state or output feedback policies (controllers). We investig… ▽ More The convergence of policy gradient algorithms hinges on the optimization landscape of the underlying optimal control problem. Theoretical insights into these algorithms can often be acquired from analyzing those of linear quadratic control. However, most of the existing literature only considers the optimization landscape for static full-state or output feedback policies (controllers). We investigate the more challenging case of dynamic output-feedback policies for linear quadratic regulation (abbreviated as dLQR), which is prevalent in practice but has a rather complicated optimization landscape. We first show how the dLQR cost varies with the coordinate transformation of the dynamic controller and then derive the optimal transformation for a given observable stabilizing controller. One of our core results is the uniqueness of the stationary point of dLQR when it is observable, which provides an optimality certificate for solving dynamic controllers using policy gradient methods. Moreover, we establish conditions under which dLQR and linear quadratic Gaussian control are equivalent, thus providing a unified viewpoint of optimal control of both deterministic and stochastic linear systems. These results further shed light on designing policy gradient algorithms for more general decision-making problems with partially observed information. △ Less

Submitted 29 October, 2023; v1 submitted 24 January, 2022; originally announced January 2022.

Journal ref: IEEE Transactions on Automatic Control (full paper), 2023

arXiv:2112.09357 [pdf, other]

Interpreting Audiograms with Multi-stage Neural Networks

Authors: Shufan Li, Congxi Lu, Linkai Li, Jirong Duan, Xinping Fu, Haoshuai Zhou

Abstract: Audiograms are a particular type of line charts representing individuals' hearing level at various frequencies. They are used by audiologists to diagnose hearing loss, and further select and tune appropriate hearing aids for customers. There have been several projects such as Autoaudio that aim to accelerate this process through means of machine learning. But all existing models at their best can… ▽ More Audiograms are a particular type of line charts representing individuals' hearing level at various frequencies. They are used by audiologists to diagnose hearing loss, and further select and tune appropriate hearing aids for customers. There have been several projects such as Autoaudio that aim to accelerate this process through means of machine learning. But all existing models at their best can only detect audiograms in images and classify them into general categories. They are unable to extract hearing level information from detected audiograms by interpreting the marks, axis, and lines. To address this issue, we propose a Multi-stage Audiogram Interpretation Network (MAIN) that directly reads hearing level data from photos of audiograms. We also established Open Audiogram, an open dataset of audiogram images with annotations of marks and axes on which we trained and evaluated our proposed model. Experiments show that our model is feasible and reliable. △ Less

Submitted 17 December, 2021; originally announced December 2021.

Comments: 12pages,12 figures. The code for this project is available at https://github.com/jacklishufan/MAIN2021

arXiv:2112.04489 [pdf, other]

Learn2Reg: comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning

Authors: Alessa Hering, Lasse Hansen, Tony C. W. Mok, Albert C. S. Chung, Hanna Siebert, Stephanie Häger, Annkristin Lange, Sven Kuckertz, Stefan Heldmann, Wei Shao, Sulaiman Vesal, Mirabela Rusu, Geoffrey Sonn, Théo Estienne, Maria Vakalopoulou, Luyi Han, Yunzhi Huang, Pew-Thian Yap, Mikael Brudfors, Yaël Balbastre, Samuel Joutard, Marc Modat, Gal Lifshitz, Dan Raviv, Jinxin Lv , et al. (28 additional authors not shown)

Abstract: Image registration is a fundamental medical image analysis task, and a wide variety of approaches have been proposed. However, only a few studies have comprehensively compared medical image registration approaches on a wide range of clinically relevant tasks. This limits the development of registration methods, the adoption of research advances into practice, and a fair benchmark across competing… ▽ More Image registration is a fundamental medical image analysis task, and a wide variety of approaches have been proposed. However, only a few studies have comprehensively compared medical image registration approaches on a wide range of clinically relevant tasks. This limits the development of registration methods, the adoption of research advances into practice, and a fair benchmark across competing approaches. The Learn2Reg challenge addresses these limitations by providing a multi-task medical image registration data set for comprehensive characterisation of deformable registration algorithms. A continuous evaluation will be possible at https://learn2reg.grand-challenge.org. Learn2Reg covers a wide range of anatomies (brain, abdomen, and thorax), modalities (ultrasound, CT, MR), availability of annotations, as well as intra- and inter-patient registration evaluation. We established an easily accessible framework for training and validation of 3D registration methods, which enabled the compilation of results of over 65 individual method submissions from more than 20 unique teams. We used a complementary set of metrics, including robustness, accuracy, plausibility, and runtime, enabling unique insight into the current state-of-the-art of medical image registration. This paper describes datasets, tasks, evaluation methods and results of the challenge, as well as results of further analysis of transferability to new datasets, the importance of label supervision, and resulting bias. While no single approach worked best across all tasks, many methodological aspects could be identified that push the performance of medical image registration to new state-of-the-art performance. Furthermore, we demystified the common belief that conventional registration methods have to be much slower than deep-learning-based methods. △ Less

Submitted 7 October, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

arXiv:2109.13132 [pdf, ps, other]

doi 10.23919/ACC53348.2022.9867384

Optimization Landscape of Gradient Descent for Discrete-time Static Output Feedback

Authors: Jingliang Duan, Jie Li, Shengbo Eben Li, Lin Zhao

Abstract: In this paper, we analyze the optimization landscape of gradient descent methods for static output feedback (SOF) control of discrete-time linear time-invariant systems with quadratic cost. The SOF setting can be quite common, for example, when there are unmodeled hidden states in the underlying process. We first establish several important properties of the SOF cost function, including coercivity… ▽ More In this paper, we analyze the optimization landscape of gradient descent methods for static output feedback (SOF) control of discrete-time linear time-invariant systems with quadratic cost. The SOF setting can be quite common, for example, when there are unmodeled hidden states in the underlying process. We first establish several important properties of the SOF cost function, including coercivity, L-smoothness, and M-Lipschitz continuous Hessian. We then utilize these properties to show that the gradient descent is able to converge to a stationary point at a dimension-free rate. Furthermore, we prove that under some mild conditions, gradient descent converges linearly to a local minimum if the starting point is close to one. These results not only characterize the performance of gradient descent in optimizing the SOF problem, but also shed light on the efficiency of general policy gradient methods in reinforcement learning. △ Less

Submitted 10 March, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

Journal ref: 2022 American Control Conference (ACC)

arXiv:2109.05540 [pdf, other]

Encoding Distributional Soft Actor-Critic for Autonomous Driving in Multi-lane Scenarios

Authors: Jingliang Duan, Yangang Ren, Fawang Zhang, Yang Guan, Dongjie Yu, Shengbo Eben Li, Bo Cheng, Lin Zhao

Abstract: In this paper, we propose a new reinforcement learning (RL) algorithm, called encoding distributional soft actor-critic (E-DSAC), for decision-making in autonomous driving. Unlike existing RL-based decision-making methods, E-DSAC is suitable for situations where the number of surrounding vehicles is variable and eliminates the requirement for manually pre-designed sorting rules, resulting in highe… ▽ More In this paper, we propose a new reinforcement learning (RL) algorithm, called encoding distributional soft actor-critic (E-DSAC), for decision-making in autonomous driving. Unlike existing RL-based decision-making methods, E-DSAC is suitable for situations where the number of surrounding vehicles is variable and eliminates the requirement for manually pre-designed sorting rules, resulting in higher policy performance and generality. We first develop an encoding distributional policy iteration (DPI) framework by embedding a permutation invariant module, which employs a feature neural network (NN) to encode the indicators of each vehicle, in the distributional RL framework. The proposed DPI framework is proved to exhibit important properties in terms of convergence and global optimality. Next, based on the developed encoding DPI framework, we propose the E-DSAC algorithm by adding the gradient-based update rule of the feature NN to the policy evaluation process of the DSAC algorithm. Then, the multi-lane driving task and the corresponding reward function are designed to verify the effectiveness of the proposed algorithm. Results show that the policy learned by E-DSAC can realize efficient, smooth, and relatively safe autonomous driving in the designed scenario. And the final policy performance learned by E-DSAC is about three times that of DSAC. Furthermore, its effectiveness has also been verified in real vehicle experiments. △ Less

Submitted 12 September, 2021; originally announced September 2021.

arXiv:2108.11623 [pdf, other]

Model-based Chance-Constrained Reinforcement Learning via Separated Proportional-Integral Lagrangian

Authors: Baiyu Peng, Jingliang Duan, Jianyu Chen, Shengbo Eben Li, Genjin Xie, Congsheng Zhang, Yang Guan, Yao Mu, Enxin Sun

Abstract: Safety is essential for reinforcement learning (RL) applied in the real world. Adding chance constraints (or probabilistic constraints) is a suitable way to enhance RL safety under uncertainty. Existing chance-constrained RL methods like the penalty methods and the Lagrangian methods either exhibit periodic oscillations or learn an over-conservative or unsafe policy. In this paper, we address thes… ▽ More Safety is essential for reinforcement learning (RL) applied in the real world. Adding chance constraints (or probabilistic constraints) is a suitable way to enhance RL safety under uncertainty. Existing chance-constrained RL methods like the penalty methods and the Lagrangian methods either exhibit periodic oscillations or learn an over-conservative or unsafe policy. In this paper, we address these shortcomings by proposing a separated proportional-integral Lagrangian (SPIL) algorithm. We first review the constrained policy optimization process from a feedback control perspective, which regards the penalty weight as the control input and the safe probability as the control output. Based on this, the penalty method is formulated as a proportional controller, and the Lagrangian method is formulated as an integral controller. We then unify them and present a proportional-integral Lagrangian method to get both their merits, with an integral separation technique to limit the integral value in a reasonable range. To accelerate training, the gradient of safe probability is computed in a model-based manner. We demonstrate our method can reduce the oscillations and conservatism of RL policy in a car-following simulation. To prove its practicality, we also apply our method to a real-world mobile robot navigation task, where our robot successfully avoids a moving obstacle with highly uncertain or even aggressive behaviors. △ Less

Submitted 26 August, 2021; originally announced August 2021.

arXiv:2108.04517 [pdf, other]

doi 10.1016/mri.2022.01.012.

Iterative Self-consistent Parallel Magnetic Resonance Imaging Reconstruction based on Nonlocal Low-Rank Regularization

Authors: Ting Pan, Jizhong Duan, Junfeng Wang, Yu Liu

Abstract: Iterative self-consistent parallel imaging reconstruction (SPIRiT) is an effective self-calibrated reconstruction model for parallel magnetic resonance imaging (PMRI). The joint L1 norm of wavelet coefficients and joint total variation (TV) regularization terms are incorporated into the SPIRiT model to improve the reconstruction performance. The simultaneous two-directional low-rankness (STDLR) in… ▽ More Iterative self-consistent parallel imaging reconstruction (SPIRiT) is an effective self-calibrated reconstruction model for parallel magnetic resonance imaging (PMRI). The joint L1 norm of wavelet coefficients and joint total variation (TV) regularization terms are incorporated into the SPIRiT model to improve the reconstruction performance. The simultaneous two-directional low-rankness (STDLR) in k-space data is incorporated into SPIRiT to realize improved reconstruction. Recent methods have exploited the nonlocal self-similarity (NSS) of images by imposing nonlocal low-rankness of similar patches to achieve a superior performance. To fully utilize both the NSS in Magnetic resonance (MR) images and calibration consistency in the k-space domain, we propose a nonlocal low-rank (NLR)-SPIRiT model by incorporating NLR regularization into the SPIRiT model. We apply the weighted nuclear norm (WNN) as a surrogate of the rank and employ the Nash equilibrium (NE) formulation and alternating direction method of multipliers (ADMM) to efficiently solve the NLR-SPIRiT model. The experimental results demonstrate the superior performance of NLR-SPIRiT over the state-of-the-art methods via three objective metrics and visual comparison. △ Less

Submitted 17 April, 2022; v1 submitted 10 August, 2021; originally announced August 2021.

Journal ref: Magnetic Resonance Imaging, vol. 88, pp. 62-75, 2022

arXiv:2107.07907 [pdf, other]

Lightness Modulated Deep Inverse Tone Mapping

Authors: Kanglin Liu, Gaofeng Cao, Jiang Duan, Guoping Qiu

Abstract: Single-image HDR reconstruction or inverse tone mapping (iTM) is a challenging task. In particular, recovering information in over-exposed regions is extremely difficult because details in such regions are almost completely lost. In this paper, we present a deep learning based iTM method that takes advantage of the feature extraction and mapping power of deep convolutional neural networks (CNNs) a… ▽ More Single-image HDR reconstruction or inverse tone mapping (iTM) is a challenging task. In particular, recovering information in over-exposed regions is extremely difficult because details in such regions are almost completely lost. In this paper, we present a deep learning based iTM method that takes advantage of the feature extraction and mapping power of deep convolutional neural networks (CNNs) and uses a lightness prior to modulate the CNN to better exploit observations in the surrounding areas of the over-exposed regions to enhance the quality of HDR image reconstruction. Specifically, we introduce a Hierarchical Synthesis Network (HiSN) for inferring a HDR image from a LDR input and a Lightness Adpative Modulation Network (LAMN) to incorporate the the lightness prior knowledge in the inferring process. The HiSN hierarchically synthesizes the high-brightness component and the low-brightness component of the HDR image whilst the LAMN uses a lightness adaptive mask that separates detail-less saturated bright pixels from well-exposed lower light pixels to enable HiSN to better infer the missing information, particularly in the difficult over-exposed detail-less areas. We present experimental results to demonstrate the effectiveness of the new technique based on quantitative measures and visual comparisons. In addition, we present ablation studies of HiSN and visualization of the activation maps inside LAMN to help gain a deeper understanding of the internal working of the new iTM algorithm and explain why it can achieve much improved performance over state-of-the-art algorithms. △ Less

Submitted 16 July, 2021; originally announced July 2021.

Comments: 11 pages, 10 figures

arXiv:2105.12227 [pdf, other]

Learning a Model-Driven Variational Network for Deformable Image Registration

Authors: Xi Jia, Alexander Thorley, Wei Chen, Huaqi Qiu, Linlin Shen, Iain B Styles, Hyung Jin Chang, Ales Leonardis, Antonio de Marvao, Declan P. O'Regan, Daniel Rueckert, Jinming Duan

Abstract: Data-driven deep learning approaches to image registration can be less accurate than conventional iterative approaches, especially when training data is limited. To address this whilst retaining the fast inference speed of deep learning, we propose VR-Net, a novel cascaded variational network for unsupervised deformable image registration. Using the variable splitting optimization scheme, we first… ▽ More Data-driven deep learning approaches to image registration can be less accurate than conventional iterative approaches, especially when training data is limited. To address this whilst retaining the fast inference speed of deep learning, we propose VR-Net, a novel cascaded variational network for unsupervised deformable image registration. Using the variable splitting optimization scheme, we first convert the image registration problem, established in a generic variational framework, into two sub-problems, one with a point-wise, closed-form solution while the other one is a denoising problem. We then propose two neural layers (i.e. warping layer and intensity consistency layer) to model the analytical solution and a residual U-Net to formulate the denoising problem (i.e. generalized denoising layer). Finally, we cascade the warping layer, intensity consistency layer, and generalized denoising layer to form the VR-Net. Extensive experiments on three (two 2D and one 3D) cardiac magnetic resonance imaging datasets show that VR-Net outperforms state-of-the-art deep learning methods on registration accuracy, while maintains the fast inference speed of deep learning and the data-efficiency of variational model. △ Less

Submitted 25 May, 2021; originally announced May 2021.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2105.11299 [pdf, other]

doi 10.1109/TITS.2021.3136588

Fixed-Dimensional and Permutation Invariant State Representation of Autonomous Driving

Authors: Jingliang Duan, Dongjie Yu, Shengbo Eben Li, Wenxuan Wang, Yangang Ren, Ziyu Lin, Bo Cheng

Abstract: In this paper, we propose a new state representation method, called encoding sum and concatenation (ESC), for the state representation of decision-making in autonomous driving. Unlike existing state representation methods, ESC is applicable to a variable number of surrounding vehicles and eliminates the need for manually pre-designed sorting rules, leading to higher representation ability and gene… ▽ More In this paper, we propose a new state representation method, called encoding sum and concatenation (ESC), for the state representation of decision-making in autonomous driving. Unlike existing state representation methods, ESC is applicable to a variable number of surrounding vehicles and eliminates the need for manually pre-designed sorting rules, leading to higher representation ability and generality. The proposed ESC method introduces a representation neural network (NN) to encode each surrounding vehicle into an encoding vector, and then adds these vectors to obtain the representation vector of the set of surrounding vehicles. By concatenating the set representation with other variables, such as indicators of the ego vehicle and road, we realize the fixed-dimensional and permutation invariant state representation. This paper has further proved that the proposed ESC method can realize the injective representation if the output dimension of the representation NN is greater than the number of variables of all surrounding vehicles. This means that by taking the ESC representation as policy inputs, we can find the nearly optimal representation NN and policy NN by simultaneously optimizing them using gradient-based updating. Experiments demonstrate that compared with the fixed-permutation representation method, the proposed method improves the representation ability of the surrounding vehicles, and the corresponding approximation error is reduced by 62.2%. △ Less

Submitted 4 March, 2022; v1 submitted 24 May, 2021; originally announced May 2021.

Journal ref: IEEE Transactions on Intelligent Transportation Systems, 2021

arXiv:2104.05810 [pdf, other]

doi 10.1109/TII.2019.2907380

A Distributed and Resilient Bargaining Game for Weather-Predictive Microgrid Energy Cooperation

Authors: Lu An, Jie Duan, Mo-Yuen Chow, Alexandra Duel-Hallen

Abstract: A bargaining game is investigated for cooperative energy management in microgrids. This game incorporates a fully distributed and realistic cooperative power scheduling algorithm (CoDES) as well as a distributed Nash Bargaining Solution (NBS)-based method of allocating the overall power bill resulting from CoDES. A novel weather-based stochastic renewable generation (RG) prediction method is incor… ▽ More A bargaining game is investigated for cooperative energy management in microgrids. This game incorporates a fully distributed and realistic cooperative power scheduling algorithm (CoDES) as well as a distributed Nash Bargaining Solution (NBS)-based method of allocating the overall power bill resulting from CoDES. A novel weather-based stochastic renewable generation (RG) prediction method is incorporated in the power scheduling. We demonstrate the proposed game using a 4-user grid-connected microgrid model with diverse user demands, storage, and RG profiles and examine the effect of weather prediction on day-ahead power scheduling and cost/profit allocation. Finally, the impact of users' ambivalence about cooperation and /or dishonesty on the bargaining outcome is investigated, and it is shown that the proposed game is resilient to malicious users' attempts to avoid payment of their fair share of the overall bill. △ Less

Submitted 12 April, 2021; originally announced April 2021.

Comments: 9 pages, 8 figures, published in IEEE Transactions on Industrial Informatics

Journal ref: IEEE Transactions on Industrial Informatics 15 (8), 4721-4730, 2019

arXiv:2103.05505 [pdf]

Approximate Optimal Filter for Linear Gaussian Time-invariant Systems

Authors: Kaiming Tang, Shengbo Eben Li, Yuming Yin, Yang Guan, Jingliang Duan, Wenhan Cao, Jie Li

Abstract: State estimation is critical to control systems, especially when the states cannot be directly measured. This paper presents an approximate optimal filter, which enables to use policy iteration technique to obtain the steady-state gain in linear Gaussian time-invariant systems. This design transforms the optimal filtering problem with minimum mean square error into an optimal control problem, call… ▽ More State estimation is critical to control systems, especially when the states cannot be directly measured. This paper presents an approximate optimal filter, which enables to use policy iteration technique to obtain the steady-state gain in linear Gaussian time-invariant systems. This design transforms the optimal filtering problem with minimum mean square error into an optimal control problem, called Approximate Optimal Filtering (AOF) problem. The equivalence holds given certain conditions about initial state distributions and policy formats, in which the system state is the estimation error, control input is the filter gain, and control objective function is the accumulated estimation error. We present a policy iteration algorithm to solve the AOF problem in steady-state. A classic vehicle state estimation problem finally evaluates the approximate filter. The results show that the policy converges to the steady-state Kalman gain, and its accuracy is within 2 %. △ Less

Submitted 9 March, 2021; originally announced March 2021.

arXiv:2102.11736 [pdf, other]

Recurrent Model Predictive Control

Authors: Zhengyu Liu, Jingliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Qi Sun, Bo Cheng

Abstract: This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems. Unlike traditional Model Predictive Control (MPC) algorithms, it can make full use of the current computing resources and adaptively select the longest model prediction horizon. Our algorithm employs a recurrent function to approximate the… ▽ More This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems. Unlike traditional Model Predictive Control (MPC) algorithms, it can make full use of the current computing resources and adaptively select the longest model prediction horizon. Our algorithm employs a recurrent function to approximate the optimal policy, which maps the system states and reference values directly to the control inputs. The number of prediction steps is equal to the number of recurrent cycles of the learned policy function. With an arbitrary initial policy function, the proposed RMPC algorithm can converge to the optimal policy by directly minimizing the designed loss function. We further prove the convergence and optimality of the RMPC algorithm thorough Bellman optimality principle, and demonstrate its generality and efficiency using two numerical examples. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2102.10289

arXiv:2102.10289 [pdf, other]

doi 10.1109/TIE.2022.3153800

Recurrent Model Predictive Control: Learning an Explicit Recurrent Controller for Nonlinear Systems

Authors: Zhengyu Liu, Jingliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Bo Cheng

Abstract: This paper proposes an offline control algorithm, called Recurrent Model Predictive Control (RMPC), to solve large-scale nonlinear finite-horizon optimal control problems. It can be regarded as an explicit solver of traditional Model Predictive Control (MPC) algorithms, which can adaptively select appropriate model prediction horizon according to current computing resources, so as to improve the p… ▽ More This paper proposes an offline control algorithm, called Recurrent Model Predictive Control (RMPC), to solve large-scale nonlinear finite-horizon optimal control problems. It can be regarded as an explicit solver of traditional Model Predictive Control (MPC) algorithms, which can adaptively select appropriate model prediction horizon according to current computing resources, so as to improve the policy performance. Our algorithm employs a recurrent function to approximate the optimal policy, which maps the system states and reference values directly to the control inputs. The output of the learned policy network after N recurrent cycles corresponds to the nearly optimal solution of N-step MPC. A policy optimization objective is designed by decomposing the MPC cost function according to the Bellman's principle of optimality. The optimal recurrent policy can be obtained by directly minimizing the designed objective function, which is applicable for general nonlinear and non input-affine systems. Both simulation-based and real-robot path-tracking tasks are utilized to demonstrate the effectiveness of the proposed method. △ Less

Submitted 8 April, 2022; v1 submitted 20 February, 2021; originally announced February 2021.

Journal ref: IEEE Transactions on Industrial Electronics, 2022

arXiv:2102.08539 [pdf, other]

Separated Proportional-Integral Lagrangian for Chance Constrained Reinforcement Learning

Authors: Baiyu Peng, Yao Mu, Jingliang Duan, Yang Guan, Shengbo Eben Li, Jianyu Chen

Abstract: Safety is essential for reinforcement learning (RL) applied in real-world tasks like autonomous driving. Chance constraints which guarantee the satisfaction of state constraints at a high probability are suitable to represent the requirements in real-world environment with uncertainty. Existing chance constrained RL methods like the penalty method and the Lagrangian method either exhibit periodic… ▽ More Safety is essential for reinforcement learning (RL) applied in real-world tasks like autonomous driving. Chance constraints which guarantee the satisfaction of state constraints at a high probability are suitable to represent the requirements in real-world environment with uncertainty. Existing chance constrained RL methods like the penalty method and the Lagrangian method either exhibit periodic oscillations or cannot satisfy the constraints. In this paper, we address these shortcomings by proposing a separated proportional-integral Lagrangian (SPIL) algorithm. Taking a control perspective, we first interpret the penalty method and the Lagrangian method as proportional feedback and integral feedback control, respectively. Then, a proportional-integral Lagrangian method is proposed to steady learning process while improving safety. To prevent integral overshooting and reduce conservatism, we introduce the integral separation technique inspired by PID control. Finally, an analytical gradient of the chance constraint is utilized for model-based policy optimization. The effectiveness of SPIL is demonstrated by a narrow car-following task. Experiments indicate that compared with previous methods, SPIL improves the performance while guaranteeing safety, with a steady learning process. △ Less

Submitted 16 February, 2021; originally announced February 2021.

arXiv:2012.11974 [pdf, other]

Complementary Time-Frequency Domain Networks for Dynamic Parallel MR Image Reconstruction

Authors: Chen Qin, Jinming Duan, Kerstin Hammernik, Jo Schlemper, Thomas Küstner, René Botnar, Claudia Prieto, Anthony N. Price, Joseph V. Hajnal, Daniel Rueckert

Abstract: Purpose: To introduce a novel deep learning based approach for fast and high-quality dynamic multi-coil MR reconstruction by learning a complementary time-frequency domain network that exploits spatio-temporal correlations simultaneously from complementary domains. Theory and Methods: Dynamic parallel MR image reconstruction is formulated as a multi-variable minimisation problem, where the data… ▽ More Purpose: To introduce a novel deep learning based approach for fast and high-quality dynamic multi-coil MR reconstruction by learning a complementary time-frequency domain network that exploits spatio-temporal correlations simultaneously from complementary domains. Theory and Methods: Dynamic parallel MR image reconstruction is formulated as a multi-variable minimisation problem, where the data is regularised in combined temporal Fourier and spatial (x-f) domain as well as in spatio-temporal image (x-t) domain. An iterative algorithm based on variable splitting technique is derived, which alternates among signal de-aliasing steps in x-f and x-t spaces, a closed-form point-wise data consistency step and a weighted coupling step. The iterative model is embedded into a deep recurrent neural network which learns to recover the image via exploiting spatio-temporal redundancies in complementary domains. Results: Experiments were performed on two datasets of highly undersampled multi-coil short-axis cardiac cine MRI scans. Results demonstrate that our proposed method outperforms the current state-of-the-art approaches both quantitatively and qualitatively. The proposed model can also generalise well to data acquired from a different scanner and data with pathologies that were not seen in the training set. Conclusion: The work shows the benefit of reconstructing dynamic parallel MRI in complementary time-frequency domains with deep neural networks. The method can effectively and robustly reconstruct high-quality images from highly undersampled dynamic multi-coil data ($16 \times$ and $24 \times$ yielding 15s and 10s scan times respectively) with fast reconstruction speed (2.8s). This could potentially facilitate achieving fast single-breath-hold clinical 2D cardiac cine imaging. △ Less

Submitted 18 June, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

Comments: Accepted by Magnetic Resonance in Medicine

arXiv:2012.06458 [pdf, other]

On Training Effective Reinforcement Learning Agents for Real-time Power Grid Operation and Control

Authors: Ruisheng Diao, Di Shi, Bei Zhang, Siqi Wang, Haifeng Li, Chunlei Xu, Tu Lan, Desong Bian, Jiajun Duan

Abstract: Deriving fast and effectively coordinated control actions remains a grand challenge affecting the secure and economic operation of today's large-scale power grid. This paper presents a novel artificial intelligence (AI) based methodology to achieve multi-objective real-time power grid control for real-world implementation. State-of-the-art off-policy reinforcement learning (RL) algorithm, soft act… ▽ More Deriving fast and effectively coordinated control actions remains a grand challenge affecting the secure and economic operation of today's large-scale power grid. This paper presents a novel artificial intelligence (AI) based methodology to achieve multi-objective real-time power grid control for real-world implementation. State-of-the-art off-policy reinforcement learning (RL) algorithm, soft actor-critic (SAC) is adopted to train AI agents with multi-thread offline training and periodic online training for regulating voltages and transmission losses without violating thermal constraints of lines. A software prototype was developed and deployed in the control center of SGCC Jiangsu Electric Power Company that interacts with their Energy Management System (EMS) every 5 minutes. Massive numerical studies using actual power grid snapshots in the real-time environment verify the effectiveness of the proposed approach. Well-trained SAC agents can learn to provide effective and subsecond control actions in regulating voltage profiles and reducing transmission losses. △ Less

Submitted 11 December, 2020; originally announced December 2020.

arXiv:2009.04395 [pdf, other]

Automated Model Selection for Time-Series Anomaly Detection

Authors: Yuanxiang Ying, Juanyong Duan, Chunlei Wang, Yujing Wang, Congrui Huang, Bixiong Xu

Abstract: Time-series anomaly detection is a popular topic in both academia and industrial fields. Many companies need to monitor thousands of temporal signals for their applications and services and require instant feedback and alerts for potential incidents in time. The task is challenging because of the complex characteristics of time-series, which are messy, stochastic, and often without proper labels.… ▽ More Time-series anomaly detection is a popular topic in both academia and industrial fields. Many companies need to monitor thousands of temporal signals for their applications and services and require instant feedback and alerts for potential incidents in time. The task is challenging because of the complex characteristics of time-series, which are messy, stochastic, and often without proper labels. This prohibits training supervised models because of lack of labels and a single model hardly fits different time series. In this paper, we propose a solution to address these issues. We present an automated model selection framework to automatically find the most suitable detection model with proper parameters for the incoming data. The model selection layer is extensible as it can be updated without too much effort when a new detector is available to the service. Finally, we incorporate a customized tuning algorithm to flexibly filter anomalies to meet customers' criteria. Experiments on real-world datasets show the effectiveness of our solution. △ Less

Submitted 25 August, 2020; originally announced September 2020.

arXiv:2007.06810 [pdf]

Ternary Policy Iteration Algorithm for Nonlinear Robust Control

Authors: Jie Li, Shengbo Eben Li, Yang Guan, Jingliang Duan, Wenyu Li, Yuming Yin

Abstract: The uncertainties in plant dynamics remain a challenge for nonlinear control problems. This paper develops a ternary policy iteration (TPI) algorithm for solving nonlinear robust control problems with bounded uncertainties. The controller and uncertainty of the system are considered as game players, and the robust control problem is formulated as a two-player zero-sum differential game. In order t… ▽ More The uncertainties in plant dynamics remain a challenge for nonlinear control problems. This paper develops a ternary policy iteration (TPI) algorithm for solving nonlinear robust control problems with bounded uncertainties. The controller and uncertainty of the system are considered as game players, and the robust control problem is formulated as a two-player zero-sum differential game. In order to solve the differential game, the corresponding Hamilton-Jacobi-Isaacs (HJI) equation is then derived. Three loss functions and three update phases are designed to match the identity equation, minimization and maximization of the HJI equation, respectively. These loss functions are defined by the expectation of the approximate Hamiltonian in a generated state set to prevent operating all the states in the entire state set concurrently. The parameters of value function and policies are directly updated by diminishing the designed loss functions using the gradient descent method. Moreover, zero-initialization can be applied to the parameters of the control policy. The effectiveness of the proposed TPI algorithm is demonstrated through two simulation studies. The simulation results show that the TPI algorithm can converge to the optimal solution for the linear plant, and has high resistance to disturbances for the nonlinear plant. △ Less

Submitted 14 July, 2020; originally announced July 2020.

arXiv:2007.05993 [pdf, other]

Deep Network Interpolation for Accelerated Parallel MR Image Reconstruction

Authors: Chen Qin, Jo Schlemper, Kerstin Hammernik, Jinming Duan, Ronald M Summers, Daniel Rueckert

Abstract: We present a deep network interpolation strategy for accelerated parallel MR image reconstruction. In particular, we examine the network interpolation in parameter space between a source model that is formulated in an unrolled scheme with L1 and SSIM losses and its counterpart that is trained with an adversarial loss. We show that by interpolating between the two different models of the same netwo… ▽ More We present a deep network interpolation strategy for accelerated parallel MR image reconstruction. In particular, we examine the network interpolation in parameter space between a source model that is formulated in an unrolled scheme with L1 and SSIM losses and its counterpart that is trained with an adversarial loss. We show that by interpolating between the two different models of the same network structure, the new interpolated network can model a trade-off between perceptual quality and fidelity. △ Less

Submitted 12 July, 2020; originally announced July 2020.

Comments: Presented at 2020 ISMRM Conference & Exhibition (Abstract #4958)

arXiv:2007.02070 [pdf, other]

Continuous-time finite-horizon ADP for automated vehicle controller design with high efficiency

Authors: Ziyu Lin, Jingliang Duan, Shengbo Eben Li, Haitong Ma, Yuming Yin

Abstract: The design of an automated vehicle controller can be generally formulated into an optimal control problem. This paper proposes a continuous-time finite-horizon approximate dynamicprogramming (ADP) method, which can synthesis off-line near-optimal control policy with analytical vehicle dynamics. Lying on the general Policy Iteration framework, it employs value andpolicy neural networks to approxima… ▽ More The design of an automated vehicle controller can be generally formulated into an optimal control problem. This paper proposes a continuous-time finite-horizon approximate dynamicprogramming (ADP) method, which can synthesis off-line near-optimal control policy with analytical vehicle dynamics. Lying on the general Policy Iteration framework, it employs value andpolicy neural networks to approximate the mappings from thesystem states to value function and control inputs, respectively. The proposed method can converge to the near-optimal solutionof the finite-horizon Hamilton-Jacobi-Bellman (HJB) equation. We further applied our algorithm to the simulation of automated vehicle control for the path tracking maneuver. The results suggest that the proposed ADP method can obtain the near-optimal policy with 1% error and less calculation time. What is more, the proposed ADP algorithm is also suitable for nonlinear control systems, where ADP is almost 500 times faster than the nonlinear MPC ipopt solver. △ Less

Submitted 4 July, 2020; originally announced July 2020.

Comments: 7 pages,conference

arXiv:2001.09816 [pdf, other]

doi 10.1049/iet-its.2019.0317

Hierarchical Reinforcement Learning for Self-Driving Decision-Making without Reliance on Labeled Driving Data

Authors: Jingliang Duan, Shengbo Eben Li, Yang Guan, Qi Sun, Bo Cheng

Abstract: Decision making for self-driving cars is usually tackled by manually encoding rules from drivers' behaviors or imitating drivers' manipulation using supervised learning techniques. Both of them rely on mass driving data to cover all possible driving scenarios. This paper presents a hierarchical reinforcement learning method for decision making of self-driving cars, which does not depend on a large… ▽ More Decision making for self-driving cars is usually tackled by manually encoding rules from drivers' behaviors or imitating drivers' manipulation using supervised learning techniques. Both of them rely on mass driving data to cover all possible driving scenarios. This paper presents a hierarchical reinforcement learning method for decision making of self-driving cars, which does not depend on a large amount of labeled driving data. This method comprehensively considers both high-level maneuver selection and low-level motion control in both lateral and longitudinal directions. We firstly decompose the driving tasks into three maneuvers, including driving in lane, right lane change and left lane change, and learn the sub-policy for each maneuver. Then, a master policy is learned to choose the maneuver policy to be executed in the current state. All policies including master policy and maneuver policies are represented by fully-connected neural networks and trained by using asynchronous parallel reinforcement learners (APRL), which builds a mapping from the sensory outputs to driving decisions. Different state spaces and reward functions are designed for each maneuver. We apply this method to a highway driving scenario, which demonstrates that it can realize smooth and safe decision making for self-driving cars. △ Less

Submitted 27 January, 2020; originally announced January 2020.

Journal ref: IET Intelligent Transport Systems, 2020, 14(5): 297-305

arXiv:2001.02811 [pdf, other]

doi 10.1109/TNNLS.2021.3082568

Distributional Soft Actor-Critic: Off-Policy Reinforcement Learning for Addressing Value Estimation Errors

Authors: Jingliang Duan, Yang Guan, Shengbo Eben Li, Yangang Ren, Bo Cheng

Abstract: In reinforcement learning (RL), function approximation errors are known to easily lead to the Q-value overestimations, thus greatly reducing policy performance. This paper presents a distributional soft actor-critic (DSAC) algorithm, which is an off-policy RL method for continuous control setting, to improve the policy performance by mitigating Q-value overestimations. We first discover in theory… ▽ More In reinforcement learning (RL), function approximation errors are known to easily lead to the Q-value overestimations, thus greatly reducing policy performance. This paper presents a distributional soft actor-critic (DSAC) algorithm, which is an off-policy RL method for continuous control setting, to improve the policy performance by mitigating Q-value overestimations. We first discover in theory that learning a distribution function of state-action returns can effectively mitigate Q-value overestimations because it is capable of adaptively adjusting the update stepsize of the Q-value function. Then, a distributional soft policy iteration (DSPI) framework is developed by embedding the return distribution function into maximum entropy RL. Finally, we present a deep off-policy actor-critic variant of DSPI, called DSAC, which directly learns a continuous return distribution by keeping the variance of the state-action returns within a reasonable range to address exploding and vanishing gradient problems. We evaluate DSAC on the suite of MuJoCo continuous control tasks, achieving the state-of-the-art performance. △ Less

Submitted 11 June, 2021; v1 submitted 8 January, 2020; originally announced January 2020.

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2021

arXiv:1912.09278 [pdf, other]

$Σ$-net: Systematic Evaluation of Iterative Deep Neural Networks for Fast Parallel MR Image Reconstruction

Authors: Kerstin Hammernik, Jo Schlemper, Chen Qin, Jinming Duan, Ronald M. Summers, Daniel Rueckert

Abstract: Purpose: To systematically investigate the influence of various data consistency layers, (semi-)supervised learning and ensembling strategies, defined in a $Σ$-net, for accelerated parallel MR image reconstruction using deep learning. Theory and Methods: MR image reconstruction is formulated as learned unrolled optimization scheme with a Down-Up network as regularization and varying data consist… ▽ More Purpose: To systematically investigate the influence of various data consistency layers, (semi-)supervised learning and ensembling strategies, defined in a $Σ$-net, for accelerated parallel MR image reconstruction using deep learning. Theory and Methods: MR image reconstruction is formulated as learned unrolled optimization scheme with a Down-Up network as regularization and varying data consistency layers. The different architectures are split into sensitivity networks, which rely on explicit coil sensitivity maps, and parallel coil networks, which learn the combination of coils implicitly. Different content and adversarial losses, a semi-supervised fine-tuning scheme and model ensembling are investigated. Results: Evaluated on the fastMRI multicoil validation set, architectures involving raw k-space data outperform image enhancement methods significantly. Semi-supervised fine-tuning adapts to new k-space data and provides, together with reconstructions based on adversarial training, the visually most appealing results although quantitative quality metrics are reduced. The $Σ$-net ensembles the benefits from different models and achieves similar scores compared to the single state-of-the-art approaches. Conclusion: This work provides an open-source framework to perform a systematic wide-range comparison of state-of-the-art reconstruction approaches for parallel MR image reconstruction on the fastMRI knee dataset and explores the importance of data consistency. A suitable trade-off between perceptual image quality and quantitative scores are achieved with the ensembled $Σ$-net. △ Less

Submitted 18 December, 2019; originally announced December 2019.

Comments: Submitted to Magnetic Resonance in Medicine

Showing 1–50 of 67 results for author: Duan, J