-
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results
Authors:
Xin Li,
Kun Yuan,
Bingchen Li,
Fengbin Guan,
Yizhen Shao,
Zihao Yu,
Xijun Wang,
Yiting Lu,
Wei Luo,
Suhang Yao,
Ming Sun,
Chao Zhou,
Zhibo Chen,
Radu Timofte,
Yabin Zhang,
Ao-Xiang Zhang,
Tianwu Zhi,
Jianzhao Liu,
Yang Li,
Jingwen Xu,
Yiting Liao,
Yushen Zuo,
Mingyang Wu,
Renjie Li,
Shengyun Zhong
, et al. (88 additional authors not shown)
Abstract:
This paper presents a review for the NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement. The challenge comprises two tracks: (i) Efficient Video Quality Assessment (KVQ), and (ii) Diffusion-based Image Super-Resolution (KwaiSR). Track 1 aims to advance the development of lightweight and efficient video quality assessment (VQA) models, with an emphasis on eliminating re…
▽ More
This paper presents a review for the NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement. The challenge comprises two tracks: (i) Efficient Video Quality Assessment (KVQ), and (ii) Diffusion-based Image Super-Resolution (KwaiSR). Track 1 aims to advance the development of lightweight and efficient video quality assessment (VQA) models, with an emphasis on eliminating reliance on model ensembles, redundant weights, and other computationally expensive components in the previous IQA/VQA competitions. Track 2 introduces a new short-form UGC dataset tailored for single image super-resolution, i.e., the KwaiSR dataset. It consists of 1,800 synthetically generated S-UGC image pairs and 1,900 real-world S-UGC images, which are split into training, validation, and test sets using a ratio of 8:1:1. The primary objective of the challenge is to drive research that benefits the user experience of short-form UGC platforms such as Kwai and TikTok. This challenge attracted 266 participants and received 18 valid final submissions with corresponding fact sheets, significantly contributing to the progress of short-form UGC VQA and image superresolution. The project is publicly available at https://github.com/lixinustc/KVQE- ChallengeCVPR-NTIRE2025.
△ Less
Submitted 17 April, 2025;
originally announced April 2025.
-
DeProPose: Deficiency-Proof 3D Human Pose Estimation via Adaptive Multi-View Fusion
Authors:
Jianbin Jiao,
Xina Cheng,
Kailun Yang,
Xiangrong Zhang,
Licheng Jiao
Abstract:
3D human pose estimation has wide applications in fields such as intelligent surveillance, motion capture, and virtual reality. However, in real-world scenarios, issues such as occlusion, noise interference, and missing viewpoints can severely affect pose estimation. To address these challenges, we introduce the task of Deficiency-Aware 3D Pose Estimation. Traditional 3D pose estimation methods of…
▽ More
3D human pose estimation has wide applications in fields such as intelligent surveillance, motion capture, and virtual reality. However, in real-world scenarios, issues such as occlusion, noise interference, and missing viewpoints can severely affect pose estimation. To address these challenges, we introduce the task of Deficiency-Aware 3D Pose Estimation. Traditional 3D pose estimation methods often rely on multi-stage networks and modular combinations, which can lead to cumulative errors and increased training complexity, making them unable to effectively address deficiency-aware estimation. To this end, we propose DeProPose, a flexible method that simplifies the network architecture to reduce training complexity and avoid information loss in multi-stage designs. Additionally, the model innovatively introduces a multi-view feature fusion mechanism based on relative projection error, which effectively utilizes information from multiple viewpoints and dynamically assigns weights, enabling efficient integration and enhanced robustness to overcome deficiency-aware 3D Pose Estimation challenges. Furthermore, to thoroughly evaluate this end-to-end multi-view 3D human pose estimation model and to advance research on occlusion-related challenges, we have developed a novel 3D human pose estimation dataset, termed the Deficiency-Aware 3D Pose Estimation (DA-3DPE) dataset. This dataset encompasses a wide range of deficiency scenarios, including noise interference, missing viewpoints, and occlusion challenges. Compared to state-of-the-art methods, DeProPose not only excels in addressing the deficiency-aware problem but also shows improvement in conventional scenarios, providing a powerful and user-friendly solution for 3D human pose estimation. The source code will be available at https://github.com/WUJINHUAN/DeProPose.
△ Less
Submitted 22 February, 2025;
originally announced February 2025.
-
Pathologist-like explainable AI for interpretable Gleason grading in prostate cancer
Authors:
Gesa Mittmann,
Sara Laiouar-Pedari,
Hendrik A. Mehrtens,
Sarah Haggenmüller,
Tabea-Clara Bucher,
Tirtha Chanda,
Nadine T. Gaisa,
Mathias Wagner,
Gilbert Georg Klamminger,
Tilman T. Rau,
Christina Neppl,
Eva Maria Compérat,
Andreas Gocht,
Monika Hämmerle,
Niels J. Rupp,
Jula Westhoff,
Irene Krücken,
Maximillian Seidl,
Christian M. Schürch,
Marcus Bauer,
Wiebke Solass,
Yu Chun Tam,
Florian Weber,
Rainer Grobholz,
Jaroslaw Augustyniak
, et al. (41 additional authors not shown)
Abstract:
The aggressiveness of prostate cancer, the most common cancer in men worldwide, is primarily assessed based on histopathological data using the Gleason scoring system. While artificial intelligence (AI) has shown promise in accurately predicting Gleason scores, these predictions often lack inherent explainability, potentially leading to distrust in human-machine interactions. To address this issue…
▽ More
The aggressiveness of prostate cancer, the most common cancer in men worldwide, is primarily assessed based on histopathological data using the Gleason scoring system. While artificial intelligence (AI) has shown promise in accurately predicting Gleason scores, these predictions often lack inherent explainability, potentially leading to distrust in human-machine interactions. To address this issue, we introduce a novel dataset of 1,015 tissue microarray core images, annotated by an international group of 54 pathologists. The annotations provide detailed localized pattern descriptions for Gleason grading in line with international guidelines. Utilizing this dataset, we develop an inherently explainable AI system based on a U-Net architecture that provides predictions leveraging pathologists' terminology. This approach circumvents post-hoc explainability methods while maintaining or exceeding the performance of methods trained directly for Gleason pattern segmentation (Dice score: 0.713 $\pm$ 0.003 trained on explanations vs. 0.691 $\pm$ 0.010 trained on Gleason patterns). By employing soft labels during training, we capture the intrinsic uncertainty in the data, yielding strong results in Gleason pattern segmentation even in the context of high interobserver variability. With the release of this dataset, we aim to encourage further research into segmentation in medical tasks with high levels of subjectivity and to advance the understanding of pathologists' reasoning processes.
△ Less
Submitted 19 October, 2024;
originally announced October 2024.
-
DNN-based Enhanced DOA Sensing via Massive MIMO Receiver with Switches-based Hybrid Architecture
Authors:
Yifan Li,
Kang Wei,
Linqiong Jia,
Jun Zou,
Feng Shu,
Yaoliang Song,
Jiangzhou Wang
Abstract:
Switches-based hybrid architecture has attracted much attention, especially in directional-of-arrival (DOA) sensing, due to its ability of significantly reducing the hardware cost by compressing massive multiple-input multiple-output (MIMO) arrays with switching networks. However, this structure will lead to a degradation in the degrees of freedom (DOF) and accuracy of DOA estimation. To address t…
▽ More
Switches-based hybrid architecture has attracted much attention, especially in directional-of-arrival (DOA) sensing, due to its ability of significantly reducing the hardware cost by compressing massive multiple-input multiple-output (MIMO) arrays with switching networks. However, this structure will lead to a degradation in the degrees of freedom (DOF) and accuracy of DOA estimation. To address these two issues, we first propose a switches-based sparse hybrid array (SW-SHA). In this method, we design a dynamic switching network to form a synthesized sparse array, i.e., SW-SHA, that can enlarge the virtual aperture obtained by the difference co-array, thereby significantly enhancing the DOF. Second, in order to improve the DOA estimation accuracy of switches-based hybrid arrays, a deep neural network (DNN)-based method called ASN-DNN is proposed. It includes an antenna selection network (ASN) for optimizing the switch connections based on the criterion of minimizing the Cramer-Rao lower bound (CRLB) under the peak sidelobe level (PSL) constraint and a DNN for DOA estimation. Then by integrating ASN and DNN into an iterative process, the ASN-DNN is obtained. Furthermore, the closed-form expression of CRLB for DOA estimation is derived to evaluate the performance lower bound of switches-based hybrid arrays and provide a benchmark for ASN-DNN. The simulation results show the proposed ASN-DNN can achieve a greater performance than traditional methods, especially in the low signal-to-noise ratio (SNR) regions.
△ Less
Submitted 13 January, 2025; v1 submitted 21 September, 2024;
originally announced September 2024.
-
Coverage Analysis of Downlink Transmission in Multi-Connectivity Cellular V2X Networks
Authors:
Luofang Jiao,
Tianqi Zhang,
Jiwei Zhao,
Yunting Xu,
Haibo Zhou
Abstract:
With the increasing of connected vehicles in the fifth-generation mobile communication networks (5G) and beyond 5G (B5G), ensuring the reliable and high-speed cellular vehicle-to-everything (C-V2X) communication has posed significant challenges due to the high mobility of vehicles. For improving the network performance and reliability, multi-connectivity technology has emerged as a crucial transmi…
▽ More
With the increasing of connected vehicles in the fifth-generation mobile communication networks (5G) and beyond 5G (B5G), ensuring the reliable and high-speed cellular vehicle-to-everything (C-V2X) communication has posed significant challenges due to the high mobility of vehicles. For improving the network performance and reliability, multi-connectivity technology has emerged as a crucial transmission mode for C-V2X in the 5G era. To this end, this paper proposes a framework for analyzing the performance of multi-connectivity in C-V2X downlink transmission, with a focus on the performance indicators of joint distance distribution and coverage probability. Specifically, we first derive the joint distance distribution of multi-connectivity. By leveraging the tools of stochastic geometry, we then obtain the analytical expressions of coverage probability based on the previous results for general multi-connectivity cases in C-V2X. Subsequently, we evaluate the effect of path loss exponent and downlink base station density on coverage probability based on the proposed analytical framework. Finally, extensive Monte Carlo simulations are conducted to validate the effectiveness of the proposed analytical framework and the simulation results reveal that multi-connectivity technology can significantly enhance the coverage probability in C-V2X.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Performance Analysis of Uplink/Downlink Decoupled Access in Cellular-V2X Networks
Authors:
Luofang Jiao,
Kai Yu,
Jiacheng Chen,
Tingting Liu,
Haibo Zhou,
Lin Cai
Abstract:
This paper firstly develops an analytical framework to investigate the performance of uplink (UL) / downlink (DL) decoupled access in cellular vehicle-to-everything (C-V2X) networks, in which a vehicle's UL/DL can be connected to different macro/small base stations (MBSs/SBSs) separately. Using the stochastic geometry analytical tool, the UL/DL decoupled access C-V2X is modeled as a Cox process, a…
▽ More
This paper firstly develops an analytical framework to investigate the performance of uplink (UL) / downlink (DL) decoupled access in cellular vehicle-to-everything (C-V2X) networks, in which a vehicle's UL/DL can be connected to different macro/small base stations (MBSs/SBSs) separately. Using the stochastic geometry analytical tool, the UL/DL decoupled access C-V2X is modeled as a Cox process, and we obtain the following theoretical results, i.e., 1) the probability of different UL/DL joint association cases i.e., both the UL and DL are associated with the different MBSs or SBSs, or they are associated with different types of BSs; 2) the distance distribution of a vehicle to its serving BSs in each case; 3) the spectral efficiency of UL/DL in each case; and 4) the UL/DL coverage probability of MBS/SBS. The analyses reveal the insights and performance gain of UL/DL decoupled access. Through extensive simulations, \textcolor{black}{the accuracy of the proposed analytical framework is validated.} Both the analytical and simulation results show that UL/DL decoupled access can improve spectral efficiency. The theoretical results can be directly used for estimating the statistical performance of a UL/DL decoupled access C-V2X network.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Performance Analysis for Downlink Transmission in Multi-Connectivity Cellular V2X Networks
Authors:
Luofang Jiao,
Jiwei Zhao,
Yunting Xu,
Tianqi Zhang,
Haibo Zhou,
Dongmei Zhao
Abstract:
With the ever-increasing number of connected vehicles in the fifth-generation mobile communication networks (5G) and beyond 5G (B5G), ensuring the reliability and high-speed demand of cellular vehicle-to-everything (C-V2X) communication in scenarios where vehicles are moving at high speeds poses a significant challenge.Recently, multi-connectivity technology has become a promising network access p…
▽ More
With the ever-increasing number of connected vehicles in the fifth-generation mobile communication networks (5G) and beyond 5G (B5G), ensuring the reliability and high-speed demand of cellular vehicle-to-everything (C-V2X) communication in scenarios where vehicles are moving at high speeds poses a significant challenge.Recently, multi-connectivity technology has become a promising network access paradigm for improving network performance and reliability for C-V2X in the 5G and B5G era. To this end, this paper proposes an analytical framework for the performance of downlink in multi-connectivity C-V2X networks. Specifically, by modeling the vehicles and base stations as one-dimensional Poisson point processes, we first derive and analyze the joint distance distribution of multi-connectivity. Then through leveraging the tools of stochastic geometry, the coverage probability and spectral efficiency are obtained based on the previous results for general multi-connectivity cases in C-V2X. Additionally, we evaluate the effect of path loss exponent and the density of downlink base station on system performance indicators. We demonstrate through extensive Monte Carlo simulations that multi-connectivity technology can effectively enhance network performance in C-V2X. Our findings have important implications for the research and application of multi-connectivity C-V2X in the 5G and B5G era.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
Interpretable Tsetlin Machine-based Premature Ventricular Contraction Identification
Authors:
Jinbao Zhang,
Xuan Zhang,
Lei Jiao,
Ole-Christoffer Granmo,
Yongjun Qian,
Fan Pan
Abstract:
Neural network-based models have found wide use in automatic long-term electrocardiogram (ECG) analysis. However, such black box models are inadequate for analysing physiological signals where credibility and interpretability are crucial. Indeed, how to make ECG analysis transparent is still an open problem. In this study, we develop a Tsetlin machine (TM) based architecture for premature ventricu…
▽ More
Neural network-based models have found wide use in automatic long-term electrocardiogram (ECG) analysis. However, such black box models are inadequate for analysing physiological signals where credibility and interpretability are crucial. Indeed, how to make ECG analysis transparent is still an open problem. In this study, we develop a Tsetlin machine (TM) based architecture for premature ventricular contraction (PVC) identification by analysing long-term ECG signals. The architecture is transparent by describing patterns directly with logical AND rules. To validate the accuracy of our approach, we compare the TM performance with those of convolutional neural networks (CNNs). Our numerical results demonstrate that TM provides comparable performance with CNNs on the MIT-BIH database. To validate interpretability, we provide explanatory diagrams that show how TM makes the PVC identification from confirming and invalidating patterns. We argue that these are compatible with medical knowledge so that they can be readily understood and verified by a medical doctor. Accordingly, we believe this study paves the way for machine learning (ML) for ECG analysis in clinical practice.
△ Less
Submitted 20 January, 2023;
originally announced January 2023.
-
Spectral Efficiency Analysis of Uplink-Downlink Decoupled Access in C-V2X Networks
Authors:
Luofang Jiao,
Kai Yu,
Yunting Xu,
Tianqi Zhang,
Haibo Zhou,
Xuemin,
Shen
Abstract:
The uplink (UL)/downlink (DL) decoupled access has been emerging as a novel access architecture to improve the performance gains in cellular networks. In this paper, we investigate the UL/DL decoupled access performance in cellular vehicle-to-everything (C-V2X). We propose a unified analytical framework for the UL/DL decoupled access in C-V2X from the perspective of spectral efficiency (SE). By mo…
▽ More
The uplink (UL)/downlink (DL) decoupled access has been emerging as a novel access architecture to improve the performance gains in cellular networks. In this paper, we investigate the UL/DL decoupled access performance in cellular vehicle-to-everything (C-V2X). We propose a unified analytical framework for the UL/DL decoupled access in C-V2X from the perspective of spectral efficiency (SE). By modeling the UL/DL decoupled access C-V2X as a Cox process and leveraging the stochastic geometry, we obtain the joint association probability, the UL/DL distance distributions to serving base stations and the SE for the UL/DL decoupled access in C-V2X networks with different association cases. We conduct extensive Monte Carlo simulations to verify the accuracy of the proposed unified analytical framework, and the results show a better system average SE of UL/DL decoupled access in C-V2X.
△ Less
Submitted 12 December, 2022; v1 submitted 5 December, 2022;
originally announced December 2022.
-
Extraction of Vascular Wall in Carotid Ultrasound via a Novel Boundary-Delineation Network
Authors:
Qinghua Huang,
Lizhi Jia,
Guanqing Ren,
Xiaoyi Wang,
Chunying Liu
Abstract:
Ultrasound imaging plays an important role in the diagnosis of vascular lesions. Accurate segmentation of the vascular wall is important for the prevention, diagnosis and treatment of vascular diseases. However, existing methods have inaccurate localization of the vascular wall boundary. Segmentation errors occur in discontinuous vascular wall boundaries and dark boundaries. To overcome these prob…
▽ More
Ultrasound imaging plays an important role in the diagnosis of vascular lesions. Accurate segmentation of the vascular wall is important for the prevention, diagnosis and treatment of vascular diseases. However, existing methods have inaccurate localization of the vascular wall boundary. Segmentation errors occur in discontinuous vascular wall boundaries and dark boundaries. To overcome these problems, we propose a new boundary-delineation network (BDNet). We use the boundary refinement module to re-delineate the boundary of the vascular wall to obtain the correct boundary location. We designed the feature extraction module to extract and fuse multi-scale features and different receptive field features to solve the problem of dark boundaries and discontinuous boundaries. We use a new loss function to optimize the model. The interference of class imbalance on model optimization is prevented to obtain finer and smoother boundaries. Finally, to facilitate clinical applications, we design the model to be lightweight. Experimental results show that our model achieves the best segmentation results and significantly reduces memory consumption compared to existing models for the dataset.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
RSBNet: One-Shot Neural Architecture Search for A Backbone Network in Remote Sensing Image Recognition
Authors:
Cheng Peng,
Yangyang Li,
Ronghua Shang,
Licheng Jiao
Abstract:
Recently, a massive number of deep learning based approaches have been successfully applied to various remote sensing image (RSI) recognition tasks. However, most existing advances of deep learning methods in the RSI field heavily rely on the features extracted by the manually designed backbone network, which severely hinders the potential of deep learning models due the complexity of RSI and the…
▽ More
Recently, a massive number of deep learning based approaches have been successfully applied to various remote sensing image (RSI) recognition tasks. However, most existing advances of deep learning methods in the RSI field heavily rely on the features extracted by the manually designed backbone network, which severely hinders the potential of deep learning models due the complexity of RSI and the limitation of prior knowledge. In this paper, we research a new design paradigm for the backbone architecture in RSI recognition tasks, including scene classification, land-cover classification and object detection. A novel one-shot architecture search framework based on weight-sharing strategy and evolutionary algorithm is proposed, called RSBNet, which consists of three stages: Firstly, a supernet constructed in a layer-wise search space is pretrained on a self-assembled large-scale RSI dataset based on an ensemble single-path training strategy. Next, the pre-trained supernet is equipped with different recognition heads through the switchable recognition module and respectively fine-tuned on the target dataset to obtain task-specific supernet. Finally, we search the optimal backbone architecture for different recognition tasks based on the evolutionary algorithm without any network training. Extensive experiments have been conducted on five benchmark datasets for different recognition tasks, the results show the effectiveness of the proposed search paradigm and demonstrate that the searched backbone is able to flexibly adapt different RSI recognition tasks and achieve impressive performance.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
The Report on China-Spain Joint Clinical Testing for Rapid COVID-19 Risk Screening by Eye-region Manifestations
Authors:
Yanwei Fu,
Feng Li,
Paula boned Fustel,
Lei Zhao,
Lijie Jia,
Haojie Zheng,
Qiang Sun,
Shisong Rong,
Haicheng Tang,
Xiangyang Xue,
Li Yang,
Hong Li,
Jiao Xie Wenxuan Wang,
Yuan Li,
Wei Wang,
Yantao Pei,
Jianmin Wang,
Xiuqi Wu,
Yanhua Zheng,
Hongxia Tian,
Mengwei Gu
Abstract:
Background: The worldwide surge in coronavirus cases has led to the COVID-19 testing demand surge. Rapid, accurate, and cost-effective COVID-19 screening tests working at a population level are in imperative demand globally.
Methods: Based on the eye symptoms of COVID-19, we developed and tested a COVID-19 rapid prescreening model using the eye-region images captured in China and Spain with cell…
▽ More
Background: The worldwide surge in coronavirus cases has led to the COVID-19 testing demand surge. Rapid, accurate, and cost-effective COVID-19 screening tests working at a population level are in imperative demand globally.
Methods: Based on the eye symptoms of COVID-19, we developed and tested a COVID-19 rapid prescreening model using the eye-region images captured in China and Spain with cellphone cameras. The convolutional neural networks (CNNs)-based model was trained on these eye images to complete binary classification task of identifying the COVID-19 cases. The performance was measured using area under receiver-operating-characteristic curve (AUC), sensitivity, specificity, accuracy, and F1. The application programming interface was open access.
Findings: The multicenter study included 2436 pictures corresponding to 657 subjects (155 COVID-19 infection, 23.6%) in development dataset (train and validation) and 2138 pictures corresponding to 478 subjects (64 COVID-19 infections, 13.4%) in test dataset. The image-level performance of COVID-19 prescreening model in the China-Spain multicenter study achieved an AUC of 0.913 (95% CI, 0.898-0.927), with a sensitivity of 0.695 (95% CI, 0.643-0.748), a specificity of 0.904 (95% CI, 0.891 -0.919), an accuracy of 0.875(0.861-0.889), and a F1 of 0.611(0.568-0.655).
Interpretation: The CNN-based model for COVID-19 rapid prescreening has reliable specificity and sensitivity. This system provides a low-cost, fully self-performed, non-invasive, real-time feedback solution for continuous surveillance and large-scale rapid prescreening for COVID-19.
Funding: This project is supported by Aimomics (Shanghai) Intelligent
△ Less
Submitted 17 September, 2021;
originally announced September 2021.
-
Switching Controller Synthesis for Delay Hybrid Systems under Perturbations
Authors:
Yunjun Bai,
Ting Gan,
Li Jiao,
Bican Xia,
Bai Xue,
Naijun Zhan
Abstract:
Delays are ubiquitous in modern hybrid systems, which exhibit both continuous and discrete dynamical behaviors. Induced by signal transmission, conversion, the nature of plants, and so on, delays may appear either in the continuous evolution of a hybrid system such that the evolution depends not only on the present state but also on its execution history, or in the discrete switching between its d…
▽ More
Delays are ubiquitous in modern hybrid systems, which exhibit both continuous and discrete dynamical behaviors. Induced by signal transmission, conversion, the nature of plants, and so on, delays may appear either in the continuous evolution of a hybrid system such that the evolution depends not only on the present state but also on its execution history, or in the discrete switching between its different control modes. In this paper we come up with a new model of hybrid systems, called \emph{delay hybrid automata}, to capture the dynamics of systems with the aforementioned two kinds of delays. Furthermore, based upon this model we study the robust switching controller synthesis problem such that the controlled delay system is able to satisfy the specified safety properties regardless of perturbations. To the end, a novel method is proposed to synthesize switching controllers based on the computation of differential invariants for continuous evolution and backward reachable sets of discrete jumps with delays. Finally, we implement a prototypical tool of our approach and demonstrate it on some case studies.
△ Less
Submitted 21 March, 2021;
originally announced March 2021.
-
Scannerless non-line-of-sight three dimensional imaging with a 32x32 SPAD array
Authors:
Chenfei Jin,
Meng Tang,
Legeng Jia,
Xiaorui Tian,
Jie Yang,
Kai Qiao,
Siqi Zhang
Abstract:
We develop a scannerless non-line-of-sight three dimensional imaging system based on a commercial 32x32 SPAD camera combined with a 70 ps pulsed laser. In our experiment, 1024 time histograms can be achieved synchronously in 3s with an average time resolution of about 165 ps. The result with filtered back projection shows a discernable reconstruction while the result using virtual wave field demon…
▽ More
We develop a scannerless non-line-of-sight three dimensional imaging system based on a commercial 32x32 SPAD camera combined with a 70 ps pulsed laser. In our experiment, 1024 time histograms can be achieved synchronously in 3s with an average time resolution of about 165 ps. The result with filtered back projection shows a discernable reconstruction while the result using virtual wave field demonstrates a better quality similar to the ones created by earlier scanning imaging systems with single pixel SPAD. Comparatively, our system has large potential advantages in frame frequency, power requirements, compactness and robustness. The research results will pave a path for scannerless non-line-of-sight three dimensional imaging application.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Contralaterally Enhanced Networks for Thoracic Disease Detection
Authors:
Gangming Zhao,
Chaowei Fang,
Guanbin Li,
Licheng Jiao,
Yizhou Yu
Abstract:
Identifying and locating diseases in chest X-rays are very challenging, due to the low visual contrast between normal and abnormal regions, and distortions caused by other overlapping tissues. An interesting phenomenon is that there exist many similar structures in the left and right parts of the chest, such as ribs, lung fields and bronchial tubes. This kind of similarities can be used to identif…
▽ More
Identifying and locating diseases in chest X-rays are very challenging, due to the low visual contrast between normal and abnormal regions, and distortions caused by other overlapping tissues. An interesting phenomenon is that there exist many similar structures in the left and right parts of the chest, such as ribs, lung fields and bronchial tubes. This kind of similarities can be used to identify diseases in chest X-rays, according to the experience of broad-certificated radiologists. Aimed at improving the performance of existing detection methods, we propose a deep end-to-end module to exploit the contralateral context information for enhancing feature representations of disease proposals. First of all, under the guidance of the spine line, the spatial transformer network is employed to extract local contralateral patches, which can provide valuable context information for disease proposals. Then, we build up a specific module, based on both additive and subtractive operations, to fuse the features of the disease proposal and the contralateral patch. Our method can be integrated into both fully and weakly supervised disease detection frameworks. It achieves 33.17 AP50 on a carefully annotated private chest X-ray dataset which contains 31,000 images. Experiments on the NIH chest X-ray dataset indicate that our method achieves state-of-the-art performance in weakly-supervised disease localization.
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
A Light-Weighted Convolutional Neural Network for Bitemporal SAR Image Change Detection
Authors:
Rongfang Wang,
Fan Ding,
Licheng Jiao,
Jia-Wei Chen,
Bo Liu,
Wenping Ma,
Mi Wang
Abstract:
Recently, many Convolution Neural Networks (CNN) have been successfully employed in bitemporal SAR image change detection. However, most of the existing networks are too heavy and occupy a large volume of memory for storage and calculation. Motivated by this, in this paper, we propose a lightweight neural network to reduce the computational and spatial complexity and facilitate the change detectio…
▽ More
Recently, many Convolution Neural Networks (CNN) have been successfully employed in bitemporal SAR image change detection. However, most of the existing networks are too heavy and occupy a large volume of memory for storage and calculation. Motivated by this, in this paper, we propose a lightweight neural network to reduce the computational and spatial complexity and facilitate the change detection on an edge device. In the proposed network, we replace normal convolutional layers with bottleneck layers that keep the same number of channels between input and output. Next, we employ dilated convolutional kernels with a few non-zero entries that reduce the running time in convolutional operators. Comparing with the conventional convolutional neural network, our light-weighted neural network will be more efficient with fewer parameters. We verify our light-weighted neural network on four sets of bitemporal SAR images. The experimental results show that the proposed network can obtain better performance than the conventional CNN and has better model generalization, especially on the challenging datasets with complex scenes.
△ Less
Submitted 20 June, 2020; v1 submitted 29 May, 2020;
originally announced May 2020.
-
A Convolutional Neural Network with Parallel Multi-Scale Spatial Pooling to Detect Temporal Changes in SAR Images
Authors:
Jia-Wei Chen,
Rongfang Wang,
Fan Ding,
Bo Liu,
Licheng Jiao,
Jie Zhang
Abstract:
In synthetic aperture radar (SAR) image change detection, it is quite challenging to exploit the changing information from the noisy difference image subject to the speckle. In this paper, we propose a multi-scale spatial pooling (MSSP) network to exploit the changed information from the noisy difference image. Being different from the traditional convolutional network with only mono-scale pooling…
▽ More
In synthetic aperture radar (SAR) image change detection, it is quite challenging to exploit the changing information from the noisy difference image subject to the speckle. In this paper, we propose a multi-scale spatial pooling (MSSP) network to exploit the changed information from the noisy difference image. Being different from the traditional convolutional network with only mono-scale pooling kernels, in the proposed method, multi-scale pooling kernels are equipped in a convolutional network to exploit the spatial context information on changed regions from the difference image. Furthermore, to verify the generalization of the proposed method, we apply our proposed method to the cross-dataset bitemporal SAR image change detection, where the MSSP network (MSSP-Net) is trained on a dataset and then applied to an unknown testing dataset. We compare the proposed method with other state-of-arts and the comparisons are performed on four challenging datasets of bitemporal SAR images. Experimental results demonstrate that our proposed method obtains comparable results with S-PCA-Net on YR-A and YR-B dataset and outperforms other state-of-art methods, especially on the Sendai-A and Sendai-B datasets with more complex scenes. More important, MSSP-Net is more efficient than S-PCA-Net and convolutional neural networks (CNN) with less executing time in both training and testing phases.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
Temperate Fish Detection and Classification: a Deep Learning based Approach
Authors:
Kristian Muri Knausgård,
Arne Wiklund,
Tonje Knutsen Sørdalen,
Kim Halvorsen,
Alf Ring Kleiven,
Lei Jiao,
Morten Goodwin
Abstract:
A wide range of applications in marine ecology extensively uses underwater cameras. Still, to efficiently process the vast amount of data generated, we need to develop tools that can automatically detect and recognize species captured on film. Classifying fish species from videos and images in natural environments can be challenging because of noise and variation in illumination and the surroundin…
▽ More
A wide range of applications in marine ecology extensively uses underwater cameras. Still, to efficiently process the vast amount of data generated, we need to develop tools that can automatically detect and recognize species captured on film. Classifying fish species from videos and images in natural environments can be challenging because of noise and variation in illumination and the surrounding habitat. In this paper, we propose a two-step deep learning approach for the detection and classification of temperate fishes without pre-filtering. The first step is to detect each single fish in an image, independent of species and sex. For this purpose, we employ the You Only Look Once (YOLO) object detection technique. In the second step, we adopt a Convolutional Neural Network (CNN) with the Squeeze-and-Excitation (SE) architecture for classifying each fish in the image without pre-filtering. We apply transfer learning to overcome the limited training samples of temperate fishes and to improve the accuracy of the classification. This is done by training the object detection model with ImageNet and the fish classifier via a public dataset (Fish4Knowledge), whereupon both the object detection and classifier are updated with temperate fishes of interest. The weights obtained from pre-training are applied to post-training as a priori. Our solution achieves the state-of-the-art accuracy of 99.27\% on the pre-training. The percentage values for accuracy on the post-training are good; 83.68\% and 87.74\% with and without image augmentation, respectively, indicating that the solution is viable with a more extensive dataset.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
PolSF: PolSAR image dataset on San Francisco
Authors:
Xu Liu,
Licheng Jiao,
Fang Liu
Abstract:
Polarimetric SAR data has the characteristics of all-weather, all-time and so on, which is widely used in many fields. However, the data of annotation is relatively small, which is not conducive to our research. In this paper, we have collected five open polarimetric SAR images, which are images of the San Francisco area. These five images come from different satellites at different times, which h…
▽ More
Polarimetric SAR data has the characteristics of all-weather, all-time and so on, which is widely used in many fields. However, the data of annotation is relatively small, which is not conducive to our research. In this paper, we have collected five open polarimetric SAR images, which are images of the San Francisco area. These five images come from different satellites at different times, which has great scientific research value. We annotate the collected images at the pixel level for image classification and segmentation. For the convenience of researchers, the annotated data is open source https://github.com/liuxuvip/PolSF.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
Enhanced Secure Wireless Information and Power Transfer via Intelligent Reflecting Surface
Authors:
Weiping Shi,
Xiaobo Zhou,
Linqiong Jia,
Yongpeng Wu,
Feng Shu,
Jiangzhou Wang
Abstract:
In this paper, secure wireless information and power transfer with intelligent reflecting surface (IRS) is proposed for a multiple-input single-output (MISO) system. Under the secrecy rate (SR) and the reflecting phase shifts of IRS constraints, the secure transmit beamforming at access point (AP) and phase shifts at IRS are jointly optimized to maximize the harvested power of energy harvesting re…
▽ More
In this paper, secure wireless information and power transfer with intelligent reflecting surface (IRS) is proposed for a multiple-input single-output (MISO) system. Under the secrecy rate (SR) and the reflecting phase shifts of IRS constraints, the secure transmit beamforming at access point (AP) and phase shifts at IRS are jointly optimized to maximize the harvested power of energy harvesting receiver (EHR). Due to the non-convexity of optimization problem and coupled optimization variables, firstly, we convert the optimization problem into a semidefinite relaxation (SDR) problem and a sub-optimal solution is achieved. To reduce the high-complexity of the proposed SDR method, a low-complexity successive convex approximation (SCA) technique is proposed. Simulation results show the power harvested by the proposed SDR and SCA methods approximately double that of the existing method without IRS given the same SR. In particular, the proposed SCA achieves almost the same performance as the proposed SDR but with a much lower complexity.
△ Less
Submitted 3 November, 2019;
originally announced November 2019.
-
Hierarchical Feature-Aware Tracking
Authors:
Wenhua Zhang,
Licheng Jiao,
Jia Liu
Abstract:
In this paper, we propose a hierarchical feature-aware tracking framework for efficient visual tracking. Recent years, ensembled trackers which combine multiple component trackers have achieved impressive performance. In ensembled trackers, the decision of results is usually a post-event process, i.e., tracking result for each tracker is first obtained and then the suitable one is selected accordi…
▽ More
In this paper, we propose a hierarchical feature-aware tracking framework for efficient visual tracking. Recent years, ensembled trackers which combine multiple component trackers have achieved impressive performance. In ensembled trackers, the decision of results is usually a post-event process, i.e., tracking result for each tracker is first obtained and then the suitable one is selected according to result ensemble. In this paper, we propose a pre-event method. We construct an expert pool with each expert being one set of features. For each frame, several experts are first selected in the pool according to their past performance and then they are used to predict the object. The selection rate of each expert in the pool is then updated and tracking result is obtained according to result ensemble. We propose a novel pre-known expert-adaptive selection strategy. Since the process is more efficient, more experts can be constructed by fusing more types of features which leads to more robustness. Moreover, with the novel expert selection strategy, overfitting caused by fixed experts for each frame can be mitigated. Experiments on several public available datasets demonstrate the superiority of the proposed method and its state-of-the-art performance among ensembled trackers.
△ Less
Submitted 18 October, 2019; v1 submitted 13 October, 2019;
originally announced October 2019.
-
Physical Layer Key Generation in 5G Wireless Networks
Authors:
Long Jiao,
Ning Wang,
Pu Wang,
Amir Alipour-Fanid,
Jie Tang,
Kai Zeng
Abstract:
The bloom of the fifth generation (5G) communication and beyond serves as a catalyst for physical layer key generation techniques. In 5G communications systems, many challenges in traditional physical layer key generation schemes, such as co-located eavesdroppers, the high bit disagreement ratio, and high temporal correlation, could be overcome. This paper lists the key-enabler techniques in 5G wi…
▽ More
The bloom of the fifth generation (5G) communication and beyond serves as a catalyst for physical layer key generation techniques. In 5G communications systems, many challenges in traditional physical layer key generation schemes, such as co-located eavesdroppers, the high bit disagreement ratio, and high temporal correlation, could be overcome. This paper lists the key-enabler techniques in 5G wireless networks, which offer opportunities to address existing issues in physical layer key generation. We survey the existing key generation methods and introduce possible solutions for the existing issues. The new solutions include applying the high signal directionality in beamforming to resist co-located eavesdroppers, utilizing the sparsity of millimeter wave (mmWave) channel to achieve a low bit disagreement ratio under low signal-to-noise-ratio (SNR), and exploiting hybrid precoding to reduce the temporal correlation among measured samples. Finally, the future trends of physical layer key generation in 5G and beyond communications are discussed.
△ Less
Submitted 27 August, 2019;
originally announced August 2019.
-
Semi-supervised Complex-valued GAN for Polarimetric SAR Image Classification
Authors:
Qigong Sun,
Xiufang Li,
Lingling Li,
Xu Liu,
Fang Liu,
Licheng Jiao
Abstract:
Polarimetric synthetic aperture radar (PolSAR) images are widely used in disaster detection and military reconnaissance and so on. However, their interpretation faces some challenges, e.g., deficiency of labeled data, inadequate utilization of data information and so on. In this paper, a complex-valued generative adversarial network (GAN) is proposed for the first time to address these issues. The…
▽ More
Polarimetric synthetic aperture radar (PolSAR) images are widely used in disaster detection and military reconnaissance and so on. However, their interpretation faces some challenges, e.g., deficiency of labeled data, inadequate utilization of data information and so on. In this paper, a complex-valued generative adversarial network (GAN) is proposed for the first time to address these issues. The complex number form of model complies with the physical mechanism of PolSAR data and in favor of utilizing and retaining amplitude and phase information of PolSAR data. GAN architecture and semi-supervised learning are combined to handle deficiency of labeled data. GAN expands training data and semi-supervised learning is used to train network with generated, labeled and unlabeled data. Experimental results on two benchmark data sets show that our model outperforms existing state-of-the-art models, especially for conditions with fewer labeled data.
△ Less
Submitted 9 June, 2019;
originally announced June 2019.
-
Complex Scene Classification of PolSAR Imagery based on a Self-paced Learning Approach
Authors:
Wenshuai Chen,
Shuiping Gou,
Xinlin Wang,
Licheng Jiao,
Changzhe Jiao,
Alina Zare
Abstract:
Existing polarimetric synthetic aperture radar (PolSAR) image classification methods cannot achieve satisfactory performance on complex scenes characterized by several types of land cover with significant levels of noise or similar scattering properties across land cover types. Hence, we propose a supervised classification method aimed at constructing a classifier based on self-paced learning (SPL…
▽ More
Existing polarimetric synthetic aperture radar (PolSAR) image classification methods cannot achieve satisfactory performance on complex scenes characterized by several types of land cover with significant levels of noise or similar scattering properties across land cover types. Hence, we propose a supervised classification method aimed at constructing a classifier based on self-paced learning (SPL). SPL has been demonstrated to be effective at dealing with complex data while providing classifier. In this paper, a novel Support Vector Machine (SVM) algorithm based on SPL with neighborhood constraints (SVM_SPLNC) is proposed. The proposed method leverages the easiest samples first to obtain an initial parameter vector. Then, more complex samples are gradually incorporated to update the parameter vector iteratively. Moreover, neighborhood constraints are introduced during the training process to further improve performance. Experimental results on three real PolSAR images show that the proposed method performs well on complex scenes.
△ Less
Submitted 17 March, 2019;
originally announced March 2019.
-
Learning Automata Based Q-learning for Content Placement in Cooperative Caching
Authors:
Zhong Yang,
Yuanwei Liu,
Yue Chen,
Lei Jiao
Abstract:
An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing sum mean opinion score (MOS) of mobile users. Firstly, a supervised feed-forward back-propagation connectionist model based neural network (SFBC-NN) is invoked for user mobility and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphon…
▽ More
An optimization problem of content placement in cooperative caching is formulated, with the aim of maximizing sum mean opinion score (MOS) of mobile users. Firstly, a supervised feed-forward back-propagation connectionist model based neural network (SFBC-NN) is invoked for user mobility and content popularity prediction. More particularly, practical data collected from GPS-tracker app on smartphones is tackled to test the accuracy of mobility prediction. Then, a learning automata-based Q-learning (LAQL) algorithm for cooperative caching is proposed, in which learning automata (LA) is invoked for Q-learning to obtain an optimal action selection in a random and stationary environment. It is proven that the LA-based action selection scheme is capable of enabling every state to select the optimal action with arbitrarily high probability if Q-learning is able to converge to the optimal Q value eventually. To characterize the performance of the proposed algorithms, the sum MOS of users is applied to define the reward function. Extensive simulations reveal that: 1) The prediction error of SFBC-NN lessen with the increase of iterations and nodes; 2) the proposed LAQL achieves significant performance improvement against traditional Q-learning; 3) the cooperative caching scheme is capable of outperforming non-cooperative caching and random caching of 3% and 4%.
△ Less
Submitted 30 March, 2019; v1 submitted 14 March, 2019;
originally announced March 2019.
-
A Collaborative Multi-agent Reinforcement Learning Anti-jamming Algorithm in Wireless Networks
Authors:
Fuqiang Yao,
Luliang Jia
Abstract:
In this letter, we investigate the anti-jamming defense problem in multi-user scenarios, where the coordination among users is taken into consideration. The Markov game framework is employed to model and analyze the anti-jamming defense problem, and a collaborative multi-agent anti-jamming algorithm (CMAA) is proposed to obtain the optimal anti-jamming strategy. In sweep jamming scenarios, on the…
▽ More
In this letter, we investigate the anti-jamming defense problem in multi-user scenarios, where the coordination among users is taken into consideration. The Markov game framework is employed to model and analyze the anti-jamming defense problem, and a collaborative multi-agent anti-jamming algorithm (CMAA) is proposed to obtain the optimal anti-jamming strategy. In sweep jamming scenarios, on the one hand, the proposed CMAA can tackle the external malicious jamming. On the other hand, it can effectively cope with the mutual interference among users. Simulation results show that the proposed CMAA is superior to both sensing based method and independent Q-learning method, and has the highest normalized rate.
△ Less
Submitted 12 September, 2018;
originally announced September 2018.
-
Power Allocation Strategies for Secure Spatial Modulation
Authors:
Guiyang Xia,
Linqiong Jia,
Yuwen Qian,
Feng Shu,
Zhihong Zhuang,
Jiangzhou Wang
Abstract:
In secure spatial modulation (SM) networks, power allocation (PA) strategies are investigated in this paper under the total power constraint. Considering that there is no closed-form expression for secrecy rate (SR), an approximate closed-form expression of SR is presented, which is used as an efficient metric to optimize PA factor and can greatly reduce the computation complexity. Based on this e…
▽ More
In secure spatial modulation (SM) networks, power allocation (PA) strategies are investigated in this paper under the total power constraint. Considering that there is no closed-form expression for secrecy rate (SR), an approximate closed-form expression of SR is presented, which is used as an efficient metric to optimize PA factor and can greatly reduce the computation complexity. Based on this expression, a convex optimization (CO) method of maximizing SR (Max-SR) is proposed accordingly. Furthermore, a method of maximizing the product of signal-to-leakage and noise ratio (SLNR) and artificial noise-to-leakage-and noise ratio (ANLNR) (Max-P-SAN) is proposed to provide an analytic solution to PA with extremely low-complexity. Simulation results demonstrate that the SR performance of the proposed CO method is close to that of the optimal PA strategy of Max-SR with exhaustive search and better than that of Max-P-SAN in the high signal-to-noise ratio (SNR) region. However, in the low and medium SNR regions, the SR performance of the proposed Max-P-SAN slightly exceeds that of the proposed CO.
△ Less
Submitted 1 August, 2018;
originally announced August 2018.
-
Structure-Based Self-Triggered Consensus in Networks of Multiagents with Switching Topologies
Authors:
Bo Liu,
Wenlian Lu,
Licheng Jiao,
Tianping Chen
Abstract:
In this paper, we propose a new self-triggered consensus algorithm in networks of multi-agents. Different from existing works, which are based on the observation of states, here, each agent determines its next update time based on its coupling structure. Both centralized and distributed approaches of the algorithms have been discussed. By transforming the algorithm to a proper discrete-time system…
▽ More
In this paper, we propose a new self-triggered consensus algorithm in networks of multi-agents. Different from existing works, which are based on the observation of states, here, each agent determines its next update time based on its coupling structure. Both centralized and distributed approaches of the algorithms have been discussed. By transforming the algorithm to a proper discrete-time systems without self delays, we established a new analysis framework to prove the convergence of the algorithm. Then we extended the algorithm to networks with switching topologies, especially stochastically switching topologies. Compared to existing works, our algorithm is easier to understand and implement. It explicitly provides positive lower and upper bounds for the update time interval of each agent based on its coupling structure, which can also be independently adjusted by each agent according to its own situation. Our work reveals that the event/self triggered algorithms are essentially discrete and more suitable to a discrete analysis framework. Numerical simulations are also provided to illustrate the theoretical results.
△ Less
Submitted 29 January, 2015;
originally announced January 2015.
-
A nonlinear tracking algorithm with range-rate measurements based on unbiased measurement conversion
Authors:
Lianmeng Jiao,
Quan Pan,
Yan Liang,
Feng Yang
Abstract:
The three-dimensional CMKF-U with only position measurements is extended to solve the nonlinear tracking problem with range-rate measurements in this paper. A pseudo measurement is constructed by the product of range and range-rate measurements to reduce the high nonlinearity of the range-rate measurements with respect to the target state; then the mean and covariance of the converted measurement…
▽ More
The three-dimensional CMKF-U with only position measurements is extended to solve the nonlinear tracking problem with range-rate measurements in this paper. A pseudo measurement is constructed by the product of range and range-rate measurements to reduce the high nonlinearity of the range-rate measurements with respect to the target state; then the mean and covariance of the converted measurement errors are derived by the measurement conditioned method, showing better consistency than the transitional nested conditioning method; finally, the sequential filter was used to process the converted position and range-rate measurements sequentially to reduce the approximation error in the second-order EKF. Monte Carlo simulations show that the performance of the new tracking algorithm is better than the traditional one based on CMKF-D.
△ Less
Submitted 12 December, 2014;
originally announced December 2014.