-
FDG-Diff: Frequency-Domain-Guided Diffusion Framework for Compressed Hazy Image Restoration
Authors:
Ruicheng Zhang,
Kanghui Tian,
Zeyu Zhang,
Qixiang Liu,
Zhi Jin
Abstract:
In this study, we reveal that the interaction between haze degradation and JPEG compression introduces complex joint loss effects, which significantly complicate image restoration. Existing dehazing models often neglect compression effects, which limits their effectiveness in practical applications. To address these challenges, we introduce three key contributions. First, we design FDG-Diff, a nov…
▽ More
In this study, we reveal that the interaction between haze degradation and JPEG compression introduces complex joint loss effects, which significantly complicate image restoration. Existing dehazing models often neglect compression effects, which limits their effectiveness in practical applications. To address these challenges, we introduce three key contributions. First, we design FDG-Diff, a novel frequency-domain-guided dehazing framework that improves JPEG image restoration by leveraging frequency-domain information. Second, we introduce the High-Frequency Compensation Module (HFCM), which enhances spatial-domain detail restoration by incorporating frequency-domain augmentation techniques into a diffusion-based restoration framework. Lastly, the introduction of the Degradation-Aware Denoising Timestep Predictor (DADTP) module further enhances restoration quality by enabling adaptive region-specific restoration, effectively addressing regional degradation inconsistencies in compressed hazy images. Experimental results across multiple compressed dehazing datasets demonstrate that our method consistently outperforms the latest state-of-the-art approaches. Code be available at https://github.com/SYSUzrc/FDG-Diff.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
Scattering Environment Aware Joint Multi-user Channel Estimation and Localization with Spatially Reused Pilots
Authors:
Kaiyuan Tian,
Yani Chi,
Yufan Zhou,
An Liu
Abstract:
The increasing number of users leads to an increase in pilot overhead, and the limited pilot resources make it challenging to support all users using orthogonal pilots. By fully capturing the inherent physical characteristics of the multi-user (MU) environment, it is possible to reduce pilot costs and improve the channel estimation performance. In reality, users nearby may share the same scatterer…
▽ More
The increasing number of users leads to an increase in pilot overhead, and the limited pilot resources make it challenging to support all users using orthogonal pilots. By fully capturing the inherent physical characteristics of the multi-user (MU) environment, it is possible to reduce pilot costs and improve the channel estimation performance. In reality, users nearby may share the same scatterer, while users further apart tend to have orthogonal channels. This paper proposes a two-timescale approach for joint MU uplink channel estimation and localization in MIMO-OFDM systems, which fully captures the spatial characteristics of MUs. To accurately represent the structure of the MU channel, the channel is modeled in the 3-D location domain. In the long-timescale phase, the time-space-time multiple signal classification (TST-MUSIC) algorithm initially offers a rough approximation of scatterer positions for each user, which is subsequently refined through the scatterer association algorithm based on density-based spatial clustering of applications with noise (DBSCAN) algorithm. The BS then utilizes this prior information to apply a graph-coloring-based user grouping algorithm, enabling spatial division multiplexing of pilots and reducing pilot overhead. In the short timescale phase, a low-complexity scattering environment aware location-domain turbo channel estimation (SEA-LD-TurboCE) algorithm is introduced to merge the overlapping scatterer information from MUs, facilitating high-precision joint MU channel estimation and localization under spatially reused pilots. Simulation results verify the superior channel estimation and localization performance of our proposed scheme over the baselines.
△ Less
Submitted 4 January, 2025;
originally announced January 2025.
-
Deep unrolled primal dual network for TOF-PET list-mode image reconstruction
Authors:
Rui Hu,
Chenxu Li,
Kun Tian,
Jianan Cui,
Yunmei Chen,
Huafeng Liu
Abstract:
Time-of-flight (TOF) information provides more accurate location data for annihilation photons, thereby enhancing the quality of PET reconstruction images and reducing noise. List-mode reconstruction has a significant advantage in handling TOF information. However, current advanced TOF PET list-mode reconstruction algorithms still require improvements when dealing with low-count data. Deep learnin…
▽ More
Time-of-flight (TOF) information provides more accurate location data for annihilation photons, thereby enhancing the quality of PET reconstruction images and reducing noise. List-mode reconstruction has a significant advantage in handling TOF information. However, current advanced TOF PET list-mode reconstruction algorithms still require improvements when dealing with low-count data. Deep learning algorithms have shown promising results in PET image reconstruction. Nevertheless, the incorporation of TOF information poses significant challenges related to the storage space required by deep learning methods, particularly for the advanced deep unrolled methods. In this study, we propose a deep unrolled primal dual network for TOF-PET list-mode reconstruction. The network is unrolled into multiple phases, with each phase comprising a dual network for list-mode domain updates and a primal network for image domain updates. We utilize CUDA for parallel acceleration and computation of the system matrix for TOF list-mode data, and we adopt a dynamic access strategy to mitigate memory consumption. Reconstructed images of different TOF resolutions and different count levels show that the proposed method outperforms the LM-OSEM, LM-EMTV, LM-SPDHG,LM-SPDHG-TV and FastPET method in both visually and quantitative analysis. These results demonstrate the potential application of deep unrolled methods for TOF-PET list-mode data and show better performance than current mainstream TOF-PET list-mode reconstruction algorithms, providing new insights for the application of deep learning methods in TOF list-mode data. The codes for this work are available at https://github.com/RickHH/LMPDnet
△ Less
Submitted 14 October, 2024;
originally announced October 2024.
-
Physically Analyzable AI-Based Nonlinear Platoon Dynamics Modeling During Traffic Oscillation: A Koopman Approach
Authors:
Kexin Tian,
Haotian Shi,
Yang Zhou,
Sixu Li
Abstract:
Given the complexity and nonlinearity inherent in traffic dynamics within vehicular platoons, there exists a critical need for a modeling methodology with high accuracy while concurrently achieving physical analyzability. Currently, there are two predominant approaches: the physics model-based approach and the Artificial Intelligence (AI)--based approach. Knowing the facts that the physical-based…
▽ More
Given the complexity and nonlinearity inherent in traffic dynamics within vehicular platoons, there exists a critical need for a modeling methodology with high accuracy while concurrently achieving physical analyzability. Currently, there are two predominant approaches: the physics model-based approach and the Artificial Intelligence (AI)--based approach. Knowing the facts that the physical-based model usually lacks sufficient modeling accuracy and potential function mismatches and the pure-AI-based method lacks analyzability, this paper innovatively proposes an AI-based Koopman approach to model the unknown nonlinear platoon dynamics harnessing the power of AI and simultaneously maintain physical analyzability, with a particular focus on periods of traffic oscillation. Specifically, this research first employs a deep learning framework to generate the embedding function that lifts the original space into the embedding space. Given the embedding space descriptiveness, the platoon dynamics can be expressed as a linear dynamical system founded by the Koopman theory. Based on that, the routine of linear dynamical system analysis can be conducted on the learned traffic linear dynamics in the embedding space. By that, the physical interpretability and analyzability of model-based methods with the heightened precision inherent in data-driven approaches can be synergized. Comparative experiments have been conducted with existing modeling approaches, which suggests our method's superiority in accuracy. Additionally, a phase plane analysis is performed, further evidencing our approach's effectiveness in replicating the complex dynamic patterns. Moreover, the proposed methodology is proven to feature the capability of analyzing the stability, attesting to the physical analyzability.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
A Fast Power Spectrum Sensing Solution for Generalized Coprime Sampling
Authors:
Kaili Jiang,
Dechang Wang,
Kailun Tian,
Hancong Feng,
Yuxin Zhao,
Junyu Yuan,
Bin Tang
Abstract:
The growing scarcity of spectrum resources, wideband spectrum sensing is required to process a prohibitive volume of data at a high sampling rate. For some applications, spectrum estimation only requires second-order statistics. In this case, a fast power spectrum sensing solution is proposed based on the generalized coprime sampling. By exploring the sensing vector inherent structure, the autocor…
▽ More
The growing scarcity of spectrum resources, wideband spectrum sensing is required to process a prohibitive volume of data at a high sampling rate. For some applications, spectrum estimation only requires second-order statistics. In this case, a fast power spectrum sensing solution is proposed based on the generalized coprime sampling. By exploring the sensing vector inherent structure, the autocorrelation sequence of inputs can be reconstructed from sub-Nyquist samples by only utilizing the parallel Fourier transform and simple multiplication operations. Thus, it takes less time than the state-of-the-art methods while maintaining the same performance, and it achieves higher performance than the existing methods within the same execution time, without the need for pre-estimating the number of inputs. Furthermore, the influence of the model mismatch has only a minor impact on the estimation performance, which allows for more efficient use of the spectrum resource in a distributed swarm scenario. Simulation results demonstrate the low complexity in sampling and computation, making it a more practical solution for real-time and distributed wideband spectrum sensing applications.
△ Less
Submitted 22 November, 2023;
originally announced November 2023.
-
Social Interaction-Aware Dynamical Models and Decision Making for Autonomous Vehicles
Authors:
Luca Crosato,
Kai Tian,
Hubert P. H Shum,
Edmond S. L. Ho,
Yafei Wang,
Chongfeng Wei
Abstract:
Interaction-aware Autonomous Driving (IAAD) is a rapidly growing field of research that focuses on the development of autonomous vehicles (AVs) that are capable of interacting safely and efficiently with human road users. This is a challenging task, as it requires the autonomous vehicle to be able to understand and predict the behaviour of human road users. In this literature review, the current s…
▽ More
Interaction-aware Autonomous Driving (IAAD) is a rapidly growing field of research that focuses on the development of autonomous vehicles (AVs) that are capable of interacting safely and efficiently with human road users. This is a challenging task, as it requires the autonomous vehicle to be able to understand and predict the behaviour of human road users. In this literature review, the current state of IAAD research is surveyed in this work. Commencing with an examination of terminology, attention is drawn to challenges and existing models employed for modelling the behaviour of drivers and pedestrians. Next, a comprehensive review is conducted on various techniques proposed for interaction modelling, encompassing cognitive methods, machine learning approaches, and game-theoretic methods. The conclusion is reached through a discussion of potential advantages and risks associated with IAAD, along with the illumination of pivotal research inquiries necessitating future exploration.
△ Less
Submitted 30 October, 2023; v1 submitted 28 October, 2023;
originally announced October 2023.
-
Towards Real-Time Neural Video Codec for Cross-Platform Application Using Calibration Information
Authors:
Kuan Tian,
Yonghang Guan,
Jinxi Xiang,
Jun Zhang,
Xiao Han,
Wei Yang
Abstract:
The state-of-the-art neural video codecs have outperformed the most sophisticated traditional codecs in terms of RD performance in certain cases. However, utilizing them for practical applications is still challenging for two major reasons. 1) Cross-platform computational errors resulting from floating point operations can lead to inaccurate decoding of the bitstream. 2) The high computational com…
▽ More
The state-of-the-art neural video codecs have outperformed the most sophisticated traditional codecs in terms of RD performance in certain cases. However, utilizing them for practical applications is still challenging for two major reasons. 1) Cross-platform computational errors resulting from floating point operations can lead to inaccurate decoding of the bitstream. 2) The high computational complexity of the encoding and decoding process poses a challenge in achieving real-time performance. In this paper, we propose a real-time cross-platform neural video codec, which is capable of efficiently decoding of 720P video bitstream from other encoding platforms on a consumer-grade GPU. First, to solve the problem of inconsistency of codec caused by the uncertainty of floating point calculations across platforms, we design a calibration transmitting system to guarantee the consistent quantization of entropy parameters between the encoding and decoding stages. The parameters that may have transboundary quantization between encoding and decoding are identified in the encoding stage, and their coordinates will be delivered by auxiliary transmitted bitstream. By doing so, these inconsistent parameters can be processed properly in the decoding stage. Furthermore, to reduce the bitrate of the auxiliary bitstream, we rectify the distribution of entropy parameters using a piecewise Gaussian constraint. Second, to match the computational limitations on the decoding side for real-time video codec, we design a lightweight model. A series of efficiency techniques enable our model to achieve 25 FPS decoding speed on NVIDIA RTX 2080 GPU. Experimental results demonstrate that our model can achieve real-time decoding of 720P videos while encoding on another platform. Furthermore, the real-time model brings up to a maximum of 24.2\% BD-rate improvement from the perspective of PSNR with the anchor H.265.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Wideband Spectrum Acquisition for UAV Swarm Using the Sparse Coding Fourier Transform
Authors:
Kaili Jiang,
Kailun Tian,
Hancong Feng,
Junyu Yuan,
Bin Tang
Abstract:
As the trend towards small, safe, smart, speedy and swarm development grows, unmanned aerial vehicles (UAVs) are becoming increasingly popular for a wide range of applications. In this letter, the challenge of wideband spectrum acquisition for the UAV swarms is studied by proposing a processing method that features lower power consumption, higher compression rates, and a lower signal-to-noise rati…
▽ More
As the trend towards small, safe, smart, speedy and swarm development grows, unmanned aerial vehicles (UAVs) are becoming increasingly popular for a wide range of applications. In this letter, the challenge of wideband spectrum acquisition for the UAV swarms is studied by proposing a processing method that features lower power consumption, higher compression rates, and a lower signal-to-noise ratio. Our system is equipped with multiple UAVs, each with a different sub-sampling rate. That allows for frequency backetization and estimation based on sparse Fourier transform theory. Unlike other techniques, the collisions and iterations caused by non-sparsity environ-ments are considered. We introduce sparse coding Fourier transform to address these issues. The key is to code the entire spectrum and decode it through spectrum correlation in the code. Simulation results show that our proposed method performs well in acquiring both narrowband and wideband signals simultaneously, compared to the other methods.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Distributed UAV Swarm Augmented Wideband Spectrum Sensing Using Nyquist Folding Receiver
Authors:
Kaili Jiang,
Kailun Tian,
Hancong Feng,
Yuxin Zhao,
Dechang Wang,
Sen Cao,
Jian Gao,
Xuying Zhang,
Yanfei Li,
Junyu Yuan,
Ying Xiong,
Bin Tang
Abstract:
Distributed unmanned aerial vehicle (UAV) swarms are formed by multiple UAVs with increased portability, higher levels of sensing capabilities, and more powerful autonomy. These features make them attractive for many recent applica-tions, potentially increasing the shortage of spectrum resources. In this paper, wideband spectrum sensing augmented technology is discussed for distributed UAV swarms…
▽ More
Distributed unmanned aerial vehicle (UAV) swarms are formed by multiple UAVs with increased portability, higher levels of sensing capabilities, and more powerful autonomy. These features make them attractive for many recent applica-tions, potentially increasing the shortage of spectrum resources. In this paper, wideband spectrum sensing augmented technology is discussed for distributed UAV swarms to improve the utilization of spectrum. However, the sub-Nyquist sampling applied in existing schemes has high hardware complexity, power consumption, and low recovery efficiency for non-strictly sparse conditions. Thus, the Nyquist folding receiver (NYFR) is considered for the distributed UAV swarms, which can theoretically achieve full-band spectrum detection and reception using a single analog-to-digital converter (ADC) at low speed for all circuit components. There is a focus on the sensing model of two multichannel scenarios for the distributed UAV swarms, one with a complete functional receiver for the UAV swarm with RIS, and another with a decentralized UAV swarm equipped with a complete functional receiver for each UAV element. The key issue is to consider whether the application of RIS technology will bring advantages to spectrum sensing and the data fusion problem of decentralized UAV swarms based on the NYFR architecture. Therefore, the property for multiple pulse reconstruction is analyzed through the Gershgorin circle theorem, especially for very short pulses. Further, the block sparse recovery property is analyzed for wide bandwidth signals. The proposed technology can improve the processing capability for multiple signals and wide bandwidth signals while reducing interference from folded noise and subsampled harmonics. Experiment results show augmented spectrum sensing efficiency under non-strictly sparse conditions.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Wideband Power Spectrum Sensing: a Fast Practical Solution for Nyquist Folding Receiver
Authors:
Kaili Jiang,
Dechang Wang,
Kailun Tian,
Hancong Feng,
Yuxin Zhao,
Sen Cao,
Jian Gao,
Xuying Zhang,
Yanfei Li,
Junyu Yuan,
Ying Xiong,
Bin Tang
Abstract:
The limited availability of spectrum resources has been growing into a critical problem in wireless communications, remote sensing, and electronic surveillance, etc. To address the high-speed sampling bottleneck of wideband spectrum sensing, a fast and practical solution of power spectrum estimation for Nyquist folding receiver (NYFR) is proposed in this paper. The NYFR architectures is can theore…
▽ More
The limited availability of spectrum resources has been growing into a critical problem in wireless communications, remote sensing, and electronic surveillance, etc. To address the high-speed sampling bottleneck of wideband spectrum sensing, a fast and practical solution of power spectrum estimation for Nyquist folding receiver (NYFR) is proposed in this paper. The NYFR architectures is can theoretically achieve the full-band signal sensing with a hundred percent of probability of intercept. But the existing algorithm is difficult to realize in real-time due to its high complexity and complicated calculations. By exploring the sub-sampling principle inherent in NYFR, a computationally efficient method is introduced with compressive covariance sensing. That can be efficient implemented via only the non-uniform fast Fourier transform, fast Fourier transform, and some simple multiplication operations. Meanwhile, the state-of-the-art power spectrum reconstruction model for NYFR of time-domain and frequency-domain is constructed in this paper as a comparison. Furthermore, the computational complexity of the proposed method scales linearly with the Nyquist-rate sampled number of samples and the sparsity of spectrum occupancy. Simulation results and discussion demonstrate that the low complexity in sampling and computation is a more practical solution to meet the real-time wideband spectrum sensing applications.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Hybrid Offline-Online Design for Reconfigurable Intelligent Surface Aided UAV Communication
Authors:
Kaiyuan Tian,
Bin Duo,
Xiaojun Yuan,
Wu Luo
Abstract:
This letter considers the reconfigurable intelligent surface (RIS)-aided unmanned aerial vehicle (UAV) communication systems in urban areas under the general Rician fading channel. A hybrid offline-online design is proposed to improve the system performance by leveraging both the statistical channel state information (S-CSI) and instantaneous channel state information (I-CSI). For the offline phas…
▽ More
This letter considers the reconfigurable intelligent surface (RIS)-aided unmanned aerial vehicle (UAV) communication systems in urban areas under the general Rician fading channel. A hybrid offline-online design is proposed to improve the system performance by leveraging both the statistical channel state information (S-CSI) and instantaneous channel state information (I-CSI). For the offline phase, we aim to maximize the expected average achievable rate based on the S-CSI by jointly optimizing the RIS's phase-shift and UAV trajectory. The formulated stochastic optimization problem is difficult to solve due to its non-convexity. To tackle this problem, we propose an efficient algorithm by leveraging the stochastic successive convex approximation (SSCA) techniques. For the online phase, the UAV adaptively adjusts the transmit beamforming and user scheduling according to the effective I-CSI. Numerical results verify that the proposed hybrid design performs better than various bechmark schemes, and also demonstrate a favorable trade-off between system performance and CSI overhead.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
Unsupervised heart abnormality detection based on phonocardiogram analysis with Beta Variational Auto-Encoders
Authors:
Shengchen Li,
Ke Tian,
Rui Wang
Abstract:
Heart Sound (also known as phonocardiogram (PCG)) analysis is a popular way that detects cardiovascular diseases (CVDs). Most PCG analysis uses supervised way, which demands both normal and abnormal samples. This paper proposes a method of unsupervised PCG analysis that uses beta variational auto-encoder ($β-\text{VAE}$) to model the normal PCG signals. The best performed model reaches an AUC (Are…
▽ More
Heart Sound (also known as phonocardiogram (PCG)) analysis is a popular way that detects cardiovascular diseases (CVDs). Most PCG analysis uses supervised way, which demands both normal and abnormal samples. This paper proposes a method of unsupervised PCG analysis that uses beta variational auto-encoder ($β-\text{VAE}$) to model the normal PCG signals. The best performed model reaches an AUC (Area Under Curve) value of 0.91 in ROC (Receiver Operating Characteristic) test for PCG signals collected from the same source. Unlike majority of $β-\text{VAE}$s that are used as generative models, the best-performed $β-\text{VAE}$ has a $β$ value smaller than 1. Further experiments then find that the introduction of a light weighted KL divergence between distribution of latent space and normal distribution improves the performance of anomaly PCG detection based on anomaly scores resulted by reconstruction loss. The fact suggests that anomaly score based on reconstruction loss may be better than anomaly scores based on latent vectors of samples
△ Less
Submitted 13 January, 2021;
originally announced January 2021.
-
Microscope Based HER2 Scoring System
Authors:
Jun Zhang,
Kuan Tian,
Pei Dong,
Haocheng Shen,
Kezhou Yan,
Jianhua Yao,
Junzhou Huang,
Xiao Han
Abstract:
The overexpression of human epidermal growth factor receptor 2 (HER2) has been established as a therapeutic target in multiple types of cancers, such as breast and gastric cancers. Immunohistochemistry (IHC) is employed as a basic HER2 test to identify the HER2-positive, borderline, and HER2-negative patients. However, the reliability and accuracy of HER2 scoring are affected by many factors, such…
▽ More
The overexpression of human epidermal growth factor receptor 2 (HER2) has been established as a therapeutic target in multiple types of cancers, such as breast and gastric cancers. Immunohistochemistry (IHC) is employed as a basic HER2 test to identify the HER2-positive, borderline, and HER2-negative patients. However, the reliability and accuracy of HER2 scoring are affected by many factors, such as pathologists' experience. Recently, artificial intelligence (AI) has been used in various disease diagnosis to improve diagnostic accuracy and reliability, but the interpretation of diagnosis results is still an open problem. In this paper, we propose a real-time HER2 scoring system, which follows the HER2 scoring guidelines to complete the diagnosis, and thus each step is explainable. Unlike the previous scoring systems based on whole-slide imaging, our HER2 scoring system is integrated into an augmented reality (AR) microscope that can feedback AI results to the pathologists while reading the slide. The pathologists can help select informative fields of view (FOVs), avoiding the confounding regions, such as DCIS. Importantly, we illustrate the intermediate results with membrane staining condition and cell classification results, making it possible to evaluate the reliability of the diagnostic results. Also, we support the interactive modification of selecting regions-of-interest, making our system more flexible in clinical practice. The collaboration of AI and pathologists can significantly improve the robustness of our system. We evaluate our system with 285 breast IHC HER2 slides, and the classification accuracy of 95\% shows the effectiveness of our HER2 scoring system.
△ Less
Submitted 14 September, 2020;
originally announced September 2020.
-
A 6G White Paper on Connectivity for Remote Areas
Authors:
Harri Saarnisaari,
Sudhir Dixit,
Mohamed-Slim Alouini,
Abdelaali Chaoub,
Marco Giordani,
Adrian Kliks,
Marja Matinmikko-Blue,
Nan Zhang,
Anuj Agrawal,
Mats Andersson,
Vimal Bhatia,
Wei Cao,
Yunfei Chen,
Wei Feng,
Marjo Heikkilä,
Josep M. Jornet,
Luciano Mendes,
Heikki Karvonen,
Brejesh Lall,
Matti Latva-aho,
Xiangling Li,
Kalle Lähetkangas,
Moshe T. Masonta,
Alok Pandey,
Pekka Pirinen
, et al. (9 additional authors not shown)
Abstract:
In many places all over the world rural and remote areas lack proper connectivity that has led to increasing digital divide. These areas might have low population density, low incomes, etc., making them less attractive places to invest and operate connectivity networks. 6G could be the first mobile radio generation truly aiming to close the digital divide. However, in order to do so, special requi…
▽ More
In many places all over the world rural and remote areas lack proper connectivity that has led to increasing digital divide. These areas might have low population density, low incomes, etc., making them less attractive places to invest and operate connectivity networks. 6G could be the first mobile radio generation truly aiming to close the digital divide. However, in order to do so, special requirements and challenges have to be considered since the beginning of the design process. The aim of this white paper is to discuss requirements and challenges and point out related, identified research topics that have to be solved in 6G. This white paper first provides a generic discussion, shows some facts and discusses targets set in international bodies related to rural and remote connectivity and digital divide. Then the paper digs into technical details, i.e., into a solutions space. Each technical section ends with a discussion and then highlights identified 6G challenges and research ideas as a list.
△ Less
Submitted 30 April, 2020;
originally announced April 2020.
-
Non-Local ConvLSTM for Video Compression Artifact Reduction
Authors:
Yi Xu,
Longwen Gao,
Kai Tian,
Shuigeng Zhou,
Huyang Sun
Abstract:
Video compression artifact reduction aims to recover high-quality videos from low-quality compressed videos. Most existing approaches use a single neighboring frame or a pair of neighboring frames (preceding and/or following the target frame) for this task. Furthermore, as frames of high quality overall may contain low-quality patches, and high-quality patches may exist in frames of low quality ov…
▽ More
Video compression artifact reduction aims to recover high-quality videos from low-quality compressed videos. Most existing approaches use a single neighboring frame or a pair of neighboring frames (preceding and/or following the target frame) for this task. Furthermore, as frames of high quality overall may contain low-quality patches, and high-quality patches may exist in frames of low quality overall, current methods focusing on nearby peak-quality frames (PQFs) may miss high-quality details in low-quality frames. To remedy these shortcomings, in this paper we propose a novel end-to-end deep neural network called non-local ConvLSTM (NL-ConvLSTM in short) that exploits multiple consecutive frames. An approximate non-local strategy is introduced in NL-ConvLSTM to capture global motion patterns and trace the spatiotemporal dependency in a video sequence. This approximate strategy makes the non-local module work in a fast and low space-cost way. Our method uses the preceding and following frames of the target frame to generate a residual, from which a higher quality frame is reconstructed. Experiments on two datasets show that NL-ConvLSTM outperforms the existing methods.
△ Less
Submitted 27 October, 2019;
originally announced October 2019.