-
Fast Online Video Super-Resolution with Deformable Attention Pyramid
Authors:
Dario Fuoli,
Martin Danelljan,
Radu Timofte,
Luc Van Gool
Abstract:
Video super-resolution (VSR) has many applications that pose strict causal, real-time, and latency constraints, including video streaming and TV. We address the VSR problem under these settings, which poses additional important challenges since information from future frames is unavailable. Importantly, designing efficient, yet effective frame alignment and fusion modules remain central problems.…
▽ More
Video super-resolution (VSR) has many applications that pose strict causal, real-time, and latency constraints, including video streaming and TV. We address the VSR problem under these settings, which poses additional important challenges since information from future frames is unavailable. Importantly, designing efficient, yet effective frame alignment and fusion modules remain central problems. In this work, we propose a recurrent VSR architecture based on a deformable attention pyramid (DAP). Our DAP aligns and integrates information from the recurrent state into the current frame prediction. To circumvent the computational cost of traditional attention-based methods, we only attend to a limited number of spatial locations, which are dynamically predicted by the DAP. Comprehensive experiments and analysis of the proposed key innovations show the effectiveness of our approach. We significantly reduce processing time and computational complexity in comparison to state-of-the-art methods, while maintaining a high performance. We surpass state-of-the-art method EDVR-M on two standard benchmarks with a speed-up of over $3\times$.
△ Less
Submitted 6 April, 2022; v1 submitted 3 February, 2022;
originally announced February 2022.
-
Fourier Space Losses for Efficient Perceptual Image Super-Resolution
Authors:
Dario Fuoli,
Luc Van Gool,
Radu Timofte
Abstract:
Many super-resolution (SR) models are optimized for high performance only and therefore lack efficiency due to large model complexity. As large models are often not practical in real-world applications, we investigate and propose novel loss functions, to enable SR with high perceptual quality from much more efficient models. The representative power for a given low-complexity generator network can…
▽ More
Many super-resolution (SR) models are optimized for high performance only and therefore lack efficiency due to large model complexity. As large models are often not practical in real-world applications, we investigate and propose novel loss functions, to enable SR with high perceptual quality from much more efficient models. The representative power for a given low-complexity generator network can only be fully leveraged by strong guidance towards the optimal set of parameters. We show that it is possible to improve the performance of a recently introduced efficient generator architecture solely with the application of our proposed loss functions. In particular, we use a Fourier space supervision loss for improved restoration of missing high-frequency (HF) content from the ground truth image and design a discriminator architecture working directly in the Fourier domain to better match the target HF distribution. We show that our losses' direct emphasis on the frequencies in Fourier-space significantly boosts the perceptual image quality, while at the same time retaining high restoration quality in comparison to previously proposed loss functions for this task. The performance is further improved by utilizing a combination of spatial and frequency domain losses, as both representations provide complementary information during training. On top of that, the trained generator achieves comparable results with and is 2.4x and 48x faster than state-of-the-art perceptual SR methods RankSRGAN and SRFlow respectively.
△ Less
Submitted 1 June, 2021;
originally announced June 2021.
-
An Efficient Recurrent Adversarial Framework for Unsupervised Real-Time Video Enhancement
Authors:
Dario Fuoli,
Zhiwu Huang,
Danda Pani Paudel,
Luc Van Gool,
Radu Timofte
Abstract:
Video enhancement is a challenging problem, more than that of stills, mainly due to high computational cost, larger data volumes and the difficulty of achieving consistency in the spatio-temporal domain. In practice, these challenges are often coupled with the lack of example pairs, which inhibits the application of supervised learning strategies. To address these challenges, we propose an efficie…
▽ More
Video enhancement is a challenging problem, more than that of stills, mainly due to high computational cost, larger data volumes and the difficulty of achieving consistency in the spatio-temporal domain. In practice, these challenges are often coupled with the lack of example pairs, which inhibits the application of supervised learning strategies. To address these challenges, we propose an efficient adversarial video enhancement framework that learns directly from unpaired video examples. In particular, our framework introduces new recurrent cells that consist of interleaved local and global modules for implicit integration of spatial and temporal information. The proposed design allows our recurrent cells to efficiently propagate spatio-temporal information across frames and reduces the need for high complexity networks. Our setting enables learning from unpaired videos in a cyclic adversarial manner, where the proposed recurrent units are employed in all architectures. Efficient training is accomplished by introducing one single discriminator that learns the joint distribution of source and target domain simultaneously. The enhancement results demonstrate clear superiority of the proposed video enhancer over the state-of-the-art methods, in all terms of visual quality, quantitative metrics, and inference speed. Notably, our video enhancer is capable of enhancing over 35 frames per second of FullHD video (1080x1920).
△ Less
Submitted 11 December, 2022; v1 submitted 23 December, 2020;
originally announced December 2020.
-
NTIRE 2020 Challenge on Video Quality Mapping: Methods and Results
Authors:
Dario Fuoli,
Zhiwu Huang,
Martin Danelljan,
Radu Timofte,
Hua Wang,
Longcun Jin,
Dewei Su,
Jing Liu,
Jaehoon Lee,
Michal Kudelski,
Lukasz Bala,
Dmitry Hrybov,
Marcin Mozejko,
Muchen Li,
Siyao Li,
Bo Pang,
Cewu Lu,
Chao Li,
Dongliang He,
Fu Li,
Shilei Wen
Abstract:
This paper reviews the NTIRE 2020 challenge on video quality mapping (VQM), which addresses the issues of quality mapping from source video domain to target video domain. The challenge includes both a supervised track (track 1) and a weakly-supervised track (track 2) for two benchmark datasets. In particular, track 1 offers a new Internet video benchmark, requiring algorithms to learn the map from…
▽ More
This paper reviews the NTIRE 2020 challenge on video quality mapping (VQM), which addresses the issues of quality mapping from source video domain to target video domain. The challenge includes both a supervised track (track 1) and a weakly-supervised track (track 2) for two benchmark datasets. In particular, track 1 offers a new Internet video benchmark, requiring algorithms to learn the map from more compressed videos to less compressed videos in a supervised training manner. In track 2, algorithms are required to learn the quality mapping from one device to another when their quality varies substantially and weakly-aligned video pairs are available. For track 1, in total 7 teams competed in the final test phase, demonstrating novel and effective solutions to the problem. For track 2, some existing methods are evaluated, showing promising solutions to the weakly-supervised video quality mapping problem.
△ Less
Submitted 15 June, 2020; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Efficient Video Super-Resolution through Recurrent Latent Space Propagation
Authors:
Dario Fuoli,
Shuhang Gu,
Radu Timofte
Abstract:
With the recent trend for ultra high definition displays, the demand for high quality and efficient video super-resolution (VSR) has become more important than ever. Previous methods adopt complex motion compensation strategies to exploit temporal information when estimating the missing high frequency details. However, as the motion estimation problem is a highly challenging problem, inaccurate mo…
▽ More
With the recent trend for ultra high definition displays, the demand for high quality and efficient video super-resolution (VSR) has become more important than ever. Previous methods adopt complex motion compensation strategies to exploit temporal information when estimating the missing high frequency details. However, as the motion estimation problem is a highly challenging problem, inaccurate motion compensation may affect the performance of VSR algorithms. Furthermore, the complex motion compensation module may also introduce a heavy computational burden, which limits the application of these methods in real systems. In this paper, we propose an efficient recurrent latent space propagation (RLSP) algorithm for fast VSR. RLSP introduces high-dimensional latent states to propagate temporal information between frames in an implicit manner. Our experimental results show that RLSP is a highly efficient and effective method to deal with the VSR problem. We outperform current state-of-the-art method DUF with over 70x speed-up.
△ Less
Submitted 17 September, 2019;
originally announced September 2019.