Skip to main content

Showing 1–50 of 63 results for author: Bull, D

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.14381  [pdf, ps, other

    eess.IV cs.CV

    Compressed Video Super-Resolution based on Hierarchical Encoding

    Authors: Yuxuan Jiang, Siyue Teng, Qiang Zhu, Chen Feng, Chengxi Zeng, Fan Zhang, Shuyuan Zhu, Bing Zeng, David Bull

    Abstract: This paper presents a general-purpose video super-resolution (VSR) method, dubbed VSR-HE, specifically designed to enhance the perceptual quality of compressed content. Targeting scenarios characterized by heavy compression, the method upscales low-resolution videos by a ratio of four, from 180p to 720p or from 270p to 1080p. VSR-HE adopts hierarchical encoding transformer blocks and has been soph… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  2. arXiv:2504.13131  [pdf, other

    eess.IV cs.AI cs.CV

    NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results

    Authors: Xin Li, Kun Yuan, Bingchen Li, Fengbin Guan, Yizhen Shao, Zihao Yu, Xijun Wang, Yiting Lu, Wei Luo, Suhang Yao, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Yabin Zhang, Ao-Xiang Zhang, Tianwu Zhi, Jianzhao Liu, Yang Li, Jingwen Xu, Yiting Liao, Yushen Zuo, Mingyang Wu, Renjie Li, Shengyun Zhong , et al. (88 additional authors not shown)

    Abstract: This paper presents a review for the NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement. The challenge comprises two tracks: (i) Efficient Video Quality Assessment (KVQ), and (ii) Diffusion-based Image Super-Resolution (KwaiSR). Track 1 aims to advance the development of lightweight and efficient video quality assessment (VQA) models, with an emphasis on eliminating re… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

    Comments: Challenge Report of NTIRE 2025; Methods from 18 Teams; Accepted by CVPR Workshop; 21 pages

  3. arXiv:2504.12169  [pdf, other

    cs.CV eess.IV

    Towards a General-Purpose Zero-Shot Synthetic Low-Light Image and Video Pipeline

    Authors: Joanne Lin, Crispian Morris, Ruirui Lin, Fan Zhang, David Bull, Nantheera Anantrasirichai

    Abstract: Low-light conditions pose significant challenges for both human and machine annotation. This in turn has led to a lack of research into machine understanding for low-light images and (in particular) videos. A common approach is to apply annotations obtained from high quality datasets to synthetically created low light versions. In addition, these approaches are often limited through the use of unr… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

  4. arXiv:2504.10686  [pdf, other

    cs.CV eess.IV

    The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Hang Guo, Lei Sun, Zongwei Wu, Radu Timofte, Yawei Li, Yao Zhang, Xinning Chai, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Li Song, Hongyuan Yu, Pufan Xu, Cheng Wan, Zhijuan Huang, Peng Guo, Shuyuan Cui, Chenjun Li, Xuehai Hu, Pan Pan, Xin Zhang, Heng Zhang, Qing Luo, Linyan Jiang , et al. (122 additional authors not shown)

    Abstract: This paper presents a comprehensive review of the NTIRE 2025 Challenge on Single-Image Efficient Super-Resolution (ESR). The challenge aimed to advance the development of deep models that optimize key computational metrics, i.e., runtime, parameters, and FLOPs, while achieving a PSNR of at least 26.90 dB on the $\operatorname{DIV2K\_LSDIR\_valid}$ dataset and 26.99 dB on the… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Accepted by CVPR2025 NTIRE Workshop, Efficient Super-Resolution Challenge Report. 50 pages

  5. arXiv:2503.19604  [pdf, other

    eess.IV cs.CV

    GIViC: Generative Implicit Video Compression

    Authors: Ge Gao, Siyue Teng, Tianhao Peng, Fan Zhang, David Bull

    Abstract: While video compression based on implicit neural representations (INRs) has recently demonstrated great potential, existing INR-based video codecs still cannot achieve state-of-the-art (SOTA) performance compared to their conventional or autoencoder-based counterparts given the same coding configuration. In this context, we propose a Generative Implicit Video Compression framework, GIViC, aiming a… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  6. RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content

    Authors: Yuxuan Jiang, Jakub Nawała, Chen Feng, Fan Zhang, Xiaoqing Zhu, Joel Sole, David Bull

    Abstract: Super-resolution (SR) is a key technique for improving the visual quality of video content by increasing its spatial resolution while reconstructing fine details. SR has been employed in many applications including video streaming, where compressed low-resolution content is typically transmitted to end users and then reconstructed with a higher resolution and enhanced quality. To support real-time… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

  7. BVI-CR: A Multi-View Human Dataset for Volumetric Video Compression

    Authors: Ge Gao, Adrian Azzarelli, Ho Man Kwan, Nantheera Anantrasirichai, Fan Zhang, Oliver Moolan-Feroze, David Bull

    Abstract: The advances in immersive technologies and 3D reconstruction have enabled the creation of digital replicas of real-world objects and environments with fine details. These processes generate vast amounts of 3D data, requiring more efficient compression methods to satisfy the memory and bandwidth constraints associated with data storage and transmission. However, the development and validation of ef… ▽ More

    Submitted 17 November, 2024; originally announced November 2024.

  8. arXiv:2409.07414  [pdf, other

    cs.CV eess.IV

    NVRC: Neural Video Representation Compression

    Authors: Ho Man Kwan, Ge Gao, Fan Zhang, Andrew Gower, David Bull

    Abstract: Recent advances in implicit neural representation (INR)-based video coding have demonstrated its potential to compete with both conventional and other learning-based approaches. With INR methods, a neural network is trained to overfit a video sequence, with its parameters compressed to obtain a compact representation of the video content. However, although promising results have been achieved, the… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

  9. arXiv:2408.07171  [pdf, other

    eess.IV cs.CV

    BVI-UGC: A Video Quality Database for User-Generated Content Transcoding

    Authors: Zihao Qi, Chen Feng, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull

    Abstract: In recent years, user-generated content (UGC) has become one of the major video types consumed via streaming networks. Numerous research contributions have focused on assessing its visual quality through subjective tests and objective modeling. In most cases, objective assessments are based on a no-reference scenario, where the corresponding reference content is assumed not to be available. Howeve… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 12 pages, 11 figures

  10. Benchmarking Conventional and Learned Video Codecs with a Low-Delay Configuration

    Authors: Siyue Teng, Yuxuan Jiang, Ge Gao, Fan Zhang, Thomas Davis, Zoe Liu, David Bull

    Abstract: Recent advances in video compression have seen significant coding performance improvements with the development of new standards and learning-based video codecs. However, most of these works focus on application scenarios that allow a certain amount of system delay (e.g., Random Access mode in MPEG codecs), which is not always acceptable for live delivery. This paper conducts a comparative study o… ▽ More

    Submitted 9 August, 2024; originally announced August 2024.

  11. BVI-AOM: A New Training Dataset for Deep Video Compression Optimization

    Authors: Jakub Nawała, Yuxuan Jiang, Fan Zhang, Xiaoqing Zhu, Joel Sole, David Bull

    Abstract: Deep learning is now playing an important role in enhancing the performance of conventional hybrid video codecs. These learning-based methods typically require diverse and representative training material for optimization in order to achieve model generalization and optimal coding performance. However, existing datasets either offer limited content variability or come with restricted licensing ter… ▽ More

    Submitted 23 October, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

    Comments: 5 pages, 5 figures. Swapped the PSNR-HVS plot in Fig. 3 for a PSNR-YUV plot. Updated Fig. 3 (SI/TI/CF plots) and added the URL to the dataset

  12. arXiv:2407.11496  [pdf, other

    eess.IV cs.CV cs.MM

    ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

    Authors: Xinyi Wang, Angeliki Katsenou, David Bull

    Abstract: With the rapid growth of User-Generated Content (UGC) exchanged between users and sharing platforms, the need for video quality assessment in the wild is increasingly evident. UGC is typically acquired using consumer devices and undergoes multiple rounds of compression (transcoding) before reaching the end user. Therefore, traditional quality metrics that employ the original content as a reference… ▽ More

    Submitted 12 March, 2025; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: 10 pages, 3 figures

  13. MVAD: A Multiple Visual Artifact Detector for Video Streaming

    Authors: Chen Feng, Duolikun Danier, Fan Zhang, Alex Mackin, Andrew Collins, David Bull

    Abstract: Visual artifacts are often introduced into streamed video content, due to prevailing conditions during content production and delivery. Since these can degrade the quality of the user's experience, it is important to automatically and accurately detect them in order to enable effective quality measurement and enhancement. Existing detection methods often focus on a single type of artifact and/or d… ▽ More

    Submitted 9 December, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

    Comments: Paper has been accpeted by WACV 2025

  14. RMT-BVQA: Recurrent Memory Transformer-based Blind Video Quality Assessment for Enhanced Video Content

    Authors: Tianhao Peng, Chen Feng, Duolikun Danier, Fan Zhang, Benoit Vallade, Alex Mackin, David Bull

    Abstract: With recent advances in deep learning, numerous algorithms have been developed to enhance video quality, reduce visual artifacts, and improve perceptual quality. However, little research has been reported on the quality assessment of enhanced content - the evaluation of enhancement methods is often based on quality metrics that were designed for compression applications. In this paper, we propose… ▽ More

    Submitted 10 October, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by the ECCV 2024 AIM Advances in Image Manipulation workshop

  15. MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution

    Authors: Yuxuan Jiang, Chen Feng, Fan Zhang, David Bull

    Abstract: Knowledge distillation (KD) has emerged as a promising technique in deep learning, typically employed to enhance a compact student network through learning from their high-performance but more complex teacher variant. When applied in the context of image super-resolution, most KD approaches are modified versions of methods developed for other computer vision tasks, which are based on training stra… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  16. arXiv:2403.02408  [pdf, other

    eess.IV cs.CV

    A Spatio-temporal Aligned SUNet Model for Low-light Video Enhancement

    Authors: Ruirui Lin, Nantheera Anantrasirichai, Alexandra Malyugina, David Bull

    Abstract: Distortions caused by low-light conditions are not only visually unpleasant but also degrade the performance of computer vision tasks. The restoration and enhancement have proven to be highly beneficial. However, there are only a limited number of enhancement methods explicitly designed for videos acquired in low-light conditions. We propose a Spatio-Temporal Aligned SUNet (STA-SUNet) model using… ▽ More

    Submitted 12 July, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  17. arXiv:2402.19041  [pdf, other

    cs.CV cs.AI eess.IV

    Atmospheric Turbulence Removal with Video Sequence Deep Visual Priors

    Authors: P. Hill, N. Anantrasirichai, A. Achim, D. R. Bull

    Abstract: Atmospheric turbulence poses a challenge for the interpretation and visual perception of visual imagery due to its distortion effects. Model-based approaches have been used to address this, but such methods often suffer from artefacts associated with moving content. Conversely, deep learning based methods are dependent on large and diverse datasets that may not effectively represent any specific c… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  18. arXiv:2402.07057  [pdf, other

    eess.IV cs.MM

    Rate-Quality or Energy-Quality Pareto Fronts for Adaptive Video Streaming?

    Authors: Angeliki Katsenou, Xinyi Wang, Daniel Schien, David Bull

    Abstract: Adaptive video streaming is a key enabler for optimising the delivery of offline encoded video content. The research focus to date has been on optimisation, based solely on rate-quality curves. This paper adds an additional dimension, the energy expenditure, and explores construction of bitrate ladders based on decoding energy-quality curves rather than the conventional rate-quality curves. Pareto… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: 6, submitted to a conference

  19. Immersive Video Compression using Implicit Neural Representations

    Authors: Ho Man Kwan, Fan Zhang, Andrew Gower, David Bull

    Abstract: Recent work on implicit neural representations (INRs) has evidenced their potential for efficiently representing and encoding conventional video content. In this paper we, for the first time, extend their application to immersive (multi-view) videos, by proposing MV-HiNeRV, a new INR-based immersive video codec. MV-HiNeRV is an enhanced version of a state-of-the-art INR-based video codec, HiNeRV,… ▽ More

    Submitted 23 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  20. Compressing Deep Image Super-resolution Models

    Authors: Yuxuan Jiang, Jakub Nawala, Fan Zhang, David Bull

    Abstract: Deep learning techniques have been applied in the context of image super-resolution (SR), achieving remarkable advances in terms of reconstruction performance. Existing techniques typically employ highly complex model structures which result in large model sizes and slow inference speeds. This often leads to high energy consumption and restricts their adoption for practical applications. To addres… ▽ More

    Submitted 21 February, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

  21. Full-reference Video Quality Assessment for User Generated Content Transcoding

    Authors: Zihao Qi, Chen Feng, Duolikun Danier, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull

    Abstract: Unlike video coding for professional content, the delivery pipeline of User Generated Content (UGC) involves transcoding where unpristine reference content needs to be compressed repeatedly. In this work, we observe that existing full-/no-reference quality metrics fail to accurately predict the perceptual quality difference between transcoded UGC content and the corresponding unpristine references… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 5 pages, 4 figures

  22. arXiv:2312.12150  [pdf, ps, other

    eess.IV cs.MM

    Comparative Study of Hardware and Software Power Measurements in Video Compression

    Authors: Angeliki Katsenou, Xinyi Wang, Daniel Schien, David Bull

    Abstract: The environmental impact of video streaming services has been discussed as part of the strategies towards sustainable information and communication technologies. A first step towards that is the energy profiling and assessment of energy consumption of existing video technologies. This paper presents a comprehensive study of power measurement techniques in video compression, comparing the use of ha… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 5 pages

  23. RankDVQA-mini: Knowledge Distillation-Driven Deep Video Quality Assessment

    Authors: Chen Feng, Duolikun Danier, Haoran Wang, Fan Zhang, Benoit Vallade, Alex Mackin, David Bull

    Abstract: Deep learning-based video quality assessment (deep VQA) has demonstrated significant potential in surpassing conventional metrics, with promising improvements in terms of correlation with human perception. However, the practical deployment of such deep VQA models is often limited due to their high computational complexity and large memory requirements. To address this issue, we aim to significantl… ▽ More

    Submitted 7 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: The paper has been accepted by Picture Coding Symposium (PCS) 2024

  24. Accelerating Learnt Video Codecs with Gradient Decay and Layer-wise Distillation

    Authors: Tianhao Peng, Ge Gao, Heming Sun, Fan Zhang, David Bull

    Abstract: In recent years, end-to-end learnt video codecs have demonstrated their potential to compete with conventional coding algorithms in term of compression efficiency. However, most learning-based video compression models are associated with high computational complexity and latency, in particular at the decoder side, which limits their deployment in practical applications. In this paper, we present a… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Report number: 2312.02605

  25. arXiv:2309.08975  [pdf, other

    eess.IV

    Wavelet-based Topological Loss for Low-Light Image Denoising

    Authors: Alexandra Malyugina, Nantheera Anantrasirichai, David Bull

    Abstract: Despite extensive research conducted in the field of image denoising, many algorithms still heavily depend on supervised learning and their effectiveness primarily relies on the quality and diversity of training data. It is widely assumed that digital image distortions are caused by spatially invariant Additive White Gaussian Noise (AWGN). However, the analysis of real-world data suggests that thi… ▽ More

    Submitted 20 September, 2023; v1 submitted 16 September, 2023; originally announced September 2023.

  26. HiNeRV: Video Compression with Hierarchical Encoding-based Neural Representation

    Authors: Ho Man Kwan, Ge Gao, Fan Zhang, Andrew Gower, David Bull

    Abstract: Learning-based video compression is currently a popular research topic, offering the potential to compete with conventional standard video codecs. In this context, Implicit Neural Representations (INRs) have previously been used to represent and compress image and video content, demonstrating relatively high decoding speed compared to other methods. However, existing INR-based methods have failed… ▽ More

    Submitted 26 January, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

  27. LDMVFI: Video Frame Interpolation with Latent Diffusion Models

    Authors: Duolikun Danier, Fan Zhang, David Bull

    Abstract: Existing works on video frame interpolation (VFI) mostly employ deep neural networks that are trained by minimizing the L1, L2, or deep feature space distance (e.g. VGG loss) between their outputs and ground-truth frames. However, recent works have shown that these metrics are poor indicators of perceptual VFI quality. Towards developing perceptually-oriented VFI methods, in this work we propose l… ▽ More

    Submitted 11 December, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted by AAAI 2024

  28. arXiv:2302.08455  [pdf, other

    eess.IV

    ST-MFNet Mini: Knowledge Distillation-Driven Frame Interpolation

    Authors: Crispian Morris, Duolikun Danier, Fan Zhang, Nantheera Anantrasirichai, David R. Bull

    Abstract: Currently, one of the major challenges in deep learning-based video frame interpolation (VFI) is the large model sizes and high computational complexity associated with many high performance VFI approaches. In this paper, we present a distillation-based two-stage workflow for obtaining compressed VFI models which perform competitively to the state of the arts, at a greatly reduced model size and c… ▽ More

    Submitted 23 February, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

  29. BVI-VFI: A Video Quality Database for Video Frame Interpolation

    Authors: Duolikun Danier, Fan Zhang, David Bull

    Abstract: Video frame interpolation (VFI) is a fundamental research topic in video processing, which is currently attracting increased attention across the research community. While the development of more advanced VFI algorithms has been extensively researched, there remains little understanding of how humans perceive the quality of interpolated content and how well existing objective quality assessment me… ▽ More

    Submitted 21 October, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

  30. arXiv:2208.04573  [pdf, other

    eess.IV

    A Topological Loss Function: Image Denoising on a Low-Light Dataset

    Authors: Alexandra Malyugina, Nantheera Anantrasirichai, David Bull

    Abstract: Although image denoising algorithms have attracted significant research attention, surprisingly few have been proposed for, or evaluated on, noise from imagery acquired under real low-light conditions. Moreover, noise characteristics are often assumed to be spatially invariant, leading to edges and textures being distorted after denoising. Here, we introduce a novel topological loss function which… ▽ More

    Submitted 26 June, 2023; v1 submitted 9 August, 2022; originally announced August 2022.

  31. arXiv:2207.08634  [pdf, other

    eess.IV

    Enhancing HDR Video Compression through CNN-based Effective Bit Depth Adaptation

    Authors: Chen Feng, Zihao Qi, Duolikun Danier, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull

    Abstract: It is well known that high dynamic range (HDR) video can provide more immersive visual experiences compared to conventional standard dynamic range content. However, HDR content is typically more challenging to encode due to the increased detail associated with the wider dynamic range. In this paper, we improve HDR compression performance using the effective bit depth adaptation approach (EBDA). Th… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: 5 pages, 3 figures

  32. arXiv:2207.08119  [pdf, other

    eess.IV cs.CV

    FloLPIPS: A Bespoke Video Quality Metric for Frame Interpoation

    Authors: Duolikun Danier, Fan Zhang, David Bull

    Abstract: Video frame interpolation (VFI) serves as a useful tool for many video processing applications. Recently, it has also been applied in the video compression domain for enhancing both conventional video codecs and learning-based compression architectures. While there has been an increased focus on the development of enhanced frame interpolation algorithms in recent years, the perceptual quality asse… ▽ More

    Submitted 22 June, 2023; v1 submitted 17 July, 2022; originally announced July 2022.

  33. arXiv:2205.09458  [pdf, other

    eess.IV

    Enhancing VVC with Deep Learning based Multi-Frame Post-Processing

    Authors: Duolikun Danier, Chen Feng, Fan Zhang, David Bull

    Abstract: This paper describes a CNN-based multi-frame post-processing approach based on a perceptually-inspired Generative Adversarial Network architecture, CVEGAN. This method has been integrated with the Versatile Video Coding Test Model (VTM) 15.2 to enhance the visual quality of the final reconstructed content. The evaluation results on the CLIC 2022 validation sequences show consistent coding gains ov… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Accepted in CVPR 2022 Workshop and Challenge on Learned Image Compression

  34. arXiv:2203.02407  [pdf, other

    eess.IV

    Sparse InSAR Data 3D Inpainting for Ground Deformation Detection Along the Rail Corridor

    Authors: Odysseas Pappas, Juliet Biggs, David Bull, Alin Achim, Nantheera Anantrasirichai

    Abstract: Monitoring of ground movement close to the rail corridor, such as that associated with landslips caused by ground subsidence and/or uplift, is of great interest for the detection and prevention of possible railway faults. Interferometric synthetic-aperture radar (InSAR) data can be used to measure ground deformation, but its use poses distinct challenges, as the data is highly sparse and can be pa… ▽ More

    Submitted 4 March, 2022; originally announced March 2022.

    Comments: in submission to ICIP 2022

  35. arXiv:2202.12852  [pdf, other

    eess.IV

    A CNN-based Post-Processor for Perceptually-Optimized Immersive Media Compression

    Authors: Angeliki Katsenou, Fan Zhang, David Bull

    Abstract: In recent years, resolution adaptation based on deep neural networks has enabled significant performance gains for conventional (2D) video codecs. This paper investigates the effectiveness of spatial resolution resampling in the context of immersive content. The proposed approach reduces the spatial resolution of input multi-view videos before encoding, and reconstructs their original resolution a… ▽ More

    Submitted 25 February, 2022; originally announced February 2022.

  36. RankDVQA: Deep VQA based on Ranking-inspired Hybrid Training

    Authors: Chen Feng, Duolikun Danier, Fan Zhang, David Bull

    Abstract: In recent years, deep learning techniques have shown significant potential for improving video quality assessment (VQA), achieving higher correlation with subjective opinions compared to conventional approaches. However, the development of deep VQA methods has been constrained by the limited availability of large-scale training databases and ineffective training methodologies. As a result, it is d… ▽ More

    Submitted 20 November, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

    Comments: 8 pages, 5 figures accepted by WACV 2024

  37. Enhancing Deformable Convolution based Video Frame Interpolation with Coarse-to-fine 3D CNN

    Authors: Duolikun Danier, Fan Zhang, David Bull

    Abstract: This paper presents a new deformable convolution-based video frame interpolation (VFI) method, using a coarse to fine 3D CNN to enhance the multi-flow prediction. This model first extracts spatio-temporal features at multiple scales using a 3D CNN, and estimates multi-flows using these features in a coarse-to-fine manner. The estimated multi-flows are then used to warp the original input frames as… ▽ More

    Submitted 22 June, 2023; v1 submitted 15 February, 2022; originally announced February 2022.

  38. A Subjective Quality Study for Video Frame Interpolation

    Authors: Duolikun Danier, Fan Zhang, David Bull

    Abstract: Video frame interpolation (VFI) is one of the fundamental research areas in video processing and there has been extensive research on novel and enhanced interpolation algorithms. The same is not true for quality assessment of the interpolated content. In this paper, we describe a subjective quality study for VFI based on a newly developed video database, BVI-VFI. BVI-VFI contains 36 reference sequ… ▽ More

    Submitted 22 June, 2023; v1 submitted 15 February, 2022; originally announced February 2022.

  39. arXiv:2111.15536  [pdf, other

    eess.IV

    ViSTRA3: Video Coding with Deep Parameter Adaptation and Post Processing

    Authors: Chen Feng, Duolikun Danier, Charlie Tan, Fan Zhang, David Bull

    Abstract: This paper presents a deep learning-based video compression framework (ViSTRA3). The proposed framework intelligently adapts video format parameters of the input video before encoding, subsequently employing a CNN at the decoder to restore their original format and enhance reconstruction quality. ViSTRA3 has been integrated with the H.266/VVC Test Model VTM 14.0, and evaluated under the Joint Vide… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

  40. ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation

    Authors: Duolikun Danier, Fan Zhang, David Bull

    Abstract: Video frame interpolation (VFI) is currently a very active research topic, with applications spanning computer vision, post production and video encoding. VFI can be extremely challenging, particularly in sequences containing large motions, occlusions or dynamic textures, where existing approaches fail to offer perceptually robust interpolation performance. In this context, we present a novel deep… ▽ More

    Submitted 30 March, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

    Comments: Accepted in CVPR 2022

  41. arXiv:2110.06740  [pdf, other

    eess.IV cs.CV cs.LG

    Transform and Bitstream Domain Image Classification

    Authors: P. R. Hill, D. R. Bull

    Abstract: Classification of images within the compressed domain offers significant benefits. These benefits include reduced memory and computational requirements of a classification system. This paper proposes two such methods as a proof of concept: The first classifies within the JPEG image transform domain (i.e. DCT transform data); the second classifies the JPEG compressed binary bitstream directly. Thes… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Comments: 7 pages, 3 figures, one table

  42. arXiv:2106.08147  [pdf, other

    eess.IV cs.CV cs.LG

    Perceptually-inspired super-resolution of compressed videos

    Authors: Di Ma, Mariana Afonso, Fan Zhang, David R. Bull

    Abstract: Spatial resolution adaptation is a technique which has often been employed in video compression to enhance coding efficiency. This approach encodes a lower resolution version of the input video and reconstructs the original resolution during decoding. Instead of using conventional up-sampling filters, recent work has employed advanced super-resolution methods based on convolutional neural networks… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  43. An adaptive Lagrange multiplier determination method for rate-distortion optimisation in hybrid video codecs

    Authors: Fan Zhang, David R. Bull

    Abstract: This paper describes an adaptive Lagrange multiplier determination method for rate-quality optimisation in video compression. Inspired by the experimental results of a Lagrange multiplier selection test, the presented approach adaptively estimates the optimum Lagrange multiplier for different video content, based on distortion statistics of recently encoded frames. The proposed algorithm has been… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  44. Quality assessment methods for perceptual video compression

    Authors: Fan Zhang, David R. Bull

    Abstract: This paper describes a quality assessment model for perceptual video compression applications (PVM), which stimulates visual masking and distortion-artefact perception using an adaptive combination of noticeable distortions and blurring artefacts. The method shows significant improvement over existing quality metrics based on the VQEG database, and provides compatibility with in-loop rate-quality… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  45. A Subjective Study on Videos at Various Bit Depths

    Authors: Alex Mackin, Di Ma, Fan Zhang, David Bull

    Abstract: Bit depth adaptation, where the bit depth of a video sequence is reduced before transmission and up-sampled during display, can potentially reduce data rates with limited impact on perceptual quality. In this context, we conducted a subjective study on a UHD video database, BVI-BD, to explore the relationship between bit depth and visual quality. In this work, three bit depth adaptation methods ar… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    Comments: 5 pages; 7 figures; 1 table

  46. VMAF-based Bitrate Ladder Estimation for Adaptive Streaming

    Authors: Angeliki V. Katsenou, Fan Zhang, Kyle Swanson, Mariana Afonso, Joel Sole, David R. Bull

    Abstract: In HTTP Adaptive Streaming, video content is conventionally encoded by adapting its spatial resolution and quantization level to best match the prevailing network state and display characteristics. It is well known that the traditional solution, of using a fixed bitrate ladder, does not result in the highest quality of experience for the user. Hence, in this paper, we consider a content-driven app… ▽ More

    Submitted 12 March, 2021; originally announced March 2021.

  47. Enhancing VMAF through New Feature Integration and Model Combination

    Authors: Fan Zhang, Angeliki Katsenou, Christos Bampis, Lukas Krasula, Zhi Li, David Bull

    Abstract: VMAF is a machine learning based video quality assessment method, originally designed for streaming applications, which combines multiple quality metrics and video features through SVM regression. It offers higher correlation with subjective opinions compared to many conventional quality assessment methods. In this paper we propose enhancements to VMAF through the integration of new video features… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: 5 pages, 2 figures and 4 tables

  48. Texture-aware Video Frame Interpolation

    Authors: Duolikun Danier, David Bull

    Abstract: Temporal interpolation has the potential to be a powerful tool for video compression. Existing methods for frame interpolation do not discriminate between video textures and generally invoke a single general model capable of interpolating a wide range of video content. However, past work on video texture analysis and synthesis has shown that different textures exhibit vastly different motion chara… ▽ More

    Submitted 26 February, 2021; originally announced February 2021.

  49. Efficient Bitrate Ladder Construction for Content-Optimized Adaptive Video Streaming

    Authors: Angeliki V. Katsenou, Joel Sole, David R. Bull

    Abstract: One of the challenges faced by many video providers is the heterogeneity of network specifications, user requirements, and content compression performance. The universal solution of a fixed bitrate ladder is inadequate in ensuring a high quality of user experience without re-buffering or introducing annoying compression artifacts. However, a content-tailored solution, based on extensively encoding… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

  50. arXiv:2102.04167  [pdf

    eess.IV

    Study of Compression Statistics and Prediction of Rate-Distortion Curves for Video Texture

    Authors: Angeliki V. Katsenou, Mariana Afonso, David R. Bull

    Abstract: Encoding textural content remains a challenge for current standardised video codecs. It is therefore beneficial to understand video textures in terms of both their spatio-temporal characteristics and their encoding statistics in order to optimize encoding performance. In this paper, we analyse the spatio-temporal features and statistics of video textures, explore the rate-quality performance of di… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: 17 pages