Search | arXiv e-print repository

Decoding Complexity-Rate-Quality Pareto-Front for Adaptive VVC Streaming

Authors: Angeliki Katsenou, Vignesh V Menon, Adam Wieckowski, Benjamin Bross, Detlev Marpe

Abstract: Pareto-front optimization is crucial for addressing the multi-objective challenges in video streaming, enabling the identification of optimal trade-offs between conflicting goals such as bitrate, video quality, and decoding complexity. This paper explores the construction of efficient bitrate ladders for adaptive Versatile Video Coding (VVC) streaming, focusing on optimizing these trade-offs. We i… ▽ More Pareto-front optimization is crucial for addressing the multi-objective challenges in video streaming, enabling the identification of optimal trade-offs between conflicting goals such as bitrate, video quality, and decoding complexity. This paper explores the construction of efficient bitrate ladders for adaptive Versatile Video Coding (VVC) streaming, focusing on optimizing these trade-offs. We investigate various ladder construction methods based on Pareto-front optimization, including exhaustive Rate-Quality and fixed ladder approaches. We propose a joint decoding time-rate-quality Pareto-front, providing a comprehensive framework to balance bitrate, decoding time, and video quality in video streaming. This allows streaming services to tailor their encoding strategies to meet specific requirements, prioritizing low decoding latency, bandwidth efficiency, or a balanced approach, thus enhancing the overall user experience. The experimental results confirm and demonstrate these opportunities for navigating the decoding time-rate-quality space to support various use cases. For example, when prioritizing low decoding latency, the proposed method achieves decoding time reduction of 14.86% while providing Bjontegaard delta rate savings of 4.65% and 0.32dB improvement in the eXtended Peak Signal-to-Noise Ratio (XPSNR)-Rate domain over the traditional fixed ladder solution. △ Less

Submitted 27 September, 2024; originally announced September 2024.

Comments: 5 pages

arXiv:2407.11496 [pdf, other]

ReLaX-VQA: Residual Fragment and Layer Stack Extraction for Enhancing Video Quality Assessment

Authors: Xinyi Wang, Angeliki Katsenou, David Bull

Abstract: With the rapid growth of User-Generated Content (UGC) exchanged between users and sharing platforms, the need for video quality assessment in the wild is increasingly evident. UGC is typically acquired using consumer devices and undergoes multiple rounds of compression (transcoding) before reaching the end user. Therefore, traditional quality metrics that employ the original content as a reference… ▽ More With the rapid growth of User-Generated Content (UGC) exchanged between users and sharing platforms, the need for video quality assessment in the wild is increasingly evident. UGC is typically acquired using consumer devices and undergoes multiple rounds of compression (transcoding) before reaching the end user. Therefore, traditional quality metrics that employ the original content as a reference are not suitable. In this paper, we propose ReLaX-VQA, a novel No-Reference Video Quality Assessment (NR-VQA) model that aims to address the challenges of evaluating the quality of diverse video content without reference to the original uncompressed videos. ReLaX-VQA uses frame differences to select spatio-temporal fragments intelligently together with different expressions of spatial features associated with the sampled frames. These are then used to better capture spatial and temporal variabilities in the quality of neighbouring frames. Furthermore, the model enhances abstraction by employing layer-stacking techniques in deep neural network features from Residual Networks and Vision Transformers. Extensive testing across four UGC datasets demonstrates that ReLaX-VQA consistently outperforms existing NR-VQA methods, achieving an average SRCC of 0.8658 and PLCC of 0.8873. Open-source code and trained models that will facilitate further research and applications of NR-VQA can be found at https://github.com/xinyiW915/ReLaX-VQA. △ Less

Submitted 12 March, 2025; v1 submitted 16 July, 2024; originally announced July 2024.

Comments: 10 pages, 3 figures

arXiv:2402.07057 [pdf, other]

Rate-Quality or Energy-Quality Pareto Fronts for Adaptive Video Streaming?

Authors: Angeliki Katsenou, Xinyi Wang, Daniel Schien, David Bull

Abstract: Adaptive video streaming is a key enabler for optimising the delivery of offline encoded video content. The research focus to date has been on optimisation, based solely on rate-quality curves. This paper adds an additional dimension, the energy expenditure, and explores construction of bitrate ladders based on decoding energy-quality curves rather than the conventional rate-quality curves. Pareto… ▽ More Adaptive video streaming is a key enabler for optimising the delivery of offline encoded video content. The research focus to date has been on optimisation, based solely on rate-quality curves. This paper adds an additional dimension, the energy expenditure, and explores construction of bitrate ladders based on decoding energy-quality curves rather than the conventional rate-quality curves. Pareto fronts are extracted from the rate-quality and energy-quality spaces to select optimal points. Bitrate ladders are constructed from these points using conventional rate-based rules together with a novel quality-based approach. Evaluation on a subset of YouTube-UGC videos encoded with x.265 shows that the energy-quality ladders reduce energy requirements by 28-31% on average at the cost of slightly higher bitrates. The results indicate that optimising based on energy-quality curves rather than rate-quality curves and using quality levels to create the rungs could potentially improve energy efficiency for a comparable quality of experience. △ Less

Submitted 10 February, 2024; originally announced February 2024.

Comments: 6, submitted to a conference

arXiv:2312.12150 [pdf, ps, other]

Comparative Study of Hardware and Software Power Measurements in Video Compression

Authors: Angeliki Katsenou, Xinyi Wang, Daniel Schien, David Bull

Abstract: The environmental impact of video streaming services has been discussed as part of the strategies towards sustainable information and communication technologies. A first step towards that is the energy profiling and assessment of energy consumption of existing video technologies. This paper presents a comprehensive study of power measurement techniques in video compression, comparing the use of ha… ▽ More The environmental impact of video streaming services has been discussed as part of the strategies towards sustainable information and communication technologies. A first step towards that is the energy profiling and assessment of energy consumption of existing video technologies. This paper presents a comprehensive study of power measurement techniques in video compression, comparing the use of hardware and software power meters. An experimental methodology to ensure reliability of measurements is introduced. Key findings demonstrate the high correlation of hardware and software based energy measurements for two video codecs across different spatial and temporal resolutions at a lower computational overhead. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: 5 pages

arXiv:2306.14432 [pdf, other]

doi 10.1109/ICIP49359.2023.10222332

Subjective assessment of the impact of a content adaptive optimiser for compressing 4K HDR content with AV1

Authors: Vibhoothi, Angeliki Katsenou, François Pitié, Katarina Domijan, Anil Kokaram

Abstract: Since 2015 video dimensionality has expanded to higher spatial and temporal resolutions and a wider colour gamut. This High Dynamic Range (HDR) content has gained traction in the consumer space as it delivers an enhanced quality of experience. At the same time, the complexity of codecs is growing. This has driven the development of tools for content-adaptive optimisation that achieve optimal rate-… ▽ More Since 2015 video dimensionality has expanded to higher spatial and temporal resolutions and a wider colour gamut. This High Dynamic Range (HDR) content has gained traction in the consumer space as it delivers an enhanced quality of experience. At the same time, the complexity of codecs is growing. This has driven the development of tools for content-adaptive optimisation that achieve optimal rate-distortion performance for HDR video at 4K resolution. While improvements of just a few percentage points in BD-Rate (1-5\%) are significant for the streaming media industry, the impact on subjective quality has been less studied especially for HDR/AV1. In this paper, we conduct a subjective quality assessment (42 subjects) of 4K HDR content with a per-clip optimisation strategy. We correlate these subjective scores with existing popular objective metrics used in standard development and show that some perceptual metrics correlate surprisingly well even though they are not tuned for HDR. We find that the DSQCS protocol is too insensitive to categorically compare the methods but the data allows us to make recommendations about the use of experts vs non-experts in HDR studies, and explain the subjective impact of film grain in HDR content under compression. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: Accepted Camera-ready version for the ICIP 2023 Paper

arXiv:2305.11858 [pdf, other]

Recommendations for Verifying HDR Subjective Testing Workflows

Authors: Vibhoothi, Angeliki Katsenou, John Squires, François Pitié, Anil Kokaram

Abstract: Over the past few years, there has been an increase in the demand and availability of High Dynamic Range (HDR) displays and content. To ensure the production of high-quality materials, human evaluation is required. However, ascertaining whether the full playback pipeline is indeed HDR-compliant can be challenging. In this paper, we present a set of recommendations for conformance testing to valida… ▽ More Over the past few years, there has been an increase in the demand and availability of High Dynamic Range (HDR) displays and content. To ensure the production of high-quality materials, human evaluation is required. However, ascertaining whether the full playback pipeline is indeed HDR-compliant can be challenging. In this paper, we present a set of recommendations for conformance testing to validate various aspects of the testing workflow, including playback, displays, brightness, colours, and viewing environment. We assessed the effectiveness of HDR conversion techniques used in current standards development (3GPP) for making source materials. Additionally, we evaluate HDR display technologies, including OLED and LCD, using both consumer television and a reference monitor. △ Less

Submitted 19 May, 2023; originally announced May 2023.

Comments: Accepted Camera-ready version of QOMEX 2023 Short-paper

arXiv:2303.16163 [pdf, other]

Comparison of HDR quality metrics in Per-Clip Lagrangian multiplier optimisation with AV1

Authors: Vibhoothi, François Pitié, Angeliki Katsenou, Yeping Su, Balu Adsumilli, Anil Kokaram

Abstract: The complexity of modern codecs along with the increased need of delivering high-quality videos at low bitrates has reinforced the idea of a per-clip tailoring of parameters for optimised rate-distortion performance. While the objective quality metrics used for Standard Dynamic Range (SDR) videos have been well studied, the transitioning of consumer displays to support High Dynamic Range (HDR) vid… ▽ More The complexity of modern codecs along with the increased need of delivering high-quality videos at low bitrates has reinforced the idea of a per-clip tailoring of parameters for optimised rate-distortion performance. While the objective quality metrics used for Standard Dynamic Range (SDR) videos have been well studied, the transitioning of consumer displays to support High Dynamic Range (HDR) videos, poses a new challenge to rate-distortion optimisation. In this paper, we review the popular HDR metrics DeltaE100 (DE100), PSNRL100, wPSNR, and HDR-VQM. We measure the impact of employing these metrics in per-clip direct search optimisation of the rate-distortion Lagrange multiplier in AV1. We report, on 35 HDR videos, average Bjontegaard Delta Rate (BD-Rate) gains of 4.675%, 2.226%, and 7.253% in terms of DE100, PSNRL100, and HDR-VQM. We also show that the inclusion of chroma in the quality metrics has a significant impact on optimisation, which can only be partially addressed by the use of chroma offsets. △ Less

Submitted 26 April, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

Comments: Accepted version for ICME 2023 Special Session, "Optimised Media Delivery"

arXiv:2210.00618 [pdf, other]

doi 10.1109/PCS56426.2022.10017999

Energy-Rate-Quality Tradeoffs of State-of-the-Art Video Codecs

Authors: Angeliki Katsenou, Jongwewi Mao, Ioannis Mavromatis

Abstract: The adoption of video conferencing and video communication services, accelerated by COVID-19, has driven a rapid increase in video data traffic. The demand for higher resolutions and quality, the need for immersive video formats, and the newest, more complex video codecs increase the energy consumption in data centers and display devices. In this paper, we explore and compare the energy consumptio… ▽ More The adoption of video conferencing and video communication services, accelerated by COVID-19, has driven a rapid increase in video data traffic. The demand for higher resolutions and quality, the need for immersive video formats, and the newest, more complex video codecs increase the energy consumption in data centers and display devices. In this paper, we explore and compare the energy consumption across optimized state-of-the-art video codecs, SVT-AV1, VVenC/VVdeC, VP9, and x.265. Furthermore, we align the energy usage with various objective quality metrics and the compression performance for a set of video sequences across different resolutions. The results indicate that from the tested codecs and configurations, SVT-AV1 provides the best tradeoff between energy consumption and quality. The reported results aim to serve as a guide towards sustainable video streaming while not compromising the quality of experience of the end user. △ Less

Submitted 2 October, 2022; originally announced October 2022.

Comments: 5 pages

arXiv:2208.11150 [pdf, other]

doi 10.1117/12.2632272

Direct Optimisation of $\boldsymbolλ$ for HDR Content Adaptive Transcoding in AV1

Authors: Vibhoothi, François Pitié, Angeliki Katsenou, Daniel Joseph Ringis, Yeping Su, Neil Birkbeck, Jessie Lin, Balu Adsumilli, Anil Kokaram

Abstract: Since the adoption of VP9 by Netflix in 2016, royalty-free coding standards continued to gain prominence through the activities of the AOMedia consortium. AV1, the latest open source standard, is now widely supported. In the early years after standardisation, HDR video tends to be under served in open source encoders for a variety of reasons including the relatively small amount of true HDR conten… ▽ More Since the adoption of VP9 by Netflix in 2016, royalty-free coding standards continued to gain prominence through the activities of the AOMedia consortium. AV1, the latest open source standard, is now widely supported. In the early years after standardisation, HDR video tends to be under served in open source encoders for a variety of reasons including the relatively small amount of true HDR content being broadcast and the challenges in RD optimisation with that material. AV1 codec optimisation has been ongoing since 2020 including consideration of the computational load. In this paper, we explore the idea of direct optimisation of the Lagrangian $λ$ parameter used in the rate control of the encoders to estimate the optimal Rate-Distortion trade-off achievable for a High Dynamic Range signalled video clip. We show that by adjusting the Lagrange multiplier in the RD optimisation process on a frame-hierarchy basis, we are able to increase the Bjontegaard difference rate gains by more than 3.98$\times$ on average without visually affecting the quality. △ Less

Submitted 7 October, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: SPIE2022:Applications of Digital Image Processing XLV accepted manuscript

arXiv:2202.12852 [pdf, other]

A CNN-based Post-Processor for Perceptually-Optimized Immersive Media Compression

Authors: Angeliki Katsenou, Fan Zhang, David Bull

Abstract: In recent years, resolution adaptation based on deep neural networks has enabled significant performance gains for conventional (2D) video codecs. This paper investigates the effectiveness of spatial resolution resampling in the context of immersive content. The proposed approach reduces the spatial resolution of input multi-view videos before encoding, and reconstructs their original resolution a… ▽ More In recent years, resolution adaptation based on deep neural networks has enabled significant performance gains for conventional (2D) video codecs. This paper investigates the effectiveness of spatial resolution resampling in the context of immersive content. The proposed approach reduces the spatial resolution of input multi-view videos before encoding, and reconstructs their original resolution after decoding. During the up-sampling process, an advanced CNN model is used to reduce potential re-sampling, compression, and synthesis artifacts. This work has been fully tested with the TMIV coding standard using a Versatile Video Coding (VVC) codec. The results demonstrate that the proposed method achieves a significant rate-quality performance improvement for the majority of the test sequences, with an average BD-VMAF improvement of 3.07 overall sequences. △ Less

Submitted 25 February, 2022; originally announced February 2022.

arXiv:2103.07564 [pdf, other]

doi 10.1109/PCS50896.2021.9477469

VMAF-based Bitrate Ladder Estimation for Adaptive Streaming

Authors: Angeliki V. Katsenou, Fan Zhang, Kyle Swanson, Mariana Afonso, Joel Sole, David R. Bull

Abstract: In HTTP Adaptive Streaming, video content is conventionally encoded by adapting its spatial resolution and quantization level to best match the prevailing network state and display characteristics. It is well known that the traditional solution, of using a fixed bitrate ladder, does not result in the highest quality of experience for the user. Hence, in this paper, we consider a content-driven app… ▽ More In HTTP Adaptive Streaming, video content is conventionally encoded by adapting its spatial resolution and quantization level to best match the prevailing network state and display characteristics. It is well known that the traditional solution, of using a fixed bitrate ladder, does not result in the highest quality of experience for the user. Hence, in this paper, we consider a content-driven approach for estimating the bitrate ladder, based on spatio-temporal features extracted from the uncompressed content. The method implements a content-driven interpolation. It uses the extracted features to train a machine learning model to infer the curvature points of the Rate-VMAF curves in order to guide a set of initial encodings. We employ the VMAF quality metric as a means of perceptually conditioning the estimation. When compared to exhaustive encoding that produces the reference ladder, the estimated ladder is composed by 74.3% of identical Rate-VMAF points with the reference ladder. The proposed method offers a significant reduction of the number of encodes required, 77.4%, at a small average Bjøntegaard Delta Rate cost, 1.12%. △ Less

Submitted 12 March, 2021; originally announced March 2021.

arXiv:2103.06338 [pdf, other]

doi 10.1109/PCS50896.2021.9477458

Enhancing VMAF through New Feature Integration and Model Combination

Authors: Fan Zhang, Angeliki Katsenou, Christos Bampis, Lukas Krasula, Zhi Li, David Bull

Abstract: VMAF is a machine learning based video quality assessment method, originally designed for streaming applications, which combines multiple quality metrics and video features through SVM regression. It offers higher correlation with subjective opinions compared to many conventional quality assessment methods. In this paper we propose enhancements to VMAF through the integration of new video features… ▽ More VMAF is a machine learning based video quality assessment method, originally designed for streaming applications, which combines multiple quality metrics and video features through SVM regression. It offers higher correlation with subjective opinions compared to many conventional quality assessment methods. In this paper we propose enhancements to VMAF through the integration of new video features and alternative quality metrics (selected from a diverse pool) alongside multiple model combination. The proposed combination approach enables training on multiple databases with varying content and distortion characteristics. Our enhanced VMAF method has been evaluated on eight HD video databases, and consistently outperforms the original VMAF model (0.6.1) and other benchmark quality metrics, exhibiting higher correlation with subjective ground truth data. △ Less

Submitted 10 March, 2021; originally announced March 2021.

Comments: 5 pages, 2 figures and 4 tables

arXiv:2102.04550 [pdf, other]

doi 10.1109/OJSP.2021.3086691

Efficient Bitrate Ladder Construction for Content-Optimized Adaptive Video Streaming

Authors: Angeliki V. Katsenou, Joel Sole, David R. Bull

Abstract: One of the challenges faced by many video providers is the heterogeneity of network specifications, user requirements, and content compression performance. The universal solution of a fixed bitrate ladder is inadequate in ensuring a high quality of user experience without re-buffering or introducing annoying compression artifacts. However, a content-tailored solution, based on extensively encoding… ▽ More One of the challenges faced by many video providers is the heterogeneity of network specifications, user requirements, and content compression performance. The universal solution of a fixed bitrate ladder is inadequate in ensuring a high quality of user experience without re-buffering or introducing annoying compression artifacts. However, a content-tailored solution, based on extensively encoding across all resolutions and over a wide quality range is highly expensive in terms of computational, financial, and energy costs. Inspired by this, we propose an approach that exploits machine learning to predict a content-optimized bitrate ladder. The method extracts spatio-temporal features from the uncompressed content, trains machine-learning models to predict the Pareto front parameters, and, based on that, builds the ladder within a defined bitrate range. The method has the benefit of significantly reducing the number of encodes required per sequence. The presented results, based on 100 HEVC-encoded sequences, demonstrate a reduction in the number of encodes required when compared to an exhaustive search and an interpolation-based method, by 89.06% and 61.46%, respectively, at the cost of an average Bjøntegaard Delta Rate difference of 1.78% compared to the exhaustive approach. Finally, a hybrid method is introduced that selects either the proposed or the interpolation-based method depending on the sequence features. This results in an overall 83.83% reduction of required encodings at the cost of an average Bjøntegaard Delta Rate difference of 1.26%. △ Less

Submitted 8 February, 2021; originally announced February 2021.

arXiv:2102.04167 [pdf]

Study of Compression Statistics and Prediction of Rate-Distortion Curves for Video Texture

Authors: Angeliki V. Katsenou, Mariana Afonso, David R. Bull

Abstract: Encoding textural content remains a challenge for current standardised video codecs. It is therefore beneficial to understand video textures in terms of both their spatio-temporal characteristics and their encoding statistics in order to optimize encoding performance. In this paper, we analyse the spatio-temporal features and statistics of video textures, explore the rate-quality performance of di… ▽ More Encoding textural content remains a challenge for current standardised video codecs. It is therefore beneficial to understand video textures in terms of both their spatio-temporal characteristics and their encoding statistics in order to optimize encoding performance. In this paper, we analyse the spatio-temporal features and statistics of video textures, explore the rate-quality performance of different texture types and investigate models to mathematically describe them. For all considered theoretical models, we employ machine-learning regression to predict the rate-quality curves based solely on selected spatio-temporal features extracted from uncompressed content. All experiments were performed on homogeneous video textures to ensure validity of the observations. The results of the regression indicate that using an exponential model we can more accurately predict the expected rate-quality curve (with a mean Bjøntegaard Delta rate of 0.46% over the considered dataset) while maintaining a low relative complexity. This is expected to be adopted by in the loop processes for faster encoding decisions such as rate-distortion optimisation, adaptive quantization, partitioning, etc. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 17 pages

arXiv:2005.03315 [pdf, other]

doi 10.1109/ICMEW46912.2020.9106011

Encoding in the Dark Grand Challenge: An Overview

Authors: Nantheera Anantrasirichai, Fan Zhang, Alexandra Malyugina, Paul Hill, Angeliki Katsenou

Abstract: A big part of the video content we consume from video providers consists of genres featuring low-light aesthetics. Low light sequences have special characteristics, such as spatio-temporal varying acquisition noise and light flickering, that make the encoding process challenging. To deal with the spatio-temporal incoherent noise, higher bitrates are used to achieve high objective quality. Addition… ▽ More A big part of the video content we consume from video providers consists of genres featuring low-light aesthetics. Low light sequences have special characteristics, such as spatio-temporal varying acquisition noise and light flickering, that make the encoding process challenging. To deal with the spatio-temporal incoherent noise, higher bitrates are used to achieve high objective quality. Additionally, the quality assessment metrics and methods have not been designed, trained or tested for this type of content. This has inspired us to trigger research in that area and propose a Grand Challenge on encoding low-light video sequences. In this paper, we present an overview of the proposed challenge, and test state-of-the-art methods that will be part of the benchmark methods at the stage of the participants' deliverable assessment. From this exploration, our results show that VVC already achieves a high performance compared to simply denoising the video source prior to encoding. Moreover, the quality of the video streams can be further improved by employing a post-processing image enhancement method. △ Less

Submitted 7 May, 2020; originally announced May 2020.

arXiv:2003.10282 [pdf, other]

doi 10.3389/frsip.2022.874200

BVI-CC: A Dataset for Research on Video Compression and Quality Assessment

Authors: Angeliki V. Katsenou, Fan Zhang, Mariana Afonso, Goce Dimitrov, David R. Bull

Abstract: The video technology scenery has been very vivid over the past years, with novel video coding technologies introduced that promise improved compression performance over state-of-the-art technologies. Despite the fact that a lot of video datasets are available, representative content of the wide parameter space along with subjective evaluations of variations of encoded content from an unpartial end… ▽ More The video technology scenery has been very vivid over the past years, with novel video coding technologies introduced that promise improved compression performance over state-of-the-art technologies. Despite the fact that a lot of video datasets are available, representative content of the wide parameter space along with subjective evaluations of variations of encoded content from an unpartial end is required. In response to this requirement, this paper features a dataset, the BVI-CC. Three video codecs were deployed to create the variations of the encoded sequences: High Efficiency Video Coding (HEVC) Test Model (HM), AOMedia Video 1 (AV1), and Versatile Video Coding (VVC) Test Model (VTM). Nine source video sequences were carefully selected to offer both diversity and representativeness in the spatio-temporal domain. Different spatial resolution versions of the sequences were created and encoded by all three codecs at pre-defined target bit rates. The compression efficiency of the codecs was evaluated with commonly used objective quality metrics, and the subjective quality of their reconstructed content was also evaluated through psychophysical experiments. Additionally, an adaptive bit rate (convex hull rate-distortion optimization across spatial resolutions) test case was assessed using both objective and subjective evaluations. Finally, the computational complexities of the tested codecs were examined. All data have been made publicly available as part of the dataset, which can be used for coding performance evaluation and video quality metric development. △ Less

Submitted 21 February, 2022; v1 submitted 23 March, 2020; originally announced March 2020.

Showing 1–16 of 16 results for author: Katsenou, A