Search | arXiv e-print repository

Modeling the Energy Consumption of the HEVC Software Encoding Process using Processor events

Authors: Geetha Ramasubbu, Andrè Kaup, Christian Herglotz

Abstract: Developing energy-efficient video encoding algorithms is highly important due to the high processing complexities and, consequently, the high energy demand of the encoding process. To accomplish this, the energy consumption of the video encoders must be studied, which is only possible with a complex and dedicated energy measurement setup. This emphasizes the need for simple energy estimation model… ▽ More Developing energy-efficient video encoding algorithms is highly important due to the high processing complexities and, consequently, the high energy demand of the encoding process. To accomplish this, the energy consumption of the video encoders must be studied, which is only possible with a complex and dedicated energy measurement setup. This emphasizes the need for simple energy estimation models, which estimate the energy required for the encoding. Our paper investigates the possibility of estimating the energy demand of a HEVC software CPU-encoding process using processor events. First, we perform energy measurements and obtain processor events using dedicated profiling software. Then, by using the measured energy demand of the encoding process and profiling data, we build an encoding energy estimation model that uses the processor events of the ultrafast encoding preset to obtain the energy estimate for complex encoding presets with a mean absolute percentage error of 5.36% when averaged over all the presets. Additionally, we present an energy model that offers the possibility of obtaining energy distribution among various encoding sub-processes. △ Less

Submitted 3 October, 2024; v1 submitted 1 October, 2024; originally announced October 2024.

arXiv:2410.00533 [pdf, ps, other]

Design Space Exploration at Frame-Level for Joint Decoding Energy and Quality Optimization in VVC

Authors: Teresa Stürzenhofäcker, Matthias Kränzler, Christian Herglotz, André Kaup

Abstract: In the pursuit of a reduced energy demand of VVC decoders, it was found that the coding tool configuration has a substantial influence on the bit rate efficiency and the decoding energy demand. The Advanced Design Space Exploration algorithm as proposed in the literature, can derive coding tool configurations that provide optimal trade-offs between rate and energy efficiency. Yet, some trade-off p… ▽ More In the pursuit of a reduced energy demand of VVC decoders, it was found that the coding tool configuration has a substantial influence on the bit rate efficiency and the decoding energy demand. The Advanced Design Space Exploration algorithm as proposed in the literature, can derive coding tool configurations that provide optimal trade-offs between rate and energy efficiency. Yet, some trade-off points in the design space cannot be reached with the state-of-the-art methodology, which defines coding tools for an entire bitstream. This work proposes a novel, granular adjustment of the coding tool usage in VVC. Consequently, the optimization algorithm is adjusted to explore coding tool configurations that operate on frame-level. Moreover, new optimization criteria are introduced to focus the search on specific bit rates. As a result, coding tool configurations are obtained which yield so far inaccessible trade-offs between bit rate efficiency and decoding energy demand for VVC-coded sequences. The proposed methodology extends the design space and enhances the continuity of the Pareto front. △ Less

Submitted 1 October, 2024; originally announced October 2024.

Comments: submitted, accepted and published at EuSipCo 2024, Special Session on Frugality for Video Streaming

Journal ref: EuSipCo 2024, ISBN: 978-9-4645-9361-7

arXiv:2408.00052 [pdf, other]

doi 10.1109/QoMEX61742.2024.10598281

Exploiting Change Blindness for Video Coding: Perspectives from a Less Promising User Study

Authors: Mitra Amiri, Steven Le Moan, Christian Herglotz

Abstract: What the human visual system can perceive is strongly limited by the capacity of our working memory and attention. Such limitations result in the human observer's inability to perceive large-scale changes in a stimulus, a phenomenon known as change blindness. In this paper, we started with the premise that this phenomenon can be exploited in video coding, especially HDR-video compression where the… ▽ More What the human visual system can perceive is strongly limited by the capacity of our working memory and attention. Such limitations result in the human observer's inability to perceive large-scale changes in a stimulus, a phenomenon known as change blindness. In this paper, we started with the premise that this phenomenon can be exploited in video coding, especially HDR-video compression where the bitrate is high. We designed an HDR-video encoding approach that relies on spatially and temporally varying quantization parameters within the framework of HEVC video encoding. In the absence of a reliable change blindness prediction model, to extract compression candidate regions (CCR) we used an existing saliency prediction algorithm. We explored different configurations and carried out a subjective study to test our hypothesis. While our methodology did not lead to significantly superior performance in terms of the ratio between perceived quality and bitrate, we were able to determine potential flaws in our methodology, such as the employed saliency model for CCR prediction (chosen for computational efficiency, but eventually not sufficiently accurate), as well as a very strong subjective bias due to observers priming themselves early on in the experiment about the type of artifacts they should look for, thus creating a scenario with little ecological validity. △ Less

Submitted 31 July, 2024; originally announced August 2024.

Comments: 16th International Conference on Quality of Multimedia Experience (QoMEX) 2024

arXiv:2407.05900 [pdf, other]

SVT-AV1 Encoding Bitrate Estimation Using Motion Search Information

Authors: Lena Eichermüller, Gaurang Chaudhari, Ioannis Katsavounidis, Zhijun Lei, Hassene Tmar, Christian Herglotz, André Kaup

Abstract: Enabling high compression efficiency while keeping encoding energy consumption at a low level, requires prioritization of which videos need more sophisticated encoding techniques. However, the effects vary highly based on the content, and information on how good a video can be compressed is required. This can be measured by estimating the encoded bitstream size prior to encoding. We identified the… ▽ More Enabling high compression efficiency while keeping encoding energy consumption at a low level, requires prioritization of which videos need more sophisticated encoding techniques. However, the effects vary highly based on the content, and information on how good a video can be compressed is required. This can be measured by estimating the encoded bitstream size prior to encoding. We identified the errors between estimated motion vectors from Motion Search, an algorithm that predicts temporal changes in videos, correlates well to the encoded bitstream size. Combining Motion Search with Random Forests, the encoding bitrate can be estimated with a Pearson correlation of above 0.96. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 5 pages, 4 figures, accepted for European Signal Processing Conference (EUSIPCO) 2024

arXiv:2406.11492 [pdf, ps, other]

Energy Reduction Opportunities in HDR Video Encoding

Authors: Christian Herglotz, Steven Le Moan, Alexandre Mercat

Abstract: This paper investigates the energy consumption of video encoding for high dynamic range videos. Specifically, we compare the energy consumption of the compression process using 10-bit input sequences, a tone-mapped 8-bit input sequence at 10-bit internal bit depth, and encoding an 8-bit input sequence using an encoder with an internal bit depth of 8 bit. We find that linear scaling of the luminanc… ▽ More This paper investigates the energy consumption of video encoding for high dynamic range videos. Specifically, we compare the energy consumption of the compression process using 10-bit input sequences, a tone-mapped 8-bit input sequence at 10-bit internal bit depth, and encoding an 8-bit input sequence using an encoder with an internal bit depth of 8 bit. We find that linear scaling of the luminance and chrominance values leads to degradations of the visual quality, but that significant encoding complexity and thus encoding energy can be saved. An important reason for this is the availability of vector instructions, which are not available for the 10-bit encoder. Furthermore, we find that at sufficiently low target bitrates, the compression efficiency at an internal bit depth of 8 bit exceeds the compression efficiency of regular 10-bit encoding. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 7 pages, 5 figures, 1 table

arXiv:2405.17866 [pdf, ps, other]

doi 10.1109/QoMEX61742.2024.10598269

Towards Video Codec Performance Evaluation: A Rate-Energy-Distortion Perspective

Authors: Geetha Ramasubbu, André Kaup, Christian Herglotz

Abstract: The Bjøntegaard Delta rate (BD-rate) objectively assesses the coding efficiency of video codecs using the rate-distortion (R-D) performance but overlooks encoding energy, which is crucial in practical applications, especially for those on handheld devices. Although R-D analysis can be extended to incorporate encoding energy as energy-distortion (E-D), it fails to integrate all three parameters sea… ▽ More The Bjøntegaard Delta rate (BD-rate) objectively assesses the coding efficiency of video codecs using the rate-distortion (R-D) performance but overlooks encoding energy, which is crucial in practical applications, especially for those on handheld devices. Although R-D analysis can be extended to incorporate encoding energy as energy-distortion (E-D), it fails to integrate all three parameters seamlessly. This work proposes a novel approach to address this limitation by introducing a 3D representation of rate, encoding energy, and distortion through surface fitting. In addition, we evaluate various surface fitting techniques based on their accuracy and investigate the proposed 3D representation and its projections. The overlapping areas in projections help in encoder selection and recommend avoiding the slow presets of the older encoders (x264, x265), as the recent encoders (x265, VVenC) offer higher quality for the same bitrate-energy performance and provide a lower rate for the same energy-distortion performance. △ Less

Submitted 1 October, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

Comments: Proc. 2024 16th International Conference on Quality of Multimedia Experience (QoMEX)

Journal ref: 2024 16th International Conference on Quality of Multimedia Experience (QoMEX)

arXiv:2402.09926 [pdf, other]

Energy Demand Prediction for Hardware Video Decoders Using Software Profiling

Authors: Matthias Kränzler, Christian Herglotz, André Kaup

Abstract: Energy efficiency for video communications is essential for mobile devices with a limited battery capacity. Therefore, hardware decoder implementations are commonly used to significantly reduce the energetic load of video playback. The energy consumption of such a hardware implementation largely depends on a previously published specification of a video coding standard that defines which coding to… ▽ More Energy efficiency for video communications is essential for mobile devices with a limited battery capacity. Therefore, hardware decoder implementations are commonly used to significantly reduce the energetic load of video playback. The energy consumption of such a hardware implementation largely depends on a previously published specification of a video coding standard that defines which coding tools and methods are included. However, during the standardization of a video coding standard, the energy demand of a hardware implementation is unknown. Hence, the hardware complexity of coding tools is judged subjectively by experts from the field of hardware programming without using standardized assessment procedures. To solve this problem, we propose a method that accurately models the energy demand of existing hardware decoders with an average error of 1.79% by exploiting information from software decoder profiling. Motivated by the low estimation error, we propose a hardware decoding energy metric that can predict and estimate the energy demand of an unknown hardware implementation using information from existing hardware decoder implementations and available software implementations of the future video decoder. By using multiple video coding standards for model training, we can predict the relative energy demand of an unknown hardware decoder with a minimum error of 4.54% without using the corresponding hardware decoder for training. △ Less

Submitted 12 December, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: Submitted to IEEE Journal on Consumer Electronics

arXiv:2402.09001 [pdf, other]

A Comprehensive Review of Software and Hardware Energy Efficiency of Video Decoders

Authors: Matthias Kränzler, Christian Herglotz, André Kaup

Abstract: Energy and compression efficiency are two essential parts of modern video decoder implementations that have to be considered. This work comprehensively studies the following six video coding formats regarding compression and decoding energy efficiency: AVC, VP9, HEVC, AV1, VVC, and AVM. We first evaluate the energy demand of reference and optimized software decoder implementations. Furthermore, we… ▽ More Energy and compression efficiency are two essential parts of modern video decoder implementations that have to be considered. This work comprehensively studies the following six video coding formats regarding compression and decoding energy efficiency: AVC, VP9, HEVC, AV1, VVC, and AVM. We first evaluate the energy demand of reference and optimized software decoder implementations. Furthermore, we consider the influence of the usage of SIMD instructions on those decoder implementations. We find that AV1 is a sweet spot for optimized software decoder implementations with an additional energy demand of 16.55% and bitrate savings of -43.95% compared to VP9. We furthermore evaluate the hardware decoding energy demand of four video coding formats. Thereby, we show that AV1 has energy demand increases by 117.50% compared to VP9. For HEVC, we found a sweet spot in terms of energy demand with an increase of 6.06% with respect to VP9. Relative to their optimized software counterparts, hardware video decoders reduce the energy consumption to less than 9% compared to software decoders. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: accepted as a conference paper for Picture Coding Symposium (PCS) 2024

arXiv:2401.16067 [pdf, other]

Encoding Time and Energy Model for SVT-AV1 based on Video Complexity

Authors: Lena Eichermüller, Gaurang Chaudhari, Ioannis Katsavounidis, Zhijun Lei, Hassene Tmar, Christian Herglotz, André Kaup

Abstract: The share of online video traffic in global carbon dioxide emissions is growing steadily. To comply with the demand for video media, dedicated compression techniques are continuously optimized, but at the expense of increasingly higher computational demands and thus rising energy consumption at the video encoder side. In order to find the best trade-off between compression and energy consumption,… ▽ More The share of online video traffic in global carbon dioxide emissions is growing steadily. To comply with the demand for video media, dedicated compression techniques are continuously optimized, but at the expense of increasingly higher computational demands and thus rising energy consumption at the video encoder side. In order to find the best trade-off between compression and energy consumption, modeling encoding energy for a wide range of encoding parameters is crucial. We propose an encoding time and energy model for SVT-AV1 based on empirical relations between the encoding time and video parameters as well as encoder configurations. Furthermore, we model the influence of video content by established content descriptors such as spatial and temporal information. We then use the predicted encoding time to estimate the required energy demand and achieve a prediction error of 19.6 % for encoding time and 20.9 % for encoding energy. △ Less

Submitted 30 January, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 5 pages, 1 figure, accepted for IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2024

arXiv:2310.17346 [pdf, ps, other]

Extended Signaling Methods for Reduced Video Decoder Power Consumption Using Green Metadata

Authors: Christian Herglotz, Matthias Kränzler, Xixue Chu, Edouard Francois, Yong He, André Kaup

Abstract: In this paper, we discuss one aspect of the latest MPEG standard edition on energy-efficient media consumption, also known as Green Metadata (ISO/IEC 232001-11), which is the interactive signaling for remote decoder-power reduction for peer-to-peer video conferencing. In this scenario, the receiver of a video, e.g., a battery-driven portable device, can send a dedicated request to the sender which… ▽ More In this paper, we discuss one aspect of the latest MPEG standard edition on energy-efficient media consumption, also known as Green Metadata (ISO/IEC 232001-11), which is the interactive signaling for remote decoder-power reduction for peer-to-peer video conferencing. In this scenario, the receiver of a video, e.g., a battery-driven portable device, can send a dedicated request to the sender which asks for a video bitstream representation that is less complex to decode and process. Consequently, the receiver saves energy and extends operating times. We provide an overview on latest studies from the literature dealing with energy-saving aspects, which motivate the extension of the legacy Green Metadata standard. Furthermore, we explain the newly introduced syntax elements and verify their effectiveness by performing dedicated experiments. We show that the integration of these syntax elements can lead to dynamic energy savings of up to 90% for software video decoding and 80% for hardware video decoding, respectively. △ Less

Submitted 26 October, 2023; originally announced October 2023.

Comments: 5 pages, 2 figures

arXiv:2309.06945 [pdf, ps, other]

doi 10.1109/ISM.2018.00063

Improving HEVC Encoding of Rendered Video Data Using True Motion Information

Authors: Christian Herglotz, David Müller, Andreas Weinlich, Frank Bauer, Michael Ortner, Marc Stamminger, André Kaup

Abstract: This paper shows that motion vectors representing the true motion of an object in a scene can be exploited to improve the encoding process of computer generated video sequences. Therefore, a set of sequences is presented for which the true motion vectors of the corresponding objects were generated on a per-pixel basis during the rendering process. In addition to conventional motion estimation meth… ▽ More This paper shows that motion vectors representing the true motion of an object in a scene can be exploited to improve the encoding process of computer generated video sequences. Therefore, a set of sequences is presented for which the true motion vectors of the corresponding objects were generated on a per-pixel basis during the rendering process. In addition to conventional motion estimation methods, it is proposed to exploit the computer generated motion vectors to enhance the ratedistortion performance. To this end, a motion vector mapping method including disocclusion handling is presented. It is shown that mean rate savings of 3.78% can be achieved. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: 4 pages, 4 figures

Journal ref: Proc. 2018 IEEE International Symposium on Multimedia (ISM)

arXiv:2308.06570 [pdf, other]

doi 10.1109/QoMEX48832.2020.9123140

On Versatile Video Coding at UHD with Machine-Learning-Based Super-Resolution

Authors: Kristian Fischer, Christian Herglotz, André Kaup

Abstract: Coding 4K data has become of vital interest in recent years, since the amount of 4K data is significantly increasing. We propose a coding chain with spatial down- and upscaling that combines the next-generation VVC codec with machine learning based single image super-resolution algorithms for 4K. The investigated coding chain, which spatially downscales the 4K data before coding, shows superior qu… ▽ More Coding 4K data has become of vital interest in recent years, since the amount of 4K data is significantly increasing. We propose a coding chain with spatial down- and upscaling that combines the next-generation VVC codec with machine learning based single image super-resolution algorithms for 4K. The investigated coding chain, which spatially downscales the 4K data before coding, shows superior quality than the conventional VVC reference software for low bitrate scenarios. Throughout several tests, we find that up to 12 % and 18 % Bjontegaard delta rate gains can be achieved on average when coding 4K sequences with VVC and QP values above 34 and 42, respectively. Additionally, the investigated scenario with up- and downscaling helps to reduce the loss of details and compression artifacts, as it is shown in a visual example. △ Less

Submitted 12 August, 2023; originally announced August 2023.

Comments: Originally published as conference paper at QoMEX 2020

arXiv:2307.14000 [pdf, ps, other]

doi 10.1109/ICIP.2017.8296731

Video Decoding Energy Estimation Using Processor Events

Authors: Christian Herglotz, André Kaup

Abstract: In this paper, we show that processor events like instruction counts or cache misses can be used to accurately estimate the processing energy of software video decoders. Therefore, we perform energy measurements on an ARM-based evaluation platform and count processor level events using a dedicated profiling software. Measurements are performed for various codecs and decoder implementations to prov… ▽ More In this paper, we show that processor events like instruction counts or cache misses can be used to accurately estimate the processing energy of software video decoders. Therefore, we perform energy measurements on an ARM-based evaluation platform and count processor level events using a dedicated profiling software. Measurements are performed for various codecs and decoder implementations to prove the general viability of our observations. Using the estimation method proposed in this paper, the true decoding energy for various recent video coding standards including HEVC and VP9 can be estimated with a mean estimation error that is smaller than 6%. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: 5 pages, 2 figures

Journal ref: IEEE International Conference on Image Processing (ICIP), Beijing, China, 2017, pp. 2493-2497

arXiv:2307.08354 [pdf, ps, other]

doi 10.1109/TCE.2021.3122076

Component-wise Power Estimation of Electrical Devices Using Thermal Imaging

Authors: Christian Herglotz, Simon Grosche, Akarsh Bharadwaj, André Kaup

Abstract: This paper presents a novel method to estimate the power consumption of distinct active components on an electronic carrier board by using thermal imaging. The components and the board can be made of heterogeneous material such as plastic, coated microchips, and metal bonds or wires, where a special coating for high emissivity is not required. The thermal images are recorded when the components on… ▽ More This paper presents a novel method to estimate the power consumption of distinct active components on an electronic carrier board by using thermal imaging. The components and the board can be made of heterogeneous material such as plastic, coated microchips, and metal bonds or wires, where a special coating for high emissivity is not required. The thermal images are recorded when the components on the board are dissipating power. In order to enable reliable estimates, a segmentation of the thermal image must be available that can be obtained by manual labeling, object detection methods, or exploiting layout information. Evaluations show that with low-resolution consumer infrared cameras and dissipated powers larger than 300mW, mean estimation errors of 10% can be achieved. △ Less

Submitted 18 July, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

Comments: 10 pages, 8 figures

Journal ref: IEEE Transactions on Consumer Electronics, vol. 67, no. 4, pp. 383-392, Nov. 2021,

arXiv:2307.08344 [pdf, ps, other]

doi 10.1109/ICIP.2019.8803759

Efficient coding of 360° videos exploiting inactive regions in projection formats

Authors: Christian Herglotz, Mohammadreza Jamali, Stéphane Coulombe, Carlos Vazquez, Ahmad Vakili

Abstract: This paper presents an efficient method for encoding common projection formats in 360$^\circ$ video coding, in which we exploit inactive regions. These regions are ignored in the reconstruction of the equirectangular format or the viewport in virtual reality applications. As the content of these pixels is irrelevant, we neglect the corresponding pixel values in ratedistortion optimization, residua… ▽ More This paper presents an efficient method for encoding common projection formats in 360$^\circ$ video coding, in which we exploit inactive regions. These regions are ignored in the reconstruction of the equirectangular format or the viewport in virtual reality applications. As the content of these pixels is irrelevant, we neglect the corresponding pixel values in ratedistortion optimization, residual transformation, as well as inloop filtering and achieve bitrate savings of up to 10%. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 5 pages, 4 figures

Journal ref: 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 2019, pp. 1104-1108

arXiv:2307.08338 [pdf, ps, other]

doi 10.1109/ISCE.2019.8901018

Power Modeling for Virtual Reality Video Playback Applications

Authors: Christian Herglotz, Stéphane Coulombe, Ahmad Vakili, André Kaup

Abstract: This paper proposes a method to evaluate and model the power consumption of modern virtual reality playback and streaming applications on smartphones. Due to the high computational complexity of the virtual reality processing toolchain, the corresponding power consumption is very high, which reduces operating times of battery-powered devices. To tackle this problem, we analyze the power consumptio… ▽ More This paper proposes a method to evaluate and model the power consumption of modern virtual reality playback and streaming applications on smartphones. Due to the high computational complexity of the virtual reality processing toolchain, the corresponding power consumption is very high, which reduces operating times of battery-powered devices. To tackle this problem, we analyze the power consumption in detail by performing power measurements. Furthermore, we construct a model to estimate the true power consumption with a mean error of less than 3.5%. The model can be used to save power at critical battery levels by changing the streaming video parameters. Particularly, the results show that the power consumption is significantly reduced by decreasing the input video resolution. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 6 pages, 5 figures

Journal ref: 2019 IEEE 23rd International Symposium on Consumer Technologies (ISCT), Ancona, Italy, 2019, pp. 105-110

arXiv:2307.08337 [pdf, ps, other]

doi 10.1109/ICCE-Berlin47944.2019.8966177

Power-Efficient Video Streaming on Mobile Devices Using Optimal Spatial Scaling

Authors: Christian Herglotz, André Kaup, Stéphane Coulombe, Ahmad Vakili

Abstract: This paper derives optimal spatial scaling and rate control parameters for power-efficient wireless video streaming on portable devices. A video streaming application is studied, which receives a high-resolution and high-quality video stream from a remote server and displays the content to the end-user.We show that the resolution of the input video can be adjusted such that the quality-power trade… ▽ More This paper derives optimal spatial scaling and rate control parameters for power-efficient wireless video streaming on portable devices. A video streaming application is studied, which receives a high-resolution and high-quality video stream from a remote server and displays the content to the end-user.We show that the resolution of the input video can be adjusted such that the quality-power trade-off is optimized. Making use of a power model from the literature and subjective quality evaluation using a perceptual metric, we derive optimal combinations of the scaling factor and the rate-control parameter for encoding. For HD sequences, up to 10% of power can be saved at negligible quality losses and up to 15% of power can be saved at tolerable distortions. To show general validity, the method was tested for Wi-Fi and a mobile network as well as for two different smartphones. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 6 pages, 7 figures

Journal ref: Proc. IEEE 9th International Conference on Consumer Electronics (ICCE-Berlin), Berlin, Germany, 2019, pp. 233-238

arXiv:2307.05208 [pdf, other]

Encoder Complexity Control in SVT-AV1 by Speed-Adaptive Preset Switching

Authors: Lena Eichermüller, Gaurang Chaudhari, Ioannis Katsavounidis, Zhijun Lei, Hassene Tmar, André Kaup, Christian Herglotz

Abstract: Current developments in video encoding technology lead to continuously improving compression performance but at the expense of increasingly higher computational demands. Regarding the online video traffic increases during the last years and the concomitant need for video encoding, encoder complexity control mechanisms are required to restrict the processing time to a sufficient extent in order to… ▽ More Current developments in video encoding technology lead to continuously improving compression performance but at the expense of increasingly higher computational demands. Regarding the online video traffic increases during the last years and the concomitant need for video encoding, encoder complexity control mechanisms are required to restrict the processing time to a sufficient extent in order to find a reasonable trade-off between performance and complexity. We present a complexity control mechanism in SVT-AV1 by using speed-adaptive preset switching to comply with the remaining time budget. This method enables encoding with a user-defined time constraint within the complete preset range with an average precision of 8.9 \% without introducing any additional latencies. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: 5 pages, 2 figures, accepted for IEEE International Conference on Image Processing (ICIP) 2023

arXiv:2306.16755 [pdf, ps, other]

Processing Energy Modeling for Neural Network Based Image Compression

Authors: Christian Herglotz, Fabian Brand, Andy Regensky, Felix Rievel, André Kaup

Abstract: Nowadays, the compression performance of neural-networkbased image compression algorithms outperforms state-of-the-art compression approaches such as JPEG or HEIC-based image compression. Unfortunately, most neural-network based compression methods are executed on GPUs and consume a high amount of energy during execution. Therefore, this paper performs an in-depth analysis on the energy consumptio… ▽ More Nowadays, the compression performance of neural-networkbased image compression algorithms outperforms state-of-the-art compression approaches such as JPEG or HEIC-based image compression. Unfortunately, most neural-network based compression methods are executed on GPUs and consume a high amount of energy during execution. Therefore, this paper performs an in-depth analysis on the energy consumption of state-of-the-art neural-network based compression methods on a GPU and show that the energy consumption of compression networks can be estimated using the image size with mean estimation errors of less than 7%. Finally, using a correlation analysis, we find that the number of operations per pixel is the main driving force for energy consumption and deduce that the network layers up to the second downsampling step are consuming most energy. △ Less

Submitted 29 June, 2023; originally announced June 2023.

Comments: 5 pages, 3 figures, accepted for IEEE International Conference on Image Processing (ICIP) 2023

arXiv:2306.13694 [pdf, other]

doi 10.1109/ICIP49359.2023.10222661

Motion Plane Adaptive Motion Modeling for Spherical Video Coding in H.266/VVC

Authors: Andy Regensky, Christian Herglotz, André Kaup

Abstract: Motion compensation is one of the key technologies enabling the high compression efficiency of modern video coding standards. To allow compression of spherical video content, special mapping functions are required to project the video to the 2D image plane. Distortions inevitably occurring in these mappings impair the performance of classical motion models. In this paper, we propose a novel motion… ▽ More Motion compensation is one of the key technologies enabling the high compression efficiency of modern video coding standards. To allow compression of spherical video content, special mapping functions are required to project the video to the 2D image plane. Distortions inevitably occurring in these mappings impair the performance of classical motion models. In this paper, we propose a novel motion plane adaptive motion modeling technique (MPA) for spherical video that allows to perform motion compensation on different motion planes in 3D space instead of having to work on the - in theory arbitrarily mapped - 2D image representation directly. The integration of MPA into the state-of-the-art H.266/VVC video coding standard shows average Bjøntegaard Delta rate savings of 1.72\% with a peak of 3.37\% based on PSNR and 1.55\% with a peak of 2.92\% based on WS-PSNR compared to VTM-14.2. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Comments: 5 pages, 4 figures, 1 table, accepted for IEEE International Conference on Image Processing 2023 (IEEE ICIP 2023). arXiv admin note: substantial text overlap with arXiv:2202.03323

arXiv:2306.06917 [pdf, ps, other]

doi 10.1145/3593908.3593948

Video Decoding Energy Reduction Using Temporal-Domain Filtering

Authors: Christian Herglotz, Matthias Kränzler, Robert Ludwig, André Kaup

Abstract: In this paper, we study decoding energy reduction opportunities using temporal-domain filtering and subsampling methods. In particular, we study spatiotemporal filtering using a contrast sensitivity function and temporal downscaling, i.e., frame rate reduction. We apply these concepts as a pre-filtering to the video before compression and evaluate the bitrate, the decoding energy, and the visual q… ▽ More In this paper, we study decoding energy reduction opportunities using temporal-domain filtering and subsampling methods. In particular, we study spatiotemporal filtering using a contrast sensitivity function and temporal downscaling, i.e., frame rate reduction. We apply these concepts as a pre-filtering to the video before compression and evaluate the bitrate, the decoding energy, and the visual quality with a dedicated metric targeting temporally down-scaled sequences. We find that decoding energy savings yield 35% when halving the frame rate and that spatiotemporal filtering can lead to up to 5% of additional savings, depending on the content. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: 6 pages, 5 figures

arXiv:2305.15117 [pdf, other]

Power Reduction Opportunities on End-User Devices in Quality-Steady Video Streaming

Authors: Christian Herglotz, Werner Robitza, Alexander Raake, Tobias Hossfeld, André Kaup

Abstract: This paper uses a crowdsourced dataset of online video streaming sessions to investigate opportunities to reduce the power consumption while considering QoE. For this, we base our work on prior studies which model both the end-user's QoE and the end-user device's power consumption with the help of high-level video features such as the bitrate, the frame rate, and the resolution. On top of existing… ▽ More This paper uses a crowdsourced dataset of online video streaming sessions to investigate opportunities to reduce the power consumption while considering QoE. For this, we base our work on prior studies which model both the end-user's QoE and the end-user device's power consumption with the help of high-level video features such as the bitrate, the frame rate, and the resolution. On top of existing research, which focused on reducing the power consumption at the same QoE optimizing video parameters, we investigate potential power savings by other means such as using a different playback device, a different codec, or a predefined maximum quality level. We find that based on the power consumption of the streaming sessions from the crowdsourcing dataset, devices could save more than 55% of power if all participants adhere to low-power settings. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: 4 pages, 3 figures

arXiv:2304.12852 [pdf, ps, other]

doi 10.1109/TIP.2023.3346695

The Bjøntegaard Bible -- Why your Way of Comparing Video Codecs May Be Wrong

Authors: Christian Herglotz, Hannah Och, Anna Meyer, Geetha Ramasubbu, Lena Eichermüller, Matthias Kränzler, Fabian Brand, Kristian Fischer, Dat Thanh Nguyen, Andy Regensky, André Kaup

Abstract: In this paper, we provide an in-depth assessment on the Bjøntegaard Delta. We construct a large data set of video compression performance comparisons using a diverse set of metrics including PSNR, VMAF, bitrate, and processing energies. These metrics are evaluated for visual data types such as classic perspective video, 360$^\circ$ video, point clouds, and screen content. As compression technology… ▽ More In this paper, we provide an in-depth assessment on the Bjøntegaard Delta. We construct a large data set of video compression performance comparisons using a diverse set of metrics including PSNR, VMAF, bitrate, and processing energies. These metrics are evaluated for visual data types such as classic perspective video, 360$^\circ$ video, point clouds, and screen content. As compression technology, we consider multiple hybrid video codecs as well as state-of-the-art neural network based compression methods. Using additional supporting points inbetween standard points defined by parameters such as the quantization parameter, we assess the interpolation error of the Bjøntegaard-Delta (BD) calculus and its impact on the final BD value. From the analysis, we find that the BD calculus is most accurate in the standard application of rate-distortion comparisons with mean errors below 0.5 percentage points. For other applications and special cases, e.g., VMAF quality, energy considerations, or inter-codec comparisons, the errors are higher (up to 5 percentage points), but can be halved by using a higher number of supporting points. We finally come up with recommendations on how to use the BD calculus such that the validity of the resulting BD-values is maximized. Main recommendations are as follows: First, relative curve differences should be plotted and analyzed. Second, the logarithmic domain should be used for saturating metrics such as SSIM and VMAF. Third, BD values below a certain threshold indicated by the subset error should not be used to draw recommendations. Fourth, using two supporting points is sufficient to obtain rough performance estimates. △ Less

Submitted 22 December, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

Comments: 21 pages, 14 figures

arXiv:2301.08533 [pdf, other]

doi 10.1109/ICIP46576.2022.9897987

Learning Frequency-Specific Quantization Scaling in VVC for Standard-Compliant Task-driven Image Coding

Authors: Kristian Fischer, Fabian Brand, Christian Herglotz, André Kaup

Abstract: Today, visual data is often analyzed by a neural network without any human being involved, which demands for specialized codecs. For standard-compliant codec adaptations towards certain information sinks, HEVC or VVC provide the possibility of frequency-specific quantization with scaling lists. This is a well-known method for the human visual system, where scaling lists are derived from psycho-vis… ▽ More Today, visual data is often analyzed by a neural network without any human being involved, which demands for specialized codecs. For standard-compliant codec adaptations towards certain information sinks, HEVC or VVC provide the possibility of frequency-specific quantization with scaling lists. This is a well-known method for the human visual system, where scaling lists are derived from psycho-visual models. In this work, we employ scaling lists when performing VVC intra coding for neural networks as information sink. To this end, we propose a novel data-driven method to obtain optimal scaling lists for arbitrary neural networks. Experiments with Mask R-CNN as information sink reveal that coding the Cityscapes dataset with the proposed scaling lists result in peak bitrate savings of 8.9 % over VVC with constant quantization. By that, our approach also outperforms scaling lists optimized for the human visual system. The generated scaling lists can be found under https://github.com/FAU-LMS/VCM_scaling_lists. △ Less

Submitted 20 January, 2023; originally announced January 2023.

Comments: Originally submitted at IEEE ICIP 2022

ACM Class: I.4.2

Journal ref: ICIP2022

arXiv:2212.05609 [pdf, ps, other]

doi 10.1109/PCS56426.2022.10018048

A Bit Stream Feature-Based Energy Estimator for HEVC Software Encoding

Authors: Geetha Ramasubbu, André Kaup, Christian Herglotz

Abstract: The total energy consumption of today's video coding systems is globally significant and emphasizes the need for sustainable video coder applications. To develop such sustainable video coders, the knowledge of the energy consumption of state-of-the-art video coders is necessary. For that purpose, we need a dedicated setup that measures the energy of the encoding and decoding system. However, such… ▽ More The total energy consumption of today's video coding systems is globally significant and emphasizes the need for sustainable video coder applications. To develop such sustainable video coders, the knowledge of the energy consumption of state-of-the-art video coders is necessary. For that purpose, we need a dedicated setup that measures the energy of the encoding and decoding system. However, such measurements are costly and laborious. To this end, this paper presents an energy estimator that uses a subset of bit stream features to accurately estimate the energy consumption of the HEVC software encoding process. The proposed model reaches a mean estimation error of 4.88% when averaged over presets of the x265 encoder implementation. The results from this work help to identify properties of encoding energy-saving bit streams and, in turn, are useful for developing new energy-efficient video coding algorithms. △ Less

Submitted 1 October, 2024; v1 submitted 11 December, 2022; originally announced December 2022.

Comments: Proc. 2022 Picture Coding Symposium. arXiv admin note: substantial text overlap with arXiv:2207.02676

Journal ref: 2022 Picture Coding Symposium (PCS)

arXiv:2212.04324 [pdf, ps, other]

3-D mesh compensated wavelet lifting for 3-D+t medical CT data

Authors: Wolfgang Schnurrer, Thomas Richter, Jürgen Seiler, Christian Herglotz, André Kaup

Abstract: For scalable coding, a high quality of the lowpass band of a wavelet transform is crucial when it is used as a downscaled version of the original signal. However, blur and motion can lead to disturbing artifacts. By incorporating feasible compensation methods directly into the wavelet transform, the quality of the lowpass band can be improved. The displacement in dynamic medical 3-D+t volumes from… ▽ More For scalable coding, a high quality of the lowpass band of a wavelet transform is crucial when it is used as a downscaled version of the original signal. However, blur and motion can lead to disturbing artifacts. By incorporating feasible compensation methods directly into the wavelet transform, the quality of the lowpass band can be improved. The displacement in dynamic medical 3-D+t volumes from Computed Tomography is mainly given by expansion and compression of tissue over time and can be modeled well by mesh-based methods. We extend a 2-D mesh-based compensation method to three dimensions to obtain a volume compensation method that can additionally compensate deforming displacements in the third dimension. We show that a 3-D mesh can obtain a higher quality of the lowpass band by 0.28 dB with less than 40% of the model parameters of a comparable 2-D mesh. Results from lossless coding with JPEG 2000 3D and SPECK3D show that the compensated subbands using a 3-D mesh need about 6% less data compared to using a 2-D mesh. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Journal ref: IEEE International Conference on Image Processing (ICIP), 2014, pp. 3631-3635

arXiv:2210.05444 [pdf, other]

Modeling of Energy Consumption and Streaming Video QoE using a Crowdsourcing Dataset

Authors: Christian Herglotz, Werner Robitza, Matthias Kränzler, André Kaup, Alexander Raake

Abstract: In the past decade, we have witnessed an enormous growth in the demand for online video services. Recent studies estimate that nowadays, more than 1% of the global greenhouse gas emissions can be attributed to the production and use of devices performing online video tasks. As such, research on the true power consumption of devices and their energy efficiency during video streaming is highly impor… ▽ More In the past decade, we have witnessed an enormous growth in the demand for online video services. Recent studies estimate that nowadays, more than 1% of the global greenhouse gas emissions can be attributed to the production and use of devices performing online video tasks. As such, research on the true power consumption of devices and their energy efficiency during video streaming is highly important for a sustainable use of this technology. At the same time, over-the-top providers strive to offer high-quality streaming experiences to satisfy user expectations. Here, energy consumption and QoE partly depend on the same system parameters. Hence, a joint view is needed for their evaluation. In this paper, we perform a first analysis of both end-user power efficiency and Quality of Experience of a video streaming service. We take a crowdsourced dataset comprising 447,000 streaming events from YouTube and estimate both the power consumption and perceived quality. The power consumption is modeled based on previous work which we extended towards predicting the power usage of different devices and codecs. The user-perceived QoE is estimated using a standardized model. Our results indicate that an intelligent choice of streaming parameters can optimize both the QoE and the power efficiency of the end user device. Further, the paper discusses limitations of the approach and identifies directions for future research. △ Less

Submitted 11 October, 2022; originally announced October 2022.

Comments: 6 pages, 3 figures

Journal ref: Proc. 2022 14th International Conference on Quality of Multimedia Experience (QoMEX)

arXiv:2209.15405 [pdf, ps, other]

doi 10.1109/MCAS.2023.3234739

Sweet Streams are Made of This: The System Engineer's View on Energy Efficiency in Video Communications

Authors: Christian Herglotz, Matthias Kränzler, Robert Schober, André Kaup

Abstract: In recent years, the global use of online video services has increased rapidly. Today, a manifold of applications, such as video streaming, video conferencing, live broadcasting, and social networks, make use of this technology. A recent study found that the development and the success of these services had as a consequence that, nowadays, more than 1% of the global greenhouse-gas emissions are re… ▽ More In recent years, the global use of online video services has increased rapidly. Today, a manifold of applications, such as video streaming, video conferencing, live broadcasting, and social networks, make use of this technology. A recent study found that the development and the success of these services had as a consequence that, nowadays, more than 1% of the global greenhouse-gas emissions are related to online video, with growth rates close to 10% per year. This article reviews the latest findings concerning energy consumption of online video from the system engineer's perspective, where the system engineer is the designer and operator of a typical online video service. We discuss all relevant energy sinks, highlight dependencies with quality-of-service variables as well as video properties, review energy consumption models for different devices from the literature, and aggregate these existing models into a global model for the overall energy consumption of a generic online video service. Analyzing this model and its implications, we find that end-user devices and video encoding have the largest potential for energy savings. Finally, we provide an overview of recent advances in energy efficiency improvement for video streaming and propose future research directions for energy-efficient video streaming services. △ Less

Submitted 30 September, 2022; originally announced September 2022.

Comments: 16 pages, 5 figures, accepted for IEEE Circuits and Systems Magazine

arXiv:2209.10353 [pdf, ps, other]

doi 10.1109/QoMEX48832.2020.9123084

Matched Quality Evaluation of Temporally Downsampled Videos with Non-Integer Factors

Authors: Christian Herglotz, Geetha Ramasubbu, André Kaup

Abstract: Recent research has shown that temporal downsampling of high-frame-rate sequences can be exploited to improve the rate-distortion performance in video coding. However, until now, research only targeted downsampling factors of powers of two, which greatly restricts the potential applicability of temporal downsampling. A major reason is that traditional, objective quality metrics such as peak signal… ▽ More Recent research has shown that temporal downsampling of high-frame-rate sequences can be exploited to improve the rate-distortion performance in video coding. However, until now, research only targeted downsampling factors of powers of two, which greatly restricts the potential applicability of temporal downsampling. A major reason is that traditional, objective quality metrics such as peak signal-to-noise ratio or more recent approaches, which try to mimic subjective quality, can only be evaluated between two sequences whose frame rate ratio is an integer value. To relieve this problem, we propose a quality evaluation method that allows calculating the distortion between two sequences whose frame rate ratio is fractional. The proposed method can be applied to any full-reference quality metric. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: Proc. 12th International Conference on Quality of Multimedia Experience (QoMEX)

arXiv:2209.10283 [pdf, ps, other]

doi 10.1109/MMSP48831.2020.9287098

A Comparative Analysis of the Time and Energy Demand of Versatile Video Coding and High Efficiency Video Coding Reference Decoders

Authors: Matthias Kränzler, Christian Herglotz, André Kaup

Abstract: This paper investigates the decoding energy and decoding time demand of VTM-7.0 in relation to HM-16.20. We present the first detailed comparison of two video codecs in terms of software decoder energy consumption. The evaluation shows that the energy demand of the VTM decoder is increased significantly compared to HM and that the increase depends on the coding configuration. For the coding config… ▽ More This paper investigates the decoding energy and decoding time demand of VTM-7.0 in relation to HM-16.20. We present the first detailed comparison of two video codecs in terms of software decoder energy consumption. The evaluation shows that the energy demand of the VTM decoder is increased significantly compared to HM and that the increase depends on the coding configuration. For the coding configuration randomaccess, we find that the decoding energy is increased by over 80% at a decoding time increase of over 70%. Furthermore, results indicate that the energy demand increases by up to 207% when Single Instruction Multiple Data (SIMD) instructions are disabled, which corresponds to the HM implementation style. By measurements, it is revealed that the coding tools MIP, AMVR, TPM, LFNST, and MTS increase the energy efficiency of the decoder. Furthermore, we propose a new coding configuration based on our analysis, which reduces the energy demand of the VTM decoder by over 17% on average. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: in Proc. 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)

arXiv:2209.10268 [pdf, other]

doi 10.1109/PCS48520.2019.8954563

Extending Video Decoding Energy Models for 360° and HDR Video Formats in HEVC

Authors: Matthias Kränzler, Christian Herglotz, André Kaup

Abstract: Research has shown that decoder energy models are helpful tools for improving the energy efficiency in video playback applications. For example, an accurate feature-based bit stream model can reduce the energy consumption of the decoding process. However, until now only sequences of the SDR video format were investigated. Therefore, this paper shows that the decoding energy of HEVC-coded bit strea… ▽ More Research has shown that decoder energy models are helpful tools for improving the energy efficiency in video playback applications. For example, an accurate feature-based bit stream model can reduce the energy consumption of the decoding process. However, until now only sequences of the SDR video format were investigated. Therefore, this paper shows that the decoding energy of HEVC-coded bit streams can be estimated precisely for different video formats and coding bit depths. Therefore, we compare a state-of-the-art model from the literature with a proposed model. We show that bit streams of the 360°, HDR, and fisheye video format can be estimated with a mean estimation error lower than 3.88% if the setups have the same coding bit depth. Furthermore, it is shown that on average, the energy demand for the decoding of bit streams with a bit depth of 10-bit is 55% higher than with 8-bit. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: in Proc. Picture Coding Symposium (PCS) 2019

arXiv:2209.10266 [pdf, other]

doi 10.1109/ICIP40778.2020.9190840

Decoding Energy Modeling For Versatile Video Coding

Authors: Matthias Kränzler, Christian Herglotz, André Kaup

Abstract: In previous research, it was shown that the software decoding energy demand of High Efficiency Video Coding (HEVC) can be reduced by 15$\%$ by using a decoding-energy-rate-distortion optimization algorithm. To achieve this, the energy demand of the decoder has to be modeled by a bit stream feature-based model with sufficiently high accuracy. Therefore, we propose two bit stream feature-based model… ▽ More In previous research, it was shown that the software decoding energy demand of High Efficiency Video Coding (HEVC) can be reduced by 15$\%$ by using a decoding-energy-rate-distortion optimization algorithm. To achieve this, the energy demand of the decoder has to be modeled by a bit stream feature-based model with sufficiently high accuracy. Therefore, we propose two bit stream feature-based models for the upcoming Versatile Video Coding (VVC) standard. The newly introduced models are compared with models from literature, which are used for HEVC. An evaluation of the proposed models reveals that the mean estimation error is similar to the results of the literature and yields an estimation error of 1.85% with 10-fold cross-validation. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: in Proc. 2020 IEEE International Conference on Image Processing (ICIP)

arXiv:2209.10211 [pdf, other]

doi 10.1109/PCS56426.2022.10018037

Advanced Design Space Exploration for Joint Energy and Quality Optimization for VVC

Authors: Matthias Kränzler, Christian Herglotz, André Kaup

Abstract: In recent studies, it could be shown that the energy demand of Versatile Video Coding (VVC) decoders can be twice as high as comparable High Efficiency Video Coding (HEVC) decoders. A significant part of this increase in complexity is attributed to the usage of new coding tools. By using a design space exploration algorithm, it was shown that the energy demand of VVC-coded sequences could be reduc… ▽ More In recent studies, it could be shown that the energy demand of Versatile Video Coding (VVC) decoders can be twice as high as comparable High Efficiency Video Coding (HEVC) decoders. A significant part of this increase in complexity is attributed to the usage of new coding tools. By using a design space exploration algorithm, it was shown that the energy demand of VVC-coded sequences could be reduced if different coding tool profiles were used for the encoding process. This work extends the algorithm with several optimization strategies, methodological adjustments to optimize perceptual quality, and a new minimization criterion. As a result, we significantly improve the Pareto front, and the rate-distortion and energy efficiency of the state-of-the-art design space exploration. Therefore, we show an energy demand reduction of up to 47% with less than 30% additional bit rate, or a reduction of over 35% with approximately 6% additional bit rate. △ Less

Submitted 21 September, 2022; originally announced September 2022.

Comments: accepted as a conference paper for Picture Coding Symposium (PCS) 2022

arXiv:2207.02676 [pdf, ps, other]

doi 10.1109/ICIP46576.2022.9897306

Modeling the HEVC Encoding Energy Using the Encoder Processing Time

Authors: Geetha Ramasubbu, André Kaup, Christian Herglotz

Abstract: The global significance of energy consumption of video communication renders research on the energy need of video coding an important task. To do so, usually, a dedicated setup is needed that measures the energy of the encoding and decoding system. However, such measurements are costly and complex. To this end, this paper presents the results of an exhaustive measurement series using the x265 enco… ▽ More The global significance of energy consumption of video communication renders research on the energy need of video coding an important task. To do so, usually, a dedicated setup is needed that measures the energy of the encoding and decoding system. However, such measurements are costly and complex. To this end, this paper presents the results of an exhaustive measurement series using the x265 encoder implementation of HEVC and analyzes the relation between encoding time and encoding energy. Finally, we introduce a simple encoding energy estimation model which employs the encoding time of a lightweight encoding process to estimate the encoding energy of complex encoding configurations. The proposed model reaches a mean estimation error of 11.35% when averaged over all presets. The results from this work are useful when the encoding energy estimate is required to develop new energy-efficient video compression algorithms. △ Less

Submitted 1 October, 2024; v1 submitted 6 July, 2022; originally announced July 2022.

Comments: Proc. 2022 IEEE International Conference on Image Processing (ICIP)

Journal ref: 2022 IEEE International Conference on Image Processing (ICIP)

arXiv:2206.13483 [pdf, other]

doi 10.1109/ICIP46576.2022.9897896

Optimized Decoding-Energy-Aware Encoding in Practical VVC Implementations

Authors: Matthias Kränzler, Adam Wieckowski, Geetha Ramasubbu, Benjamin Bross, André Kaup, Detlev Marpe, Christian Herglotz

Abstract: The optimization of the energy demand is crucial for modern video codecs. Previous studies show that the energy demand of VVC decoders can be improved by more than 50% if specific coding tools are disabled in the encoder. However, those approaches increase the bit rate by over 20% if the concept is applied to practical encoder implementations such as VVenC. Therefore, in this work, we investigate… ▽ More The optimization of the energy demand is crucial for modern video codecs. Previous studies show that the energy demand of VVC decoders can be improved by more than 50% if specific coding tools are disabled in the encoder. However, those approaches increase the bit rate by over 20% if the concept is applied to practical encoder implementations such as VVenC. Therefore, in this work, we investigate VVenC and study possibilities to reduce the additional bit rate, while still achieving low-energy decoding at reasonable encoding times. We show that encoding using our proposed coding tool profiles, the decoding energy efficiency is improved by over 25% with a bit rate increase of less than 5% with respect to standard encoding. Furthermore, we propose a second coding tool profile targeting maximum energy savings, which achieves 34% of energy savings at bitrate increases below 15%. △ Less

Submitted 27 June, 2022; originally announced June 2022.

arXiv:2206.12186 [pdf, other]

doi 10.1109/TCSVT.2022.3185026

Rate-Distortion Optimal Transform Coefficient Selection for Unoccupied Regions in Video-Based Point Cloud Compression

Authors: Christian Herglotz, Nils Genser, André Kaup

Abstract: This paper presents a novel method to determine rate-distortion optimized transform coefficients for efficient compression of videos generated from point clouds. The method exploits a generalized frequency selective extrapolation approach that iteratively determines rate-distortion-optimized coefficients for all basis functions of two-dimensional discrete cosine and sine transforms. The method is… ▽ More This paper presents a novel method to determine rate-distortion optimized transform coefficients for efficient compression of videos generated from point clouds. The method exploits a generalized frequency selective extrapolation approach that iteratively determines rate-distortion-optimized coefficients for all basis functions of two-dimensional discrete cosine and sine transforms. The method is applied to blocks containing both occupied and unoccupied pixels in video based point cloud compression for HEVC encoding. In the proposed algorithm, only the values of the transform coefficients are changed such that resulting bit streams are compliant to the V-PCC standard. For all-intra coded point clouds, bitrate savings of more than 4% for geometry and more than 6% for texture error metrics with respect to standard encoding can be observed. These savings are more than twice as high as savings obtained using competing methods from literature. In the randomaccess case, our proposed method outperforms competing V-PCC methods by more than 0.5%. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: 14 pages, 9 figures

Journal ref: IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), early access, June 2022

arXiv:2205.06511 [pdf, other]

doi 10.1109/ICIP42928.2021.9506763

Analysis of Neural Image Compression Networks for Machine-to-Machine Communication

Authors: Kristian Fischer, Christian Forsch, Christian Herglotz, André Kaup

Abstract: Video and image coding for machines (VCM) is an emerging field that aims to develop compression methods resulting in optimal bitstreams when the decoded frames are analyzed by a neural network. Several approaches already exist improving classic hybrid codecs for this task. However, neural compression networks (NCNs) have made an enormous progress in coding images over the last years. Thus, it is r… ▽ More Video and image coding for machines (VCM) is an emerging field that aims to develop compression methods resulting in optimal bitstreams when the decoded frames are analyzed by a neural network. Several approaches already exist improving classic hybrid codecs for this task. However, neural compression networks (NCNs) have made an enormous progress in coding images over the last years. Thus, it is reasonable to consider such NCNs, when the information sink at the decoder side is a neural network as well. Therefore, we build-up an evaluation framework analyzing the performance of four state-of-the-art NCNs, when a Mask R-CNN is segmenting objects from the decoded image. The compression performance is measured by the weighted average precision for the Cityscapes dataset. Based on that analysis, we find that networks with leaky ReLU as non-linearity and training with SSIM as distortion criteria results in the highest coding gains for the VCM task. Furthermore, it is shown that the GAN-based NCN architecture achieves the best coding performance and even out-performs the recently standardized Versatile Video Coding (VVC) for the given scenario. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: Originally submitted at IEEE ICIP 2021

ACM Class: I.4.2

Journal ref: IEEE International Conference on Image Processing (ICIP) 2021

arXiv:2205.06501 [pdf, other]

doi 10.1109/ISCAS51556.2021.9401621

Robust Deep Neural Object Detection and Segmentation for Automotive Driving Scenario with Compressed Image Data

Authors: Kristian Fischer, Christian Blum, Christian Herglotz, André Kaup

Abstract: Deep neural object detection or segmentation networks are commonly trained with pristine, uncompressed data. However, in practical applications the input images are usually deteriorated by compression that is applied to efficiently transmit the data. Thus, we propose to add deteriorated images to the training process in order to increase the robustness of the two state-of-the-art networks Faster a… ▽ More Deep neural object detection or segmentation networks are commonly trained with pristine, uncompressed data. However, in practical applications the input images are usually deteriorated by compression that is applied to efficiently transmit the data. Thus, we propose to add deteriorated images to the training process in order to increase the robustness of the two state-of-the-art networks Faster and Mask R-CNN. Throughout our paper, we investigate an autonomous driving scenario by evaluating the newly trained models on the Cityscapes dataset that has been compressed with the upcoming video coding standard Versatile Video Coding (VVC). When employing the models that have been trained with the proposed method, the weighted average precision of the R-CNNs can be increased by up to 3.68 percentage points for compressed input images, which corresponds to bitrate savings of nearly 48 %. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: Originally submitted at IEEE ISCAS 2021

ACM Class: I.4.2

Journal ref: IEEE International Symposium on Circuits and Systems (ISCAS) 2021

arXiv:2204.10151 [pdf, ps, other]

doi 10.1109/PCS.2016.7906400

A Bitstream Feature Based Model for Video Decoding Energy Estimation

Authors: Christian Herglotz, Yongjun Wen, Bowen Dai, Matthias Kränzler, André Kaup

Abstract: In this paper we show that a small amount of bit stream features can be used to accurately estimate the energy consumption of state-of-the-art software and hardware accelerated decoder implementations for four different video codecs. By testing the estimation performance on HEVC, H.264, H.263, and VP9 we show that the proposed model can be used for any hybrid video codec. We test our approach on a… ▽ More In this paper we show that a small amount of bit stream features can be used to accurately estimate the energy consumption of state-of-the-art software and hardware accelerated decoder implementations for four different video codecs. By testing the estimation performance on HEVC, H.264, H.263, and VP9 we show that the proposed model can be used for any hybrid video codec. We test our approach on a high amount of different test sequences to prove the general validity. We show that less than 20 features are sufficient to obtain mean estimation errors that are smaller than 8%. Finally, an example will show the performance trade-offs in terms of rate, distortion, and decoding energy for all tested codecs. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Comments: 5 pages, 2 figures, 2016 Picture Coding Symposium (PCS)

arXiv:2203.05944 [pdf, other]

doi 10.1109/ICASSP39728.2021.9415048

Saliency-Driven Versatile Video Coding for Neural Object Detection

Authors: Kristian Fischer, Felix Fleckenstein, Christian Herglotz, André Kaup

Abstract: Saliency-driven image and video coding for humans has gained importance in the recent past. In this paper, we propose such a saliency-driven coding framework for the video coding for machines task using the latest video coding standard Versatile Video Coding (VVC). To determine the salient regions before encoding, we employ the real-time-capable object detection network You Only Look Once~(YOLO) i… ▽ More Saliency-driven image and video coding for humans has gained importance in the recent past. In this paper, we propose such a saliency-driven coding framework for the video coding for machines task using the latest video coding standard Versatile Video Coding (VVC). To determine the salient regions before encoding, we employ the real-time-capable object detection network You Only Look Once~(YOLO) in combination with a novel decision criterion. To measure the coding quality for a machine, the state-of-the-art object segmentation network Mask R-CNN was applied to the decoded frame. From extensive simulations we find that, compared to the reference VVC with a constant quality, up to 29 % of bitrate can be saved with the same detection accuracy at the decoder side by applying the proposed saliency-driven framework. Besides, we compare YOLO against other, more traditional saliency detection methods. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: 5 pages, 3 figures, 2 tables; Originally submitted at IEEE ICASSP 2021

ACM Class: I.4.2

Journal ref: IEEE ICASSP 2021

arXiv:2203.05927 [pdf, other]

doi 10.1109/ICIP40778.2020.9191023

On Intra Video Coding and In-loop Filtering for Neural Object Detection Networks

Authors: Kristian Fischer, Christian Herglotz, André Kaup

Abstract: Classical video coding for satisfying humans as the final user is a widely investigated field of studies for visual content, and common video codecs are all optimized for the human visual system (HVS). But are the assumptions and optimizations also valid when the compressed video stream is analyzed by a machine? To answer this question, we compared the performance of two state-of-the-art neural de… ▽ More Classical video coding for satisfying humans as the final user is a widely investigated field of studies for visual content, and common video codecs are all optimized for the human visual system (HVS). But are the assumptions and optimizations also valid when the compressed video stream is analyzed by a machine? To answer this question, we compared the performance of two state-of-the-art neural detection networks when being fed with deteriorated input images coded with HEVC and VVC in an autonomous driving scenario using intra coding. Additionally, the impact of the three VVC in-loop filters when coding images for a neural network is examined. The results are compared using the mean average precision metric to evaluate the object detection performance for the compressed inputs. Throughout these tests, we found that the Bjøntegaard Delta Rate savings with respect to PSNR of 22.2 % using VVC instead of HEVC cannot be reached when coding for object detection networks with only 13.6% in the best case. Besides, it is shown that disabling the VVC in-loop filters SAO and ALF results in bitrate savings of 6.4 % compared to the standard VTM at the same mean average precision. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: 5 pages, 6 figures, 2 tables; Originally published at IEEE ICIP 2020

ACM Class: I.4.2

Journal ref: IEEE ICIP 2020

arXiv:2203.05890 [pdf, other]

doi 10.1109/MMSP48831.2020.9287136

Video Coding for Machines with Feature-Based Rate-Distortion Optimization

Authors: Kristian Fischer, Fabian Brand, Christian Herglotz, André Kaup

Abstract: Common state-of-the-art video codecs are optimized to deliver a low bitrate by providing a certain quality for the final human observer, which is achieved by rate-distortion optimization (RDO). But, with the steady improvement of neural networks solving computer vision tasks, more and more multimedia data is not observed by humans anymore, but directly analyzed by neural networks. In this paper, w… ▽ More Common state-of-the-art video codecs are optimized to deliver a low bitrate by providing a certain quality for the final human observer, which is achieved by rate-distortion optimization (RDO). But, with the steady improvement of neural networks solving computer vision tasks, more and more multimedia data is not observed by humans anymore, but directly analyzed by neural networks. In this paper, we propose a standard-compliant feature-based RDO (FRDO) that is designed to increase the coding performance, when the decoded frame is analyzed by a neural network in a video coding for machine scenario. To that extent, we replace the pixel-based distortion metrics in conventional RDO of VTM-8.0 with distortion metrics calculated in the feature space created by the first layers of a neural network. Throughout several tests with the segmentation network Mask R-CNN and single images from the Cityscapes dataset, we compare the proposed FRDO and its hybrid version HFRDO with different distortion measures in the feature space against the conventional RDO. With HFRDO, up to 5.49 % bitrate can be saved compared to the VTM-8.0 implementation in terms of Bjøntegaard Delta Rate and using the weighted average precision as quality metric. Additionally, allowing the encoder to vary the quantization parameter results in coding gains for the proposed HFRDO of up 9.95 % compared to conventional VTM. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: 6 pages, 7 figures, 2 tables; Originally published as conference paper at IEEE MMSP 2020

ACM Class: I.4.2

Journal ref: IEEE MMSP 2020

arXiv:2203.01782 [pdf, ps, other]

doi 10.1109/PCS.2016.7906327

Multi-Objective Design Space Exploration for the Optimization of the HEVC Mode Decision Process

Authors: Christian Herglotz, Rafael Rosales, Michael Glass, Jürgen Teich, André Kaup

Abstract: Finding the best possible encoding decisions for compressing a video sequence is a highly complex problem. In this work, we propose a multi-objective Design Space Exploration (DSE) method to automatically find HEVC encoder implementations that are optimized for several different criteria. The DSE shall optimize the coding mode evaluation order of the mode decision process and jointly explore early… ▽ More Finding the best possible encoding decisions for compressing a video sequence is a highly complex problem. In this work, we propose a multi-objective Design Space Exploration (DSE) method to automatically find HEVC encoder implementations that are optimized for several different criteria. The DSE shall optimize the coding mode evaluation order of the mode decision process and jointly explore early skip conditions to minimize the four objectives a) bitrate, b) distortion, c) encoding time, and d) decoding energy. In this context, we use a SystemC-based actor model of the HM test model encoder for the evaluation of each explored solution. The evaluation that is based on real measurements shows that our framework can automatically generate encoder solutions that save more than 60% of encoding time or 3% of decoding energy when accepting bitrate increases of around 3%. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 5 pages, 4 figures, 2016 Picture Coding Symposium (PCS)

arXiv:2203.01771 [pdf, other]

doi 10.1109/IPDPSW.2015.58

Estimation of Non-Functional Properties for Embedded Hardware with Application to Image Processing

Authors: Christian Herglotz, Jürgen Seiler, André Kaup, Arne Hendricks, Marc Reichenbach, Dietmar Fey

Abstract: In recent years, due to a higher demand for portable devices, which provide restricted amounts of processing capacity and battery power, the need for energy and time efficient hard- and software solutions has increased. Preliminary estimations of time and energy consumption can thus be valuable to improve implementations and design decisions. To this end, this paper presents a method to estimate t… ▽ More In recent years, due to a higher demand for portable devices, which provide restricted amounts of processing capacity and battery power, the need for energy and time efficient hard- and software solutions has increased. Preliminary estimations of time and energy consumption can thus be valuable to improve implementations and design decisions. To this end, this paper presents a method to estimate the time and energy consumption of a given software solution, without having to rely on the use of a traditional Cycle Accurate Simulator (CAS). Instead, we propose to utilize a combination of high-level functional simulation with a mechanistic extension to include non-functional properties: Instruction counts from virtual execution are multiplied with corresponding specific energies and times. By evaluating two common image processing algorithms on an FPGA-based CPU, where a mean relative estimation error of 3% is achieved for cacheless systems, we show that this estimation tool can be a valuable aid in the development of embedded processor architectures. The tool allows the developer to reach well-suited design decisions regarding the optimal processor hardware configuration for a given algorithm at an early stage in the design process. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 6 pages, 4 figures, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (IPDPS)

arXiv:2203.01767 [pdf, ps, other]

doi 10.1109/ISCAS.2015.7168683

Estimating the HEVC Decoding Energy Using the Decoder Processing Time

Authors: Christian Herglotz, Elisabeth Walencik, André Kaup

Abstract: This paper presents a method to accurately estimate the required decoding energy for a given HEVC software decoding solution. We show that the decoder's processing time as returned by common C++ and UNIX functions is a highly suitable parameter to obtain valid estimations for the actual decoding energy. We verify this hypothesis by performing an exhaustive measurement series using different decode… ▽ More This paper presents a method to accurately estimate the required decoding energy for a given HEVC software decoding solution. We show that the decoder's processing time as returned by common C++ and UNIX functions is a highly suitable parameter to obtain valid estimations for the actual decoding energy. We verify this hypothesis by performing an exhaustive measurement series using different decoder setups and video bit streams. Our findings can be used by developers and researchers in the search for new energy saving video compression algorithms. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 4 pages, 3 figures

Journal ref: IEEE International Symposium on Circuits and Systems (ISCAS), 2015

arXiv:2203.01765 [pdf, ps, other]

doi 10.1109/ICIP.2016.7532416

Joint Optimization of Rate, Distortion, and Decoding Energy for HEVC Intraframe Coding

Authors: Christian Herglotz, André Kaup

Abstract: This paper presents a novel algorithm that aims at minimizing the required decoding energy by exploiting a general energy model for HEVC-decoder solutions. We incorporate the energy model into the HEVC encoder such that it is capable of constructing a bit stream whose decoding process consumes less energy than the decoding process of a conventional bit stream. To achieve this, we propose to extend… ▽ More This paper presents a novel algorithm that aims at minimizing the required decoding energy by exploiting a general energy model for HEVC-decoder solutions. We incorporate the energy model into the HEVC encoder such that it is capable of constructing a bit stream whose decoding process consumes less energy than the decoding process of a conventional bit stream. To achieve this, we propose to extend the traditional Rate-Distortion-Optimization scheme to a Decoding-Energy-Rate-Distortion approach. To obtain fast encoding decisions in the optimization process, we derive a fixed relation between the quantization parameter and the Lagrange multiplier for energy optimization. Our experiments show that this concept is applicable for intraframe-coded videos and that for local playback as well as online streaming scenarios, up to 15% of the decoding energy can be saved at the expense of a bitrate increase of approximately the same magnitude. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 5 pages, 3 figures

Journal ref: IEEE International Conference on Image Processing (ICIP), 2016

arXiv:2203.01755 [pdf, ps, other]

doi 10.1109/IWSSIP.2013.6623457

Modeling the Energy Consumption of HEVC Intra Decoding

Authors: Christian Herglotz, Dominic Springer, Andrea Eichenseer, André Kaup

Abstract: Battery life is one of the major limitations to mobile device use, which makes research on energy efficient soft- and hardware an important task. This paper investigates the energy required by a CPU when decoding compressed bitstream videos on mobile platforms. A model is derived that describes the energy consumption of the new HEVC decoder for intra coded videos. We show that the relative estimat… ▽ More Battery life is one of the major limitations to mobile device use, which makes research on energy efficient soft- and hardware an important task. This paper investigates the energy required by a CPU when decoding compressed bitstream videos on mobile platforms. A model is derived that describes the energy consumption of the new HEVC decoder for intra coded videos. We show that the relative estimation error of the model is smaller than 3.2% and that the model can be used to build encoders aiming at minimizing decoding energy. △ Less

Submitted 3 March, 2022; originally announced March 2022.

Comments: 4 pages, 7 figures

Journal ref: 20th International Conference on Systems, Signals and Image Processing (IWSSIP), 2013

arXiv:2203.01099 [pdf, ps, other]

doi 10.1109/TCSVT.2017.2771819

Decoding-Energy-Rate-Distortion Optimization for Video Coding

Authors: Christian Herglotz, Andreas Heindel, André Kaup

Abstract: This paper presents a method for generating coded video bit streams requiring less decoding energy than conventionally coded bit streams. To this end, we propose extending the standard rate-distortion optimization approach to also consider the decoding energy. In the encoder, the decoding energy is estimated during runtime using a feature-based energy model. These energy estimates are then used to… ▽ More This paper presents a method for generating coded video bit streams requiring less decoding energy than conventionally coded bit streams. To this end, we propose extending the standard rate-distortion optimization approach to also consider the decoding energy. In the encoder, the decoding energy is estimated during runtime using a feature-based energy model. These energy estimates are then used to calculate decoding-energy-rate-distortion costs that are minimized by the encoder. This ultimately leads to optimal trade-offs between these three parameters. Therefore, we introduce the mathematical theory for describing decoding-energy-rate-distortion optimization and the proposed encoder algorithm is explained in detail. For rate-energy control, a new encoder parameter is introduced. Finally, measurements of the software decoding process for HEVC-coded bit streams are performed. Results show that this approach can lead to up to 30% of decoding energy reduction at a constant visual objective quality when accepting a bitrate increase at the same order of magnitude. △ Less

Submitted 2 March, 2022; originally announced March 2022.

Comments: 12 pages, 10 figures

Journal ref: IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), volume 29, issue 1, pp. 171 - 182, Jan. 2019

arXiv:2203.00466 [pdf, ps, other]

doi 10.1109/TCSVT.2016.2598705

Modeling the Energy Consumption of the HEVC Decoding Process

Authors: Christian Herglotz, Dominic Springer, Marc Reichenbach, Benno Stabernack, André Kaup

Abstract: In this paper, we present a bit stream feature based energy model that accurately estimates the energy required to decode a given HEVC-coded bit stream. Therefore, we take a model from literature and extend it by explicitly modeling the inloop filters, which was not done before. Furthermore, to prove its superior estimation performance, it is compared to seven different energy models from literatu… ▽ More In this paper, we present a bit stream feature based energy model that accurately estimates the energy required to decode a given HEVC-coded bit stream. Therefore, we take a model from literature and extend it by explicitly modeling the inloop filters, which was not done before. Furthermore, to prove its superior estimation performance, it is compared to seven different energy models from literature. By using a unified evaluation framework we show how accurately the required decoding energy for different decoding systems can be approximated. We give thorough explanations on the model parameters and explain how the model variables are derived. To show the modeling capabilities in general, we test the estimation performance for different decoding software and hardware solutions, where we find that the proposed model outperforms the models from literature by reaching frame-wise mean estimation errors of less than 7% for software and less than 15% for hardware based systems. △ Less

Submitted 1 March, 2022; originally announced March 2022.

Comments: 13 pages, 4 figures

Journal ref: IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), volume 28, issue 1, pp. 217 - 229, Jan. 2018

arXiv:2202.13892 [pdf, other]

doi 10.1109/ICASSP39728.2021.9413576

A Novel Viewport-Adaptive Motion Compensation Technique for Fisheye Video

Authors: Andy Regensky, Christian Herglotz, André Kaup

Abstract: Although fisheye cameras are in high demand in many application areas due to their large field of view, many image and video signal processing tasks such as motion compensation suffer from the introduced strong radial distortions. A recently proposed projection-based approach takes the fisheye projection into account to improve fisheye motion compensation. However, the approach does not consider t… ▽ More Although fisheye cameras are in high demand in many application areas due to their large field of view, many image and video signal processing tasks such as motion compensation suffer from the introduced strong radial distortions. A recently proposed projection-based approach takes the fisheye projection into account to improve fisheye motion compensation. However, the approach does not consider the large field of view of fisheye lenses that requires the consideration of different motion planes in 3D space. We propose a novel viewport-adaptive motion compensation technique that applies the motion vectors in different perspective viewports in order to realize these motion planes. Thereby, some pixels are mapped to so-called virtual image planes and require special treatment to obtain reliable mappings between the perspective viewports and the original fisheye image. While the state-of-the-art ultra wide-angle compensation is sufficiently accurate, we propose a virtual image plane compensation that leads to perfect mappings. All in all, we achieve average gains of +2.40 dB in terms of PSNR compared to the state of the art in fisheye motion compensation. △ Less

Submitted 28 February, 2022; originally announced February 2022.

Comments: 5 pages, 5 figures, 2 tables

Journal ref: ICASSP 2021

Showing 1–50 of 54 results for author: Herglotz, C