Search | arXiv e-print repository

SVT-AV1 Encoding Bitrate Estimation Using Motion Search Information

Authors: Lena Eichermüller, Gaurang Chaudhari, Ioannis Katsavounidis, Zhijun Lei, Hassene Tmar, Christian Herglotz, André Kaup

Abstract: Enabling high compression efficiency while keeping encoding energy consumption at a low level, requires prioritization of which videos need more sophisticated encoding techniques. However, the effects vary highly based on the content, and information on how good a video can be compressed is required. This can be measured by estimating the encoded bitstream size prior to encoding. We identified the… ▽ More Enabling high compression efficiency while keeping encoding energy consumption at a low level, requires prioritization of which videos need more sophisticated encoding techniques. However, the effects vary highly based on the content, and information on how good a video can be compressed is required. This can be measured by estimating the encoded bitstream size prior to encoding. We identified the errors between estimated motion vectors from Motion Search, an algorithm that predicts temporal changes in videos, correlates well to the encoded bitstream size. Combining Motion Search with Random Forests, the encoding bitrate can be estimated with a Pearson correlation of above 0.96. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 5 pages, 4 figures, accepted for European Signal Processing Conference (EUSIPCO) 2024

arXiv:2404.13484 [pdf, other]

Joint Quality Assessment and Example-Guided Image Processing by Disentangling Picture Appearance from Content

Authors: Abhinau K. Venkataramanan, Cosmin Stejerean, Ioannis Katsavounidis, Hassene Tmar, Alan C. Bovik

Abstract: The deep learning revolution has strongly impacted low-level image processing tasks such as style/domain transfer, enhancement/restoration, and visual quality assessments. Despite often being treated separately, the aforementioned tasks share a common theme of understanding, editing, or enhancing the appearance of input images without modifying the underlying content. We leverage this observation… ▽ More The deep learning revolution has strongly impacted low-level image processing tasks such as style/domain transfer, enhancement/restoration, and visual quality assessments. Despite often being treated separately, the aforementioned tasks share a common theme of understanding, editing, or enhancing the appearance of input images without modifying the underlying content. We leverage this observation to develop a novel disentangled representation learning method that decomposes inputs into content and appearance features. The model is trained in a self-supervised manner and we use the learned features to develop a new quality prediction model named DisQUE. We demonstrate through extensive evaluations that DisQUE achieves state-of-the-art accuracy across quality prediction tasks and distortion types. Moreover, we demonstrate that the same features may also be used for image processing tasks such as HDR tone mapping, where the desired output characteristics may be tuned using example input-output pairs. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.13452 [pdf, other]

Cut-FUNQUE: An Objective Quality Model for Compressed Tone-Mapped High Dynamic Range Videos

Authors: Abhinau K. Venkataramanan, Cosmin Stejerean, Ioannis Katsavounidis, Hassene Tmar, Alan C. Bovik

Abstract: High Dynamic Range (HDR) videos have enjoyed a surge in popularity in recent years due to their ability to represent a wider range of contrast and color than Standard Dynamic Range (SDR) videos. Although HDR video capture has seen increasing popularity because of recent flagship mobile phones such as Apple iPhones, Google Pixels, and Samsung Galaxy phones, a broad swath of consumers still utilize… ▽ More High Dynamic Range (HDR) videos have enjoyed a surge in popularity in recent years due to their ability to represent a wider range of contrast and color than Standard Dynamic Range (SDR) videos. Although HDR video capture has seen increasing popularity because of recent flagship mobile phones such as Apple iPhones, Google Pixels, and Samsung Galaxy phones, a broad swath of consumers still utilize legacy SDR displays that are unable to display HDR videos. As result, HDR videos must be processed, i.e., tone-mapped, before streaming to a large section of SDR-capable video consumers. However, server-side tone-mapping involves automating decisions regarding the choices of tone-mapping operators (TMOs) and their parameters to yield high-fidelity outputs. Moreover, these choices must be balanced against the effects of lossy compression, which is ubiquitous in streaming scenarios. In this work, we develop a novel, efficient model of objective video quality named Cut-FUNQUE that is able to accurately predict the visual quality of tone-mapped and compressed HDR videos. Finally, we evaluate Cut-FUNQUE on a large-scale crowdsourced database of such videos and show that it achieves state-of-the-art accuracy. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2401.16067 [pdf, other]

Encoding Time and Energy Model for SVT-AV1 based on Video Complexity

Authors: Lena Eichermüller, Gaurang Chaudhari, Ioannis Katsavounidis, Zhijun Lei, Hassene Tmar, Christian Herglotz, André Kaup

Abstract: The share of online video traffic in global carbon dioxide emissions is growing steadily. To comply with the demand for video media, dedicated compression techniques are continuously optimized, but at the expense of increasingly higher computational demands and thus rising energy consumption at the video encoder side. In order to find the best trade-off between compression and energy consumption,… ▽ More The share of online video traffic in global carbon dioxide emissions is growing steadily. To comply with the demand for video media, dedicated compression techniques are continuously optimized, but at the expense of increasingly higher computational demands and thus rising energy consumption at the video encoder side. In order to find the best trade-off between compression and energy consumption, modeling encoding energy for a wide range of encoding parameters is crucial. We propose an encoding time and energy model for SVT-AV1 based on empirical relations between the encoding time and video parameters as well as encoder configurations. Furthermore, we model the influence of video content by established content descriptors such as spatial and temporal information. We then use the predicted encoding time to estimate the required energy demand and achieve a prediction error of 19.6 % for encoding time and 20.9 % for encoding energy. △ Less

Submitted 30 January, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: 5 pages, 1 figure, accepted for IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2024

arXiv:2312.07780 [pdf, other]

Bitrate Ladder Construction using Visual Information Fidelity

Authors: Krishna Srikar Durbha, Hassene Tmar, Cosmin Stejerean, Ioannis Katsavounidis, Alan C. Bovik

Abstract: Recently proposed perceptually optimized per-title video encoding methods provide better BD-rate savings than fixed bitrate-ladder approaches that have been employed in the past. However, a disadvantage of per-title encoding is that it requires significant time and energy to compute bitrate ladders. Over the past few years, a variety of methods have been proposed to construct optimal bitrate ladde… ▽ More Recently proposed perceptually optimized per-title video encoding methods provide better BD-rate savings than fixed bitrate-ladder approaches that have been employed in the past. However, a disadvantage of per-title encoding is that it requires significant time and energy to compute bitrate ladders. Over the past few years, a variety of methods have been proposed to construct optimal bitrate ladders including using low-level features to predict cross-over bitrates, optimal resolutions for each bitrate, predicting visual quality, etc. Here, we deploy features drawn from Visual Information Fidelity (VIF) (VIF features) extracted from uncompressed videos to predict the visual quality (VMAF) of compressed videos. We present multiple VIF feature sets extracted from different scales and subbands of a video to tackle the problem of bitrate ladder construction. Comparisons are made against a fixed bitrate ladder and a bitrate ladder obtained from exhaustive encoding using Bjontegaard delta metrics. △ Less

Submitted 28 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: PCS 2024 Camera Ready Submission

arXiv:2307.05208 [pdf, other]

Encoder Complexity Control in SVT-AV1 by Speed-Adaptive Preset Switching

Authors: Lena Eichermüller, Gaurang Chaudhari, Ioannis Katsavounidis, Zhijun Lei, Hassene Tmar, André Kaup, Christian Herglotz

Abstract: Current developments in video encoding technology lead to continuously improving compression performance but at the expense of increasingly higher computational demands. Regarding the online video traffic increases during the last years and the concomitant need for video encoding, encoder complexity control mechanisms are required to restrict the processing time to a sufficient extent in order to… ▽ More Current developments in video encoding technology lead to continuously improving compression performance but at the expense of increasingly higher computational demands. Regarding the online video traffic increases during the last years and the concomitant need for video encoding, encoder complexity control mechanisms are required to restrict the processing time to a sufficient extent in order to find a reasonable trade-off between performance and complexity. We present a complexity control mechanism in SVT-AV1 by using speed-adaptive preset switching to comply with the remaining time budget. This method enables encoding with a user-defined time constraint within the complete preset range with an average precision of 8.9 \% without introducing any additional latencies. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: 5 pages, 2 figures, accepted for IEEE International Conference on Image Processing (ICIP) 2023

arXiv:1109.0138 [pdf]

Automatic Application Level Set Approach in Detection Calcifications in Mammographic Image

Authors: Atef Boujelben, Hedi Tmar, Jameleddine Mnif, Mohamed Abid

Abstract: Breast cancer is considered as one of a major health problem that constitutes the strongest cause behind mortality among women in the world. So, in this decade, breast cancer is the second most common type of cancer, in term of appearance frequency, and the fifth most common cause of cancer related death. In order to reduce the workload on radiologists, a variety of CAD systems; Computer-Aided Dia… ▽ More Breast cancer is considered as one of a major health problem that constitutes the strongest cause behind mortality among women in the world. So, in this decade, breast cancer is the second most common type of cancer, in term of appearance frequency, and the fifth most common cause of cancer related death. In order to reduce the workload on radiologists, a variety of CAD systems; Computer-Aided Diagnosis (CADi) and Computer-Aided Detection (CADe) have been proposed. In this paper, we interested on CADe tool to help radiologist to detect cancer. The proposed CADe is based on a three-step work flow; namely, detection, analysis and classification. This paper deals with the problem of automatic detection of Region Of Interest (ROI) based on Level Set approach depended on edge and region criteria. This approach gives good visual information from the radiologist. After that, the features extraction using textures characteristics and the vector classification using Multilayer Perception (MLP) and k-Nearest Neighbours (KNN) are adopted to distinguish different ACR (American College of Radiology) classification. Moreover, we use the Digital Database for Screening Mammography (DDSM) for experiments and these results in term of accuracy varied between 60 % and 70% are acceptable and must be ameliorated to aid radiologist. △ Less

Submitted 1 September, 2011; originally announced September 2011.

Comments: 14 pages, 9 figures

Showing 1–7 of 7 results for author: Tmar, H