Search | arXiv e-print repository

arXiv:2506.19051 [pdf, ps, other]

NIC-RobustBench: A Comprehensive Open-Source Toolkit for Neural Image Compression and Robustness Analysis

Authors: Georgii Bychkov, Khaled Abud, Egor Kovalev, Alexander Gushchin, Dmitriy Vatolin, Anastasia Antsiferova

Abstract: Adversarial robustness of neural networks is an increasingly important area of research, combining studies on computer vision models, large language models (LLMs), and others. With the release of JPEG AI -- the first standard for end-to-end neural image compression (NIC) methods -- the question of evaluating NIC robustness has become critically significant. However, previous research has been limi… ▽ More Adversarial robustness of neural networks is an increasingly important area of research, combining studies on computer vision models, large language models (LLMs), and others. With the release of JPEG AI -- the first standard for end-to-end neural image compression (NIC) methods -- the question of evaluating NIC robustness has become critically significant. However, previous research has been limited to a narrow range of codecs and attacks. To address this, we present \textbf{NIC-RobustBench}, the first open-source framework to evaluate NIC robustness and adversarial defenses' efficiency, in addition to comparing Rate-Distortion (RD) performance. The framework includes the largest number of codecs among all known NIC libraries and is easily scalable. The paper demonstrates a comprehensive overview of the NIC-RobustBench framework and employs it to analyze NIC robustness. Our code is available online at https://github.com/msu-video-group/NIC-RobustBench. △ Less

Submitted 23 June, 2025; originally announced June 2025.

Comments: arXiv admin note: text overlap with arXiv:2411.11795

arXiv:2411.12575 [pdf, other]

Stochastic BIQA: Median Randomized Smoothing for Certified Blind Image Quality Assessment

Authors: Ekaterina Shumitskaya, Mikhail Pautov, Dmitriy Vatolin, Anastasia Antsiferova

Abstract: Most modern No-Reference Image-Quality Assessment (NR-IQA) metrics are based on neural networks vulnerable to adversarial attacks. Attacks on such metrics lead to incorrect image/video quality predictions, which poses significant risks, especially in public benchmarks. Developers of image processing algorithms may unfairly increase the score of a target IQA metric without improving the actual qual… ▽ More Most modern No-Reference Image-Quality Assessment (NR-IQA) metrics are based on neural networks vulnerable to adversarial attacks. Attacks on such metrics lead to incorrect image/video quality predictions, which poses significant risks, especially in public benchmarks. Developers of image processing algorithms may unfairly increase the score of a target IQA metric without improving the actual quality of the adversarial image. Although some empirical defenses for IQA metrics were proposed, they do not provide theoretical guarantees and may be vulnerable to adaptive attacks. This work focuses on developing a provably robust no-reference IQA metric. Our method is based on Median Smoothing (MS) combined with an additional convolution denoiser with ranking loss to improve the SROCC and PLCC scores of the defended IQA metric. Compared with two prior methods on three datasets, our method exhibited superior SROCC and PLCC scores while maintaining comparable certified guarantees. △ Less

Submitted 19 November, 2024; originally announced November 2024.

arXiv:2411.11795 [pdf, other]

Exploring adversarial robustness of JPEG AI: methodology, comparison and new methods

Authors: Egor Kovalev, Georgii Bychkov, Khaled Abud, Aleksandr Gushchin, Anna Chistyakova, Sergey Lavrushkin, Dmitriy Vatolin, Anastasia Antsiferova

Abstract: Adversarial robustness of neural networks is an increasingly important area of research, combining studies on computer vision models, large language models (LLMs), and others. With the release of JPEG AI - the first standard for end-to-end neural image compression (NIC) methods - the question of its robustness has become critically significant. JPEG AI is among the first international, real-world… ▽ More Adversarial robustness of neural networks is an increasingly important area of research, combining studies on computer vision models, large language models (LLMs), and others. With the release of JPEG AI - the first standard for end-to-end neural image compression (NIC) methods - the question of its robustness has become critically significant. JPEG AI is among the first international, real-world applications of neural-network-based models to be embedded in consumer devices. However, research on NIC robustness has been limited to open-source codecs and a narrow range of attacks. This paper proposes a new methodology for measuring NIC robustness to adversarial attacks. We present the first large-scale evaluation of JPEG AI's robustness, comparing it with other NIC models. Our evaluation results and code are publicly available online (link is hidden for a blind review). △ Less

Submitted 18 November, 2024; originally announced November 2024.

arXiv:2410.04225 [pdf, other]

AIM 2024 Challenge on Video Super-Resolution Quality Assessment: Methods and Results

Authors: Ivan Molodetskikh, Artem Borisov, Dmitriy Vatolin, Radu Timofte, Jianzhao Liu, Tianwu Zhi, Yabin Zhang, Yang Li, Jingwen Xu, Yiting Liao, Qing Luo, Ao-Xiang Zhang, Peng Zhang, Haibo Lei, Linyan Jiang, Yaqing Li, Yuqin Cao, Wei Sun, Weixia Zhang, Yinan Sun, Ziheng Jia, Yuxin Zhu, Xiongkuo Min, Guangtao Zhai, Weihua Luo , et al. (2 additional authors not shown)

Abstract: This paper presents the Video Super-Resolution (SR) Quality Assessment (QA) Challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2024. The task of this challenge was to develop an objective QA method for videos upscaled 2x and 4x by modern image- and video-SR algorithms. QA methods were evaluated by comparing their output with aggregate subjec… ▽ More This paper presents the Video Super-Resolution (SR) Quality Assessment (QA) Challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2024. The task of this challenge was to develop an objective QA method for videos upscaled 2x and 4x by modern image- and video-SR algorithms. QA methods were evaluated by comparing their output with aggregate subjective scores collected from >150,000 pairwise votes obtained through crowd-sourced comparisons across 52 SR methods and 1124 upscaled videos. The goal was to advance the state-of-the-art in SR QA, which had proven to be a challenging problem with limited applicability of traditional QA methods. The challenge had 29 registered participants, and 5 teams had submitted their final results, all outperforming the current state-of-the-art. All data, including the private test subset, has been made publicly available on the challenge homepage at https://challenges.videoprocessing.ai/challenges/super-resolution-metrics-challenge.html △ Less

Submitted 5 October, 2024; originally announced October 2024.

Comments: 18 pages, 7 figures

arXiv:2408.11982 [pdf, other]

AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results

Authors: Maksim Smirnov, Aleksandr Gushchin, Anastasia Antsiferova, Dmitry Vatolin, Radu Timofte, Ziheng Jia, Zicheng Zhang, Wei Sun, Jiaying Qian, Yuqin Cao, Yinan Sun, Yuxin Zhu, Xiongkuo Min, Guangtao Zhai, Kanjar De, Qing Luo, Ao-Xiang Zhang, Peng Zhang, Haibo Lei, Linyan Jiang, Yaqing Li, Wenhui Meng, Zhenzhong Chen, Zhengxue Cheng, Jiahao Xiao , et al. (7 additional authors not shown)

Abstract: Video quality assessment (VQA) is a crucial task in the development of video compression standards, as it directly impacts the viewer experience. This paper presents the results of the Compressed Video Quality Assessment challenge, held in conjunction with the Advances in Image Manipulation (AIM) workshop at ECCV 2024. The challenge aimed to evaluate the performance of VQA methods on a diverse dat… ▽ More Video quality assessment (VQA) is a crucial task in the development of video compression standards, as it directly impacts the viewer experience. This paper presents the results of the Compressed Video Quality Assessment challenge, held in conjunction with the Advances in Image Manipulation (AIM) workshop at ECCV 2024. The challenge aimed to evaluate the performance of VQA methods on a diverse dataset of 459 videos, encoded with 14 codecs of various compression standards (AVC/H.264, HEVC/H.265, AV1, and VVC/H.266) and containing a comprehensive collection of compression artifacts. To measure the methods performance, we employed traditional correlation coefficients between their predictions and subjective scores, which were collected via large-scale crowdsourced pairwise human comparisons. For training purposes, participants were provided with the Compressed Video Quality Assessment Dataset (CVQAD), a previously developed dataset of 1022 videos. Up to 30 participating teams registered for the challenge, while we report the results of 6 teams, which submitted valid final solutions and code for reproducing the results. Moreover, we calculated and present the performance of state-of-the-art VQA methods on the developed dataset, providing a comprehensive benchmark for future research. The dataset, results, and online leaderboard are publicly available at https://challenges.videoprocessing.ai/challenges/compressedvideo-quality-assessment.html. △ Less

Submitted 22 October, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

arXiv:2408.01541 [pdf, other]

Guardians of Image Quality: Benchmarking Defenses Against Adversarial Attacks on Image Quality Metrics

Authors: Alexander Gushchin, Khaled Abud, Georgii Bychkov, Ekaterina Shumitskaya, Anna Chistyakova, Sergey Lavrushkin, Bader Rasheed, Kirill Malyshev, Dmitriy Vatolin, Anastasia Antsiferova

Abstract: In the field of Image Quality Assessment (IQA), the adversarial robustness of the metrics poses a critical concern. This paper presents a comprehensive benchmarking study of various defense mechanisms in response to the rise in adversarial attacks on IQA. We systematically evaluate 25 defense strategies, including adversarial purification, adversarial training, and certified robustness methods. We… ▽ More In the field of Image Quality Assessment (IQA), the adversarial robustness of the metrics poses a critical concern. This paper presents a comprehensive benchmarking study of various defense mechanisms in response to the rise in adversarial attacks on IQA. We systematically evaluate 25 defense strategies, including adversarial purification, adversarial training, and certified robustness methods. We applied 14 adversarial attack algorithms of various types in both non-adaptive and adaptive settings and tested these defenses against them. We analyze the differences between defenses and their applicability to IQA tasks, considering that they should preserve IQA scores and image quality. The proposed benchmark aims to guide future developments and accepts submissions of new methods, with the latest results available online: https://videoprocessing.ai/benchmarks/iqa-defenses.html. △ Less

Submitted 2 August, 2024; originally announced August 2024.

arXiv:2405.20392 [pdf, other]

Can No-Reference Quality-Assessment Methods Serve as Perceptual Losses for Super-Resolution?

Authors: Egor Kashkarov, Egor Chistov, Ivan Molodetskikh, Dmitriy Vatolin

Abstract: Perceptual losses play an important role in constructing deep-neural-network-based methods by increasing the naturalness and realism of processed images and videos. Use of perceptual losses is often limited to LPIPS, a fullreference method. Even though deep no-reference image-qualityassessment methods are excellent at predicting human judgment, little research has examined their incorporation in l… ▽ More Perceptual losses play an important role in constructing deep-neural-network-based methods by increasing the naturalness and realism of processed images and videos. Use of perceptual losses is often limited to LPIPS, a fullreference method. Even though deep no-reference image-qualityassessment methods are excellent at predicting human judgment, little research has examined their incorporation in loss functions. This paper investigates direct optimization of several video-superresolution models using no-reference image-quality-assessment methods as perceptual losses. Our experimental results show that straightforward optimization of these methods produce artifacts, but a special training procedure can mitigate them. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 4 pages, 3 figures. The first two authors contributed equally to this work

arXiv:2405.04997 [pdf, ps, other]

Bridging the Gap Between Saliency Prediction and Image Quality Assessment

Authors: Kirillov Alexey, Andrey Moskalenko, Dmitriy Vatolin

Abstract: Over the past few years, deep neural models have made considerable advances in image quality assessment (IQA). However, the underlying reasons for their success remain unclear, owing to the complex nature of deep neural networks. IQA aims to describe how the human visual system (HVS) works and to create its efficient approximations. On the other hand, Saliency Prediction task aims to emulate HVS v… ▽ More Over the past few years, deep neural models have made considerable advances in image quality assessment (IQA). However, the underlying reasons for their success remain unclear, owing to the complex nature of deep neural networks. IQA aims to describe how the human visual system (HVS) works and to create its efficient approximations. On the other hand, Saliency Prediction task aims to emulate HVS via determining areas of visual interest. Thus, we believe that saliency plays a crucial role in human perception. In this work, we conduct an empirical study that reveals the relation between IQA and Saliency Prediction tasks, demonstrating that the former incorporates knowledge of the latter. Moreover, we introduce a novel SACID dataset of saliency-aware compressed images and conduct a large-scale comparison of classic and neural-based IQA methods. All supplementary code and data will be available at the time of publication. △ Less

Submitted 27 June, 2025; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted to EUSIPCO 2025

arXiv:2404.09961 [pdf, other]

Ti-Patch: Tiled Physical Adversarial Patch for no-reference video quality metrics

Authors: Victoria Leonenkova, Ekaterina Shumitskaya, Anastasia Antsiferova, Dmitriy Vatolin

Abstract: Objective no-reference image- and video-quality metrics are crucial in many computer vision tasks. However, state-of-the-art no-reference metrics have become learning-based and are vulnerable to adversarial attacks. The vulnerability of quality metrics imposes restrictions on using such metrics in quality control systems and comparing objective algorithms. Also, using vulnerable metrics as a loss… ▽ More Objective no-reference image- and video-quality metrics are crucial in many computer vision tasks. However, state-of-the-art no-reference metrics have become learning-based and are vulnerable to adversarial attacks. The vulnerability of quality metrics imposes restrictions on using such metrics in quality control systems and comparing objective algorithms. Also, using vulnerable metrics as a loss for deep learning model training can mislead training to worsen visual quality. Because of that, quality metrics testing for vulnerability is a task of current interest. This paper proposes a new method for testing quality metrics vulnerability in the physical space. To our knowledge, quality metrics were not previously tested for vulnerability to this attack; they were only tested in the pixel space. We applied a physical adversarial Ti-Patch (Tiled Patch) attack to quality metrics and did experiments both in pixel and physical space. We also performed experiments on the implementation of physical adversarial wallpaper. The proposed method can be used as additional quality metrics in vulnerability evaluation, complementing traditional subjective comparison and vulnerability tests in the pixel space. We made our code and adversarial videos available on GitHub: https://github.com/leonenkova/Ti-Patch. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: Accepted to WAIT AINL 2024

arXiv:2403.05955 [pdf, other]

IOI: Invisible One-Iteration Adversarial Attack on No-Reference Image- and Video-Quality Metrics

Authors: Ekaterina Shumitskaya, Anastasia Antsiferova, Dmitriy Vatolin

Abstract: No-reference image- and video-quality metrics are widely used in video processing benchmarks. The robustness of learning-based metrics under video attacks has not been widely studied. In addition to having success, attacks that can be employed in video processing benchmarks must be fast and imperceptible. This paper introduces an Invisible One-Iteration (IOI) adversarial attack on no reference ima… ▽ More No-reference image- and video-quality metrics are widely used in video processing benchmarks. The robustness of learning-based metrics under video attacks has not been widely studied. In addition to having success, attacks that can be employed in video processing benchmarks must be fast and imperceptible. This paper introduces an Invisible One-Iteration (IOI) adversarial attack on no reference image and video quality metrics. We compared our method alongside eight prior approaches using image and video datasets via objective and subjective tests. Our method exhibited superior visual quality across various attacked metric architectures while maintaining comparable attack success and speed. We made the code available on GitHub: https://github.com/katiashh/ioi-attack. △ Less

Submitted 29 May, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

Comments: Accepted to ICML 2024

arXiv:2310.06958 [pdf, other]

Comparing the Robustness of Modern No-Reference Image- and Video-Quality Metrics to Adversarial Attacks

Authors: Anastasia Antsiferova, Khaled Abud, Aleksandr Gushchin, Ekaterina Shumitskaya, Sergey Lavrushkin, Dmitriy Vatolin

Abstract: Nowadays, neural-network-based image- and video-quality metrics perform better than traditional methods. However, they also became more vulnerable to adversarial attacks that increase metrics' scores without improving visual quality. The existing benchmarks of quality metrics compare their performance in terms of correlation with subjective quality and calculation time. Nonetheless, the adversaria… ▽ More Nowadays, neural-network-based image- and video-quality metrics perform better than traditional methods. However, they also became more vulnerable to adversarial attacks that increase metrics' scores without improving visual quality. The existing benchmarks of quality metrics compare their performance in terms of correlation with subjective quality and calculation time. Nonetheless, the adversarial robustness of image-quality metrics is also an area worth researching. This paper analyses modern metrics' robustness to different adversarial attacks. We adapted adversarial attacks from computer vision tasks and compared attacks' efficiency against 15 no-reference image- and video-quality metrics. Some metrics showed high resistance to adversarial attacks, which makes their usage in benchmarks safer than vulnerable metrics. The benchmark accepts submissions of new metrics for researchers who want to make their metrics more robust to attacks or to find such metrics for their needs. The latest results can be found online: https://videoprocessing.ai/benchmarks/metrics-robustness.html. △ Less

Submitted 27 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2305.15544 [pdf, other]

Fast Adversarial CNN-based Perturbation Attack on No-Reference Image- and Video-Quality Metrics

Authors: Ekaterina Shumitskaya, Anastasia Antsiferova, Dmitriy Vatolin

Abstract: Modern neural-network-based no-reference image- and video-quality metrics exhibit performance as high as full-reference metrics. These metrics are widely used to improve visual quality in computer vision methods and compare video processing methods. However, these metrics are not stable to traditional adversarial attacks, which can cause incorrect results. Our goal is to investigate the boundaries… ▽ More Modern neural-network-based no-reference image- and video-quality metrics exhibit performance as high as full-reference metrics. These metrics are widely used to improve visual quality in computer vision methods and compare video processing methods. However, these metrics are not stable to traditional adversarial attacks, which can cause incorrect results. Our goal is to investigate the boundaries of no-reference metrics applicability, and in this paper, we propose a fast adversarial perturbation attack on no-reference quality metrics. The proposed attack (FACPA) can be exploited as a preprocessing step in real-time video processing and compression algorithms. This research can yield insights to further aid in designing of stable neural-network-based no-reference quality metrics. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: ICLR 2023 TinyPapers

arXiv:2305.04844 [pdf, other]

SR+Codec: a Benchmark of Super-Resolution for Video Compression Bitrate Reduction

Authors: Evgeney Bogatyrev, Ivan Molodetskikh, Dmitriy Vatolin

Abstract: In recent years, there has been significant interest in Super-Resolution (SR), which focuses on generating a high-resolution image from a low-resolution input. Deep learning-based methods for super-resolution have been particularly popular and have shown impressive results on various benchmarks. However, research indicates that these methods may not perform as well on strongly compressed videos. W… ▽ More In recent years, there has been significant interest in Super-Resolution (SR), which focuses on generating a high-resolution image from a low-resolution input. Deep learning-based methods for super-resolution have been particularly popular and have shown impressive results on various benchmarks. However, research indicates that these methods may not perform as well on strongly compressed videos. We developed a super-resolution benchmark to analyze SR's capacity to upscale compressed videos. Our dataset employed video codecs based on five widely-used compression standards: H.264, H.265, H.266, AV1, and AVS3. We assessed 19 popular SR models using our benchmark and evaluated their ability to restore details and their susceptibility to compression artifacts. To get an accurate perceptual ranking of SR models, we conducted a crowd-sourced side-by-side comparison of their outputs. We found that some SR models, combined with compression, allow us to reduce the video bitrate without significant loss of quality. We also compared a range of image and video quality metrics with subjective scores to evaluate their accuracy on super-resolved compressed videos. The benchmark is publicly available at https://videoprocessing.ai/benchmarks/super-resolution-for-video-compression.html △ Less

Submitted 4 December, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

arXiv:2212.05499 [pdf, other]

Applicability limitations of differentiable full-reference image-quality

Authors: Maksim Siniukov, Dmitriy Kulikov, Dmitriy Vatolin

Abstract: Subjective image-quality measurement plays a critical role in the development of image-processing applications. The purpose of a visual-quality metric is to approximate the results of subjective assessment. In this regard, more and more metrics are under development, but little research has considered their limitations. This paper addresses that deficiency: we show how image preprocessing before c… ▽ More Subjective image-quality measurement plays a critical role in the development of image-processing applications. The purpose of a visual-quality metric is to approximate the results of subjective assessment. In this regard, more and more metrics are under development, but little research has considered their limitations. This paper addresses that deficiency: we show how image preprocessing before compression can artificially increase the quality scores provided by the popular metrics DISTS, LPIPS, HaarPSI, and VIF as well as how these scores are inconsistent with subjective-quality scores. We propose a series of neural-network preprocessing models that increase DISTS by up to 34.5%, LPIPS by up to 36.8%, VIF by up to 98.0%, and HaarPSI by up to 22.6% in the case of JPEG-compressed images. A subjective comparison of preprocessed images showed that for most of the metrics we examined, visual quality drops or stays unchanged, limiting the applicability of these metrics. △ Less

Submitted 3 January, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

Comments: 10 pages, 4 figures

ACM Class: I.4.0

arXiv:2211.04799 [pdf, other]

Bit-depth enhancement detection for compressed video

Authors: Nickolay Safonov, Dmitriy Vatolin

Abstract: In recent years, display intensity and contrast have increased considerably. Many displays support high dynamic range (HDR) and 10-bit color depth. Since high bit-depth is an emerging technology, video content is still largely shot and transmitted with a bit depth of 8 bits or less per color component. Insufficient bit-depths produce distortions called false contours or banding, and they are visib… ▽ More In recent years, display intensity and contrast have increased considerably. Many displays support high dynamic range (HDR) and 10-bit color depth. Since high bit-depth is an emerging technology, video content is still largely shot and transmitted with a bit depth of 8 bits or less per color component. Insufficient bit-depths produce distortions called false contours or banding, and they are visible on high contrast screens. To deal with such distortions, researchers have proposed algorithms for bit-depth enhancement (dequantization). Such techniques convert videos with low bit-depth (LBD) to videos with high bit-depth (HBD). The quality of converted LBD video, however, is usually lower than that of the original HBD video, and many consumers prefer to keep the original HBD versions. In this paper, we propose an algorithm to determine whether a video has undergone conversion before compression. This problem is complex; it involves detecting outcomes of different dequantization algorithms in the presence of compression that strongly affects the least-significant bits (LSBs) in the video frames. Our algorithm can detect bit-depth enhancement and demonstrates good generalization capability, as it is able to determine whether a video has undergone processing by dequantization algorithms absent from the training dataset. △ Less

Submitted 9 November, 2022; originally announced November 2022.

arXiv:2203.08923 [pdf, other]

Towards True Detail Restoration for Super-Resolution: A Benchmark and a Quality Metric

Authors: Eugene Lyapustin, Anastasia Kirillova, Viacheslav Meshchaninov, Evgeney Zimin, Nikolai Karetin, Dmitriy Vatolin

Abstract: Super-resolution (SR) has become a widely researched topic in recent years. SR methods can improve overall image and video quality and create new possibilities for further content analysis. But the SR mainstream focuses primarily on increasing the naturalness of the resulting image despite potentially losing context accuracy. Such methods may produce an incorrect digit, character, face, or other s… ▽ More Super-resolution (SR) has become a widely researched topic in recent years. SR methods can improve overall image and video quality and create new possibilities for further content analysis. But the SR mainstream focuses primarily on increasing the naturalness of the resulting image despite potentially losing context accuracy. Such methods may produce an incorrect digit, character, face, or other structural object even though they otherwise yield good visual quality. Incorrect detail restoration can cause errors when detecting and identifying objects both manually and automatically. To analyze the detail-restoration capabilities of image and video SR models, we developed a benchmark based on our own video dataset, which contains complex patterns that SR models generally fail to correctly restore. We assessed 32 recent SR models using our benchmark and compared their ability to preserve scene context. We also conducted a crowd-sourced comparison of restored details and developed an objective assessment metric that outperforms other quality metrics by correlation with subjective scores for this task. In conclusion, we provide a deep analysis of benchmark results that yields insights for future SR-based work. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2110.09992 [pdf, other]

doi 10.5220/0010780900003124

ERQA: Edge-Restoration Quality Assessment for Video Super-Resolution

Authors: Anastasia Kirillova, Eugene Lyapustin, Anastasia Antsiferova, Dmitry Vatolin

Abstract: Despite the growing popularity of video super-resolution (VSR), there is still no good way to assess the quality of the restored details in upscaled frames. Some SR methods may produce the wrong digit or an entirely different face. Whether a method's results are trustworthy depends on how well it restores truthful details. Image super-resolution can use natural distributions to produce a high-reso… ▽ More Despite the growing popularity of video super-resolution (VSR), there is still no good way to assess the quality of the restored details in upscaled frames. Some SR methods may produce the wrong digit or an entirely different face. Whether a method's results are trustworthy depends on how well it restores truthful details. Image super-resolution can use natural distributions to produce a high-resolution image that is only somewhat similar to the real one. VSR enables exploration of additional information in neighboring frames to restore details from the original scene. The ERQA metric, which we propose in this paper, aims to estimate a model's ability to restore real details using VSR. On the assumption that edges are significant for detail and character recognition, we chose edge fidelity as the foundation for this metric. Experimental validation of our work is based on the MSU Video Super-Resolution Benchmark, which includes the most difficult patterns for detail restoration and verifies the fidelity of details from the original frame. Code for the proposed metric is publicly available at https://github.com/msu-video-group/ERQA. △ Less

Submitted 24 January, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

Comments: Accepted for presentation at the International Conference on Computer Vision Theory and Applications (VISAPP) 2022

arXiv:2104.13464 [pdf, other]

doi 10.51130/graphicon-2020-2-4-18

Deep Two-Stage High-Resolution Image Inpainting

Authors: Andrey Moskalenko, Mikhail Erofeev, Dmitriy Vatolin

Abstract: In recent years, the field of image inpainting has developed rapidly, learning based approaches show impressive results in the task of filling missing parts in an image. But most deep methods are strongly tied to the resolution of the images on which they were trained. A slight resolution increase leads to serious artifacts and unsatisfactory filling quality. These methods are therefore unsuitable… ▽ More In recent years, the field of image inpainting has developed rapidly, learning based approaches show impressive results in the task of filling missing parts in an image. But most deep methods are strongly tied to the resolution of the images on which they were trained. A slight resolution increase leads to serious artifacts and unsatisfactory filling quality. These methods are therefore unsuitable for interactive image processing. In this article, we propose a method that solves the problem of inpainting arbitrary-size images. We also describe a way to better restore texture fragments in the filled area. For this, we propose to use information from neighboring pixels by shifting the original image in four directions. Moreover, this approach can work with existing inpainting models, making them almost resolution independent without the need for retraining. We also created a GIMP plugin that implements our technique. The plugin, code, and model weights are available at https://github.com/a-mos/High_Resolution_Image_Inpainting. △ Less

Submitted 27 April, 2021; originally announced April 2021.

arXiv:1907.04807 [pdf, other]

Hacking VMAF with Video Color and Contrast Distortion

Authors: Anastasia Zvezdakova, Sergey Zvezdakov, Dmitriy Kulikov, Dmitriy Vatolin

Abstract: Video quality measurement takes an important role in many applications. Full-reference quality metrics which are usually used in video codecs comparisons are expected to reflect any changes in videos. In this article, we consider different color corrections of compressed videos which increase the values of full-reference metric VMAF and almost don't decrease other widely-used metric SSIM. The prop… ▽ More Video quality measurement takes an important role in many applications. Full-reference quality metrics which are usually used in video codecs comparisons are expected to reflect any changes in videos. In this article, we consider different color corrections of compressed videos which increase the values of full-reference metric VMAF and almost don't decrease other widely-used metric SSIM. The proposed video contrast enhancement approach shows the metric inapplicability in some cases for video codecs comparisons, as it may be used for cheating in the comparisons via tuning to improve this metric values. △ Less

Submitted 29 August, 2019; v1 submitted 10 July, 2019; originally announced July 2019.

Showing 1–19 of 19 results for author: Vatolin, D