Search | arXiv e-print repository

NTIRE 2024 Challenge on Night Photography Rendering

Authors: Egor Ershov, Artyom Panshin, Oleg Karasev, Sergey Korchagin, Shepelev Lev, Alexandr Startsev, Daniil Vladimirov, Ekaterina Zaychenkova, Nikola Banić, Dmitrii Iarchuk, Maria Efimova, Radu Timofte, Arseniy Terekhin, Shuwei Yue, Yuyang Liu, Minchen Wei, Lu Xu, Chao Zhang, Yasi Wang, Furkan Kınlı, Doğa Yılmaz, Barış Özcan, Furkan Kıraç, Shuai Liu, Jingyuan Xiao , et al. (25 additional authors not shown)

Abstract: This paper presents a review of the NTIRE 2024 challenge on night photography rendering. The goal of the challenge was to find solutions that process raw camera images taken in nighttime conditions, and thereby produce a photo-quality output images in the standard RGB (sRGB) space. Unlike the previous year's competition, the challenge images were collected with a mobile phone and the speed of algo… ▽ More This paper presents a review of the NTIRE 2024 challenge on night photography rendering. The goal of the challenge was to find solutions that process raw camera images taken in nighttime conditions, and thereby produce a photo-quality output images in the standard RGB (sRGB) space. Unlike the previous year's competition, the challenge images were collected with a mobile phone and the speed of algorithms was also measured alongside the quality of their output. To evaluate the results, a sufficient number of viewers were asked to assess the visual quality of the proposed solutions, considering the subjective nature of the task. There were 2 nominations: quality and efficiency. Top 5 solutions in terms of output quality were sorted by evaluation time (see Fig. 1). The top ranking participants' solutions effectively represent the state-of-the-art in nighttime photography rendering. More results can be found at https://nightimaging.org. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 10 pages, 10 figures

arXiv:2311.15908 [pdf, other]

Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models

Authors: Claudio Rota, Marco Buzzelli, Joost van de Weijer

Abstract: In this paper, we address the problem of enhancing perceptual quality in video super-resolution (VSR) using Diffusion Models (DMs) while ensuring temporal consistency among frames. We present StableVSR, a VSR method based on DMs that can significantly enhance the perceptual quality of upscaled videos by synthesizing realistic and temporally-consistent details. We introduce the Temporal Conditionin… ▽ More In this paper, we address the problem of enhancing perceptual quality in video super-resolution (VSR) using Diffusion Models (DMs) while ensuring temporal consistency among frames. We present StableVSR, a VSR method based on DMs that can significantly enhance the perceptual quality of upscaled videos by synthesizing realistic and temporally-consistent details. We introduce the Temporal Conditioning Module (TCM) into a pre-trained DM for single image super-resolution to turn it into a VSR method. TCM uses the novel Temporal Texture Guidance, which provides it with spatially-aligned and detail-rich texture information synthesized in adjacent frames. This guides the generative process of the current frame toward high-quality and temporally-consistent results. In addition, we introduce the novel Frame-wise Bidirectional Sampling strategy to encourage the use of information from past to future and vice-versa. This strategy improves the perceptual quality of the results and the temporal consistency across frames. We demonstrate the effectiveness of StableVSR in enhancing the perceptual quality of upscaled videos while achieving better temporal consistency compared to existing state-of-the-art methods for VSR. The project page is available at https://github.com/claudiom4sir/StableVSR. △ Less

Submitted 16 July, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: Accepted to ECCV 2024

arXiv:2208.11184 [pdf, other]

AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

Authors: Ren Yang, Radu Timofte, Xin Li, Qi Zhang, Lin Zhang, Fanglong Liu, Dongliang He, Fu li, He Zheng, Weihang Yuan, Pavel Ostyakov, Dmitry Vyal, Magauiya Zhussip, Xueyi Zou, Youliang Yan, Lei Li, Jingzhu Tang, Ming Chen, Shijie Zhao, Yu Zhu, Xiaoran Qin, Chenghua Li, Cong Leng, Jian Cheng, Claudio Rota , et al. (28 additional authors not shown)

Abstract: This paper reviews the Challenge on Super-Resolution of Compressed Image and Video at AIM 2022. This challenge includes two tracks. Track 1 aims at the super-resolution of compressed image, and Track~2 targets the super-resolution of compressed video. In Track 1, we use the popular dataset DIV2K as the training, validation and test sets. In Track 2, we propose the LDV 3.0 dataset, which contains 3… ▽ More This paper reviews the Challenge on Super-Resolution of Compressed Image and Video at AIM 2022. This challenge includes two tracks. Track 1 aims at the super-resolution of compressed image, and Track~2 targets the super-resolution of compressed video. In Track 1, we use the popular dataset DIV2K as the training, validation and test sets. In Track 2, we propose the LDV 3.0 dataset, which contains 365 videos, including the LDV 2.0 dataset (335 videos) and 30 additional videos. In this challenge, there are 12 teams and 2 teams that submitted the final results to Track 1 and Track 2, respectively. The proposed methods and solutions gauge the state-of-the-art of super-resolution on compressed image and video. The proposed LDV 3.0 dataset is available at https://github.com/RenYang-home/LDV_dataset. The homepage of this challenge is at https://github.com/RenYang-home/AIM22_CompressSR. △ Less

Submitted 25 August, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Camera-ready version

arXiv:2205.04357 [pdf, other]

doi 10.1109/THMS.2023.3267898

Unified Framework for Identity and Imagined Action Recognition from EEG patterns

Authors: Marco Buzzelli, Simone Bianco, Paolo Napoletano

Abstract: We present a unified deep learning framework for the recognition of user identity and the recognition of imagined actions, based on electroencephalography (EEG) signals, for application as a brain-computer interface. Our solution exploits a novel shifted subsampling preprocessing step as a form of data augmentation, and a matrix representation to encode the inherent local spatial relationships of… ▽ More We present a unified deep learning framework for the recognition of user identity and the recognition of imagined actions, based on electroencephalography (EEG) signals, for application as a brain-computer interface. Our solution exploits a novel shifted subsampling preprocessing step as a form of data augmentation, and a matrix representation to encode the inherent local spatial relationships of multi-electrode EEG signals. The resulting image-like data is then fed to a convolutional neural network to process the local spatial dependencies, and eventually analyzed through a bidirectional long-short term memory module to focus on temporal relationships. Our solution is compared against several methods in the state of the art, showing comparable or superior performance on different tasks. Specifically, we achieve accuracy levels above 90% both for action and user classification tasks. In terms of user identification, we reach 0.39% equal error rate in the case of known users and gestures, and 6.16% in the more challenging case of unknown users and gestures. Preliminary experiments are also conducted in order to direct future works towards everyday applications relying on a reduced set of EEG electrodes. △ Less

Submitted 2 May, 2023; v1 submitted 9 May, 2022; originally announced May 2022.

arXiv:2204.08972 [pdf, other]

Shallow camera pipeline for night photography rendering

Authors: Simone Zini, Claudio Rota, Marco Buzzelli, Simone Bianco, Raimondo Schettini

Abstract: We introduce a camera pipeline for rendering visually pleasing photographs in low light conditions, as part of the NTIRE2022 Night Photography Rendering challenge. Given the nature of the task, where the objective is verbally defined by an expert photographer instead of relying on explicit ground truth images, we design an handcrafted solution, characterized by a shallow structure and by a low par… ▽ More We introduce a camera pipeline for rendering visually pleasing photographs in low light conditions, as part of the NTIRE2022 Night Photography Rendering challenge. Given the nature of the task, where the objective is verbally defined by an expert photographer instead of relying on explicit ground truth images, we design an handcrafted solution, characterized by a shallow structure and by a low parameter count. Our pipeline exploits a local light enhancer as a form of high dynamic range correction, followed by a global adjustment of the image histogram to prevent washed-out results. We proportionally apply image denoising to darker regions, where it is more easily perceived, without losing details on brighter regions. The solution reached the fifth place in the competition, with a preference vote count comparable to those of other entries, based on deep convolutional neural networks. Code is available at www.github.com/AvailableAfterAcceptance. △ Less

Submitted 19 April, 2022; originally announced April 2022.

arXiv:2202.07993 [pdf, other]

Planckian Jitter: countering the color-crippling effects of color jitter on self-supervised training

Authors: Simone Zini, Alex Gomez-Villa, Marco Buzzelli, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

Abstract: Several recent works on self-supervised learning are trained by mapping different augmentations of the same image to the same feature representation. The data augmentations used are of crucial importance to the quality of learned feature representations. In this paper, we analyze how the color jitter traditionally used in data augmentation negatively impacts the quality of the color features in le… ▽ More Several recent works on self-supervised learning are trained by mapping different augmentations of the same image to the same feature representation. The data augmentations used are of crucial importance to the quality of learned feature representations. In this paper, we analyze how the color jitter traditionally used in data augmentation negatively impacts the quality of the color features in learned feature representations. To address this problem, we propose a more realistic, physics-based color data augmentation - which we call Planckian Jitter - that creates realistic variations in chromaticity and produces a model robust to illumination changes that can be commonly observed in real life, while maintaining the ability to discriminate image content based on color information. Experiments confirm that such a representation is complementary to the representations learned with the currently-used color jitter augmentation and that a simple concatenation leads to significant performance gains on a wide range of downstream datasets. In addition, we present a color sensitivity analysis that documents the impact of different training methods on model neurons and shows that the performance of the learned features is robust with respect to illuminant variations. △ Less

Submitted 2 February, 2023; v1 submitted 16 February, 2022; originally announced February 2022.

Comments: Accepted at Eleventh International Conference on Learning Representations (ICLR 2023)

arXiv:2012.15779 [pdf, other]

Illumination Estimation Challenge: experience of past two years

Authors: Egor Ershov, Alex Savchik, Ilya Semenkov, Nikola Banić, Karlo Koscević, Marko Subašić, Alexander Belokopytov, Zhihao Li, Arseniy Terekhin, Daria Senshina, Artem Nikonorov, Yanlin Qian, Marco Buzzelli, Riccardo Riva, Simone Bianco, Raimondo Schettini, Sven Lončarić, Dmitry Nikolaev

Abstract: Illumination estimation is the essential step of computational color constancy, one of the core parts of various image processing pipelines of modern digital cameras. Having an accurate and reliable illumination estimation is important for reducing the illumination influence on the image colors. To motivate the generation of new ideas and the development of new algorithms in this field, the 2nd Il… ▽ More Illumination estimation is the essential step of computational color constancy, one of the core parts of various image processing pipelines of modern digital cameras. Having an accurate and reliable illumination estimation is important for reducing the illumination influence on the image colors. To motivate the generation of new ideas and the development of new algorithms in this field, the 2nd Illumination estimation challenge~(IEC\#2) was conducted. The main advantage of testing a method on a challenge over testing in on some of the known datasets is the fact that the ground-truth illuminations for the challenge test images are unknown up until the results have been submitted, which prevents any potential hyperparameter tuning that may be biased. The challenge had several tracks: general, indoor, and two-illuminant with each of them focusing on different parameters of the scenes. Other main features of it are a new large dataset of images (about 5000) taken with the same camera sensor model, a manual markup accompanying each image, diverse content with scenes taken in numerous countries under a huge variety of illuminations extracted by using the SpyderCube calibration object, and a contest-like markup for the images from the Cube+ dataset that was used in IEC\#1. This paper focuses on the description of the past two challenges, algorithms which won in each track, and the conclusions that were drawn based on the results obtained during the 1st and 2nd challenge that can be useful for similar future developments. △ Less

Submitted 31 December, 2020; originally announced December 2020.

arXiv:1805.09264 [pdf, other]

Learning Illuminant Estimation from Object Recognition

Authors: Marco Buzzelli, Joost van de Weijer, Raimondo Schettini

Abstract: In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep learning architecture for illuminant estimation that is trained without ground truth illuminant… ▽ More In this paper we present a deep learning method to estimate the illuminant of an image. Our model is not trained with illuminant annotations, but with the objective of improving performance on an auxiliary task such as object recognition. To the best of our knowledge, this is the first example of a deep learning architecture for illuminant estimation that is trained without ground truth illuminants. We evaluate our solution on standard datasets for color constancy, and compare it with state of the art methods. Our proposal is shown to outperform most deep learning methods in a cross-dataset evaluation setup, and to present competitive results in a comparison with parametric solutions. △ Less

Submitted 23 May, 2018; originally announced May 2018.

Comments: Accepted at ICIP 2018

arXiv:1701.02620 [pdf, other]

doi 10.1016/j.neucom.2017.03.051

Deep Learning for Logo Recognition

Authors: Simone Bianco, Marco Buzzelli, Davide Mazzini, Raimondo Schettini

Abstract: In this paper we propose a method for logo recognition using deep learning. Our recognition pipeline is composed of a logo region proposal followed by a Convolutional Neural Network (CNN) specifically trained for logo classification, even if they are not precisely localized. Experiments are carried out on the FlickrLogos-32 database, and we evaluate the effect on recognition performance of synthet… ▽ More In this paper we propose a method for logo recognition using deep learning. Our recognition pipeline is composed of a logo region proposal followed by a Convolutional Neural Network (CNN) specifically trained for logo classification, even if they are not precisely localized. Experiments are carried out on the FlickrLogos-32 database, and we evaluate the effect on recognition performance of synthetic versus real data augmentation, and image pre-processing. Moreover, we systematically investigate the benefits of different training choices such as class-balancing, sample-weighting and explicit modeling the background class (i.e. no-logo regions). Experimental results confirm the feasibility of the proposed method, that outperforms the methods in the state of the art. △ Less

Submitted 3 May, 2017; v1 submitted 10 January, 2017; originally announced January 2017.

Comments: Preprint accepted in Neurocomputing

Journal ref: Neurocomputing 245, 23-30 (2017)

Showing 1–9 of 9 results for author: Buzzelli, M