Search | arXiv e-print repository

NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and Results

Authors: Xin Li, Kun Yuan, Bingchen Li, Fengbin Guan, Yizhen Shao, Zihao Yu, Xijun Wang, Yiting Lu, Wei Luo, Suhang Yao, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Yabin Zhang, Ao-Xiang Zhang, Tianwu Zhi, Jianzhao Liu, Yang Li, Jingwen Xu, Yiting Liao, Yushen Zuo, Mingyang Wu, Renjie Li, Shengyun Zhong , et al. (88 additional authors not shown)

Abstract: This paper presents a review for the NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement. The challenge comprises two tracks: (i) Efficient Video Quality Assessment (KVQ), and (ii) Diffusion-based Image Super-Resolution (KwaiSR). Track 1 aims to advance the development of lightweight and efficient video quality assessment (VQA) models, with an emphasis on eliminating re… ▽ More This paper presents a review for the NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement. The challenge comprises two tracks: (i) Efficient Video Quality Assessment (KVQ), and (ii) Diffusion-based Image Super-Resolution (KwaiSR). Track 1 aims to advance the development of lightweight and efficient video quality assessment (VQA) models, with an emphasis on eliminating reliance on model ensembles, redundant weights, and other computationally expensive components in the previous IQA/VQA competitions. Track 2 introduces a new short-form UGC dataset tailored for single image super-resolution, i.e., the KwaiSR dataset. It consists of 1,800 synthetically generated S-UGC image pairs and 1,900 real-world S-UGC images, which are split into training, validation, and test sets using a ratio of 8:1:1. The primary objective of the challenge is to drive research that benefits the user experience of short-form UGC platforms such as Kwai and TikTok. This challenge attracted 266 participants and received 18 valid final submissions with corresponding fact sheets, significantly contributing to the progress of short-form UGC VQA and image superresolution. The project is publicly available at https://github.com/lixinustc/KVQE- ChallengeCVPR-NTIRE2025. △ Less

Submitted 17 April, 2025; originally announced April 2025.

Comments: Challenge Report of NTIRE 2025; Methods from 18 Teams; Accepted by CVPR Workshop; 21 pages

arXiv:2501.01483 [pdf, other]

Embedding Similarity Guided License Plate Super Resolution

Authors: Abderrezzaq Sendjasni, Mohamed-Chaker Larabi

Abstract: Super-resolution (SR) techniques play a pivotal role in enhancing the quality of low-resolution images, particularly for applications such as security and surveillance, where accurate license plate recognition is crucial. This study proposes a novel framework that combines pixel-based loss with embedding similarity learning to address the unique challenges of license plate super-resolution (LPSR).… ▽ More Super-resolution (SR) techniques play a pivotal role in enhancing the quality of low-resolution images, particularly for applications such as security and surveillance, where accurate license plate recognition is crucial. This study proposes a novel framework that combines pixel-based loss with embedding similarity learning to address the unique challenges of license plate super-resolution (LPSR). The introduced pixel and embedding consistency loss (PECL) integrates a Siamese network and applies contrastive loss to force embedding similarities to improve perceptual and structural fidelity. By effectively balancing pixel-wise accuracy with embedding-level consistency, the framework achieves superior alignment of fine-grained features between high-resolution (HR) and super-resolved (SR) license plates. Extensive experiments on the CCPD dataset validate the efficacy of the proposed framework, demonstrating consistent improvements over state-of-the-art methods in terms of PSNR_RGB, PSNR_Y and optical character recognition (OCR) accuracy. These results highlight the potential of embedding similarity learning to advance both perceptual quality and task-specific performance in extreme super-resolution scenarios. △ Less

Submitted 8 January, 2025; v1 submitted 2 January, 2025; originally announced January 2025.

Comments: Submitted to Neurocomputing

arXiv:2412.06003 [pdf, other]

Enhancing Content Representation for AR Image Quality Assessment Using Knowledge Distillation

Authors: Aymen Sekhri, Seyed Ali Amirshahi, Mohamed-Chaker Larabi

Abstract: Augmented Reality (AR) is a major immersive media technology that enriches our perception of reality by overlaying digital content (the foreground) onto physical environments (the background). It has far-reaching applications, from entertainment and gaming to education, healthcare, and industrial training. Nevertheless, challenges such as visual confusion and classical distortions can result in us… ▽ More Augmented Reality (AR) is a major immersive media technology that enriches our perception of reality by overlaying digital content (the foreground) onto physical environments (the background). It has far-reaching applications, from entertainment and gaming to education, healthcare, and industrial training. Nevertheless, challenges such as visual confusion and classical distortions can result in user discomfort when using the technology. Evaluating AR quality of experience becomes essential to measure user satisfaction and engagement, facilitating the refinement necessary for creating immersive and robust experiences. Though, the scarcity of data and the distinctive characteristics of AR technology render the development of effective quality assessment metrics challenging. This paper presents a deep learning-based objective metric designed specifically for assessing image quality for AR scenarios. The approach entails four key steps, (1) fine-tuning a self-supervised pre-trained vision transformer to extract prominent features from reference images and distilling this knowledge to improve representations of distorted images, (2) quantifying distortions by computing shift representations, (3) employing cross-attention-based decoders to capture perceptual quality features, and (4) integrating regularization techniques and label smoothing to address the overfitting problem. To validate the proposed approach, we conduct extensive experiments on the ARIQA dataset. The results showcase the superior performance of our proposed approach across all model variants, namely TransformAR, TransformAR-KD, and TransformAR-KD+ in comparison to existing state-of-the-art methods. △ Less

Submitted 8 December, 2024; originally announced December 2024.

Comments: Submitted to the IEEE Transactions on Circuits and Systems for Video Technology

arXiv:2103.12547 [pdf, ps, other]

A new public Alsat-2B dataset for single-image super-resolution

Authors: Achraf Djerida, Khelifa Djerriri, Moussa Sofiane Karoui, Mohammed El Amin larabi

Abstract: Currently, when reliable training datasets are available, deep learning methods dominate the proposed solutions for image super-resolution. However, for remote sensing benchmarks, it is very expensive to obtain high spatial resolution images. Most of the super-resolution methods use down-sampling techniques to simulate low and high spatial resolution pairs and construct the training samples. To so… ▽ More Currently, when reliable training datasets are available, deep learning methods dominate the proposed solutions for image super-resolution. However, for remote sensing benchmarks, it is very expensive to obtain high spatial resolution images. Most of the super-resolution methods use down-sampling techniques to simulate low and high spatial resolution pairs and construct the training samples. To solve this issue, the paper introduces a novel public remote sensing dataset (Alsat2B) of low and high spatial resolution images (10m and 2.5m respectively) for the single-image super-resolution task. The high-resolution images are obtained through pan-sharpening. Besides, the performance of some super-resolution methods on the dataset is assessed based on common criteria. The obtained results reveal that the proposed scheme is promising and highlight the challenges in the dataset which shows the need for advanced methods to grasp the relationship between the low and high-resolution patches. △ Less

Submitted 21 March, 2021; originally announced March 2021.

Comments: This paper has been Accepted for publication in the International Geoscience and Remote Sensing Symposium (IGARSS 2021)

Showing 1–4 of 4 results for author: Larabi, M