-
VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance
Authors:
Mohammad Reza Taesiri,
Abhijay Ghildyal,
Saman Zadtootaghaj,
Nabajeet Barman,
Cor-Paul Bezemer
Abstract:
With video games now generating the highest revenues in the entertainment industry, optimizing game development workflows has become essential for the sector's sustained growth. Recent advancements in Vision-Language Models (VLMs) offer considerable potential to automate and enhance various aspects of game development, particularly Quality Assurance (QA), which remains one of the industry's most l…
▽ More
With video games now generating the highest revenues in the entertainment industry, optimizing game development workflows has become essential for the sector's sustained growth. Recent advancements in Vision-Language Models (VLMs) offer considerable potential to automate and enhance various aspects of game development, particularly Quality Assurance (QA), which remains one of the industry's most labor-intensive processes with limited automation options. To accurately evaluate the performance of VLMs in video game QA tasks and determine their effectiveness in handling real-world scenarios, there is a clear need for standardized benchmarks, as existing benchmarks are insufficient to address the specific requirements of this domain. To bridge this gap, we introduce VideoGameQA-Bench, a comprehensive benchmark that covers a wide array of game QA activities, including visual unit testing, visual regression testing, needle-in-a-haystack tasks, glitch detection, and bug report generation for both images and videos of various games. Code and data are available at: https://asgaardlab.github.io/videogameqa-bench/
△ Less
Submitted 21 May, 2025;
originally announced May 2025.
-
DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module
Authors:
Krish Sharma,
Niyar R Barman,
Akshay Chaturvedi,
Nicholas Asher
Abstract:
We look at reasoning on GSM8k, a dataset of short texts presenting primary school, math problems. We find, with Mirzadeh et al. (2024), that current LLM progress on the data set may not be explained by better reasoning but by exposure to a broader pretraining data distribution. We then introduce a novel information source for helping models with less data or inferior training reason better: discou…
▽ More
We look at reasoning on GSM8k, a dataset of short texts presenting primary school, math problems. We find, with Mirzadeh et al. (2024), that current LLM progress on the data set may not be explained by better reasoning but by exposure to a broader pretraining data distribution. We then introduce a novel information source for helping models with less data or inferior training reason better: discourse structure. We show that discourse structure improves performance for models like Llama2 13b by up to 160%. Even for models that have most likely memorized the data set, adding discourse structural information to the model still improves predictions and dramatically improves large model performance on out of distribution examples.
△ Less
Submitted 7 March, 2025; v1 submitted 6 March, 2025;
originally announced March 2025.
-
Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities
Authors:
Abhijay Ghildyal,
Yuanhan Chen,
Saman Zadtootaghaj,
Nabajeet Barman,
Alan C. Bovik
Abstract:
The advent of AI has influenced many aspects of human life, from self-driving cars and intelligent chatbots to text-based image and video generation models capable of creating realistic images and videos based on user prompts (text-to-image, image-to-image, and image-to-video). AI-based methods for image and video super resolution, video frame interpolation, denoising, and compression have already…
▽ More
The advent of AI has influenced many aspects of human life, from self-driving cars and intelligent chatbots to text-based image and video generation models capable of creating realistic images and videos based on user prompts (text-to-image, image-to-image, and image-to-video). AI-based methods for image and video super resolution, video frame interpolation, denoising, and compression have already gathered significant attention and interest in the industry and some solutions are already being implemented in real-world products and services. However, to achieve widespread integration and acceptance, AI-generated and enhanced content must be visually accurate, adhere to intended use, and maintain high visual quality to avoid degrading the end user's quality of experience (QoE).
One way to monitor and control the visual "quality" of AI-generated and -enhanced content is by deploying Image Quality Assessment (IQA) and Video Quality Assessment (VQA) models. However, most existing IQA and VQA models measure visual fidelity in terms of "reconstruction" quality against a pristine reference content and were not designed to assess the quality of "generative" artifacts. To address this, newer metrics and models have recently been proposed, but their performance evaluation and overall efficacy have been limited by datasets that were too small or otherwise lack representative content and/or distortion capacity; and by performance measures that can accurately report the success of an IQA/VQA model for "GenAI". This paper examines the current shortcomings and possibilities presented by AI-generated and enhanced image and video content, with a particular focus on end-user perceived quality. Finally, we discuss open questions and make recommendations for future work on the "GenAI" quality assessment problems, towards further progressing on this interesting and relevant field of research.
△ Less
Submitted 19 October, 2024; v1 submitted 11 October, 2024;
originally announced October 2024.
-
AIM 2024 Challenge on UHD Blind Photo Quality Assessment
Authors:
Vlad Hosu,
Marcos V. Conde,
Lorenzo Agnolucci,
Nabajeet Barman,
Saman Zadtootaghaj,
Radu Timofte
Abstract:
We introduce the AIM 2024 UHD-IQA Challenge, a competition to advance the No-Reference Image Quality Assessment (NR-IQA) task for modern, high-resolution photos. The challenge is based on the recently released UHD-IQA Benchmark Database, which comprises 6,073 UHD-1 (4K) images annotated with perceptual quality ratings from expert raters. Unlike previous NR-IQA datasets, UHD-IQA focuses on highly a…
▽ More
We introduce the AIM 2024 UHD-IQA Challenge, a competition to advance the No-Reference Image Quality Assessment (NR-IQA) task for modern, high-resolution photos. The challenge is based on the recently released UHD-IQA Benchmark Database, which comprises 6,073 UHD-1 (4K) images annotated with perceptual quality ratings from expert raters. Unlike previous NR-IQA datasets, UHD-IQA focuses on highly aesthetic photos of superior technical quality, reflecting the ever-increasing standards of digital photography. This challenge aims to develop efficient and effective NR-IQA models. Participants are tasked with creating novel architectures and training strategies to achieve high predictive performance on UHD-1 images within a computational budget of 50G MACs. This enables model deployment on edge devices and scalable processing of extensive image collections. Winners are determined based on a combination of performance metrics, including correlation measures (SRCC, PLCC, KRCC), absolute error metrics (MAE, RMSE), and computational efficiency (G MACs). To excel in this challenge, participants leverage techniques like knowledge distillation, low-precision inference, and multi-scale training. By pushing the boundaries of NR-IQA for high-resolution photos, the UHD-IQA Challenge aims to stimulate the development of practical models that can keep pace with the rapidly evolving landscape of digital photography. The innovative solutions emerging from this competition will have implications for various applications, from photo curation and enhancement to image compression.
△ Less
Submitted 24 September, 2024;
originally announced September 2024.
-
Foundation Models Boost Low-Level Perceptual Similarity Metrics
Authors:
Abhijay Ghildyal,
Nabajeet Barman,
Saman Zadtootaghaj
Abstract:
For full-reference image quality assessment (FR-IQA) using deep-learning approaches, the perceptual similarity score between a distorted image and a reference image is typically computed as a distance measure between features extracted from a pretrained CNN or more recently, a Transformer network. Often, these intermediate features require further fine-tuning or processing with additional neural n…
▽ More
For full-reference image quality assessment (FR-IQA) using deep-learning approaches, the perceptual similarity score between a distorted image and a reference image is typically computed as a distance measure between features extracted from a pretrained CNN or more recently, a Transformer network. Often, these intermediate features require further fine-tuning or processing with additional neural network layers to align the final similarity scores with human judgments. So far, most IQA models based on foundation models have primarily relied on the final layer or the embedding for the quality score estimation. In contrast, this work explores the potential of utilizing the intermediate features of these foundation models, which have largely been unexplored so far in the design of low-level perceptual similarity metrics. We demonstrate that the intermediate features are comparatively more effective. Moreover, without requiring any training, these metrics can outperform both traditional and state-of-the-art learned metrics by utilizing distance measures between the features.
△ Less
Submitted 12 January, 2025; v1 submitted 11 September, 2024;
originally announced September 2024.
-
LAR-IQA: A Lightweight, Accurate, and Robust No-Reference Image Quality Assessment Model
Authors:
Nasim Jamshidi Avanaki,
Abhijay Ghildyal,
Nabajeet Barman,
Saman Zadtootaghaj
Abstract:
Recent advancements in the field of No-Reference Image Quality Assessment (NR-IQA) using deep learning techniques demonstrate high performance across multiple open-source datasets. However, such models are typically very large and complex making them not so suitable for real-world deployment, especially on resource- and battery-constrained mobile devices. To address this limitation, we propose a c…
▽ More
Recent advancements in the field of No-Reference Image Quality Assessment (NR-IQA) using deep learning techniques demonstrate high performance across multiple open-source datasets. However, such models are typically very large and complex making them not so suitable for real-world deployment, especially on resource- and battery-constrained mobile devices. To address this limitation, we propose a compact, lightweight NR-IQA model that achieves state-of-the-art (SOTA) performance on ECCV AIM UHD-IQA challenge validation and test datasets while being also nearly 5.7 times faster than the fastest SOTA model. Our model features a dual-branch architecture, with each branch separately trained on synthetically and authentically distorted images which enhances the model's generalizability across different distortion types. To improve robustness under diverse real-world visual conditions, we additionally incorporate multiple color spaces during the training process. We also demonstrate the higher accuracy of recently proposed Kolmogorov-Arnold Networks (KANs) for final quality regression as compared to the conventional Multi-Layer Perceptrons (MLPs). Our evaluation considering various open-source datasets highlights the practical, high-accuracy, and robust performance of our proposed lightweight model. Code: https://github.com/nasimjamshidi/LAR-IQA.
△ Less
Submitted 6 September, 2024; v1 submitted 30 August, 2024;
originally announced August 2024.
-
MSLIQA: Enhancing Learning Representations for Image Quality Assessment through Multi-Scale Learning
Authors:
Nasim Jamshidi Avanaki,
Abhijay Ghildyal,
Nabajeet Barman,
Saman Zadtootaghaj
Abstract:
No-Reference Image Quality Assessment (NR-IQA) remains a challenging task due to the diversity of distortions and the lack of large annotated datasets. Many studies have attempted to tackle these challenges by developing more accurate NR-IQA models, often employing complex and computationally expensive networks, or by bridging the domain gap between various distortions to enhance performance on te…
▽ More
No-Reference Image Quality Assessment (NR-IQA) remains a challenging task due to the diversity of distortions and the lack of large annotated datasets. Many studies have attempted to tackle these challenges by developing more accurate NR-IQA models, often employing complex and computationally expensive networks, or by bridging the domain gap between various distortions to enhance performance on test datasets. In our work, we improve the performance of a generic lightweight NR-IQA model by introducing a novel augmentation strategy that boosts its performance by almost 28\%. This augmentation strategy enables the network to better discriminate between different distortions in various parts of the image by zooming in and out. Additionally, the inclusion of test-time augmentation further enhances performance, making our lightweight network's results comparable to the current state-of-the-art models, simply through the use of augmentations.
△ Less
Submitted 6 September, 2024; v1 submitted 29 August, 2024;
originally announced August 2024.
-
The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks
Authors:
Niyar R Barman,
Krish Sharma,
Ashhar Aziz,
Shashwat Bajpai,
Shwetangshu Biswas,
Vasu Sharma,
Vinija Jain,
Aman Chadha,
Amit Sheth,
Amitava Das
Abstract:
The rapid advancement of text-to-image generation systems, exemplified by models like Stable Diffusion, Midjourney, Imagen, and DALL-E, has heightened concerns about their potential misuse. In response, companies like Meta and Google have intensified their efforts to implement watermarking techniques on AI-generated images to curb the circulation of potentially misleading visuals. However, in this…
▽ More
The rapid advancement of text-to-image generation systems, exemplified by models like Stable Diffusion, Midjourney, Imagen, and DALL-E, has heightened concerns about their potential misuse. In response, companies like Meta and Google have intensified their efforts to implement watermarking techniques on AI-generated images to curb the circulation of potentially misleading visuals. However, in this paper, we argue that current image watermarking methods are fragile and susceptible to being circumvented through visual paraphrase attacks. The proposed visual paraphraser operates in two steps. First, it generates a caption for the given image using KOSMOS-2, one of the latest state-of-the-art image captioning systems. Second, it passes both the original image and the generated caption to an image-to-image diffusion system. During the denoising step of the diffusion pipeline, the system generates a visually similar image that is guided by the text caption. The resulting image is a visual paraphrase and is free of any watermarks. Our empirical findings demonstrate that visual paraphrase attacks can effectively remove watermarks from images. This paper provides a critical assessment, empirically revealing the vulnerability of existing watermarking techniques to visual paraphrase attacks. While we do not propose solutions to this issue, this paper serves as a call to action for the scientific community to prioritize the development of more robust watermarking techniques. Our first-of-its-kind visual paraphrase dataset and accompanying code are publicly available.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results
Authors:
Marcos V. Conde,
Saman Zadtootaghaj,
Nabajeet Barman,
Radu Timofte,
Chenlong He,
Qi Zheng,
Ruoxi Zhu,
Zhengzhong Tu,
Haiqiang Wang,
Xiangguang Chen,
Wenhui Meng,
Xiang Pan,
Huiying Shi,
Han Zhu,
Xiaozhong Xu,
Lei Sun,
Zhenzhong Chen,
Shan Liu,
Zicheng Zhang,
Haoning Wu,
Yingjie Zhou,
Chunyi Li,
Xiaohong Liu,
Weisi Lin,
Guangtao Zhai
, et al. (11 additional authors not shown)
Abstract:
This paper reviews the AIS 2024 Video Quality Assessment (VQA) Challenge, focused on User-Generated Content (UGC). The aim of this challenge is to gather deep learning-based methods capable of estimating the perceptual quality of UGC videos. The user-generated videos from the YouTube UGC Dataset include diverse content (sports, games, lyrics, anime, etc.), quality and resolutions. The proposed met…
▽ More
This paper reviews the AIS 2024 Video Quality Assessment (VQA) Challenge, focused on User-Generated Content (UGC). The aim of this challenge is to gather deep learning-based methods capable of estimating the perceptual quality of UGC videos. The user-generated videos from the YouTube UGC Dataset include diverse content (sports, games, lyrics, anime, etc.), quality and resolutions. The proposed methods must process 30 FHD frames under 1 second. In the challenge, a total of 102 participants registered, and 15 submitted code and models. The performance of the top-5 submissions is reviewed and provided here as a survey of diverse deep models for efficient video quality assessment of user-generated content.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution, Challenges, and Recommendations
Authors:
Nabajeet Barman,
Maria G. Martini,
Yuriy Reznik
Abstract:
The Bjøntegaard Delta (BD) method proposed in 2001 has become a popular tool for comparing video codec compression efficiency. It was initially proposed to compute bitrate and quality differences between two Rate-Distortion curves using PSNR as a distortion metric. Over the years, many works have calculated and reported BD results using other objective quality metrics such as SSIM, VMAF and, in so…
▽ More
The Bjøntegaard Delta (BD) method proposed in 2001 has become a popular tool for comparing video codec compression efficiency. It was initially proposed to compute bitrate and quality differences between two Rate-Distortion curves using PSNR as a distortion metric. Over the years, many works have calculated and reported BD results using other objective quality metrics such as SSIM, VMAF and, in some cases, even subjective ratings (mean opinion scores). However, the lack of consolidated literature explaining the metric, its evolution over the years, and a systematic evaluation of the same under different test conditions can result in a wrong interpretation of the BD results thus obtained.
Towards this end, this paper presents a detailed tutorial describing the BD method and example cases where the metric might fail. We also provide a detailed history of its evolution, including a discussion of various proposed improvements and variations over the last 20 years. In addition, we evaluate the various BD methods and their open-source implementations, considering different objective quality metrics and subjective ratings taking into account different RD characteristics. Based on our results, we present a set of recommendations on using existing BD metrics and various insights for possible exploration towards developing more effective tools for codec compression efficiency evaluation and comparison.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Counter Turing Test CT^2: AI-Generated Text Detection is Not as Easy as You May Think -- Introducing AI Detectability Index
Authors:
Megha Chakraborty,
S. M Towhidul Islam Tonmoy,
S M Mehedi Zaman,
Krish Sharma,
Niyar R Barman,
Chandan Gupta,
Shreya Gautam,
Tanay Kumar,
Vinija Jain,
Aman Chadha,
Amit P. Sheth,
Amitava Das
Abstract:
With the rise of prolific ChatGPT, the risk and consequences of AI-generated text has increased alarmingly. To address the inevitable question of ownership attribution for AI-generated artifacts, the US Copyright Office released a statement stating that 'If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it'.…
▽ More
With the rise of prolific ChatGPT, the risk and consequences of AI-generated text has increased alarmingly. To address the inevitable question of ownership attribution for AI-generated artifacts, the US Copyright Office released a statement stating that 'If a work's traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it'. Furthermore, both the US and the EU governments have recently drafted their initial proposals regarding the regulatory framework for AI. Given this cynosural spotlight on generative AI, AI-generated text detection (AGTD) has emerged as a topic that has already received immediate attention in research, with some initial methods having been proposed, soon followed by emergence of techniques to bypass detection. This paper introduces the Counter Turing Test (CT^2), a benchmark consisting of techniques aiming to offer a comprehensive evaluation of the robustness of existing AGTD techniques. Our empirical findings unequivocally highlight the fragility of the proposed AGTD methods under scrutiny. Amidst the extensive deliberations on policy-making for regulating AI development, it is of utmost importance to assess the detectability of content generated by LLMs. Thus, to establish a quantifiable spectrum facilitating the evaluation and ranking of LLMs according to their detectability levels, we propose the AI Detectability Index (ADI). We conduct a thorough examination of 15 contemporary LLMs, empirically demonstrating that larger LLMs tend to have a higher ADI, indicating they are less detectable compared to smaller LLMs. We firmly believe that ADI holds significant value as a tool for the wider NLP community, with the potential to serve as a rubric in AI-related policy-making.
△ Less
Submitted 23 October, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
-
A Subjective Dataset for Multi-Screen Video Streaming Applications
Authors:
Nabajeet Barman,
Yuriy Reznik,
Maria G. Martini
Abstract:
In modern-era video streaming systems, videos are streamed and displayed on a wide range of devices. Such devices vary from large-screen UHD and HDTVs to medium-screen Desktop PCs and Laptops to smaller-screen devices such as mobile phones and tablets. It is well known that a video is perceived differently when displayed on different devices. The viewing experience for a particular video on smalle…
▽ More
In modern-era video streaming systems, videos are streamed and displayed on a wide range of devices. Such devices vary from large-screen UHD and HDTVs to medium-screen Desktop PCs and Laptops to smaller-screen devices such as mobile phones and tablets. It is well known that a video is perceived differently when displayed on different devices. The viewing experience for a particular video on smaller screen devices such as smartphones and tablets, which have high pixel density, will be different with respect to the case where the same video is played on a large screen device such as a TV or PC monitor. Being able to model such relative differences in perception effectively can help in the design of better quality metrics and in the design of more efficient and optimized encoding profiles, leading to lower storage, encoding, and transmission costs. This paper presents a new, open-source dataset consisting of subjective ratings for various encoded video sequences of different resolutions and bitrates (quality) when viewed on three devices of varying screen sizes: TV, Tablet, and Mobile. Along with the subjective scores, an evaluation of some of the most famous and commonly used open-source objective quality metrics is also presented. It is observed that the performance of the metrics varies a lot across different device types, with the recently standardized ITU-T P.1204.3 Model, on average, outperforming their full-reference counterparts. The dataset consisting of the videos, along with their subjective and objective scores, is available freely on Github at https://github.com/NabajeetBarman/Multiscreen-Dataset.
△ Less
Submitted 22 June, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
Datasheet for Subjective and Objective Quality Assessment Datasets
Authors:
Nabajeet Barman,
Yuriy Reznik,
Maria Martini
Abstract:
Over the years, many subjective and objective quality assessment datasets have been created and made available to the research community. However, there is no standard process for documenting the various aspects of the dataset, such as details about the source sequences, number of test subjects, test methodology, encoding settings, etc. Such information is often of great importance to the users of…
▽ More
Over the years, many subjective and objective quality assessment datasets have been created and made available to the research community. However, there is no standard process for documenting the various aspects of the dataset, such as details about the source sequences, number of test subjects, test methodology, encoding settings, etc. Such information is often of great importance to the users of the dataset as it can help them get a quick understanding of the motivation and scope of the dataset. Without such a template, it is left to each reader to collate the information from the relevant publication or website, which is a tedious and time-consuming process. In some cases, the absence of a template to guide the documentation process can result in an unintentional omission of some important information.
This paper addresses this simple but significant gap by proposing a datasheet template for documenting various aspects of subjective and objective quality assessment datasets for multimedia data. The contributions presented in this work aim to simplify the documentation process for existing and new datasets and improve their reproducibility. The proposed datasheet template is available on GitHub, along with a few sample datasheets of a few open-source audiovisual subjective and objective datasets.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
Codec Compression Efficiency Evaluation of MPEG-5 part 2 (LCEVC) using Objective and Subjective Quality Assessment
Authors:
Nabajeet Barman,
Steven Schmidt,
Saman Zadtootaghaj,
Maria G Martini
Abstract:
With the increasing advancements in video compression efficiency achieved by newer codecs such as HEVC, AV1, and VVC, and intelligent encoding strategies, as well as improved bandwidth availability,there has been a proliferation and acceptance of newer services such as Netflix, Twitch, etc. However, such higher compression efficiencies are achieved at the cost of higher complexity and encoding del…
▽ More
With the increasing advancements in video compression efficiency achieved by newer codecs such as HEVC, AV1, and VVC, and intelligent encoding strategies, as well as improved bandwidth availability,there has been a proliferation and acceptance of newer services such as Netflix, Twitch, etc. However, such higher compression efficiencies are achieved at the cost of higher complexity and encoding delay, while many applications are delay sensitive. Hence, there is a requirement for faster, more efficient codecs to achieve higher encoding efficiency without significant trade-off in terms of both complexity and speed. We present in this work an evaluation of the latest MPEG-5 Part 2 Low Complexity Enhancement Video Coding (LCEVC) for live gaming video streaming applications. The results are presented in terms of bitrate savings using both subjective and objective quality measures as well as a comparison of the encoding speeds. Our results indicate that, for the encoding settings used in this work, LCEVC outperforms both x264 and x265 codecs in terms of bitrate savings using VMAF by approximately 42\% and 38\%. Using subjective results, it is found that LCEVC outperforms the respective base codecs, especially for low bitrates. This effect is more evident for x264 than for x265, i.e., for the latter the absolute improvement of quality scores is smaller. The objective and subjective results as well as sample video sequences are made available as part of an open dataset, LCEVC-LiveGaming at https://github.com/NabajeetBarman/LCEVC-LiveGaming.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
User Generated HDR Gaming Video Streaming: Dataset, Codec Comparison and Challenges
Authors:
Nabajeet Barman,
Maria G Martini
Abstract:
Gaming video streaming services have grown tremendously in the past few years, with higher resolutions, higher frame rates and HDR gaming videos getting increasingly adopted among the gaming community. Since gaming content as such is different from non-gaming content, it is imperative to evaluate the performance of the existing encoders to help understand the bandwidth requirements of such service…
▽ More
Gaming video streaming services have grown tremendously in the past few years, with higher resolutions, higher frame rates and HDR gaming videos getting increasingly adopted among the gaming community. Since gaming content as such is different from non-gaming content, it is imperative to evaluate the performance of the existing encoders to help understand the bandwidth requirements of such services, as well as further improve the compression efficiency of such encoders. Towards this end, we present in this paper GamingHDRVideoSET, a dataset consisting of eighteen 10-bit UHD-HDR gaming videos and encoded video sequences using four different codecs, together with their objective evaluation results. The dataset is available online at [to be added after paper acceptance]. Additionally, the paper discusses the codec compression efficiency of most widely used practical encoders, i.e., x264 (H.264/AVC), x265 (H.265/HEVC) and libvpx (VP9), as well the recently proposed encoder libaom (AV1), on 10-bit, UHD-HDR content gaming content. Our results show that the latest compression standard AV1 results in the best compression efficiency, followed by HEVC, H.264, and VP9.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
QoE Management of Multimedia Streaming Services in Future Networks: A Tutorial and Survey
Authors:
Alcardo Alex Barakabitze,
Nabajeet Barman,
Arslan Ahmad,
Saman Zadtootaghaj,
Lingfen Sun,
Maria G. Martini,
Luigi Atzori
Abstract:
We provide in this paper a tutorial and a comprehensive survey of QoE management solutions in current and future networks. We start with a high level description of QoE management for multimedia services, which integrates QoE modelling, monitoring, and optimization. This followed by a discussion of HTTP Adaptive Streaming (HAS) solutions as the dominant technique for streaming videos over the best…
▽ More
We provide in this paper a tutorial and a comprehensive survey of QoE management solutions in current and future networks. We start with a high level description of QoE management for multimedia services, which integrates QoE modelling, monitoring, and optimization. This followed by a discussion of HTTP Adaptive Streaming (HAS) solutions as the dominant technique for streaming videos over the best-effort Internet. We then summarize the key elements in SDN/NFV along with an overview of ongoing research projects, standardization activities and use cases related to SDN, NFV, and other emerging applications. We provide a survey of the state-of-the-art of QoE management techniques categorized into three different groups: a) QoE-aware/driven strategies using SDN and/or NFV; b) QoE-aware/driven approaches for adaptive streaming over emerging architectures such as multi-access edge computing, cloud/fog computing, and information-centric networking; and c) extended QoE management approaches in new domains such as immersive augmented and virtual reality, mulsemedia and video gaming applications. Based on the review, we present a list of identified future QoE management challenges regarding emerging multimedia applications, network management and orchestration, network slicing and collaborative service management in softwarized networks. Finally, we provide a discussion on future research directions with a focus on emerging research areas in QoE management, such as QoE-oriented business models, QoE-based big data strategies, and scalability issues in QoE optimization.
△ Less
Submitted 28 December, 2019;
originally announced December 2019.
-
A Novel Approach of Harris Corner Detection of Noisy Images using Adaptive Wavelet Thresholding Technique
Authors:
Nilanjan Dey,
Pradipti Nandi,
Nilanjana Barman
Abstract:
In this paper we propose a method of corner detection for obtaining features which is required to track and recognize objects within a noisy image. Corner detection of noisy images is a challenging task in image processing. Natural images often get corrupted by noise during acquisition and transmission. Though Corner detection of these noisy images does not provide desired results, hence de-noisin…
▽ More
In this paper we propose a method of corner detection for obtaining features which is required to track and recognize objects within a noisy image. Corner detection of noisy images is a challenging task in image processing. Natural images often get corrupted by noise during acquisition and transmission. Though Corner detection of these noisy images does not provide desired results, hence de-noising is required. Adaptive wavelet thresholding approach is applied for the same.
△ Less
Submitted 13 September, 2012;
originally announced September 2012.
-
A Comparative Study between Moravec and Harris Corner Detection of Noisy Images Using Adaptive Wavelet Thresholding Technique
Authors:
Nilanjan Dey,
Pradipti Nandi,
Nilanjana Barman,
Debolina Das,
Subhabrata Chakraborty
Abstract:
In this paper a comparative study between Moravec and Harris Corner Detection has been done for obtaining features required to track and recognize objects within a noisy image. Corner detection of noisy images is a challenging task in image processing. Natural images often get corrupted by noise during acquisition and transmission. As Corner detection of these noisy images does not provide desired…
▽ More
In this paper a comparative study between Moravec and Harris Corner Detection has been done for obtaining features required to track and recognize objects within a noisy image. Corner detection of noisy images is a challenging task in image processing. Natural images often get corrupted by noise during acquisition and transmission. As Corner detection of these noisy images does not provide desired results, hence de-noising is required. Adaptive wavelet thresholding approach is applied for the same.
△ Less
Submitted 7 September, 2012;
originally announced September 2012.