-
Analysing the Robustness of Vision-Language-Models to Common Corruptions
Authors:
Muhammad Usama,
Syeda Aishah Asim,
Syed Bilal Ali,
Syed Talal Wasim,
Umair Bin Mansoor
Abstract:
Vision-language models (VLMs) have demonstrated impressive capabilities in understanding and reasoning about visual and textual content. However, their robustness to common image corruptions remains under-explored. In this work, we present the first comprehensive analysis of VLM robustness across 19 corruption types from the ImageNet-C benchmark, spanning four categories: noise, blur, weather, and…
▽ More
Vision-language models (VLMs) have demonstrated impressive capabilities in understanding and reasoning about visual and textual content. However, their robustness to common image corruptions remains under-explored. In this work, we present the first comprehensive analysis of VLM robustness across 19 corruption types from the ImageNet-C benchmark, spanning four categories: noise, blur, weather, and digital distortions. We introduce two new benchmarks, TextVQA-C and GQA-C, to systematically evaluate how corruptions affect scene text understanding and object-based reasoning, respectively. Our analysis reveals that transformer-based VLMs exhibit distinct vulnerability patterns across tasks: text recognition deteriorates most severely under blur and snow corruptions, while object reasoning shows higher sensitivity to corruptions such as frost and impulse noise. We connect these observations to the frequency-domain characteristics of different corruptions, revealing how transformers' inherent bias toward low-frequency processing explains their differential robustness patterns. Our findings provide valuable insights for developing more corruption-robust vision-language models for real-world applications.
△ Less
Submitted 21 April, 2025; v1 submitted 18 April, 2025;
originally announced April 2025.
-
Experimental Evaluation of an SDN Controller for Open Optical-circuit-switched Networks
Authors:
Kazuya Anazawa,
Takeru Inoue,
Toru Mano,
Hiroshi Ou,
Hirotaka Ujikawa,
Dmitrii Briantcev,
Sumaiya Binte Ali,
Devika Dass,
Hideki Nishizawa,
Yoshiaki Sone,
Eoin Kenny,
Marco Ruffini,
Daniel Kilper,
Eiji Oki,
Koichi Takasugi
Abstract:
Open optical networks have been considered to be important for cost-effectively building and operating the networks. Recently, the optical-circuit-switches (OCSes) have attracted industry and academia because of their cost efficiency and higher capacity than traditional electrical packet switches (EPSes) and reconfigurable optical add drop multiplexers (ROADMs). Though the open interfaces and cont…
▽ More
Open optical networks have been considered to be important for cost-effectively building and operating the networks. Recently, the optical-circuit-switches (OCSes) have attracted industry and academia because of their cost efficiency and higher capacity than traditional electrical packet switches (EPSes) and reconfigurable optical add drop multiplexers (ROADMs). Though the open interfaces and control planes for traditional ROADMs and transponders have been defined by several standard-defining organizations (SDOs), those of OCSes have not. Considering that several OCSes have already been installed in production datacenter networks (DCNs) and several OCS products are on the market, bringing the openness and interoperability into the OCS-based networks has become important. Motivated by this fact, this paper investigates a software-defined networking (SDN) controller for open optical-circuit-switched networks. To this end, we identified the use cases of OCSes and derived the controller requirements for supporting them. We then proposed a multi-vendor (MV) OCS controller framework that satisfies the derived requirements; it was designed to quickly and consistently operate fiber paths upon receiving the operation requests. We validated our controller by implementing it and evaluating its performance on actual MV-OCS networks. It satisfied all the requirements, and fiber paths could be configured within 1.0 second by using our controller.
△ Less
Submitted 29 April, 2025; v1 submitted 28 January, 2025;
originally announced January 2025.
-
A Stochastic Rounding-Enabled Low-Precision Floating-Point MAC for DNN Training
Authors:
Sami Ben Ali,
Silviu-Ioan Filip,
Olivier Sentieys
Abstract:
Training Deep Neural Networks (DNNs) can be computationally demanding, particularly when dealing with large models. Recent work has aimed to mitigate this computational challenge by introducing 8-bit floating-point (FP8) formats for multiplication. However, accumulations are still done in either half (16-bit) or single (32-bit) precision arithmetic. In this paper, we investigate lowering accumulat…
▽ More
Training Deep Neural Networks (DNNs) can be computationally demanding, particularly when dealing with large models. Recent work has aimed to mitigate this computational challenge by introducing 8-bit floating-point (FP8) formats for multiplication. However, accumulations are still done in either half (16-bit) or single (32-bit) precision arithmetic. In this paper, we investigate lowering accumulator word length while maintaining the same model accuracy. We present a multiply-accumulate (MAC) unit with FP8 multiplier inputs and FP12 accumulations, which leverages an optimized stochastic rounding (SR) implementation to mitigate swamping errors that commonly arise during low precision accumulations. We investigate the hardware implications and accuracy impact associated with varying the number of random bits used for rounding operations. We additionally attempt to reduce MAC area and power by proposing a new scheme to support SR in floating-point MAC and by removing support for subnormal values. Our optimized eager SR unit significantly reduces delay and area when compared to a classic lazy SR design. Moreover, when compared to MACs utilizing single-or half-precision adders, our design showcases notable savings in all metrics. Furthermore, our approach consistently maintains near baseline accuracy across a diverse range of computer vision tasks, making it a promising alternative for low-precision DNN training.
△ Less
Submitted 26 September, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
State-of-the-art Models for Object Detection in Various Fields of Application
Authors:
Syed Ali John Naqvi,
Syed Bazil Ali
Abstract:
We present a list of datasets and their best models with the goal of advancing the state-of-the-art in object detection by placing the question of object recognition in the context of the two types of state-of-the-art methods: one-stage methods and two stage-methods. We provided an in-depth statistical analysis of the five top datasets in the light of recent developments in granulated Deep Learnin…
▽ More
We present a list of datasets and their best models with the goal of advancing the state-of-the-art in object detection by placing the question of object recognition in the context of the two types of state-of-the-art methods: one-stage methods and two stage-methods. We provided an in-depth statistical analysis of the five top datasets in the light of recent developments in granulated Deep Learning models - COCO minival, COCO test, Pascal VOC 2007, ADE20K, and ImageNet. The datasets are handpicked after closely comparing them with the rest in terms of diversity, quality of data, minimal bias, labeling quality etc. More importantly, our work extends to provide the best combination of these datasets with the emerging models in the last two years. It lists the top models and their optimal use cases for each of the respective datasets. We have provided a comprehensive overview of a variety of both generic and specific object detection models, enlisting comparative results like inference time and average precision of box (AP) fixed at different Intersection Over Union (IoUs) and for different sized objects. The qualitative and quantitative analysis will allow experts to achieve new performance records using the best combination of datasets and models.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Improved lung segmentation based on U-Net architecture and morphological operations
Authors:
S Ali John Naqvi,
Abdullah Tauqeer,
Rohaib Bhatti,
S Bazil Ali
Abstract:
An essential stage in computer aided diagnosis of chest X rays is automated lung segmentation. Due to rib cages and the unique modalities of each persons lungs, it is essential to construct an effective automated lung segmentation model. This paper presents a reliable model for the segmentation of lungs in chest radiographs. Our model overcomes the challenges by learning to ignore unimportant area…
▽ More
An essential stage in computer aided diagnosis of chest X rays is automated lung segmentation. Due to rib cages and the unique modalities of each persons lungs, it is essential to construct an effective automated lung segmentation model. This paper presents a reliable model for the segmentation of lungs in chest radiographs. Our model overcomes the challenges by learning to ignore unimportant areas in the source Chest Radiograph and emphasize important features for lung segmentation. We evaluate our model on public datasets, Montgomery and Shenzhen. The proposed model has a DICE coefficient of 98.1 percent which demonstrates the reliability of our model.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
A quality model for evaluating and choosing a stream processing framework architecture
Authors:
Youness Dendane,
Fabio Petrillo,
Hamid Mcheick,
Souhail Ben Ali
Abstract:
Today, we have to deal with many data (Big data) and we need to make decisions by choosing an architectural framework to analyze these data coming from different area. Due to this, it become problematic when we want to process these data, and even more, when it is continuous data. When you want to process some data, you have to first receive it, store it, and then query it. This is what we call Ba…
▽ More
Today, we have to deal with many data (Big data) and we need to make decisions by choosing an architectural framework to analyze these data coming from different area. Due to this, it become problematic when we want to process these data, and even more, when it is continuous data. When you want to process some data, you have to first receive it, store it, and then query it. This is what we call Batch Processing. It works well when you process big amount of data, but it finds its limits when you want to get fast (or real-time) processing results, such as financial trades, sensors, user session activity, etc. The solution to this problem is stream processing. Stream processing approach consists of data arriving record by record and rather than storing it, the processing should be done directly. Therefore, direct results are needed with a latency that may vary in real-time.
In this paper, we propose an assessment quality model to evaluate and choose stream processing frameworks. We describe briefly different architectural frameworks such as Kafka, Spark Streaming and Flink that address the stream processing. Using our quality model, we present a decision tree to support engineers to choose a framework following the quality aspects. Finally, we evaluate our model doing a case study to Twitter and Netflix streaming.
△ Less
Submitted 25 January, 2019;
originally announced January 2019.