Skip to main content

Showing 1–14 of 14 results for author: Shoman, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2508.21080  [pdf, ps, other

    cs.CV cs.RO

    2COOOL: 2nd Workshop on the Challenge Of Out-Of-Label Hazards in Autonomous Driving

    Authors: Ali K. AlShami, Ryan Rabinowitz, Maged Shoman, Jianwu Fang, Lukas Picek, Shao-Yuan Lo, Steve Cruz, Khang Nhut Lam, Nachiket Kamod, Lei-Lei Li, Jugal Kalita, Terrance E. Boult

    Abstract: As the computer vision community advances autonomous driving algorithms, integrating vision-based insights with sensor data remains essential for improving perception, decision making, planning, prediction, simulation, and control. Yet we must ask: Why don't we have entirely safe self-driving cars yet? A key part of the answer lies in addressing novel scenarios, one of the most critical barriers t… ▽ More

    Submitted 18 August, 2025; originally announced August 2025.

    Comments: 11 pages, 2 figures, Accepted to ICCV 2025 Workshop on Out-of-Label Hazards in Autonomous Driving (2COOOL)

    MSC Class: 68T45 (Machine vision and scene understanding) ACM Class: I.2.10; I.4.8

  2. arXiv:2507.00951  [pdf, ps, other

    cs.AI

    Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

    Authors: Rizwan Qureshi, Ranjan Sapkota, Abbas Shah, Amgad Muneer, Anas Zafar, Ashmal Vayani, Maged Shoman, Abdelrahman B. M. Eldaly, Kai Zhang, Ferhat Sadak, Shaina Raza, Xinqi Fan, Ravid Shwartz-Ziv, Hong Yan, Vinjia Jain, Aman Chadha, Manoj Karkee, Jia Wu, Seyedali Mirjalili

    Abstract: Can machines truly think, reason and act in domains like humans? This enduring question continues to shape the pursuit of Artificial General Intelligence (AGI). Despite the growing capabilities of models such as GPT-4.5, DeepSeek, Claude 3.5 Sonnet, Phi-4, and Grok 3, which exhibit multimodal fluency and partial reasoning, these systems remain fundamentally limited by their reliance on token-level… ▽ More

    Submitted 11 July, 2025; v1 submitted 1 July, 2025; originally announced July 2025.

  3. arXiv:2506.07055  [pdf, ps, other

    cs.CV

    A Layered Self-Supervised Knowledge Distillation Framework for Efficient Multimodal Learning on the Edge

    Authors: Tarique Dahri, Zulfiqar Ali Memon, Zhenyu Yu, Mohd. Yamani Idna Idris, Sheheryar Khan, Sadiq Ahmad, Maged Shoman, Saddam Aziz, Rizwan Qureshi

    Abstract: We introduce Layered Self-Supervised Knowledge Distillation (LSSKD) framework for training compact deep learning models. Unlike traditional methods that rely on pre-trained teacher networks, our approach appends auxiliary classifiers to intermediate feature maps, generating diverse self-supervised knowledge and enabling one-to-one transfer across different network stages. Our method achieves an av… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

  4. arXiv:2502.08650  [pdf, ps, other

    cs.CY

    Who is Responsible? The Data, Models, Users or Regulations? A Comprehensive Survey on Responsible Generative AI for a Sustainable Future

    Authors: Shaina Raza, Rizwan Qureshi, Anam Zahid, Safiullah Kamawal, Ferhat Sadak, Joseph Fioresi, Muhammaed Saeed, Ranjan Sapkota, Aditya Jain, Anas Zafar, Muneeb Ul Hassan, Aizan Zafar, Hasan Maqbool, Ashmal Vayani, Jia Wu, Maged Shoman

    Abstract: Generative AI is moving rapidly from research into real world deployment across sectors, which elevates the need for responsible development, deployment, evaluation, and governance. To address this pressing challenge, in this study, we synthesize the landscape of responsible generative AI across methods, benchmarks, and policies, and connects governance expectations to concrete engineering practic… ▽ More

    Submitted 24 September, 2025; v1 submitted 15 January, 2025; originally announced February 2025.

    Comments: under review

  5. arXiv:2501.18648  [pdf, other

    cs.CV

    Multimodal Large Language Models for Image, Text, and Speech Data Augmentation: A Survey

    Authors: Ranjan Sapkota, Shaina Raza, Maged Shoman, Achyut Paudel, Manoj Karkee

    Abstract: In the past five years, research has shifted from traditional Machine Learning (ML) and Deep Learning (DL) approaches to leveraging Large Language Models (LLMs) , including multimodality, for data augmentation to enhance generalization, and combat overfitting in training deep convolutional neural networks. However, while existing surveys predominantly focus on ML and DL techniques or limited modal… ▽ More

    Submitted 21 March, 2025; v1 submitted 29 January, 2025; originally announced January 2025.

    Comments: 52 pages

  6. YOLO advances to its genesis: a decadal and comprehensive review of the You Only Look Once (YOLO) series

    Authors: Ranjan Sapkota, Marco Flores Calero, Rizwan Qureshi, Chetan Badgujar, Upesh Nepal, Alwin Poulose, Peter Zeno, Uday Bhanu Prakash Vaddevolu, Sheheryar Khan, Maged Shoman, Hong Yan, Manoj Karkee

    Abstract: This review systematically examines the progression of the You Only Look Once (YOLO) object detection algorithms from YOLOv1 to the recently unveiled YOLOv12. Employing a reverse chronological analysis, this study examines the advancements introduced by YOLO algorithms, beginning with YOLOv12 and progressing through YOLO11 (or YOLOv11), YOLOv10, YOLOv9, YOLOv8, and subsequent versions to explore e… ▽ More

    Submitted 13 June, 2025; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Published in Artificial Intelligence Review as https://doi.org/10.1007/s10462-025-11253-3

    Journal ref: Artificial Intelligence Review, SpringerNature, 2025

  7. arXiv:2404.10078  [pdf, other

    cs.CV

    Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets

    Authors: Dai Quoc Tran, Armstrong Aboah, Yuntae Jeon, Maged Shoman, Minsoo Park, Seunghee Park

    Abstract: This study addresses the evolving challenges in urban traffic monitoring detection systems based on fisheye lens cameras by proposing a framework that improves the efficacy and accuracy of these systems. In the context of urban infrastructure and transportation management, advanced traffic monitoring systems have become critical for managing the complexities of urbanization and increasing vehicle… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  8. arXiv:2404.08229  [pdf, other

    cs.CV

    Enhancing Traffic Safety with Parallel Dense Video Captioning for End-to-End Event Analysis

    Authors: Maged Shoman, Dongdong Wang, Armstrong Aboah, Mohamed Abdel-Aty

    Abstract: This paper introduces our solution for Track 2 in AI City Challenge 2024. The task aims to solve traffic safety description and analysis with the dataset of Woven Traffic Safety (WTS), a real-world Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding. Our solution mainly focuses on the following points: 1) To solve dense video captioning, we leverage the framewo… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  9. arXiv:2311.05120  [pdf

    cs.CL

    Quranic Conversations: Developing a Semantic Search tool for the Quran using Arabic NLP Techniques

    Authors: Yasser Shohoud, Maged Shoman, Sarah Abdelazim

    Abstract: The Holy Book of Quran is believed to be the literal word of God (Allah) as revealed to the Prophet Muhammad (PBUH) over a period of approximately 23 years. It is the book where God provides guidance on how to live a righteous and just life, emphasizing principles like honesty, compassion, charity and justice, as well as providing rules for personal conduct, family matters, business ethics and muc… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  10. arXiv:2305.07454  [pdf

    cs.DC

    Accelerating Statewide Connected Vehicles Big (Sensor Fusion) Data ETL Pipelines on GPUs

    Authors: Abdul Rashid Mussah, Maged Shoman, Mark Amo-Boateng, Yaw Adu-Gyamfi

    Abstract: Real-time traffic and sensor data from connected vehicles have the potential to provide insights that will lead to the immediate benefit of efficient management of the transportation infrastructure and related adjacent services. However, the growth of electric vehicles (EVs) and connected vehicles (CVs) has generated an abundance of CV data and sensor data that has put a strain on the processing c… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: Presented at the 102nd Transportation Research Board (TRB) Annual Meeting

  11. arXiv:2211.08541  [pdf, other

    cs.CV eess.SP

    GC-GRU-N for Traffic Prediction using Loop Detector Data

    Authors: Maged Shoman, Armstrong Aboah, Abdulateef Daud, Yaw Adu-Gyamfi

    Abstract: Because traffic characteristics display stochastic nonlinear spatiotemporal dependencies, traffic prediction is a challenging task. In this paper develop a graph convolution gated recurrent unit (GC GRU N) network to extract the essential Spatio temporal features. we use Seattle loop detector data aggregated over 15 minutes and reframe the problem through space and time. The model performance is c… ▽ More

    Submitted 13 November, 2022; originally announced November 2022.

  12. arXiv:2204.08584  [pdf

    cs.CV

    A Region-Based Deep Learning Approach to Automated Retail Checkout

    Authors: Maged Shoman, Armstrong Aboah, Alex Morehead, Ye Duan, Abdulateef Daud, Yaw Adu-Gyamfi

    Abstract: Automating the product checkout process at conventional retail stores is a task poised to have large impacts on society generally speaking. Towards this end, reliable deep learning models that enable automated product counting for fast customer checkout can make this goal a reality. In this work, we propose a novel, region-based deep learning approach to automate product counting using a customize… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

  13. Exploring Preferences for Transportation Modes in the City of Munich after the Recent Incorporation of Ride-Hailing Companies

    Authors: Maged Shoman, Ana Tsui Moreno

    Abstract: The growth of ridehailing (RH) companies over the past few years has affected urban mobility in numerous ways. Despite widespread claims about the benefits of such services, limited research has been conducted on the topic. This paper assesses the willingness of Munich transportation users to pay for RH services. Realizing the difficulty of obtaining data directly from RH companies, a stated prefe… ▽ More

    Submitted 28 January, 2022; originally announced January 2022.

    Report number: Vol. 2675(5) 329--338

    Journal ref: Transportation Research Record 2021

  14. arXiv:2104.06856  [pdf

    cs.CV

    A Vision-based System for Traffic Anomaly Detection using Deep Learning and Decision Trees

    Authors: Armstrong Aboah, Maged Shoman, Vishal Mandal, Sayedomidreza Davami, Yaw Adu-Gyamfi, Anuj Sharma

    Abstract: Any intelligent traffic monitoring system must be able to detect anomalies such as traffic accidents in real time. In this paper, we propose a Decision-Tree - enabled approach powered by Deep Learning for extracting anomalies from traffic cameras while accurately estimating the start and end time of the anomalous event. Our approach included creating a detection model, followed by anomaly detectio… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.