Skip to main content

Showing 1–9 of 9 results for author: Bastola, A

.
  1. arXiv:2501.17329  [pdf, other

    cs.MA cs.AI cs.LG

    Anomaly Detection in Cooperative Vehicle Perception Systems under Imperfect Communication

    Authors: Ashish Bastola, Hao Wang, Abolfazl Razi

    Abstract: Anomaly detection is a critical requirement for ensuring safety in autonomous driving. In this work, we leverage Cooperative Perception to share information across nearby vehicles, enabling more accurate identification and consensus of anomalous behaviors in complex traffic scenarios. To account for the real-world challenge of imperfect communication, we propose a cooperative-perception-based anom… ▽ More

    Submitted 28 January, 2025; originally announced January 2025.

    Comments: 10 pages

  2. arXiv:2501.00944  [pdf, other

    cs.CV eess.IV

    Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion

    Authors: Hao Wang, Xiwen Chen, Ashish Bastola, Jiayou Qin, Abolfazl Razi

    Abstract: The emergence of generative AI and controllable diffusion has made image-to-image synthesis increasingly practical and efficient. However, when input images exhibit low entropy and sparse, the inherent characteristics of diffusion models often result in limited diversity. This constraint significantly interferes with data augmentation. To address this, we propose Diffusion Prism, a training-free f… ▽ More

    Submitted 10 January, 2025; v1 submitted 1 January, 2025; originally announced January 2025.

  3. arXiv:2411.13040  [pdf, other

    cs.CV

    RobustFormer: Noise-Robust Pre-training for images and videos

    Authors: Ashish Bastola, Nishant Luitel, Hao Wang, Danda Pani Paudel, Roshani Poudel, Abolfazl Razi

    Abstract: While deep learning models are powerful tools that revolutionized many areas, they are also vulnerable to noise as they rely heavily on learning patterns and features from the exact details of the clean data. Transformers, which have become the backbone of modern vision models, are no exception. Current Discrete Wavelet Transforms (DWT) based methods do not benefit from masked autoencoder (MAE) pr… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

    Comments: 13 pages

  4. Motor Focus: Fast Ego-Motion Prediction for Assistive Visual Navigation

    Authors: Hao Wang, Jiayou Qin, Xiwen Chen, Ashish Bastola, John Suchanek, Zihao Gong, Abolfazl Razi

    Abstract: Assistive visual navigation systems for visually impaired individuals have become increasingly popular thanks to the rise of mobile computing. Most of these devices work by translating visual information into voice commands. In complex scenarios where multiple objects are present, it is imperative to prioritize object detection and provide immediate notifications for key entities in specific direc… ▽ More

    Submitted 12 October, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  5. arXiv:2403.17331  [pdf, other

    cs.DC cs.IT

    FedMIL: Federated-Multiple Instance Learning for Video Analysis with Optimized DPP Scheduling

    Authors: Ashish Bastola, Hao Wang, Xiwen Chen, Abolfazl Razi

    Abstract: Many AI platforms, including traffic monitoring systems, use Federated Learning (FL) for decentralized sensor data processing for learning-based applications while preserving privacy and ensuring secured information transfer. On the other hand, applying supervised learning to large data samples, like high-resolution images requires intensive human labor to label different parts of a data sample. M… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  6. arXiv:2403.12415  [pdf, other

    cs.CV cs.HC

    VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation

    Authors: Hao Wang, Jiayou Qin, Ashish Bastola, Xiwen Chen, John Suchanek, Zihao Gong, Abolfazl Razi

    Abstract: This paper explores the potential of Large Language Models(LLMs) in zero-shot anomaly detection for safe visual navigation. With the assistance of the state-of-the-art real-time open-world object detection model Yolo-World and specialized prompts, the proposed framework can identify anomalies within camera-captured frames that include any possible obstacles, then generate concise, audio-delivered… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  7. arXiv:2403.03463  [pdf, other

    cs.CV

    FLAME Diffuser: Wildfire Image Synthesis using Mask Guided Diffusion

    Authors: Hao Wang, Sayed Pedram Haeri Boroujeni, Xiwen Chen, Ashish Bastola, Huayu Li, Wenhui Zhu, Abolfazl Razi

    Abstract: Wildfires are a significant threat to ecosystems and human infrastructure, leading to widespread destruction and environmental degradation. Recent advancements in deep learning and generative models have enabled new methods for wildfire detection and monitoring. However, the scarcity of annotated wildfire images limits the development of robust models for these tasks. In this work, we present the… ▽ More

    Submitted 30 September, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  8. arXiv:2401.14571  [pdf, other

    cs.HC cs.AI cs.CY

    Driving Towards Inclusion: A Systematic Review of AI-powered Accessibility Enhancements for People with Disability in Autonomous Vehicles

    Authors: Ashish Bastola, Hao Wang, Sayed Pedram Haeri Boroujeni, Julian Brinkley, Ata Jahangir Moshayedi, Abolfazl Razi

    Abstract: This paper provides a comprehensive and, to our knowledge, the first review of inclusive human-computer interaction (HCI) within autonomous vehicles (AVs) and human-driven cars with partial autonomy, emphasizing accessibility and user-centered design principles. We explore the current technologies and HCI systems designed to enhance passenger experience, particularly for individuals with accessibi… ▽ More

    Submitted 9 January, 2025; v1 submitted 25 January, 2024; originally announced January 2024.

  9. arXiv:2306.11980  [pdf, other

    cs.HC

    LLM-based Smart Reply (LSR): Enhancing Collaborative Performance with ChatGPT-mediated Smart Reply System

    Authors: Ashish Bastola, Hao Wang, Judsen Hembree, Pooja Yadav, Zihao Gong, Emma Dixon, Abolfazl Razi, Nathan McNeese

    Abstract: Interactive user interfaces have increasingly explored AI's role in enhancing communication efficiency and productivity in collaborative tasks. The emergence of Large Language Models (LLMs) such as ChatGPT has revolutionized conversational agents, employing advanced deep learning techniques to generate context-aware, coherent, and personalized responses. Consequently, LLM-based AI assistants provi… ▽ More

    Submitted 4 March, 2024; v1 submitted 20 June, 2023; originally announced June 2023.