Skip to main content

Showing 1–50 of 281 results for author: Joshi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.05146  [pdf, ps, other

    cs.CV cs.LG

    VERITAS: Verification and Explanation of Realness in Images for Transparency in AI Systems

    Authors: Aadi Srivastava, Vignesh Natarajkumar, Utkarsh Bheemanaboyna, Devisree Akashapu, Nagraj Gaonkar, Archit Joshi

    Abstract: The widespread and rapid adoption of AI-generated content, created by models such as Generative Adversarial Networks (GANs) and Diffusion Models, has revolutionized the digital media landscape by allowing efficient and creative content generation. However, these models also blur the difference between real images and AI-generated synthetic images, raising concerns regarding content authenticity an… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  2. arXiv:2507.04775  [pdf, ps, other

    cs.CR

    FIDESlib: A Fully-Fledged Open-Source FHE Library for Efficient CKKS on GPUs

    Authors: Carlos Agulló-Domingo, Óscar Vera-López, Seyda Guzelhan, Lohit Daksha, Aymane El Jerari, Kaustubh Shivdikar, Rashmi Agrawal, David Kaeli, Ajay Joshi, José L. Abellán

    Abstract: Word-wise Fully Homomorphic Encryption (FHE) schemes, such as CKKS, are gaining significant traction due to their ability to provide post-quantum-resistant, privacy-preserving approximate computing; an especially desirable feature in Machine-Learning-as-a-Service (MLaaS) cloud-computing paradigms. OpenFHE is a leading CPU-based FHE library with robust CKKS operations, but its server-side performan… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

    Comments: Presented as poster paper at 2025 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

  3. arXiv:2507.00981  [pdf, ps, other

    cs.CV

    Evaluating Robustness of Monocular Depth Estimation with Procedural Scene Perturbations

    Authors: Jack Nugent, Siyang Wu, Zeyu Ma, Beining Han, Meenal Parakh, Abhishek Joshi, Lingjie Mei, Alexander Raistrick, Xinyuan Li, Jia Deng

    Abstract: Recent years have witnessed substantial progress on monocular depth estimation, particularly as measured by the success of large models on standard benchmarks. However, performance on standard benchmarks does not offer a complete assessment, because most evaluate accuracy but not robustness. In this work, we introduce PDE (Procedural Depth Evaluation), a new benchmark which enables systematic robu… ▽ More

    Submitted 2 July, 2025; v1 submitted 1 July, 2025; originally announced July 2025.

    Comments: Fixing display of figure on Safari browsers

  4. arXiv:2506.22922  [pdf, ps, other

    cs.DS

    Global Predecessor Indexing: Avoiding Binary Search in Weighted Job Scheduling

    Authors: Amit Joshi

    Abstract: We present an improved solution to the Weighted Job Scheduling (WJS) problem. While the classical dynamic programming (DP) solution runs in $O(n \log(n))$ time due to comparison-based sorting and per-job binary search, we eliminate the binary search bottleneck. In its place, we introduce a novel multi-phase preprocessing technique called Global Predecessor Indexing (GPI), which computes the latest… ▽ More

    Submitted 28 June, 2025; originally announced June 2025.

    Comments: 6 pages, 9 figures including tables. Short theoretical and practical paper on improved dynamic programming for weighted job scheduling with linear-time preprocessing

  5. arXiv:2506.12576  [pdf, ps, other

    cs.CL cs.AI

    Enabling Precise Topic Alignment in Large Language Models Via Sparse Autoencoders

    Authors: Ananya Joshi, Celia Cintas, Skyler Speakman

    Abstract: Recent work shows that Sparse Autoencoders (SAE) applied to large language model (LLM) layers have neurons corresponding to interpretable concepts. These SAE neurons can be modified to align generated outputs, but only towards pre-identified topics and with some parameter tuning. Our approach leverages the observational and modification properties of SAEs to enable alignment for any topic. This me… ▽ More

    Submitted 28 June, 2025; v1 submitted 14 June, 2025; originally announced June 2025.

  6. arXiv:2506.10164  [pdf, ps, other

    cs.HC

    Mastery Learning Improves Performance on Complex Tasks on PCP Literacy Test

    Authors: Chandana Srinivas, Elif E. Firat, Robert S. Laramee, Alark Joshi

    Abstract: Developing literacy with unfamiliar data visualization techniques such as Parallel Coordinate Plots (PCPs) can be a significant challenge for students. We adopted the Revised Bloom's taxonomy to instruct students on Parallel Coordinate Plots (PCPs) using Mastery Learning in the classroom. To evaluate Mastery Learning's impact, we conducted an intervention in a Data Visualization course to teach st… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  7. arXiv:2506.04429  [pdf, ps, other

    cs.AI

    An AI-Based Public Health Data Monitoring System

    Authors: Ananya Joshi, Nolan Gormley, Richa Gadgil, Tina Townes, Roni Rosenfeld, Bryan Wilder

    Abstract: Public health experts need scalable approaches to monitor large volumes of health data (e.g., cases, hospitalizations, deaths) for outbreaks or data quality issues. Traditional alert-based monitoring systems struggle with modern public health data monitoring systems for several reasons, including that alerting thresholds need to be constantly reset and the data volumes may cause application lag. I… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  8. arXiv:2505.17586  [pdf, ps, other

    cs.CR

    Large Language Models in the IoT Ecosystem -- A Survey on Security Challenges and Applications

    Authors: Kushal Khatiwada, Jayden Hopper, Joseph Cheatham, Ayan Joshi, Sabur Baidya

    Abstract: The Internet of Things (IoT) and Large Language Models (LLMs) have been two major emerging players in the information technology era. Although there has been significant coverage of their individual capabilities, our literature survey sheds some light on the integration and interaction of LLMs and IoT devices - a mutualistic relationship in which both parties leverage the capabilities of the other… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  9. arXiv:2505.15095  [pdf, ps, other

    cs.CL cs.AI

    Nek Minit: Harnessing Pragmatic Metacognitive Prompting for Explainable Sarcasm Detection of Australian and Indian English

    Authors: Ishmanbir Singh, Dipankar Srirag, Aditya Joshi

    Abstract: Sarcasm is a challenge to sentiment analysis because of the incongruity between stated and implied sentiment. The challenge is exacerbated when the implication may be relevant to a specific country or geographical region. Pragmatic metacognitive prompting (PMP) is a cognition-inspired technique that has been used for pragmatic reasoning. In this paper, we harness PMP for explainable sarcasm detect… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: Under review. 4 pages + references

  10. arXiv:2505.10755  [pdf, other

    cs.RO cs.GR

    Infinigen-Sim: Procedural Generation of Articulated Simulation Assets

    Authors: Abhishek Joshi, Beining Han, Jack Nugent, Yiming Zuo, Jonathan Liu, Hongyu Wen, Stamatis Alexandropoulos, Tao Sun, Alexander Raistrick, Gaowen Liu, Yi Shao, Jia Deng

    Abstract: We introduce Infinigen-Sim, a toolkit which enables users to create diverse and realistic articulated object procedural generators. These tools are composed of high-level utilities for use creating articulated assets in Blender, as well as an export pipeline to integrate the resulting assets into common robotics simulators. We demonstrate our system by creating procedural generators for 5 common a… ▽ More

    Submitted 19 May, 2025; v1 submitted 15 May, 2025; originally announced May 2025.

  11. arXiv:2505.02915  [pdf, other

    cs.RO

    Zero-shot Sim2Real Transfer for Magnet-Based Tactile Sensor on Insertion Tasks

    Authors: Beining Han, Abhishek Joshi, Jia Deng

    Abstract: Tactile sensing is an important sensing modality for robot manipulation. Among different types of tactile sensors, magnet-based sensors, like u-skin, balance well between high durability and tactile density. However, the large sim-to-real gap of tactile sensors prevents robots from acquiring useful tactile-based manipulation skills from simulation data, a recipe that has been successful for achiev… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

  12. arXiv:2504.18799  [pdf, other

    cs.MM cs.SD eess.AS

    A Survey on Multimodal Music Emotion Recognition

    Authors: Rashini Liyanarachchi, Aditya Joshi, Erik Meijering

    Abstract: Multimodal music emotion recognition (MMER) is an emerging discipline in music information retrieval that has experienced a surge in interest in recent years. This survey provides a comprehensive overview of the current state-of-the-art in MMER. Discussing the different approaches and techniques used in this field, the paper introduces a four-stage MMER framework, including multimodal data selecti… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

  13. arXiv:2504.17902  [pdf, other

    cs.CV cs.CL

    CAMU: Context Augmentation for Meme Understanding

    Authors: Girish A. Koushik, Diptesh Kanojia, Helen Treharne, Aditya Joshi

    Abstract: Social media memes are a challenging domain for hate detection because they intertwine visual and textual cues into culturally nuanced messages. We introduce a novel framework, CAMU, which leverages large vision-language models to generate more descriptive captions, a caption-scoring neural network to emphasise hate-relevant content, and parameter-efficient fine-tuning of CLIP's text encoder for a… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: Under review at ACM MM 2025

  14. arXiv:2504.12276  [pdf, other

    cs.CV

    The Tenth NTIRE 2025 Image Denoising Challenge Report

    Authors: Lei Sun, Hang Guo, Bin Ren, Luc Van Gool, Radu Timofte, Yawei Li, Xiangyu Kong, Hyunhee Park, Xiaoxuan Yu, Suejin Han, Hakjae Jeon, Jia Li, Hyung-Ju Chun, Donghun Ryou, Inju Ha, Bohyung Han, Jingyu Ma, Zhijuan Huang, Huiyuan Fu, Hongyuan Yu, Boqi Zhang, Jiawei Shi, Heng Zhang, Huadong Ma, Deepak Kumar Tyagi , et al. (69 additional authors not shown)

    Abstract: This paper presents an overview of the NTIRE 2025 Image Denoising Challenge (σ = 50), highlighting the proposed methodologies and corresponding results. The primary objective is to develop a network architecture capable of achieving high-quality denoising performance, quantitatively evaluated using PSNR, without constraints on computational complexity or model size. The task assumes independent ad… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

  15. arXiv:2504.10077  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Quantifying Commonsense Reasoning with Mechanistic Insights

    Authors: Abhinav Joshi, Areeb Ahmad, Divyaksh Shukla, Ashutosh Modi

    Abstract: Commonsense reasoning deals with the implicit knowledge that is well understood by humans and typically acquired via interactions with the world. In recent times, commonsense reasoning and understanding of various LLMs have been evaluated using text-based tasks. In this work, we argue that a proxy of this understanding can be maintained as a graphical structure that can further help to perform a r… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Accepted at NAACL 2025; 28 pages (9 pages + 7 pages references + 12 pages appendix)

  16. arXiv:2504.07989  [pdf, other

    cs.CL cs.AI

    Regional Tiny Stories: Using Small Models to Compare Language Learning and Tokenizer Performance

    Authors: Nirvan Patil, Malhar Abhay Inamdar, Agnivo Gosai, Guruprasad Pathak, Anish Joshi, Aryan Sagavekar, Anish Joshirao, Raj Dandekar, Rajat Dandekar, Sreedath Panat

    Abstract: Small Language Models (SLMs) offer efficient alternatives to LLMs for specific domains. The 2023 TinyStories study developed an English dataset that allows SLMs with 1 to 10 million parameters to produce coherent outputs. Our research expands this framework by translating the original dataset into Indian languages and creating synthetic data using LLMs. We focus on Hindi, Marathi, and Bengali, eva… ▽ More

    Submitted 22 April, 2025; v1 submitted 7 April, 2025; originally announced April 2025.

    Comments: 34 pages, 24 figures, 16 tables

  17. arXiv:2504.07757  [pdf, other

    cs.AI cs.LG

    Search-contempt: a hybrid MCTS algorithm for training AlphaZero-like engines with better computational efficiency

    Authors: Ameya Joshi

    Abstract: AlphaZero in 2017 was able to master chess and other games without human knowledge by playing millions of games against itself (self-play), with a computation budget running in the tens of millions of dollars. It used a variant of the Monte Carlo Tree Search (MCTS) algorithm, known as PUCT. This paper introduces search-contempt, a novel hybrid variant of the MCTS algorithm that fundamentally alter… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

  18. arXiv:2504.01081  [pdf, other

    cs.CV cs.CL eess.IV

    ShieldGemma 2: Robust and Tractable Image Content Moderation

    Authors: Wenjun Zeng, Dana Kurniawan, Ryan Mullins, Yuchi Liu, Tamoghna Saha, Dirichi Ike-Njoku, Jindong Gu, Yiwen Song, Cai Xu, Jingjing Zhou, Aparna Joshi, Shravan Dheep, Mani Malek, Hamid Palangi, Joon Baek, Rick Pereira, Karthik Narasimhan

    Abstract: We introduce ShieldGemma 2, a 4B parameter image content moderation model built on Gemma 3. This model provides robust safety risk predictions across the following key harm categories: Sexually Explicit, Violence \& Gore, and Dangerous Content for synthetic images (e.g. output of any image generation model) and natural images (e.g. any image input to a Vision-Language Model). We evaluated on both… ▽ More

    Submitted 8 April, 2025; v1 submitted 1 April, 2025; originally announced April 2025.

  19. arXiv:2503.20068  [pdf, other

    cs.CV

    iNatAg: Multi-Class Classification Models Enabled by a Large-Scale Benchmark Dataset with 4.7M Images of 2,959 Crop and Weed Species

    Authors: Naitik Jain, Amogh Joshi, Mason Earles

    Abstract: Accurate identification of crop and weed species is critical for precision agriculture and sustainable farming. However, it remains a challenging task due to a variety of factors -- a high degree of visual similarity among species, environmental variability, and a continued lack of large, agriculture-specific image data. We introduce iNatAg, a large-scale image dataset which contains over 4.7 mill… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  20. arXiv:2503.17605  [pdf, other

    cs.IR cs.LG

    Explainable identification of similarities between entities for discovery in large text

    Authors: Akhil Joshi, Sai Teja Erukude, Lior Shamir

    Abstract: With the availability of virtually infinite number text documents in digital format, automatic comparison of textual data is essential for extracting meaningful insights that are difficult to identify manually. Many existing tools, including AI and large language models, struggle to provide precise and explainable insights into textual similarities. In many cases they determine the similarity betw… ▽ More

    Submitted 21 March, 2025; originally announced March 2025.

    Comments: Future Internet, accepted

  21. arXiv:2503.12858  [pdf, other

    cs.CL cs.LG

    Harnessing Test-time Adaptation for NLU tasks Involving Dialects of English

    Authors: Duke Nguyen, Aditya Joshi, Flora Salim

    Abstract: Test-time adaptation (TTA) is an excellent method which helps generalize models across domains, tasks, and distributions without the use of labeled datasets. Thus, TTA is very useful in natural language processing (NLP) in the dialectal setting, since oftentimes, models are trained on Standard American English (SAE), evaluated on Indian English or Nigerian English, of which distribution differs si… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  22. arXiv:2503.09636  [pdf, other

    cs.RO

    Real-Time Neuromorphic Navigation: Guiding Physical Robots with Event-Based Sensing and Task-Specific Reconfigurable Autonomy Stack

    Authors: Sourav Sanyal, Amogh Joshi, Adarsh Kosta, Kaushik Roy

    Abstract: Neuromorphic vision, inspired by biological neural systems, has recently gained significant attention for its potential in enhancing robotic autonomy. This paper presents a systematic exploration of a proposed Neuromorphic Navigation framework that uses event-based neuromorphic vision to enable efficient, real-time navigation in robotic systems. We discuss the core concepts of neuromorphic vision… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  23. arXiv:2503.02326  [pdf, other

    cs.GT physics.soc-ph

    A differential model of $N$ player games concerning ethical dilemmas

    Authors: Ramkrishna Joshi, Aniruddha Joshi

    Abstract: Ethics play an important role in determining the behavior of an individual under certain circumstances. Ethical or unethical behavior can be treated as a strategy of a player in a pay-off game. In this paper, we present two analytical solutions to studying time evolution of behavior of an individual from ethics perspective. We also present the effect of a third player as a perturbation to a two pl… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: 21 pages

  24. Deep Learning based approach to detect Customer Age, Gender and Expression in Surveillance Video

    Authors: Earnest Paul Ijjina, Goutham Kanahasabai, Aniruddha Srinivas Joshi

    Abstract: In the current information era, customer analytics play a key role in the success of any business. Since customer demographics primarily dictate their preferences, identification and utilization of age & gender information of customers in sales forecasting, may maximize retail sales. In this work, we propose a computer vision based approach to age and gender prediction in surveillance video. The p… ▽ More

    Submitted 1 March, 2025; originally announced March 2025.

    Journal ref: Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT)

  25. Customer Analytics using Surveillance Video

    Authors: Earnest Paul Ijjina, Aniruddha Srinivas Joshi, Goutham Kanahasabai, Keerthi Priyanka P

    Abstract: The analysis of sales information, is a vital step in designing an effective marketing strategy. This work proposes a novel approach to analyse the shopping behaviour of customers to identify their purchase patterns. An extended version of the Multi-Cluster Overlapping k-Means Extension (MCOKE) algorithm with weighted k-Means algorithm is utilized to map customers to the garments of interest. The… ▽ More

    Submitted 1 March, 2025; originally announced March 2025.

    Journal ref: Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT)

  26. Detection of Customer Interested Garments in Surveillance Video using Computer Vision

    Authors: Earnest Paul Ijjina, Aniruddha Srinivas Joshi, Goutham Kanahasabai

    Abstract: One of the basic requirements of humans is clothing and this approach aims to identify the garments selected by customer during shopping, from surveillance video. The existing approaches to detect garments were developed on western wear using datasets of western clothing. They do not address Indian garments due to the increased complexity. In this work, we propose a computer vision based framework… ▽ More

    Submitted 1 March, 2025; originally announced March 2025.

    Journal ref: Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT)

  27. arXiv:2502.05938  [pdf, other

    cs.RO

    Energy-Efficient Autonomous Aerial Navigation with Dynamic Vision Sensors: A Physics-Guided Neuromorphic Approach

    Authors: Sourav Sanyal, Amogh Joshi, Manish Nagaraj, Rohan Kumar Manna, Kaushik Roy

    Abstract: Vision-based object tracking is a critical component for achieving autonomous aerial navigation, particularly for obstacle avoidance. Neuromorphic Dynamic Vision Sensors (DVS) or event cameras, inspired by biological vision, offer a promising alternative to conventional frame-based cameras. These cameras can detect changes in intensity asynchronously, even in challenging lighting conditions, with… ▽ More

    Submitted 23 April, 2025; v1 submitted 9 February, 2025; originally announced February 2025.

    Comments: This work has been accepted for presentation at the 2025 IEEE International Joint Conference on Neural Networks (IJCNN), June 30 - July 5, 2025, Rome, Italy

  28. arXiv:2501.19259  [pdf, other

    cs.RO cs.CV cs.LG cs.NE eess.SY

    Neuro-LIFT: A Neuromorphic, LLM-based Interactive Framework for Autonomous Drone FlighT at the Edge

    Authors: Amogh Joshi, Sourav Sanyal, Kaushik Roy

    Abstract: The integration of human-intuitive interactions into autonomous systems has been limited. Traditional Natural Language Processing (NLP) systems struggle with context and intent understanding, severely restricting human-robot interaction. Recent advancements in Large Language Models (LLMs) have transformed this dynamic, allowing for intuitive and high-level communication through speech and text, an… ▽ More

    Submitted 26 April, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

    Comments: Accepted for publication at the International Joint Conference on Neural Networks (IJCNN) 2025

  29. arXiv:2501.17123  [pdf, other

    cs.CR cs.NE

    Hybrid Deep Learning Model for Multiple Cache Side Channel Attacks Detection: A Comparative Analysis

    Authors: Tejal Joshi, Aarya Kawalay, Anvi Jamkhande, Amit Joshi

    Abstract: Cache side channel attacks are a sophisticated and persistent threat that exploit vulnerabilities in modern processors to extract sensitive information. These attacks leverage weaknesses in shared computational resources, particularly the last level cache, to infer patterns in data access and execution flows, often bypassing traditional security defenses. Such attacks are especially dangerous as t… ▽ More

    Submitted 28 January, 2025; originally announced January 2025.

    Comments: 8 pages, 4 figures. Accepted in IEEE's 2nd International Conference on Computational Intelligence and Network Systems

  30. arXiv:2501.15464  [pdf, other

    cs.CV cs.AI

    TractoGPT: A GPT architecture for White Matter Segmentation

    Authors: Anoushkrit Goel, Simroop Singh, Ankita Joshi, Ranjeet Ranjan Jha, Chirag Ahuja, Aditya Nigam, Arnav Bhavsar

    Abstract: White matter bundle segmentation is crucial for studying brain structural connectivity, neurosurgical planning, and neurological disorders. White Matter Segmentation remains challenging due to structural similarity in streamlines, subject variability, symmetry in 2 hemispheres, etc. To address these challenges, we propose TractoGPT, a GPT-based architecture trained on streamline, cluster, and fusi… ▽ More

    Submitted 21 February, 2025; v1 submitted 26 January, 2025; originally announced January 2025.

    Comments: Accepted as a conference paper at 23rd IEEE International Symposium on Biomedical Imaging 2025. IEEE holds the copyright for this publication

  31. arXiv:2501.12482  [pdf, other

    cs.CV cs.ET cs.LG cs.NE cs.RO

    TOFFE -- Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking

    Authors: Adarsh Kumar Kosta, Amogh Joshi, Arjun Roy, Rohan Kumar Manna, Manish Nagaraj, Kaushik Roy

    Abstract: Object detection and tracking is an essential perception task for enabling fully autonomous navigation in robotic systems. Edge robot systems such as small drones need to execute complex maneuvers at high-speeds with limited resources, which places strict constraints on the underlying algorithms and hardware. Traditionally, frame-based cameras are used for vision-based perception due to their rich… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: 8 pages, 6 figures, 4 tables

  32. arXiv:2501.11440  [pdf, other

    cs.CL

    RACCOON: A Retrieval-Augmented Generation Approach for Location Coordinate Capture from News Articles

    Authors: Jonathan Lin, Aditya Joshi, Hye-young Paik, Tri Dung Doung, Deepti Gurdasani

    Abstract: Geocoding involves automatic extraction of location coordinates of incidents reported in news articles, and can be used for epidemic intelligence or disaster management. This paper introduces Retrieval-Augmented Coordinate Capture Of Online News articles (RACCOON), an open-source geocoding approach that extracts geolocations from news articles. RACCOON uses a retrieval-augmented generation (RAG) a… ▽ More

    Submitted 20 January, 2025; originally announced January 2025.

    Comments: Accepted at WWW 2025 as a short paper. 4 pages with references

  33. arXiv:2501.08552  [pdf, other

    cs.AI cs.GR cs.HC cs.LG

    Reinforcement Learning-Enhanced Procedural Generation for Dynamic Narrative-Driven AR Experiences

    Authors: Aniruddha Srinivas Joshi

    Abstract: Procedural Content Generation (PCG) is widely used to create scalable and diverse environments in games. However, existing methods, such as the Wave Function Collapse (WFC) algorithm, are often limited to static scenarios and lack the adaptability required for dynamic, narrative-driven applications, particularly in augmented reality (AR) games. This paper presents a reinforcement learning-enhanced… ▽ More

    Submitted 13 March, 2025; v1 submitted 14 January, 2025; originally announced January 2025.

    Comments: Published in Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - GRAPP 2025 https://www.scitepress.org/PublicationsDetail.aspx?ID=LfPv9Lfiya8=&t=1

  34. arXiv:2501.06918  [pdf

    stat.ME cs.CV

    Driver Age and Its Effect on Key Driving Metrics: Insights from Dynamic Vehicle Data

    Authors: Aparna Joshi, Kojo Adugyamfi, Jennifer Merickel, Pujitha Gunaratne, Anuj Sharma

    Abstract: By 2030, the senior population aged 65 and older is expected to increase by over 50%, significantly raising the number of older drivers on the road. Drivers over 70 face higher crash death rates compared to those in their forties and fifties, underscoring the importance of developing more effective safety interventions for this demographic. Although the impact of aging on driving behavior has been… ▽ More

    Submitted 12 January, 2025; originally announced January 2025.

    Comments: 21 pages, 9 figures, 4 Tables, 104th TRB Annual Meeting 2025, Washington DC

  35. A Temporal Convolutional Network-based Approach for Network Intrusion Detection

    Authors: Rukmini Nazre, Rujuta Budke, Omkar Oak, Suraj Sawant, Amit Joshi

    Abstract: Network intrusion detection is critical for securing modern networks, yet the complexity of network traffic poses significant challenges to traditional methods. This study proposes a Temporal Convolutional Network(TCN) model featuring a residual block architecture with dilated convolutions to capture dependencies in network traffic data while ensuring training stability. The TCN's ability to proce… ▽ More

    Submitted 23 December, 2024; originally announced December 2024.

    Comments: Paper presented at IEEE 2nd International Conference on Integrated Intelligence and Communication Systems (ICIICS) 2024

  36. arXiv:2412.16197  [pdf, other

    eess.IV cs.CE cs.CV cs.LG

    Generalizable Representation Learning for fMRI-based Neurological Disorder Identification

    Authors: Wenhui Cui, Haleh Akrami, Anand A. Joshi, Richard M. Leahy

    Abstract: Despite the impressive advances achieved using deep learning for functional brain activity analysis, the heterogeneity of functional patterns and the scarcity of imaging data still pose challenges in tasks such as identifying neurological disorders. For functional Magnetic Resonance Imaging (fMRI), while data may be abundantly available from healthy controls, clinical data is often scarce, especia… ▽ More

    Submitted 28 May, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

    Comments: Accepted by TMLR

  37. Identifying Bias in Deep Neural Networks Using Image Transforms

    Authors: Sai Teja Erukude, Akhil Joshi, Lior Shamir

    Abstract: CNNs have become one of the most commonly used computational tool in the past two decades. One of the primary downsides of CNNs is that they work as a ``black box", where the user cannot necessarily know how the image data are analyzed, and therefore needs to rely on empirical evaluation to test the efficacy of a trained CNN. This can lead to hidden biases that affect the performance evaluation of… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

    Comments: Computers, published

    Journal ref: Computers 2024, 13(12), 341

  38. arXiv:2412.04726  [pdf, ps, other

    cs.CL cs.AI

    BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English

    Authors: Dipankar Srirag, Aditya Joshi, Jordan Painter, Diptesh Kanojia

    Abstract: Despite large language models (LLMs) being known to exhibit bias against non-standard language varieties, there are no known labelled datasets for sentiment analysis of English. To address this gap, we introduce BESSTIE, a benchmark for sentiment and sarcasm classification for three varieties of English: Australian (en-AU), Indian (en-IN), and British (en-UK). We collect datasets for these languag… ▽ More

    Submitted 17 June, 2025; v1 submitted 5 December, 2024; originally announced December 2024.

    Comments: Findings of ACL: ACL 2025

  39. arXiv:2412.01671  [pdf, other

    cs.CR

    Verified Foundations for Differential Privacy

    Authors: Markus de Medeiros, Muhammad Naveed, Tancrède Lepoint, Temesghen Kahsai, Tristan Ravitch, Stefan Zetzsche, Anjali Joshi, Joseph Tassarotti, Aws Albarghouthi, Jean-Baptiste Tristan

    Abstract: Differential privacy (DP) has become the gold standard for privacy-preserving data analysis, but implementing it correctly has proven challenging. Prior work has focused on verifying DP at a high level, assuming the foundations are correct and a perfect source of randomness is available. However, the underlying theory of differential privacy can be very complex and subtle. Flaws in basic mechanism… ▽ More

    Submitted 29 April, 2025; v1 submitted 2 December, 2024; originally announced December 2024.

  40. arXiv:2411.19500  [pdf, other

    cs.CL cs.AI cs.LG

    COLD: Causal reasOning in cLosed Daily activities

    Authors: Abhinav Joshi, Areeb Ahmad, Ashutosh Modi

    Abstract: Large Language Models (LLMs) have shown state-of-the-art performance in a variety of tasks, including arithmetic and reasoning; however, to gauge the intellectual capabilities of LLMs, causal reasoning has become a reliable proxy for validating a general understanding of the mechanics and intricacies of the world similar to humans. Previous works in natural language processing (NLP) have either fo… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

    Comments: Paper accepted at NeurIPS 2024; Total 37 Pages

  41. arXiv:2411.15477  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Towards Robust Evaluation of Unlearning in LLMs via Data Transformations

    Authors: Abhinav Joshi, Shaswati Saha, Divyaksh Shukla, Sriram Vema, Harsh Jhamtani, Manas Gaur, Ashutosh Modi

    Abstract: Large Language Models (LLMs) have shown to be a great success in a wide range of applications ranging from regular NLP-based use cases to AI agents. LLMs have been trained on a vast corpus of texts from various sources; despite the best efforts during the data pre-processing stage while training the LLMs, they may pick some undesirable information such as personally identifiable information (PII).… ▽ More

    Submitted 23 November, 2024; originally announced November 2024.

    Comments: Accepted at EMNLP 2024 Findings; 21 pages (5 page main content + references + appendix)

  42. arXiv:2411.10730  [pdf, other

    cs.CL cs.CR

    Comparison of Multilingual and Bilingual Models for Satirical News Detection of Arabic and English

    Authors: Omar W. Abdalla, Aditya Joshi, Rahat Masood, Salil S. Kanhere

    Abstract: Satirical news is real news combined with a humorous comment or exaggerated content, and it often mimics the format and style of real news. However, satirical news is often misunderstood as misinformation, especially by individuals from different cultural and social backgrounds. This research addresses the challenge of distinguishing satire from truthful news by leveraging multilingual satire dete… ▽ More

    Submitted 16 November, 2024; originally announced November 2024.

    Comments: ALTA 2024 (Selected for publication)

  43. TractoEmbed: Modular Multi-level Embedding framework for white matter tract segmentation

    Authors: Anoushkrit Goel, Bipanjit Singh, Ankita Joshi, Ranjeet Ranjan Jha, Chirag Ahuja, Aditya Nigam, Arnav Bhavsar

    Abstract: White matter tract segmentation is crucial for studying brain structural connectivity and neurosurgical planning. However, segmentation remains challenging due to issues like class imbalance between major and minor tracts, structural similarity, subject variability, symmetric streamlines between hemispheres etc. To address these challenges, we propose TractoEmbed, a modular multi-level embedding f… ▽ More

    Submitted 12 November, 2024; originally announced November 2024.

    Comments: Accepted at 27th International Conference on Pattern Recognition (ICPR), 2024 15 pages, 2 figures

  44. arXiv:2411.05757  [pdf, other

    cs.LG

    Tract-RLFormer: A Tract-Specific RL policy based Decoder-only Transformer Network

    Authors: Ankita Joshi, Ashutosh Sharma, Anoushkrit Goel, Ranjeet Ranjan Jha, Chirag Ahuja, Arnav Bhavsar, Aditya Nigam

    Abstract: Fiber tractography is a cornerstone of neuroimaging, enabling the detailed mapping of the brain's white matter pathways through diffusion MRI. This is crucial for understanding brain connectivity and function, making it a valuable tool in neurological applications. Despite its importance, tractography faces challenges due to its complexity and susceptibility to false positives, misrepresenting vit… ▽ More

    Submitted 14 November, 2024; v1 submitted 8 November, 2024; originally announced November 2024.

    Comments: Accepted at 27th International Conference on Pattern Recognition (ICPR), 2024

  45. arXiv:2411.00238  [pdf, other

    cs.AI cs.CV cs.LG q-bio.NC

    Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem

    Authors: Declan Campbell, Sunayana Rane, Tyler Giallanza, Nicolò De Sabbata, Kia Ghods, Amogh Joshi, Alexander Ku, Steven M. Frankland, Thomas L. Griffiths, Jonathan D. Cohen, Taylor W. Webb

    Abstract: Recent work has documented striking heterogeneity in the performance of state-of-the-art vision language models (VLMs), including both multimodal language models and text-to-image models. These models are able to describe and generate a diverse array of complex, naturalistic images, yet they exhibit surprising failures on basic multi-object reasoning tasks -- such as counting, localization, and si… ▽ More

    Submitted 16 April, 2025; v1 submitted 31 October, 2024; originally announced November 2024.

  46. arXiv:2410.11230  [pdf, other

    cs.CL

    "Is Hate Lost in Translation?": Evaluation of Multilingual LGBTQIA+ Hate Speech Detection

    Authors: Fai Leui Chan, Duke Nguyen, Aditya Joshi

    Abstract: This paper explores the challenges of detecting LGBTQIA+ hate speech of large language models across multiple languages, including English, Italian, Chinese and (code-switched) English-Tamil, examining the impact of machine translation and whether the nuances of hate speech are preserved across translation. We examine the hate speech detection ability of zero-shot and fine-tuned GPT. Our findings… ▽ More

    Submitted 23 October, 2024; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: Under review

  47. arXiv:2410.11216  [pdf, other

    cs.CL

    Experiences from Creating a Benchmark for Sentiment Classification for Varieties of English

    Authors: Dipankar Srirag, Jordan Painter, Aditya Joshi, Diptesh Kanojia

    Abstract: Existing benchmarks often fail to account for linguistic diversity, like language variants of English. In this paper, we share our experiences from our ongoing project of building a sentiment classification benchmark for three variants of English: Australian (en-AU), Indian (en-IN), and British (en-UK) English. Using Google Places reviews, we explore the effects of various sampling techniques base… ▽ More

    Submitted 12 November, 2024; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: Under review

  48. arXiv:2410.06385  [pdf, other

    eess.IV cs.AI cs.CV

    Skin Cancer Machine Learning Model Tone Bias

    Authors: James Pope, Md Hassanuzzaman, William Chapman, Huw Day, Mingmar Sherpa, Omar Emara, Nirmala Adhikari, Ayush Joshi

    Abstract: Background: Many open-source skin cancer image datasets are the result of clinical trials conducted in countries with lighter skin tones. Due to this tone imbalance, machine learning models derived from these datasets can perform well at detecting skin cancer for lighter skin tones. Any tone bias in these models could introduce fairness concerns and reduce public trust in the artificial intelligen… ▽ More

    Submitted 19 March, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

  49. Neural Light Spheres for Implicit Image Stitching and View Synthesis

    Authors: Ilya Chugunov, Amogh Joshi, Kiran Murthy, Francois Bleibel, Felix Heide

    Abstract: Challenging to capture, and challenging to display on a cellphone screen, the panorama paradoxically remains both a staple and underused feature of modern mobile camera applications. In this work we address both of these challenges with a spherical neural light field model for implicit panoramic image stitching and re-rendering; able to accommodate for depth parallax, view-dependent lighting, and… ▽ More

    Submitted 26 March, 2025; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: Project site: https://light.princeton.edu/publication/neuls/

  50. arXiv:2409.17315  [pdf, other

    cs.LG cs.AI cs.CR

    KIPPS: Knowledge infusion in Privacy Preserving Synthetic Data Generation

    Authors: Anantaa Kotal, Anupam Joshi

    Abstract: The integration of privacy measures, including differential privacy techniques, ensures a provable privacy guarantee for the synthetic data. However, challenges arise for Generative Deep Learning models when tasked with generating realistic data, especially in critical domains such as Cybersecurity and Healthcare. Generative Models optimized for continuous data struggle to model discrete and non-G… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.