Skip to main content

Showing 1–50 of 1,333 results for author: Kumar, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.01778  [pdf

    cs.IT cs.CV

    A Hybrid Ensemble Learning Framework for Image-Based Solar Panel Classification

    Authors: Vivek Tetarwal, Sandeep Kumar

    Abstract: The installation of solar energy systems is on the rise, and therefore, appropriate maintenance techniques are required to be used in order to maintain maximum performance levels. One of the major challenges is the automated discrimination between clean and dirty solar panels. This paper presents a novel Dual Ensemble Neural Network (DENN) to classify solar panels using image-based features. The s… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    Comments: 6 pages

  2. arXiv:2507.01350  [pdf, ps, other

    eess.SY cs.MA cs.RO

    Cooperative Target Capture in 3D Engagements over Switched Dynamic Graphs

    Authors: Abhinav Sinha, Shashi Ranjan Kumar

    Abstract: This paper presents a leaderless cooperative guidance strategy for simultaneous time-constrained interception of a stationary target when the interceptors exchange information over switched dynamic graphs. We specifically focus on scenarios when the interceptors lack radial acceleration capabilities, relying solely on their lateral acceleration components. This consideration aligns with their inhe… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  3. arXiv:2507.00644  [pdf, ps, other

    cs.RO

    Parallel Transmission Aware Co-Design: Enhancing Manipulator Performance Through Actuation-Space Optimization

    Authors: Rohit Kumar, Melya Boukheddimi, Dennis Mronga, Shivesh Kumar, Frank Kirchner

    Abstract: In robotics, structural design and behavior optimization have long been considered separate processes, resulting in the development of systems with limited capabilities. Recently, co-design methods have gained popularity, where bi-level formulations are used to simultaneously optimize the robot design and behavior for specific tasks. However, most implementations assume a serial or tree-type model… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  4. arXiv:2506.22517  [pdf

    cs.CV

    Container damage detection using advanced computer vision model Yolov12 vs Yolov11 vs RF-DETR A comparative analysis

    Authors: Subhadip Kumar

    Abstract: Containers are an integral part of the logistics industry and act as a barrier for cargo. A typical service life for a container is more than 20 years. However, overtime containers suffer various types of damage due to the mechanical as well as natural factors. A damaged container is a safety hazard for the employees handling it and a liability for the logistic company. Therefore, a timely inspect… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  5. arXiv:2506.21931  [pdf, ps, other

    cs.IR cs.AI cs.CL cs.MA

    ARAG: Agentic Retrieval Augmented Generation for Personalized Recommendation

    Authors: Reza Yousefi Maragheh, Pratheek Vadla, Priyank Gupta, Kai Zhao, Aysenur Inan, Kehui Yao, Jianpeng Xu, Praveen Kanumala, Jason Cho, Sushant Kumar

    Abstract: Retrieval-Augmented Generation (RAG) has shown promise in enhancing recommendation systems by incorporating external context into large language model prompts. However, existing RAG-based approaches often rely on static retrieval heuristics and fail to capture nuanced user preferences in dynamic recommendation scenarios. In this work, we introduce ARAG, an Agentic Retrieval-Augmented Generation fr… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    ACM Class: I.2.11; I.2.7; H.3.3

  6. arXiv:2506.19548  [pdf, ps, other

    cs.CL cs.IR

    Health Sentinel: An AI Pipeline For Real-time Disease Outbreak Detection

    Authors: Devesh Pant, Rishi Raj Grandhe, Vipin Samaria, Mukul Paul, Sudhir Kumar, Saransh Khanna, Jatin Agrawal, Jushaan Singh Kalra, Akhil VSSG, Satish V Khalikar, Vipin Garg, Himanshu Chauhan, Pranay Verma, Neha Khandelwal, Soma S Dhavala, Minesh Mathew

    Abstract: Early detection of disease outbreaks is crucial to ensure timely intervention by the health authorities. Due to the challenges associated with traditional indicator-based surveillance, monitoring informal sources such as online media has become increasingly popular. However, owing to the number of online articles getting published everyday, manual screening of the articles is impractical. To addre… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  7. arXiv:2506.19087  [pdf, ps, other

    cs.CV cs.AI

    RareSpot: Spotting Small and Rare Wildlife in Aerial Imagery with Multi-Scale Consistency and Context-Aware Augmentation

    Authors: Bowen Zhang, Jesse T. Boulerice, Nikhil Kuniyil, Charvi Mendiratta, Satish Kumar, Hila Shamon, B. S. Manjunath

    Abstract: Automated detection of small and rare wildlife in aerial imagery is crucial for effective conservation, yet remains a significant technical challenge. Prairie dogs exemplify this issue: their ecological importance as keystone species contrasts sharply with their elusive presence--marked by small size, sparse distribution, and subtle visual features--which undermines existing detection approaches.… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: Accepted to the CVPR 2025 Workshop on Computer Vision for Animal Behavior Tracking and Modeling (CV4Animals)

  8. arXiv:2506.18535  [pdf, ps, other

    cs.CL cs.IR

    When Fine-Tuning Fails: Lessons from MS MARCO Passage Ranking

    Authors: Manu Pande, Shahil Kumar, Anay Yatin Damle

    Abstract: This paper investigates the counterintuitive phenomenon where fine-tuning pre-trained transformer models degrades performance on the MS MARCO passage ranking task. Through comprehensive experiments involving five model variants-including full parameter fine-tuning and parameter efficient LoRA adaptations-we demonstrate that all fine-tuning approaches underperform the base sentence-transformers/all… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  9. arXiv:2506.18297  [pdf, ps, other

    cs.IR

    Comparative Analysis of Lion and AdamW Optimizers for Cross-Encoder Reranking with MiniLM, GTE, and ModernBERT

    Authors: Shahil Kumar, Manu Pande, Anay Yatin Damle

    Abstract: Modern information retrieval systems often employ a two-stage pipeline: an efficient initial retrieval stage followed by a computationally intensive reranking stage. Cross-encoders have shown strong effectiveness for reranking due to their deep analysis of query-document pairs. This paper studies the impact of the Lion optimizer, a recent alternative to AdamW, during fine-tuning of cross-encoder r… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  10. arXiv:2506.17765  [pdf, ps, other

    cs.IR cs.AI

    CARTS: Collaborative Agents for Recommendation Textual Summarization

    Authors: Jiao Chen, Kehui Yao, Reza Yousefi Maragheh, Kai Zhao, Jianpeng Xu, Jason Cho, Evren Korpeoglu, Sushant Kumar, Kannan Achan

    Abstract: Current recommendation systems often require some form of textual data summarization, such as generating concise and coherent titles for product carousels or other grouped item displays. While large language models have shown promise in NLP domains for textual summarization, these approaches do not directly apply to recommendation systems, where explanations must be highly relevant to the core fea… ▽ More

    Submitted 1 July, 2025; v1 submitted 21 June, 2025; originally announced June 2025.

  11. arXiv:2506.17349  [pdf, ps, other

    cs.CR

    AndroIDS : Android-based Intrusion Detection System using Federated Learning

    Authors: Akarsh K Nair, Shanik Hubert Satheesh Kumar., Deepti Gupta

    Abstract: The exponential growth of android-based mobile IoT systems has significantly increased the susceptibility of devices to cyberattacks, particularly in smart homes, UAVs, and other connected mobile environments. This article presents a federated learning-based intrusion detection framework called AndroIDS that leverages system call traces as a personalized and privacy-preserving data source. Unlike… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  12. arXiv:2506.16698  [pdf, ps, other

    cs.LG

    SIDE: Semantic ID Embedding for effective learning from sequences

    Authors: Dinesh Ramasamy, Shakti Kumar, Chris Cadonic, Jiaxin Yang, Sohini Roychowdhury, Esam Abdel Rhman, Srihari Reddy

    Abstract: Sequence-based recommendations models are driving the state-of-the-art for industrial ad-recommendation systems. Such systems typically deal with user histories or sequence lengths ranging in the order of O(10^3) to O(10^4) events. While adding embeddings at this scale is manageable in pre-trained models, incorporating them into real-time prediction models is challenging due to both storage and in… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

    Comments: 7 pages, 4 images, 6 tables

    Journal ref: KDD workshop, 2025

  13. arXiv:2506.15906  [pdf, ps, other

    stat.ML cs.LG

    From Local Interactions to Global Operators: Scalable Gaussian Process Operator for Physical Systems

    Authors: Sawan Kumar, Tapas Tripura, Rajdip Nayek, Souvik Chakraborty

    Abstract: Operator learning offers a powerful paradigm for solving parametric partial differential equations (PDEs), but scaling probabilistic neural operators such as the recently proposed Gaussian Processes Operators (GPOs) to high-dimensional, data-intensive regimes remains a significant challenge. In this work, we introduce a novel, scalable GPO, which capitalizes on sparsity, locality, and structural i… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  14. arXiv:2506.15751  [pdf, ps, other

    cs.AI cs.CL cs.LG

    Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts

    Authors: Kartik Sharma, Yiqiao Jin, Vineeth Rakesh, Yingtong Dou, Menghai Pan, Mahashweta Das, Srijan Kumar

    Abstract: As large language models (LLMs) are deployed in safety-critical settings, it is essential to ensure that their responses comply with safety standards. Prior research has revealed that LLMs often fail to grasp the notion of safe behaviors, resulting in either unjustified refusals to harmless prompts or the generation of harmful content. While substantial efforts have been made to improve their robu… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  15. arXiv:2506.15488  [pdf, ps, other

    cs.DC

    Minimizing Communication for Parallel Symmetric Tensor Times Same Vector Computation

    Authors: Hussam Al Daas, Grey Ballard, Laura Grigori, Suraj Kumar, Kathryn Rouse, Mathieu Vérité

    Abstract: In this article, we focus on the parallel communication cost of multiplying the same vector along two modes of a $3$-dimensional symmetric tensor. This is a key computation in the higher-order power method for determining eigenpairs of a $3$-dimensional symmetric tensor and in gradient-based methods for computing a symmetric CP decomposition. We establish communication lower bounds that determine… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: 19 pages, 1 figure

  16. arXiv:2506.14821  [pdf, ps, other

    cs.LG cs.AI cs.CV

    Reinforcing VLMs to Use Tools for Detailed Visual Reasoning Under Resource Constraints

    Authors: Sunil Kumar, Bowen Zhao, Leo Dirac, Paulina Varshavskaya

    Abstract: Despite tremendous recent advances in large model reasoning ability, vision-language models (VLMs) still struggle with detailed visual reasoning, especially when compute resources are limited. To address this challenge, we draw inspiration from methods like Deepseek-r1 for VLMs and train smaller-scale models with Group Relative Policy Optimization (GRPO) to use external tools such as zoom. The gre… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  17. arXiv:2506.14434  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Unifying Streaming and Non-streaming Zipformer-based ASR

    Authors: Bidisha Sharma, Karthik Pandia Durai, Shankar Venkatesan, Jeena J Prakash, Shashi Kumar, Malolan Chetlur, Andreas Stolcke

    Abstract: There has been increasing interest in unifying streaming and non-streaming automatic speech recognition (ASR) models to reduce development, training, and deployment costs. We present a unified framework that trains a single end-to-end ASR model for both streaming and non-streaming applications, leveraging future context information. We propose to use dynamic right-context through the chunked atten… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Accepted in ACL2025 Industry track

  18. arXiv:2506.13800  [pdf, ps, other

    cs.SE cs.AI

    Enhancing Clinical Decision Support and EHR Insights through LLMs and the Model Context Protocol: An Open-Source MCP-FHIR Framework

    Authors: Abul Ehtesham, Aditi Singh, Saket Kumar

    Abstract: Enhancing clinical decision support (CDS), reducing documentation burdens, and improving patient health literacy remain persistent challenges in digital health. This paper presents an open-source, agent-based framework that integrates Large Language Models (LLMs) with HL7 FHIR data via the Model Context Protocol (MCP) for dynamic extraction and reasoning over electronic health records (EHRs). Buil… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  19. arXiv:2506.13432  [pdf, ps, other

    cs.RO

    Adaptive Model-Base Control of Quadrupeds via Online System Identification using Kalman Filter

    Authors: Jonas Haack, Franek Stark, Shubham Vyas, Frank Kirchner, Shivesh Kumar

    Abstract: Many real-world applications require legged robots to be able to carry variable payloads. Model-based controllers such as model predictive control (MPC) have become the de facto standard in research for controlling these systems. However, most model-based control architectures use fixed plant models, which limits their applicability to different tasks. In this paper, we present a Kalman filter (KF… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: 6 pages, 5 figures, 1 table, accepted for IEEE IROS 2025

  20. arXiv:2506.13028  [pdf, ps, other

    cs.OS cs.AI

    NaSh: Guardrails for an LLM-Powered Natural Language Shell

    Authors: Bimal Raj Gyawali, Saikrishna Achalla, Konstantinos Kallas, Sam Kumar

    Abstract: We explore how a shell that uses an LLM to accept natural language input might be designed differently from the shells of today. As LLMs may produce unintended or unexplainable outputs, we argue that a natural language shell should provide guardrails that empower users to recover from such errors. We concretize some ideas for doing so by designing a new shell called NaSh, identify remaining open p… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: 7 pages, 3 figures

  21. arXiv:2506.12655  [pdf, ps, other

    math.ST cs.LG stat.ML

    Beyond Sin-Squared Error: Linear-Time Entrywise Uncertainty Quantification for Streaming PCA

    Authors: Syamantak Kumar, Shourya Pandey, Purnamrita Sarkar

    Abstract: We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja's algorithm, enabling the construction of confidence intervals for individual entries of the estimated eigenvector. Most existing works on streaming PCA focus on providing sharp sin-squared error guarantees. Recently, there has been some interest in uncertainty quantification for the sin-s… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  22. arXiv:2506.11030  [pdf, ps, other

    cs.LG cs.AI

    Forward Target Propagation: A Forward-Only Approach to Global Error Credit Assignment via Local Losses

    Authors: Nazmus Saadat As-Saquib, A N M Nafiz Abeer, Hung-Ta Chien, Byung-Jun Yoon, Suhas Kumar, Su-in Yi

    Abstract: Training neural networks has traditionally relied on backpropagation (BP), a gradient-based algorithm that, despite its widespread success, suffers from key limitations in both biological and hardware perspectives. These include backward error propagation by symmetric weights, non-local credit assignment, and frozen activity during backward passes. We propose Forward Target Propagation (FTP), a bi… ▽ More

    Submitted 20 May, 2025; originally announced June 2025.

  23. arXiv:2506.10999  [pdf

    cs.SE cs.AI

    Automated Validation of COBOL to Java Transformation

    Authors: Atul Kumar, Diptikalyan Saha, Toshikai Yasue, Kohichi Ono, Saravanan Krishnan, Sandeep Hans, Fumiko Satoh, Gerald Mitchell, Sachin Kumar

    Abstract: Recent advances in Large Language Model (LLM) based Generative AI techniques have made it feasible to translate enterpriselevel code from legacy languages such as COBOL to modern languages such as Java or Python. While the results of LLM-based automatic transformation are encouraging, the resulting code cannot be trusted to correctly translate the original code. We propose a framework and a tool t… ▽ More

    Submitted 14 April, 2025; originally announced June 2025.

    Comments: arXiv admin note: text overlap with arXiv:2504.10548

    Journal ref: ASE 2024

  24. arXiv:2506.10486  [pdf, ps, other

    cs.CL

    Table-Text Alignment: Explaining Claim Verification Against Tables in Scientific Papers

    Authors: Xanh Ho, Sunisth Kumar, Yun-Ang Wu, Florian Boudin, Atsuhiro Takasu, Akiko Aizawa

    Abstract: Scientific claim verification against tables typically requires predicting whether a claim is supported or refuted given a table. However, we argue that predicting the final label alone is insufficient: it reveals little about the model's reasoning and offers limited interpretability. To address this, we reframe table-text alignment as an explanation task, requiring models to identify the table ce… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: 8 pages; code and data are available at https://github.com/Alab-NII/SciTabAlign

  25. arXiv:2506.09661  [pdf, ps, other

    eess.IV cs.CV q-bio.TO

    A Cytology Dataset for Early Detection of Oral Squamous Cell Carcinoma

    Authors: Garima Jain, Sanghamitra Pati, Mona Duggal, Amit Sethi, Abhijeet Patil, Gururaj Malekar, Nilesh Kowe, Jitender Kumar, Jatin Kashyap, Divyajeet Rout, Deepali, Hitesh, Nishi Halduniya, Sharat Kumar, Heena Tabassum, Rupinder Singh Dhaliwal, Sucheta Devi Khuraijam, Sushma Khuraijam, Sharmila Laishram, Simmi Kharb, Sunita Singh, K. Swaminadtan, Ranjana Solanki, Deepika Hemranjani, Shashank Nath Singh , et al. (12 additional authors not shown)

    Abstract: Oral squamous cell carcinoma OSCC is a major global health burden, particularly in several regions across Asia, Africa, and South America, where it accounts for a significant proportion of cancer cases. Early detection dramatically improves outcomes, with stage I cancers achieving up to 90 percent survival. However, traditional diagnosis based on histopathology has limited accessibility in low-res… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 7 pages, 2 figurs

  26. arXiv:2506.06644  [pdf, ps, other

    cs.LG stat.ML

    Spark Transformer: Reactivating Sparsity in FFN and Attention

    Authors: Chong You, Kan Wu, Zhipeng Jia, Lin Chen, Srinadh Bhojanapalli, Jiaxian Guo, Utku Evci, Jan Wassenberg, Praneeth Netrapalli, Jeremiah J. Willcock, Suvinay Subramanian, Felix Chern, Alek Andreev, Shreya Pathak, Felix Yu, Prateek Jain, David E. Culler, Henry M. Levy, Sanjiv Kumar

    Abstract: The discovery of the lazy neuron phenomenon in trained Transformers, where the vast majority of neurons in their feed-forward networks (FFN) are inactive for each token, has spurred tremendous interests in activation sparsity for enhancing large model efficiency. While notable progress has been made in translating such sparsity to wall-time benefits, modern Transformers have moved away from the Re… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  27. arXiv:2506.06324  [pdf

    cs.AI

    Mapping Human-Agent Co-Learning and Co-Adaptation: A Scoping Review

    Authors: Shruti Kumar, Xiaoyu Chen, Xiaomei Wang

    Abstract: Several papers have delved into the challenges of human-AI-robot co-learning and co-adaptation. It has been noted that the terminology used to describe this collaborative relationship in existing studies needs to be more consistent. For example, the prefix "co" is used interchangeably to represent both "collaborative" and "mutual," and the terms "co-learning" and "co-adaptation" are sometimes used… ▽ More

    Submitted 29 May, 2025; originally announced June 2025.

    Comments: Abstract accepted to HFES 2024 Annual Meeting

  28. Optimizing Recall or Relevance? A Multi-Task Multi-Head Approach for Item-to-Item Retrieval in Recommendation

    Authors: Jiang Zhang, Sumit Kumar, Wei Chang, Yubo Wang, Feng Zhang, Weize Mao, Hanchao Yu, Aashu Singh, Min Li, Qifan Wang

    Abstract: The task of item-to-item (I2I) retrieval is to identify a set of relevant and highly engaging items based on a given trigger item. It is a crucial component in modern recommendation systems, where users' previously engaged items serve as trigger items to retrieve relevant content for future engagement. However, existing I2I retrieval models in industry are primarily built on co-engagement data and… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Journal ref: KDD 2025

  29. arXiv:2506.06067  [pdf, ps, other

    cs.OS

    Efficient Memory Tiering in a Virtual Machine

    Authors: Chandra Prakash, Aravinda Prasad, Sandeep Kumar, Sreenivas Subramoney

    Abstract: Memory tiering is the norm to effectively tackle the increasing server memory total cost of ownership (TCO) and the growing data demands of modern data center workloads. However, the host-based state-of-the-art memory tiering solutions can be inefficient for a virtualized environment when (i) the frequently accessed data are scattered across the guest physical address space or (ii) the accesses to… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  30. arXiv:2506.05836  [pdf, ps, other

    cs.SE

    Analysis of cost-efficiency of serverless approaches

    Authors: Nakhat Syeda, Harsh Shah, Rajvinder Singh, Suraj Jaju, Sumedha Kumar, Gourav Chhabra, Maria Spichkova

    Abstract: In this paper, we present a survey of research studies related to the cost-effectiveness of serverless approach and corresponding cost savings. We conducted a systematic literature review using Google Scholar search engine, covering the period from 2010 to 2024. We identified 34 related studies, from which we extracted 17 parameters that might influence the relative cost savings of applying the se… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  31. arXiv:2506.04981  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering

    Authors: Andres Carofilis, Pradeep Rangappa, Srikanth Madikeri, Shashi Kumar, Sergio Burdisso, Jeena Prakash, Esau Villatoro-Tello, Petr Motlicek, Bidisha Sharma, Kadri Hacioglu, Shankar Venkatesan, Saurabh Vyas, Andreas Stolcke

    Abstract: Fine-tuning pretrained ASR models for specific domains is challenging when labeled data is scarce. But unlabeled audio and labeled data from related domains are often available. We propose an incremental semi-supervised learning pipeline that first integrates a small in-domain labeled set and an auxiliary dataset from a closely related domain, achieving a relative improvement of 4% over no auxilia… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Accepted at Interspeech 2025, Netherlands

  32. Building a Few-Shot Cross-Domain Multilingual NLU Model for Customer Care

    Authors: Saurabh Kumar, Sourav Bansal, Neeraj Agrawal, Priyanka Bhatt

    Abstract: Customer care is an essential pillar of the e-commerce shopping experience with companies spending millions of dollars each year, employing automation and human agents, across geographies (like US, Canada, Mexico, Chile), channels (like Chat, Interactive Voice Response (IVR)), and languages (like English, Spanish). SOTA pre-trained models like multilingual-BERT, fine-tuned on annotated data have s… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Journal ref: ECAI 2023. IOS Press, 2023. 3212-3217

  33. Hierarchical Text Classification Using Contrastive Learning Informed Path Guided Hierarchy

    Authors: Neeraj Agrawal, Saurabh Kumar, Priyanka Bhatt, Tanishka Agarwal

    Abstract: Hierarchical Text Classification (HTC) has recently gained traction given the ability to handle complex label hierarchy. This has found applications in domains like E- commerce, customer care and medicine industry among other real-world applications. Existing HTC models either encode label hierarchy separately and mix it with text encoding or guide the label hierarchy structure in the text encoder… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: arXiv admin note: text overlap with arXiv:2203.03825 by other authors

    Journal ref: ECAI 2023, pp. 19-26. IOS Press, 2023

  34. arXiv:2506.03697  [pdf, ps, other

    quant-ph cs.LG

    RhoDARTS: Differentiable Quantum Architecture Search with Density Matrix Simulations

    Authors: Swagat Kumar, Jan-Nico Zaech, Colin Michael Wilmott, Luc Van Gool

    Abstract: Variational Quantum Algorithms (VQAs) are a promising approach for leveraging powerful Noisy Intermediate-Scale Quantum (NISQ) computers. When applied to machine learning tasks, VQAs give rise to NISQ-compatible Quantum Neural Networks (QNNs), which have been shown to outperform classical neural networks with a similar number of trainable parameters. While the quantum circuit structures of VQAs fo… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: 24 pages, 16 figures

  35. arXiv:2506.03681  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering

    Authors: Pradeep Rangappa, Andres Carofilis, Jeena Prakash, Shashi Kumar, Sergio Burdisso, Srikanth Madikeri, Esau Villatoro-Tello, Bidisha Sharma, Petr Motlicek, Kadri Hacioglu, Shankar Venkatesan, Saurabh Vyas, Andreas Stolcke

    Abstract: Fine-tuning pretrained ASR models for specific domains is challenging for small organizations with limited labeled data and computational resources. Here, we explore different data selection pipelines and propose a robust approach that improves ASR adaptation by filtering pseudo-labels generated using Whisper (encoder-decoder) and Zipformer (transducer) models. Our approach integrates multiple sel… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Accepted at Interspeech 2025, Netherlands

  36. arXiv:2506.00145  [pdf

    cs.CL cs.SD eess.AS

    Vedavani: A Benchmark Corpus for ASR on Vedic Sanskrit Poetry

    Authors: Sujeet Kumar, Pretam Ray, Abhinay Beerukuri, Shrey Kamoji, Manoj Balaji Jagadeeshan, Pawan Goyal

    Abstract: Sanskrit, an ancient language with a rich linguistic heritage, presents unique challenges for automatic speech recognition (ASR) due to its phonemic complexity and the phonetic transformations that occur at word junctures, similar to the connected speech found in natural conversations. Due to these complexities, there has been limited exploration of ASR in Sanskrit, particularly in the context of… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

  37. arXiv:2505.24253  [pdf, ps, other

    cs.CV cs.AI cs.MM

    Interactive Video Generation via Domain Adaptation

    Authors: Ishaan Rawal, Suryansh Kumar

    Abstract: Text-conditioned diffusion models have emerged as powerful tools for high-quality video generation. However, enabling Interactive Video Generation (IVG), where users control motion elements such as object trajectory, remains challenging. Recent training-free approaches introduce attention masking to guide trajectory, but this often degrades perceptual quality. We identify two key failure modes in… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: Preprint. Under Review

  38. arXiv:2505.20422  [pdf, ps, other

    cs.CL cs.AI

    SEMMA: A Semantic Aware Knowledge Graph Foundation Model

    Authors: Arvindh Arun, Sumit Kumar, Mojtaba Nayyeri, Bo Xiong, Ponnurangam Kumaraguru, Antonio Vergari, Steffen Staab

    Abstract: Knowledge Graph Foundation Models (KGFMs) have shown promise in enabling zero-shot reasoning over unseen graphs by learning transferable patterns. However, most existing KGFMs rely solely on graph structure, overlooking the rich semantic signals encoded in textual attributes. We introduce SEMMA, a dual-module KGFM that systematically integrates transferable textual semantics alongside structure. S… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  39. arXiv:2505.20329  [pdf

    cs.CY

    Generative AI in Computer Science Education: Accelerating Python Learning with ChatGPT

    Authors: Ian McCulloh, Pedro Rodriguez, Srivaths Kumar, Manu Gupta, Viplove Raj Sharma, Benjamin Johnson, Anthony N. Johnson

    Abstract: The increasing demand for digital literacy and artificial intelligence (AI) fluency in the workforce has highlighted the need for scalable, efficient programming instruction. This study evaluates the effectiveness of integrating generative AI, specifically OpenAIs ChatGPT, into a self-paced Python programming module embedded within a sixteen-week professional training course on applied generative… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  40. arXiv:2505.20323  [pdf, other

    cs.CL cs.AI cs.LG

    PMOA-TTS: Introducing the PubMed Open Access Textual Times Series Corpus

    Authors: Shahriar Noroozizadeh, Sayantan Kumar, George H. Chen, Jeremy C. Weiss

    Abstract: Understanding temporal dynamics in clinical narratives is essential for modeling patient trajectories, yet large-scale temporally annotated resources remain limited. We present PMOA-TTS, the first openly available dataset of 124,699 PubMed Open Access (PMOA) case reports, each converted into structured (event, time) timelines via a scalable LLM-based pipeline. Our approach combines heuristic filte… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  41. arXiv:2505.20189  [pdf, ps, other

    cs.DS cs.CR cs.LG stat.ML

    Private Geometric Median in Nearly-Linear Time

    Authors: Syamantak Kumar, Daogao Liu, Kevin Tian, Chutong Yang

    Abstract: Estimating the geometric median of a dataset is a robust counterpart to mean estimation, and is a fundamental problem in computational geometry. Recently, [HSU24] gave an $(\varepsilon, δ)$-differentially private algorithm obtaining an $α$-multiplicative approximation to the geometric median objective, $\frac 1 n \sum_{i \in [n]} \|\cdot - \mathbf{x}_i\|$, given a dataset… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  42. arXiv:2505.18220  [pdf, ps, other

    cs.CY cs.AI

    Navigating Pitfalls: Evaluating LLMs in Machine Learning Programming Education

    Authors: Smitha Kumar, Michael A. Lones, Manuel Maarek, Hind Zantout

    Abstract: The rapid advancement of Large Language Models (LLMs) has opened new avenues in education. This study examines the use of LLMs in supporting learning in machine learning education; in particular, it focuses on the ability of LLMs to identify common errors of practice (pitfalls) in machine learning code, and their ability to provide feedback that can guide learning. Using a portfolio of code sample… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: 29 pages

  43. arXiv:2505.15936  [pdf

    cs.ET cond-mat.mtrl-sci physics.app-ph

    Self-heating electrochemical memory for high-precision analog computing

    Authors: Adam L. Gross, Sangheon Oh, François Léonard, Wyatt Hodges, T. Patrick Xiao, Joshua D. Sugar, Jacklyn Zhu, Sritharini Radhakrishnan, Sangyong Lee, Jolie Wang, Adam Christensen, Sam Lilak, Patrick S. Finnegan, Patrick Crandall, Christopher H. Bennett, William Wahby, Robin Jacobs-Gedrim, Matthew J. Marinella, Suhas Kumar, Sapan Agarwal, Yiyang Li, A. Alec Talin, Elliot J. Fuller

    Abstract: Analog computers hold promise to significantly reduce the energy consumption of artificial intelligence algorithms, but commercialization has been hampered by a fundamental scientific challenge - how to reliably store and process analog information with high precision. We present an approach based upon metal oxide memory cells that undergo controlled self-heating during programming with a newly de… ▽ More

    Submitted 1 July, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

  44. arXiv:2505.15842  [pdf, ps, other

    cs.SI cs.LG

    AH-UGC: Adaptive and Heterogeneous-Universal Graph Coarsening

    Authors: Mohit Kataria, Shreyash Bhilwade, Sandeep Kumar, Jayadeva

    Abstract: $\textbf{Graph Coarsening (GC)}$ is a prominent graph reduction technique that compresses large graphs to enable efficient learning and inference. However, existing GC methods generate only one coarsened graph per run and must recompute from scratch for each new coarsening ratio, resulting in unnecessary overhead. Moreover, most prior approaches are tailored to $\textit{homogeneous}… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  45. arXiv:2505.14716  [pdf

    eess.IV cs.CV cs.ET cs.LG

    A Hybrid Quantum Classical Pipeline for X Ray Based Fracture Diagnosis

    Authors: Sahil Tomar, Rajeshwar Tripathi, Sandeep Kumar

    Abstract: Bone fractures are a leading cause of morbidity and disability worldwide, imposing significant clinical and economic burdens on healthcare systems. Traditional X ray interpretation is time consuming and error prone, while existing machine learning and deep learning solutions often demand extensive feature engineering, large, annotated datasets, and high computational resources. To address these ch… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: 8 pages

  46. arXiv:2505.14469  [pdf, ps, other

    cs.CL cs.AI

    Attributional Safety Failures in Large Language Models under Code-Mixed Perturbations

    Authors: Somnath Banerjee, Pratyush Chatterjee, Shanu Kumar, Sayan Layek, Parag Agrawal, Rima Hazra, Animesh Mukherjee

    Abstract: Recent advancements in LLMs have raised significant safety concerns, particularly when dealing with code-mixed inputs and outputs. Our study systematically investigates the increased susceptibility of LLMs to produce unsafe outputs from code-mixed prompts compared to monolingual English prompts. Utilizing explainability methods, we dissect the internal attribution shifts causing model's harmful be… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

  47. arXiv:2505.13550  [pdf, ps, other

    cs.IR cs.AI

    JIR-Arena: The First Benchmark Dataset for Just-in-time Information Recommendation

    Authors: Ke Yang, Kevin Ros, Shankar Kumar Senthil Kumar, ChengXiang Zhai

    Abstract: Just-in-time Information Recommendation (JIR) is a service designed to deliver the most relevant information precisely when users need it, , addressing their knowledge gaps with minimal effort and boosting decision-making and efficiency in daily life. Advances in device-efficient deployment of foundation models and the growing use of intelligent wearable devices have made always-on JIR assistants… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  48. arXiv:2505.13496  [pdf, other

    cs.AI cs.CL cs.LG

    ADALog: Adaptive Unsupervised Anomaly detection in Logs with Self-attention Masked Language Model

    Authors: Przemek Pospieszny, Wojciech Mormul, Karolina Szyndler, Sanjeev Kumar

    Abstract: Modern software systems generate extensive heterogeneous log data with dynamic formats, fragmented event sequences, and varying temporal patterns, making anomaly detection both crucial and challenging. To address these complexities, we propose ADALog, an adaptive, unsupervised anomaly detection framework designed for practical applicability across diverse real-world environments. Unlike traditiona… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: Conference paper accepted at ICMLT 2025; to appear in the IEEE Conference Proceedings

    ACM Class: I.2.6; I.2.7; I.5.1; C.2.4

  49. arXiv:2505.13033  [pdf, ps, other

    cs.LG cs.AI

    TSPulse: Dual Space Tiny Pre-Trained Models for Rapid Time-Series Analysis

    Authors: Vijay Ekambaram, Subodh Kumar, Arindam Jati, Sumanta Mukherjee, Tomoya Sakai, Pankaj Dayama, Wesley M. Gifford, Jayant Kalagnanam

    Abstract: The rise of time-series pre-trained models has advanced temporal representation learning, but current state-of-the-art models are often large-scale, requiring substantial compute. We introduce TSPulse, ultra-compact time-series pre-trained models with only 1M parameters, specialized to perform strongly across classification, anomaly detection, imputation, and retrieval tasks. TSPulse introduces in… ▽ More

    Submitted 25 June, 2025; v1 submitted 19 May, 2025; originally announced May 2025.

  50. arXiv:2505.12323  [pdf, ps, other

    cs.LG

    GraphFLEx: Structure Learning Framework for Large Expanding Graphs

    Authors: Mohit Kataria, Nikita Malik, Sandeep Kumar, Jayadeva

    Abstract: Graph structure learning is a core problem in graph-based machine learning, essential for uncovering latent relationships and ensuring model interpretability. However, most existing approaches are ill-suited for large-scale and dynamically evolving graphs, as they often require complete re-learning of the structure upon the arrival of new nodes and incur substantial computational and memory costs.… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.