Skip to main content

Showing 1–50 of 1,053 results for author: Arun

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.16507  [pdf, ps, other

    cs.LG

    Robust Reward Modeling via Causal Rubrics

    Authors: Pragya Srivastava, Harman Singh, Rahul Madhavan, Gandharv Patil, Sravanti Addepalli, Arun Suggala, Rengarajan Aravamudhan, Soumya Sharma, Anirban Laha, Aravindan Raghuveer, Karthikeyan Shanmugam, Doina Precup

    Abstract: Reward models (RMs) are fundamental to aligning Large Language Models (LLMs) via human feedback, yet they often suffer from reward hacking. They tend to latch on to superficial or spurious attributes, such as response length or formatting, mistaking these cues learned from correlations in training data for the true causal drivers of quality (e.g., factuality, relevance). This occurs because standa… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  2. arXiv:2506.14515  [pdf, ps, other

    cs.LG cs.CV

    Train Once, Forget Precisely: Anchored Optimization for Efficient Post-Hoc Unlearning

    Authors: Prabhav Sanga, Jaskaran Singh, Arun K. Dubey

    Abstract: As machine learning systems increasingly rely on data subject to privacy regulation, selectively unlearning specific information from trained models has become essential. In image classification, this involves removing the influence of particular training samples, semantic classes, or visual styles without full retraining. We introduce \textbf{Forget-Aligned Model Reconstruction (FAMR)}, a theoret… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Accepted at ICML MUGen'25

  3. arXiv:2506.12627  [pdf, ps, other

    eess.AS cs.SD

    Towards Neural Audio Codec Source Parsing

    Authors: Orchid Chetia Phukan, Girish, Mohd Mujtaba Akhtar, Arun Balaji Buduru, Rajesh Sharma

    Abstract: A new class of audio deepfakes-codecfakes (CFs)-has recently caught attention, synthesized by Audio Language Models that leverage neural audio codecs (NACs) in the backend. In response, the community has introduced dedicated benchmarks and tailored detection strategies. As the field advances, efforts have moved beyond binary detection toward source attribution, including open-set attribution, whic… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  4. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  5. arXiv:2506.11007  [pdf, other

    cs.SE cs.AI

    Impact of Comments on LLM Comprehension of Legacy Code

    Authors: Rock Sabetto, Emily Escamilla, Devesh Agarwal, Sujay Kandwal, Justin F. Brunelle, Scott Rosen, Nitin Naik, Samruddhi Thaker, Eric O. Scott, Jacob Zimmer, Amit Madan, Arun Sridharan, Doug Wendt, Michael Doyle, Christopher Glasz, Jasper Phillips, William Macke, Colin Diggs, Michael Bartholf, Zachary Robin, Paul Ursino

    Abstract: Large language models (LLMs) have been increasingly integrated into software engineering and maintenance tasks due to their high performance with software engineering tasks and robust understanding of modern programming languages. However, the ability of LLMs to comprehend code written with legacy languages remains a research gap challenged by real-world legacy systems lacking or containing inaccu… ▽ More

    Submitted 23 April, 2025; originally announced June 2025.

  6. arXiv:2506.08201  [pdf, ps, other

    cs.LG cs.CR

    Correlated Noise Mechanisms for Differentially Private Learning

    Authors: Krishna Pillutla, Jalaj Upadhyay, Christopher A. Choquette-Choo, Krishnamurthy Dvijotham, Arun Ganesh, Monika Henzinger, Jonathan Katz, Ryan McKenna, H. Brendan McMahan, Keith Rush, Thomas Steinke, Abhradeep Thakurta

    Abstract: This monograph explores the design and analysis of correlated noise mechanisms for differential privacy (DP), focusing on their application to private training of AI and machine learning models via the core primitive of estimation of weighted prefix sums. While typical DP mechanisms inject independent noise into each step of a stochastic gradient (SGD) learning algorithm in order to protect the pr… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 212 pages

  7. arXiv:2506.06999  [pdf, ps, other

    cs.LG cs.AI cs.CV stat.ML

    Towards Physics-informed Diffusion for Anomaly Detection in Trajectories

    Authors: Arun Sharma, Mingzhou Yang, Majid Farhadloo, Subhankar Ghosh, Bharat Jayaprakash, Shashi Shekhar

    Abstract: Given trajectory data, a domain-specific study area, and a user-defined threshold, we aim to find anomalous trajectories indicative of possible GPS spoofing (e.g., fake trajectory). The problem is societally important to curb illegal activities in international waters, such as unauthorized fishing and illicit oil transfers. The problem is challenging due to advances in AI generated in deep fakes g… ▽ More

    Submitted 14 June, 2025; v1 submitted 8 June, 2025; originally announced June 2025.

  8. arXiv:2506.03378  [pdf, ps, other

    eess.AS cs.CV cs.MM

    SNIFR : Boosting Fine-Grained Child Harmful Content Detection Through Audio-Visual Alignment with Cascaded Cross-Transformer

    Authors: Orchid Chetia Phukan, Mohd Mujtaba Akhtar, Girish, Swarup Ranjan Behera, Abu Osama Siddiqui, Sarthak Jain, Priyabrata Mallick, Jaya Sai Kiran Patibandla, Pailla Balakrishna Reddy, Arun Balaji Buduru, Rajesh Sharma

    Abstract: As video-sharing platforms have grown over the past decade, child viewership has surged, increasing the need for precise detection of harmful content like violence or explicit scenes. Malicious users exploit moderation systems by embedding unsafe content in minimal frames to evade detection. While prior research has focused on visual cues and advanced such fine-grained detection, audio features re… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH 2025

  9. arXiv:2506.03364  [pdf, ps, other

    eess.AS cs.MM cs.SD

    Towards Source Attribution of Singing Voice Deepfake with Multimodal Foundation Models

    Authors: Orchid Chetia Phukan, Girish, Mohd Mujtaba Akhtar, Swarup Ranjan Behera, Priyabrata Mallick, Pailla Balakrishna Reddy, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this work, we introduce the task of singing voice deepfake source attribution (SVDSA). We hypothesize that multimodal foundation models (MMFMs) such as ImageBind, LanguageBind will be most effective for SVDSA as they are better equipped for capturing subtle source-specific characteristics-such as unique timbre, pitch manipulation, or synthesis artifacts of each singing voice deepfake source due… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH 2025

  10. arXiv:2506.02258  [pdf, ps, other

    eess.AS cs.SD

    Are Mamba-based Audio Foundation Models the Best Fit for Non-Verbal Emotion Recognition?

    Authors: Mohd Mujtaba Akhtar, Orchid Chetia Phukan, Girish, Swarup Ranjan Behera, Ananda Chandra Nayak, Sanjib Kumar Nayak, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this work, we focus on non-verbal vocal sounds emotion recognition (NVER). We investigate mamba-based audio foundation models (MAFMs) for the first time for NVER and hypothesize that MAFMs will outperform attention-based audio foundation models (AAFMs) for NVER by leveraging its state-space modeling to capture intrinsic emotional structures more effectively. Unlike AAFMs, which may amplify irre… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted to EUSIPCO 2025

  11. arXiv:2506.02232  [pdf, ps, other

    eess.AS cs.SD

    Investigating the Reasonable Effectiveness of Speaker Pre-Trained Models and their Synergistic Power for SingMOS Prediction

    Authors: Orchid Chetia Phukan, Girish, Mohd Mujtaba Akhtar, Swarup Ranjan Behera, Pailla Balakrishna Reddy, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this study, we focus on Singing Voice Mean Opinion Score (SingMOS) prediction. Previous research have shown the performance benefit with the use of state-of-the-art (SOTA) pre-trained models (PTMs). However, they haven't explored speaker recognition speech PTMs (SPTMs) such as x-vector, ECAPA and we hypothesize that it will be the most effective for SingMOS prediction. We believe that due to th… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH 2025

  12. arXiv:2506.02230  [pdf, ps, other

    eess.AS cs.SD

    Towards Machine Unlearning for Paralinguistic Speech Processing

    Authors: Orchid Chetia Phukan, Girish, Mohd Mujtaba Akhtar, Shubham Singh, Swarup Ranjan Behera, Vandana Rajan, Muskaan Singh, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this work, we pioneer the study of Machine Unlearning (MU) for Paralinguistic Speech Processing (PSP). We focus on two key PSP tasks: Speech Emotion Recognition (SER) and Depression Detection (DD). To this end, we propose, SISA++, a novel extension to previous state-of-the-art (SOTA) MU method, SISA by merging models trained on different shards with weight-averaging. With such modifications, we… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH 2025

  13. arXiv:2506.01497  [pdf, ps, other

    cs.NE cs.AR cs.LG

    SpiceMixer -- Netlist-Level Circuit Evolution

    Authors: Stefan Uhlich, Andrea Bonetti, Arun Venkitaraman, Chia-Yu Hsieh, Mustafa Emre Gürsoy, Ryoga Matsuo, Lorenzo Servadei

    Abstract: This paper introduces SpiceMixer, a genetic algorithm developed to synthesize novel analog circuits by evolving SPICE netlists. Unlike conventional methods, SpiceMixer operates directly on netlist lines, enabling compatibility with any component or subcircuit type and supporting general-purpose genetic operations. By using a normalized netlist format, the algorithm enhances the effectiveness of it… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    ACM Class: B.7.0

  14. arXiv:2506.01157  [pdf, ps, other

    eess.AS cs.SD

    Source Tracing of Synthetic Speech Systems Through Paralinguistic Pre-Trained Representations

    Authors: Girish, Mohd Mujtaba Akhtar, Orchid Chetia Phukan, Drishti Singh, Swarup Ranjan Behera, Pailla Balakrishna Reddy, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this work, we focus on source tracing of synthetic speech generation systems (STSGS). Each source embeds distinctive paralinguistic features--such as pitch, tone, rhythm, and intonation--into their synthesized speech, reflecting the underlying design of the generation model. While previous research has explored representations from speech pre-trained models (SPTMs), the use of representations f… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: Accepted to EUSIPCO 2025

  15. arXiv:2506.01148  [pdf, ps, other

    eess.AS cs.SD

    Towards Fusion of Neural Audio Codec-based Representations with Spectral for Heart Murmur Classification via Bandit-based Cross-Attention Mechanism

    Authors: Orchid Chetia Phukan, Girish, Mohd Mujtaba Akhtar, Swarup Ranjan Behera, Priyabrata Mallick, Santanu Roy, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this study, we focus on heart murmur classification (HMC) and hypothesize that combining neural audio codec representations (NACRs) such as EnCodec with spectral features (SFs), such as MFCC, will yield superior performance. We believe such fusion will trigger their complementary behavior as NACRs excel at capturing fine-grained acoustic patterns such as rhythm changes, spectral features focus… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH 2025

  16. arXiv:2506.01138  [pdf, other

    eess.AS cs.SD

    PARROT: Synergizing Mamba and Attention-based SSL Pre-Trained Models via Parallel Branch Hadamard Optimal Transport for Speech Emotion Recognition

    Authors: Orchid Chetia Phukan, Mohd Mujtaba Akhtar, Girish, Swarup Ranjan Behera, Jaya Sai Kiran Patibandla, Arun Balaji Buduru, Rajesh Sharma

    Abstract: The emergence of Mamba as an alternative to attention-based architectures has led to the development of Mamba-based self-supervised learning (SSL) pre-trained models (PTMs) for speech and audio processing. Recent studies suggest that these models achieve comparable or superior performance to state-of-the-art (SOTA) attention-based PTMs for speech emotion recognition (SER). Motivated by prior work… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH 2025

  17. arXiv:2506.00450  [pdf, ps, other

    cs.IR cs.LG

    DV365: Extremely Long User History Modeling at Instagram

    Authors: Wenhan Lyu, Devashish Tyagi, Yihang Yang, Ziwei Li, Ajay Somani, Karthikeyan Shanmugasundaram, Nikola Andrejevic, Ferdi Adeputra, Curtis Zeng, Arun K. Singh, Maxime Ransan, Sagar Jain

    Abstract: Long user history is highly valuable signal for recommendation systems, but effectively incorporating it often comes with high cost in terms of data center power consumption and GPU. In this work, we chose offline embedding over end-to-end sequence length optimization methods to enable extremely long user sequence modeling as a cost-effective solution, and propose a new user embedding learning str… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: SIGKDD 2025 accepted

  18. arXiv:2506.00201  [pdf, ps, other

    cs.CR

    Hush! Protecting Secrets During Model Training: An Indistinguishability Approach

    Authors: Arun Ganesh, Brendan McMahan, Milad Nasr, Thomas Steinke, Abhradeep Thakurta

    Abstract: We consider the problem of secret protection, in which a business or organization wishes to train a model on their own data, while attempting to not leak secrets potentially contained in that data via the model. The standard method for training models to avoid memorization of secret information is via differential privacy (DP). However, DP requires a large loss in utility or a large dataset to ach… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

  19. arXiv:2505.24214  [pdf, ps, other

    cs.CV cs.AI

    Benchmarking Foundation Models for Zero-Shot Biometric Tasks

    Authors: Redwan Sony, Parisa Farmanifard, Hamzeh Alzwairy, Nitish Shukla, Arun Ross

    Abstract: The advent of foundation models, particularly Vision-Language Models (VLMs) and Multi-modal Large Language Models (MLLMs), has redefined the frontiers of artificial intelligence, enabling remarkable generalization across diverse tasks with minimal or no supervision. Yet, their potential in biometric recognition and analysis remains relatively underexplored. In this work, we introduce a comprehensi… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  20. arXiv:2505.23720  [pdf, ps, other

    cs.LG cs.AI stat.ML

    COBRA: Contextual Bandit Algorithm for Ensuring Truthful Strategic Agents

    Authors: Arun Verma, Indrajit Saha, Makoto Yokoo, Bryan Kian Hsiang Low

    Abstract: This paper considers a contextual bandit problem involving multiple agents, where a learner sequentially observes the contexts and the agent's reported arms, and then selects the arm that maximizes the system's overall reward. Existing work in contextual bandits assumes that agents truthfully report their arms, which is unrealistic in many real-life applications. For instance, consider an online p… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: This paper proposes a contextual bandit algorithm that prevents strategic agents from misreporting while having approximate incentive compatibility and a sub-linear regret guarantee

  21. arXiv:2505.20422  [pdf, ps, other

    cs.CL cs.AI

    SEMMA: A Semantic Aware Knowledge Graph Foundation Model

    Authors: Arvindh Arun, Sumit Kumar, Mojtaba Nayyeri, Bo Xiong, Ponnurangam Kumaraguru, Antonio Vergari, Steffen Staab

    Abstract: Knowledge Graph Foundation Models (KGFMs) have shown promise in enabling zero-shot reasoning over unseen graphs by learning transferable patterns. However, most existing KGFMs rely solely on graph structure, overlooking the rich semantic signals encoded in textual attributes. We introduce SEMMA, a dual-module KGFM that systematically integrates transferable textual semantics alongside structure. S… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  22. arXiv:2505.19241  [pdf, ps, other

    cs.LG cs.AI

    ActiveDPO: Active Direct Preference Optimization for Sample-Efficient Alignment

    Authors: Xiaoqiang Lin, Arun Verma, Zhongxiang Dai, Daniela Rus, See-Kiong Ng, Bryan Kian Hsiang Low

    Abstract: The recent success of using human preferences to align large language models (LLMs) has significantly improved their performance in various downstream tasks like question answering, mathematical reasoning, and code generation. However,3 achieving effective LLM alignment depends on high-quality human preference datasets. Collecting these datasets requires human preference annotation, which is costl… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

  23. arXiv:2505.18179  [pdf, ps, other

    cs.LG cs.AI

    GAIA: A Foundation Model for Operational Atmospheric Dynamics

    Authors: Ata Akbari Asanjan, Olivia Alexander, Tom Berg, Clara Zhang, Matt Yang, Jad Makki, Disha Shidham, Srija Chakraborty, William Bender, Stephen Peng, Arun Ravindran, Olivier Raiman, David Potere, David Bell

    Abstract: We present the GAIA (Geospatial Artificial Intelligence for Atmospheres) Foundation Model, a novel model that combines masked autoencoders (MAE) and self-DIstillation with NO labels (DINO) for analyzing global atmospheric patterns in satellite imagery. By integrating these complementary self-supervised learning approaches, our model simultaneously captures both local features and global dependenci… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: 14 pages, 7 figures

  24. arXiv:2505.17300  [pdf, ps, other

    stat.ML cs.LG stat.CO stat.ME

    Statistical Inference for Online Algorithms

    Authors: Selina Carter, Arun K Kuchibhotla

    Abstract: Construction of confidence intervals and hypothesis tests for functionals based on asymptotically normal estimators is a classical topic in statistical inference. The simplest and in many cases optimal inference procedure is the Wald interval or the likelihood ratio test, both of which require an estimator and an estimate of the asymptotic variance of the estimator. Estimators obtained from online… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: Although SGD is the most commonly mentioned method in machine learning, our simulations show that the performance of SGD is highly sensitive to the choice of tuning parameters of the algorithm. We could not find a simple remedy that improves performance and also makes the asymptotic properties manageable. We hope that our article acts as a word of caution to anyone using online algorithms blindly

  25. arXiv:2505.14527  [pdf, ps, other

    cs.CV

    diffDemorph: Extending Reference-Free Demorphing to Unseen Faces

    Authors: Nitish Shukla, Arun Ross

    Abstract: A face morph is created by combining two face images corresponding to two identities to produce a composite that successfully matches both the constituent identities. Reference-free (RF) demorphing reverses this process using only the morph image, without the need for additional reference images. Previous RF demorphing methods are overly constrained, as they rely on assumptions about the distribut… ▽ More

    Submitted 6 June, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

    Journal ref: IEEE International Conference on Image Processing (ICIP), 2025

  26. arXiv:2505.12830  [pdf, ps, other

    cs.ET cs.AR eess.SY

    2T1R Regulated Memristor Conductance Control Array Architecture for Neuromorphic Computing using 28nm CMOS Technology

    Authors: Neethu Kuriakose, Arun Ashok, Christian Grewing, André Zambanini, Stefan van Waasen

    Abstract: Memristors are promising devices for scalable and low power, in-memory computing to improve the energy efficiency of a rising computational demand. The crossbar array architecture with memristors is used for vector matrix multiplication (VMM) and acts as kernels in neuromorphic computing. The analog conductance control in a memristor is achieved by applying voltage or current through it. A basic 1… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  27. arXiv:2505.12688  [pdf, ps, other

    cs.CR

    Shielding Latent Face Representations From Privacy Attacks

    Authors: Arjun Ramesh Kaushik, Bharat Chandra Yalavarthi, Arun Ross, Vishnu Boddeti, Nalini Ratha

    Abstract: In today's data-driven analytics landscape, deep learning has become a powerful tool, with latent representations, known as embeddings, playing a central role in several applications. In the face analytics domain, such embeddings are commonly used for biometric recognition (e.g., face identification). However, these embeddings, or templates, can inadvertently expose sensitive attributes such as ag… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  28. arXiv:2505.12143  [pdf, ps, other

    cs.LG cs.AI

    Structured Representation

    Authors: Arun Kumar, Paul Schrater

    Abstract: Invariant representations are core to representation learning, yet a central challenge remains: uncovering invariants that are stable and transferable without suppressing task-relevant signals. This raises fundamental questions, requiring further inquiry, about the appropriate level of abstraction at which such invariants should be defined, and which aspects of a system they should characterize. I… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

  29. arXiv:2505.11009  [pdf, ps, other

    cs.ET

    Concept of a System-on-Chip Research Platform Benchmarking Interaction of Memristor-based Bio-inspired Computing Paradigms

    Authors: Christian Grewing, Arun Ashok, Sabitha Kusuma, Michael Schiek, Andre Zambanini, Stefan van Waasen

    Abstract: A system architecture is suggested for a System on Chip that will combine several different memristor-based, bio-inspired computation arrays with inter- and intra-chip communication. It will serve as a benchmark system for future developments. The architecture takes the special requirements into account which are caused by the memristor co-integration on commercial CMOS structures in a post proces… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  30. arXiv:2505.10248  [pdf, ps, other

    cs.ET cs.AR

    Scalable 28nm IC implementation of coupled oscillator network featuring tunable topology and complexity

    Authors: S. Y. Neyaz, A. Ashok, M. Schiek, C. Grewing, A. Zambanini, S. van Waasen

    Abstract: Integrated circuit implementations of coupled oscillator networks have recently gained increased attention. The focus is usually on using these networks for analogue computing, for example for solving computational optimization tasks. For use within analog computing, these networks are run close to critical dynamics. On the other hand, such networks are also used as an analogy of transport network… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  31. arXiv:2505.07672  [pdf, other

    cs.CL cs.AI cs.LG

    OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit

    Authors: Arun S. Maiya

    Abstract: We present OnPrem$.$LLM, a Python-based toolkit for applying large language models (LLMs) to sensitive, non-public data in offline or restricted environments. The system is designed for privacy-preserving use cases and provides prebuilt pipelines for document processing and storage, retrieval-augmented generation (RAG), information extraction, summarization, classification, and prompt/output proce… ▽ More

    Submitted 12 May, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

    Comments: 6 pages

  32. arXiv:2505.04736  [pdf, other

    cs.AI

    The Promise and Limits of LLMs in Constructing Proofs and Hints for Logic Problems in Intelligent Tutoring Systems

    Authors: Sutapa Dey Tithi, Arun Kumar Ramesh, Clara DiMarco, Xiaoyi Tian, Nazia Alam, Kimia Fazeli, Tiffany Barnes

    Abstract: Intelligent tutoring systems have demonstrated effectiveness in teaching formal propositional logic proofs, but their reliance on template-based explanations limits their ability to provide personalized student feedback. While large language models (LLMs) offer promising capabilities for dynamic feedback generation, they risk producing hallucinations or pedagogically unsound explanations. We evalu… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  33. arXiv:2505.04616  [pdf, other

    cs.CV

    Person Recognition at Altitude and Range: Fusion of Face, Body Shape and Gait

    Authors: Feng Liu, Nicholas Chimitt, Lanqing Guo, Jitesh Jain, Aditya Kane, Minchul Kim, Wes Robbins, Yiyang Su, Dingqiang Ye, Xingguang Zhang, Jie Zhu, Siddharth Satyakam, Christopher Perry, Stanley H. Chan, Arun Ross, Humphrey Shi, Zhangyang Wang, Anil Jain, Xiaoming Liu

    Abstract: We address the problem of whole-body person recognition in unconstrained environments. This problem arises in surveillance scenarios such as those in the IARPA Biometric Recognition and Identification at Altitude and Range (BRIAR) program, where biometric data is captured at long standoff distances, elevated viewing angles, and under adverse atmospheric conditions (e.g., turbulence and high wind v… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: 18 pages, 12 figures

  34. arXiv:2505.03770  [pdf, other

    cs.AI

    Proceedings of 1st Workshop on Advancing Artificial Intelligence through Theory of Mind

    Authors: Mouad Abrini, Omri Abend, Dina Acklin, Henny Admoni, Gregor Aichinger, Nitay Alon, Zahra Ashktorab, Ashish Atreja, Moises Auron, Alexander Aufreiter, Raghav Awasthi, Soumya Banerjee, Joe M. Barnby, Rhea Basappa, Severin Bergsmann, Djallel Bouneffouf, Patrick Callaghan, Marc Cavazza, Thierry Chaminade, Sonia Chernova, Mohamed Chetouan, Moumita Choudhury, Axel Cleeremans, Jacek B. Cywinski, Fabio Cuzzolin , et al. (83 additional authors not shown)

    Abstract: This volume includes a selection of papers presented at the Workshop on Advancing Artificial Intelligence through Theory of Mind held at AAAI 2025 in Philadelphia US on 3rd March 2025. The purpose of this volume is to provide an open access and curated anthology for the ToM and AI research community.

    Submitted 28 April, 2025; originally announced May 2025.

    Comments: workshop proceedings

  35. Embedding based retrieval for long tail search queries in ecommerce

    Authors: Akshay Kekuda, Yuyang Zhang, Arun Udayashankar

    Abstract: In this abstract we present a series of optimizations we performed on the two-tower model architecture [14], and training and evaluation datasets to implement semantic product search at Best Buy. Search queries on bestbuy.com follow the pareto distribution whereby a minority of them account for most searches. This leaves us with a long tail of search queries that have low frequency of issuance. Th… ▽ More

    Submitted 25 May, 2025; v1 submitted 3 May, 2025; originally announced May 2025.

    Comments: Published at RecSys '24: Proceedings of the 18th ACM Conference on Recommender Systems

    Journal ref: In Proceedings of the 18th ACM Conference on Recommender Systems (pp. 771-774) 2024

  36. arXiv:2505.00949  [pdf, other

    cs.CL cs.AI cs.LG

    Llama-Nemotron: Efficient Reasoning Models

    Authors: Akhiad Bercovich, Itay Levy, Izik Golan, Mohammad Dabbah, Ran El-Yaniv, Omri Puny, Ido Galil, Zach Moshe, Tomer Ronen, Najeeb Nabwani, Ido Shahaf, Oren Tropp, Ehud Karpas, Ran Zilberstein, Jiaqi Zeng, Soumye Singhal, Alexander Bukharin, Yian Zhang, Tugrul Konuk, Gerald Shen, Ameya Sunil Mahabaleshwarkar, Bilal Kartal, Yoshi Suhara, Olivier Delalleau, Zijia Chen , et al. (109 additional authors not shown)

    Abstract: We introduce the Llama-Nemotron series of models, an open family of heterogeneous reasoning models that deliver exceptional reasoning capabilities, inference efficiency, and an open license for enterprise use. The family comes in three sizes -- Nano (8B), Super (49B), and Ultra (253B) -- and performs competitively with state-of-the-art reasoning models such as DeepSeek-R1 while offering superior i… ▽ More

    Submitted 14 May, 2025; v1 submitted 1 May, 2025; originally announced May 2025.

  37. arXiv:2504.20129  [pdf, other

    physics.ao-ph cs.LG

    A Physically Driven Long Short Term Memory Model for Estimating Snow Water Equivalent over the Continental United States

    Authors: Arun M. Saranathan, Mahmoud Saeedimoghaddam, Brandon Smith, Deepthi Raghunandan, Grey Nearing, Craig Pelissier

    Abstract: Snow is an essential input for various land surface models. Seasonal snow estimates are available as snow water equivalent (SWE) from process-based reanalysis products or locally from in situ measurements. While the reanalysis products are computationally expensive and available at only fixed spatial and temporal resolutions, the in situ measurements are highly localized and sparse. To address the… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: Preprint of journal paper in preparation. Details: 24 pages, 8 figures

  38. arXiv:2504.18786  [pdf, ps, other

    cs.NI

    Contracts: A unified lens on congestion control robustness, fairness, congestion, and generality

    Authors: Anup Agarwal, Venkat Arun, Srinivasan Seshan

    Abstract: Congestion control algorithms (CCAs) operate in partially observable environments, lacking direct visibility into link capacities, or competing flows. To ensure fair sharing of network resources, CCAs communicate their fair share through observable signals. For instance, Reno's fair share is encoded as $\propto 1/\sqrt{\texttt{loss rate}}$. We call such communication mechanisms \emph{contracts}. W… ▽ More

    Submitted 6 June, 2025; v1 submitted 25 April, 2025; originally announced April 2025.

  39. arXiv:2504.18749  [pdf, other

    math.OC cs.DS

    Optimization of Next-Day Delivery Coverage using Constraint Programming and Random Key Optimizers

    Authors: Kyle Brubaker, Kyle E. C. Booth, Martin J. A. Schuetz, Philipp Loick, Jian Shen, Arun Ramamurthy, Georgios Paschos

    Abstract: We consider the logistics network of an e-commerce retailer, specifically the so-called "middle mile" network, that routes inventory from supply warehouses to distribution stations to be ingested into the terminal ("last mile") delivery network. The speed of packages through this middle mile network is a key determinant for the ultimate delivery speed to the end user. An important target for a ret… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

    Comments: 16 pages, 3 figures, 2 algorithms, 2 tables

  40. arXiv:2504.18649  [pdf, other

    cs.DC

    Raptr: Prefix Consensus for Robust High-Performance BFT

    Authors: Andrei Tonkikh, Balaji Arun, Zhuolun Xiang, Zekun Li, Alexander Spiegelman

    Abstract: In this paper, we present Raptr--a Byzantine fault-tolerant state machine replication (BFT SMR) protocol that combines strong robustness with high throughput, while attaining near-optimal theoretical latency. Raptr delivers exceptionally low latency and high throughput under favorable conditions, and it degrades gracefully in the presence of Byzantine faults and network attacks. Existing high-th… ▽ More

    Submitted 29 April, 2025; v1 submitted 25 April, 2025; originally announced April 2025.

  41. arXiv:2504.12535  [pdf, other

    cs.CV cs.AI

    Decision-based AI Visual Navigation for Cardiac Ultrasounds

    Authors: Andy Dimnaku, Dominic Yurk, Zhiyuan Gao, Arun Padmanabhan, Mandar Aras, Yaser Abu-Mostafa

    Abstract: Ultrasound imaging of the heart (echocardiography) is widely used to diagnose cardiac diseases. However, obtaining an echocardiogram requires an expert sonographer and a high-quality ultrasound imaging device, which are generally only available in hospitals. Recently, AI-based navigation models and algorithms have been used to aid novice sonographers in acquiring the standardized cardiac views nec… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

  42. arXiv:2504.12016  [pdf, other

    cs.LG

    Active Human Feedback Collection via Neural Contextual Dueling Bandits

    Authors: Arun Verma, Xiaoqiang Lin, Zhongxiang Dai, Daniela Rus, Bryan Kian Hsiang Low

    Abstract: Collecting human preference feedback is often expensive, leading recent works to develop principled algorithms to select them more efficiently. However, these works assume that the underlying reward function is linear, an assumption that does not hold in many real-life applications, such as online recommendation and LLM alignment. To address this limitation, we propose Neural-ADB, an algorithm bas… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: Accepted at ICLR 2025 Workshop on Bidirectional Human-AI Alignment (BiAlign)

  43. arXiv:2504.11259  [pdf, ps, other

    cs.DB

    The Cambridge Report on Database Research

    Authors: Anastasia Ailamaki, Samuel Madden, Daniel Abadi, Gustavo Alonso, Sihem Amer-Yahia, Magdalena Balazinska, Philip A. Bernstein, Peter Boncz, Michael Cafarella, Surajit Chaudhuri, Susan Davidson, David DeWitt, Yanlei Diao, Xin Luna Dong, Michael Franklin, Juliana Freire, Johannes Gehrke, Alon Halevy, Joseph M. Hellerstein, Mark D. Hill, Stratos Idreos, Yannis Ioannidis, Christoph Koch, Donald Kossmann, Tim Kraska , et al. (21 additional authors not shown)

    Abstract: On October 19 and 20, 2023, the authors of this report convened in Cambridge, MA, to discuss the state of the database research field, its recent accomplishments and ongoing challenges, and future directions for research and community engagement. This gathering continues a long standing tradition in the database community, dating back to the late 1980s, in which researchers meet roughly every five… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  44. $π$-MPPI: A Projection-based Model Predictive Path Integral Scheme for Smooth Optimal Control of Fixed-Wing Aerial Vehicles

    Authors: Edvin Martin Andrejev, Amith Manoharan, Karl-Eerik Unt, Arun Kumar Singh

    Abstract: Model Predictive Path Integral (MPPI) is a popular sampling-based Model Predictive Control (MPC) algorithm for nonlinear systems. It optimizes trajectories by sampling control sequences and averaging them. However, a key issue with MPPI is the non-smoothness of the optimal control sequence, leading to oscillations in systems like fixed-wing aerial vehicles (FWVs). Existing solutions use post-hoc s… ▽ More

    Submitted 16 April, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

    Comments: 8 pages, 4 figures, submitted to IEEE RA-L

    Journal ref: IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 10, NO. 6, JUNE 2025

  45. arXiv:2504.08626  [pdf, other

    cs.LG cs.AI cs.CV

    Task-conditioned Ensemble of Expert Models for Continuous Learning

    Authors: Renu Sharma, Debasmita Pal, Arun Ross

    Abstract: One of the major challenges in machine learning is maintaining the accuracy of the deployed model (e.g., a classifier) in a non-stationary environment. The non-stationary environment results in distribution shifts and, consequently, a degradation in accuracy. Continuous learning of the deployed model with new data could be one remedy. However, the question arises as to how we should update the mod… ▽ More

    Submitted 14 April, 2025; v1 submitted 11 April, 2025; originally announced April 2025.

    Comments: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, USA, June 2025

  46. arXiv:2504.08176  [pdf, other

    cs.CR

    GenXSS: an AI-Driven Framework for Automated Detection of XSS Attacks in WAFs

    Authors: Vahid Babaey, Arun Ravindran

    Abstract: The increasing reliance on web services has led to a rise in cybersecurity threats, particularly Cross-Site Scripting (XSS) attacks, which target client-side layers of web applications by injecting malicious scripts. Traditional Web Application Firewalls (WAFs) struggle to detect highly obfuscated and complex attacks, as their rules require manual updates. This paper presents a novel generative AI… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

  47. arXiv:2504.06268  [pdf

    cs.DL

    Assessment of FAIR (Findability, Accessibility, Interoperability, and Reusability) data implementation frameworks: a parametric approach

    Authors: Ranjeet Kumar Singh, Akanksha Nagpal, Arun Jadhav, Devika P. Madalli

    Abstract: Open science movement has established reproducibility, transparency, and validation of research outputs as essential norms for conducting scientific research. It advocates for open access to research outputs, especially research data, to enable verification of published findings and its optimum reuse. The FAIR (Findable, Accessible, Interoperable, and Reusable) data principles support the philosop… ▽ More

    Submitted 27 December, 2024; originally announced April 2025.

  48. arXiv:2504.05687  [pdf, other

    cs.DS math.OC

    Radial Isotropic Position via an Implicit Newton's Method

    Authors: Arun Jambulapati, Jonathan Li, Kevin Tian

    Abstract: Placing a dataset $A = \{\mathbf{a}_i\}_{i \in [n]} \subset \mathbb{R}^d$ in radial isotropic position, i.e., finding an invertible $\mathbf{R} \in \mathbb{R}^{d \times d}$ such that the unit vectors $\{(\mathbf{R} \mathbf{a}_i) \|\mathbf{R} \mathbf{a}_i\|_2^{-1}\}_{i \in [n]}$ are in isotropic position, is a powerful tool with applications in functional analysis, communication complexity, coding… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

  49. arXiv:2503.23275  [pdf, other

    cs.CV cs.AI

    Improved Ear Verification with Vision Transformers and Overlapping Patches

    Authors: Deeksha Arun, Kagan Ozturk, Kevin W. Bowyer, Patrick Flynn

    Abstract: Ear recognition has emerged as a promising biometric modality due to the relative stability in appearance during adulthood. Although Vision Transformers (ViTs) have been widely used in image recognition tasks, their efficiency in ear recognition has been hampered by a lack of attention to overlapping patches, which is crucial for capturing intricate ear features. In this study, we evaluate ViT-Tin… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  50. arXiv:2503.20698  [pdf, other

    cs.CV cs.IR

    MMMORRF: Multimodal Multilingual Modularized Reciprocal Rank Fusion

    Authors: Saron Samuel, Dan DeGenaro, Jimena Guallar-Blasco, Kate Sanders, Oluwaseun Eisape, Tanner Spendlove, Arun Reddy, Alexander Martin, Andrew Yates, Eugene Yang, Cameron Carpenter, David Etter, Efsun Kayi, Matthew Wiesner, Kenton Murray, Reno Kriz

    Abstract: Videos inherently contain multiple modalities, including visual events, text overlays, sounds, and speech, all of which are important for retrieval. However, state-of-the-art multimodal language models like VAST and LanguageBind are built on vision-language models (VLMs), and thus overly prioritize visual signals. Retrieval benchmarks further reinforce this bias by focusing on visual queries and n… ▽ More

    Submitted 9 May, 2025; v1 submitted 26 March, 2025; originally announced March 2025.