Skip to main content

Showing 1–50 of 619 results for author: Ankur

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.08669  [pdf, ps, other

    cs.LG cs.AI

    Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search

    Authors: Dongge Han, Menglin Xia, Daniel Madrigal Diaz, Samuel Kessler, Ankur Mallick, Xuchao Zhang, Mirian Del Carmen Hipolito Garcia, Jin Xu, Victor Rühle, Saravan Rajmohan

    Abstract: Small language models (SLMs) offer promising and efficient alternatives to large language models (LLMs). However, SLMs' limited capacity restricts their reasoning capabilities and makes them sensitive to prompt variations. To address these challenges, we propose a novel framework that enhances SLM reasoning capabilities through LLM generated blueprints. The blueprints provide structured, high-leve… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: TTODLer-FM Workshop@ICML'25 (Tiny Titans: The next wave of On-Device Learning for Foundational Models)

  2. arXiv:2506.04049  [pdf, ps, other

    cs.PF cs.AI

    WANDER: An Explainable Decision-Support Framework for HPC

    Authors: Ankur Lahiry, Banooqa Banday, Tanzima Z. Islam

    Abstract: High-performance computing (HPC) systems expose many interdependent configuration knobs that impact runtime, resource usage, power, and variability. Existing predictive tools model these outcomes, but do not support structured exploration, explanation, or guided reconfiguration. We present WANDER, a decision-support framework that synthesizes alternate configurations using counterfactual analysis… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  3. arXiv:2506.03425  [pdf, ps, other

    eess.AS cs.AI cs.LG

    A Data-Driven Diffusion-based Approach for Audio Deepfake Explanations

    Authors: Petr Grinberg, Ankur Kumar, Surya Koppisetti, Gaurav Bharaj

    Abstract: Evaluating explainability techniques, such as SHAP and LRP, in the context of audio deepfake detection is challenging due to lack of clear ground truth annotations. In the cases when we are able to obtain the ground truth, we find that these methods struggle to provide accurate explanations. In this work, we propose a novel data-driven approach to identify artifact regions in deepfake audio. We co… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: 5 pages, 3 figures, accepted at Interspeech 2025

  4. arXiv:2505.19173  [pdf, other

    cs.AI

    Investigating Pedagogical Teacher and Student LLM Agents: Genetic Adaptation Meets Retrieval Augmented Generation Across Learning Style

    Authors: Debdeep Sanyal, Agniva Maiti, Umakanta Maharana, Dhruv Kumar, Ankur Mali, C. Lee Giles, Murari Mandal

    Abstract: Effective teaching requires adapting instructional strategies to accommodate the diverse cognitive and behavioral profiles of students, a persistent challenge in education and teacher training. While Large Language Models (LLMs) offer promise as tools to simulate such complex pedagogical environments, current simulation frameworks are limited in two key respects: (1) they often reduce students to… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

    Comments: 38 Pages

  5. arXiv:2505.16014  [pdf, ps, other

    cs.CL

    Ranking Free RAG: Replacing Re-ranking with Selection in RAG for Sensitive Domains

    Authors: Yash Saxena, Ankur Padia, Mandar S Chaudhary, Kalpa Gunaratna, Srinivasan Parthasarathy, Manas Gaur

    Abstract: Traditional Retrieval-Augmented Generation (RAG) pipelines rely on similarity-based retrieval and re-ranking, which depend on heuristics such as top-k, and lack explainability, interpretability, and robustness against adversarial content. To address this gap, we propose a novel method METEORA that replaces re-ranking in RAG with a rationale-driven selection approach. METEORA operates in two stages… ▽ More

    Submitted 2 June, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

  6. arXiv:2505.14635  [pdf, ps, other

    cs.LG

    Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning

    Authors: Benjamin Prada, Shion Matsumoto, Abdul Malik Zekri, Ankur Mali

    Abstract: We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-c… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 24 pages, 2 figures

  7. arXiv:2505.10526  [pdf, other

    cs.LG cs.CL cs.CV

    MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models

    Authors: Mugilan Ganesan, Shane Segal, Ankur Aggarwal, Nish Sinnadurai, Sean Lie, Vithursan Thangarasa

    Abstract: Speculative decoding significantly accelerates language model inference by enabling a lightweight draft model to propose multiple tokens that a larger target model verifies simultaneously. However, applying this technique to vision-language models (VLMs) presents two fundamental challenges: small language models that could serve as efficient drafters lack the architectural components to process vi… ▽ More

    Submitted 17 May, 2025; v1 submitted 15 May, 2025; originally announced May 2025.

    Comments: Main paper: 11 pages, 4 figures, 3 tables. Supplementary: 1 page

  8. arXiv:2505.07851  [pdf, other

    eess.IV cs.AI cs.CV cs.RO

    Pose Estimation for Intra-cardiac Echocardiography Catheter via AI-Based Anatomical Understanding

    Authors: Jaeyoung Huh, Ankur Kapoor, Young-Ho Kim

    Abstract: Intra-cardiac Echocardiography (ICE) plays a crucial role in Electrophysiology (EP) and Structural Heart Disease (SHD) interventions by providing high-resolution, real-time imaging of cardiac structures. However, existing navigation methods rely on electromagnetic (EM) tracking, which is susceptible to interference and position drift, or require manual adjustments based on operator expertise. To o… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  9. arXiv:2505.05518  [pdf, other

    eess.IV cs.CV cs.RO

    Guidance for Intra-cardiac Echocardiography Manipulation to Maintain Continuous Therapy Device Tip Visibility

    Authors: Jaeyoung Huh, Ankur Kapoor, Young-Ho Kim

    Abstract: Intra-cardiac Echocardiography (ICE) plays a critical role in Electrophysiology (EP) and Structural Heart Disease (SHD) interventions by providing real-time visualization of intracardiac structures. However, maintaining continuous visibility of the therapy device tip remains a challenge due to frequent adjustments required during manual ICE catheter manipulation. To address this, we propose an AI-… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  10. arXiv:2505.03742  [pdf, other

    cs.CR

    Hardware-Enabled Mechanisms for Verifying Responsible AI Development

    Authors: Aidan O'Gara, Gabriel Kulp, Will Hodgkins, James Petrie, Vincent Immler, Aydin Aysu, Kanad Basu, Shivam Bhasin, Stjepan Picek, Ankur Srivastava

    Abstract: Advancements in AI capabilities, driven in large part by scaling up computing resources used for AI training, have created opportunities to address major global challenges but also pose risks of misuse. Hardware-enabled mechanisms (HEMs) can support responsible AI development by enabling verifiable reporting of key properties of AI training activities such as quantity of compute used, training clu… ▽ More

    Submitted 2 April, 2025; originally announced May 2025.

  11. arXiv:2504.16871  [pdf, other

    cs.LG

    Exploring How LLMs Capture and Represent Domain-Specific Knowledge

    Authors: Mirian Hipolito Garcia, Camille Couturier, Daniel Madrigal Diaz, Ankur Mallick, Anastasios Kyrillidis, Robert Sim, Victor Ruhle, Saravan Rajmohan

    Abstract: We study whether Large Language Models (LLMs) inherently capture domain-specific nuances in natural language. Our experiments probe the domain sensitivity of LLMs by examining their ability to distinguish queries from different domains using hidden states generated during the prefill phase. We reveal latent domain-related trajectories that indicate the model's internal recognition of query domains… ▽ More

    Submitted 24 April, 2025; v1 submitted 23 April, 2025; originally announced April 2025.

  12. Streaming Democratized: Ease Across the Latency Spectrum with Delayed View Semantics and Snowflake Dynamic Tables

    Authors: Daniel Sotolongo, Daniel Mills, Tyler Akidau, Anirudh Santhiar, Attila-Péter Tóth, Ilaria Battiston, Ankur Sharma, Botong Huang, Boyuan Zhang, Dzmitry Pauliukevich, Enrico Sartorello, Igor Belianski, Ivan Kalev, Lawrence Benson, Leon Papke, Ling Geng, Matt Uhlar, Nikhil Shah, Niklas Semmler, Olivia Zhou, Saras Nowak, Sasha Lionheart, Till Merker, Vlad Lifliand, Wendy Grus , et al. (2 additional authors not shown)

    Abstract: Streaming data pipelines remain challenging and expensive to build and maintain, despite significant advancements in stronger consistency, event time semantics, and SQL support over the last decade. Persistent obstacles continue to hinder usability, such as the need for manual incrementalization, semantic discrepancies across SQL implementations, and the lack of enterprise-grade operational featur… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: 12 pages, 6 figures, to be published in SIGMOD 2025

  13. arXiv:2504.10369  [pdf, other

    cs.AR cs.AI cs.LG cs.PL

    SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning

    Authors: Yiting Wang, Wanghao Ye, Ping Guo, Yexiao He, Ziyao Wang, Yexiao He, Bowei Tian, Shwai He, Guoheng Sun, Zheyu Shen, Sihan Chen, Ankur Srivastava, Qingfu Zhang, Gang Qu, Ang Li

    Abstract: Optimizing Register Transfer Level (RTL) code is crucial for improving the power, performance, and area (PPA) of digital circuits in the early stages of synthesis. Manual rewriting, guided by synthesis feedback, can yield high-quality results but is time-consuming and error-prone. Most existing compiler-based approaches have difficulty handling complex design constraints. Large Language Model (LLM… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: 16 pages, 8 figures, 7 tables. Under Review

  14. arXiv:2504.05573  [pdf, other

    cs.DB cs.AI cs.IR

    MicroNN: An On-device Disk-resident Updatable Vector Database

    Authors: Jeffrey Pound, Floris Chabert, Arjun Bhushan, Ankur Goswami, Anil Pacaci, Shihabur Rahman Chowdhury

    Abstract: Nearest neighbour search over dense vector collections has important applications in information retrieval, retrieval augmented generation (RAG), and content ranking. Performing efficient search over large vector collections is a well studied problem with many existing approaches and open source implementations. However, most state-of-the-art systems are generally targeted towards scenarios using… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  15. arXiv:2503.22069  [pdf, other

    cs.CV cs.AI

    Contrasting Low and High-Resolution Features for HER2 Scoring using Deep Learning

    Authors: Ekansh Chauhan, Anila Sharma, Amit Sharma, Vikas Nishadham, Asha Ghughtyal, Ankur Kumar, Gurudutt Gupta, Anurag Mehta, C. V. Jawahar, P. K. Vinod

    Abstract: Breast cancer, the most common malignancy among women, requires precise detection and classification for effective treatment. Immunohistochemistry (IHC) biomarkers like HER2, ER, and PR are critical for identifying breast cancer subtypes. However, traditional IHC classification relies on pathologists' expertise, making it labor-intensive and subject to significant inter-observer variability. To ad… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  16. arXiv:2503.22020  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models

    Authors: Qingqing Zhao, Yao Lu, Moo Jin Kim, Zipeng Fu, Zhuoyang Zhang, Yecheng Wu, Zhaoshuo Li, Qianli Ma, Song Han, Chelsea Finn, Ankur Handa, Ming-Yu Liu, Donglai Xiang, Gordon Wetzstein, Tsung-Yi Lin

    Abstract: Vision-language-action models (VLAs) have shown potential in leveraging pretrained vision-language models and diverse robot demonstrations for learning generalizable sensorimotor control. While this paradigm effectively utilizes large-scale data from both robotic and non-robotic sources, current VLAs primarily focus on direct input--output mappings, lacking the intermediate reasoning steps crucial… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: Project website: https://cot-vla.github.io/

    Journal ref: CVPR 2025

  17. arXiv:2503.19786  [pdf, other

    cs.CL cs.AI

    Gemma 3 Technical Report

    Authors: Gemma Team, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, Tatiana Matejovicova, Alexandre Ramé, Morgane Rivière, Louis Rouillard, Thomas Mesnard, Geoffrey Cideron, Jean-bastien Grill, Sabela Ramos, Edouard Yvinec, Michelle Casbon, Etienne Pot, Ivo Penchev, Gaël Liu, Francesco Visin, Kathleen Kenealy, Lucas Beyer, Xiaohai Zhai, Anton Tsitsulin , et al. (191 additional authors not shown)

    Abstract: We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters. This version introduces vision understanding abilities, a wider coverage of languages and longer context - at least 128K tokens. We also change the architecture of the model to reduce the KV-cache memory that tends to explode with long context. This is achie… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  18. arXiv:2503.13507  [pdf, other

    cs.CL cs.AI

    NeurIPS 2023 LLM Efficiency Fine-tuning Competition

    Authors: Mark Saroufim, Yotam Perlitz, Leshem Choshen, Luca Antiga, Greg Bowyer, Christian Puhrsch, Driss Guessous, Supriya Rao, Geeta Chauhan, Ashvini Kumar, Jindal Pawan Kumar, Rajpoot Ankur Parikh, Joe Isaacson, Weiwei Yang

    Abstract: Our analysis of the NeurIPS 2023 large language model (LLM) fine-tuning competition revealed the following trend: top-performing models exhibit significant overfitting on benchmark datasets, mirroring the broader issue of benchmark overfitting on popular leaderboards and that data curation is essential in order to get a high performing LLM. The competition, which consisted of two stages - an open… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

    Comments: 11 pages, 10 figures

  19. Privacy Ethics Alignment in AI: A Stakeholder-Centric Based Framework for Ethical AI

    Authors: Ankur Barthwal, Molly Campbell, Ajay Kumar Shrestha

    Abstract: The increasing integration of Artificial Intelligence (AI) in digital ecosystems has reshaped privacy dynamics, particularly for young digital citizens navigating data-driven environments. This study explores evolving privacy concerns across three key stakeholder groups, digital citizens (ages 16-19), parents/educators, and AI professionals, and assesses differences in data ownership, trust, trans… ▽ More

    Submitted 20 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

    Comments: Submitted to peer reviwed venue

    Journal ref: Systems 2025, 13, 455

  20. arXiv:2503.11947  [pdf

    cs.CY cs.AI cs.LG

    Ethical AI for Young Digital Citizens: A Call to Action on Privacy Governance

    Authors: Austin Shouli, Ankur Barthwal, Molly Campbell, Ajay Kumar Shrestha

    Abstract: The rapid expansion of Artificial Intelligence (AI) in digital platforms used by youth has created significant challenges related to privacy, autonomy, and data protection. While AI-driven personalization offers enhanced user experiences, it often operates without clear ethical boundaries, leaving young users vulnerable to data exploitation and algorithmic biases. This paper presents a call to act… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

    Comments: Preprint Version | To be submitted to peer-reviewed venue

  21. arXiv:2503.08933  [pdf, other

    cs.CV

    PromptGAR: Flexible Promptive Group Activity Recognition

    Authors: Zhangyu Jin, Andrew Feng, Ankur Chemburkar, Celso M. De Melo

    Abstract: We present PromptGAR, a novel framework that addresses the limitations of current Group Activity Recognition (GAR) approaches by leveraging multi-modal prompts to achieve both input flexibility and high recognition accuracy. The existing approaches suffer from limited real-world applicability due to their reliance on full prompt annotations, the lack of long-term actor consistency, and under-explo… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  22. arXiv:2503.08918  [pdf, other

    cs.LG hep-lat stat.ML

    Multilevel Generative Samplers for Investigating Critical Phenomena

    Authors: Ankur Singha, Elia Cellini, Kim A. Nicoli, Karl Jansen, Stefan Kühn, Shinichi Nakajima

    Abstract: Investigating critical phenomena or phase transitions is of high interest in physics and chemistry, for which Monte Carlo (MC) simulations, a crucial tool for numerically analyzing macroscopic properties of given systems, are often hindered by an emerging divergence of correlation length -- known as scale invariance at criticality (SIC) in the renormalization group theory. SIC causes the system to… ▽ More

    Submitted 13 March, 2025; v1 submitted 11 March, 2025; originally announced March 2025.

    Comments: 10 pages, 4 figures (main text); 13th International Conference on Learning Representations (ICLR 2025)

  23. arXiv:2503.07522  [pdf, other

    eess.AS cs.CL

    Building English ASR model with regional language support

    Authors: Purvi Agrawal, Vikas Joshi, Bharati Patidar, Ankur Gupta, Rupesh Kumar Mehta

    Abstract: In this paper, we present a novel approach to developing an English Automatic Speech Recognition (ASR) system that can effectively handle Hindi queries, without compromising its performance on English. We propose a novel acoustic model (AM), referred to as SplitHead with Attention (SHA) model, features shared hidden layers across languages and language-specific projection layers combined via a sel… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 5 pages, 3 figures

  24. arXiv:2503.01069  [pdf, other

    cs.AI cs.MA

    Multi-Agent Reinforcement Learning with Long-Term Performance Objectives for Service Workforce Optimization

    Authors: Kareem Eissa, Rayal Prasad, Sarith Mohan, Ankur Kapoor, Dorin Comaniciu, Vivek Singh

    Abstract: Workforce optimization plays a crucial role in efficient organizational operations where decision-making may span several different administrative and time scales. For instance, dispatching personnel to immediate service requests while managing talent acquisition with various expertise sets up a highly dynamic optimization problem. Existing work focuses on specific sub-problems such as resource al… ▽ More

    Submitted 2 March, 2025; originally announced March 2025.

  25. arXiv:2502.18471  [pdf, other

    cs.IR cs.AI cs.CL cs.LG q-fin.ST

    FinBloom: Knowledge Grounding Large Language Model with Real-time Financial Data

    Authors: Ankur Sinha, Chaitanya Agarwal, Pekka Malo

    Abstract: Large language models (LLMs) excel at generating human-like responses but often struggle with interactive tasks that require access to real-time information. This limitation poses challenges in finance, where models must access up-to-date information, such as recent news or price movements, to support decision-making. To address this, we introduce Financial Agent, a knowledge-grounding approach fo… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: 27 pages, 9 tables

  26. arXiv:2502.14617  [pdf, other

    cs.DC

    Serving Models, Fast and Slow:Optimizing Heterogeneous LLM Inferencing Workloads at Scale

    Authors: Shashwat Jaiswal, Kunal Jain, Yogesh Simmhan, Anjaly Parayil, Ankur Mallick, Rujia Wang, Renee St. Amant, Chetan Bansal, Victor Rühle, Anoop Kulkarni, Steve Kofsky, Saravan Rajmohan

    Abstract: Large Language Model (LLM) inference workloads handled by global cloud providers can include both latency-sensitive and insensitive tasks, creating a diverse range of Service Level Agreement (SLA) requirements. Managing these mixed workloads is challenging due to the complexity of the inference stack, which includes multiple LLMs, hardware configurations, and geographic distributions. Current opti… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: 15 pages, 17 figures, 2 tables

  27. arXiv:2502.01695  [pdf

    eess.IV cs.CV

    A Novel Real-Time Full-Color 3D Holographic (Diffractive) Video Capture, Processing, and Transmission Pipeline Using Off-The-Shelf Hardware

    Authors: Ankur Samanta, Gregor Mackenzie, Tyler Rathkamp, Adrian Cable, Darran Milne, Andrzej Kaczorowski, Ronjon Nag

    Abstract: This paper details the world's first live 3D holographic (diffractive) video call using off-the-shelf hardware. We introduce a novel pipeline that facilitates the capture, processing, and transmission of RGBZ data, using an iPhone for image and depth capture with VividQ's SDK for hologram generation and hardware for display.

    Submitted 2 February, 2025; originally announced February 2025.

    Comments: Published and Presented at Session 63: Emerging Approaches for AR/VR/MR, SID Display Week 2022. 4 pages, 9 figures

    ACM Class: H.5.1; H.5.2

    Journal ref: SID Symposium Digest of Technical Papers 53(1): 833-836 (2022)

  28. arXiv:2502.01184  [pdf, other

    cs.LG cs.AI physics.chem-ph q-bio.QM

    FragmentNet: Adaptive Graph Fragmentation for Graph-to-Sequence Molecular Representation Learning

    Authors: Ankur Samanta, Rohan Gupta, Aditi Misra, Christian McIntosh Clarke, Jayakumar Rajadas

    Abstract: Molecular property prediction uses molecular structure to infer chemical properties. Chemically interpretable representations that capture meaningful intramolecular interactions enhance the usability and effectiveness of these predictions. However, existing methods often rely on atom-based or rule-based fragment tokenization, which can be chemically suboptimal and lack scalability. We introduce Fr… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Comments: 22 pages, 13 figures, 5 tables

  29. arXiv:2502.00027  [pdf

    cs.AR cs.AI cs.NE

    Analysis of a Memcapacitor-Based for Neural Network Accelerator Framework

    Authors: Ankur Singh, Dowon Kim, Byung-Geun Lee

    Abstract: Data-intensive computing tasks, such as training neural networks, are crucial for artificial intelligence applications but often come with high energy demands. One promising solution is to develop specialized hardware that directly maps neural networks, utilizing arrays of memristive devices to perform parallel multiply-accumulate operations. In our research, we introduce a novel CMOS-based memcap… ▽ More

    Submitted 21 January, 2025; originally announced February 2025.

    Comments: 11 pages, 7 figures

  30. What Does an Audio Deepfake Detector Focus on? A Study in the Time Domain

    Authors: Petr Grinberg, Ankur Kumar, Surya Koppisetti, Gaurav Bharaj

    Abstract: Adding explanations to audio deepfake detection (ADD) models will boost their real-world application by providing insight on the decision making process. In this paper, we propose a relevancy-based explainable AI (XAI) method to analyze the predictions of transformer-based ADD models. We compare against standard Grad-CAM and SHAP-based methods, using quantitative faithfulness metrics as well as a… ▽ More

    Submitted 27 January, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

    Comments: Accepted to ICASSP 2025

    Journal ref: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2025, pp. 1-5

  31. Investigation of the Privacy Concerns in AI Systems for Young Digital Citizens: A Comparative Stakeholder Analysis

    Authors: Molly Campbell, Ankur Barthwal, Sandhya Joshi, Austin Shouli, Ajay Kumar Shrestha

    Abstract: The integration of Artificial Intelligence (AI) systems into technologies used by young digital citizens raises significant privacy concerns. This study investigates these concerns through a comparative analysis of stakeholder perspectives. A total of 252 participants were surveyed, with the analysis focusing on 110 valid responses from parents/educators and 100 from AI professionals after data cl… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

    Comments: To appear in the 2025 IEEE 14th Annual Computing and Communication Workshop and Conference (CCWC) proceedings

    Journal ref: 2025 IEEE 15th Annual Computing and Communication Workshop and Conference (CCWC)

  32. arXiv:2501.12072  [pdf, other

    quant-ph cs.IT

    Fault-tolerance of [[6, 1, 3]] non-CSS code family generated using measurements on graph states

    Authors: Harsh Gupta, Pranav Maheshwari, Ankur Raina

    Abstract: We construct and analyze the fault tolerance of $[[6,1,3]]$ non-CSS quantum error correcting code under the anisotropic and depolarizing noise models. This rate-optimized code achieves fault-tolerance using a single ancilla qubit for syndrome measurement under anisotropic noise conditions. This method was called fault-tolerance using bare ancilla by Brown \emph{et al.} We give explicit constructio… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: 10 pages, 12 figures

  33. arXiv:2501.08036  [pdf, other

    quant-ph cs.IT

    Decoding Quantum LDPC Codes using Collaborative Check Node Removal

    Authors: Mainak Bhattacharyya, Ankur Raina

    Abstract: The fault tolerance of quantum devices requires on-par contributions from error-correcting codes and suitable decoders. One of the most explored error-correcting codes is the family of Quantum Low-Density Parity Check (QLDPC) codes. Although faster than many of the reported decoders for QLDPC codes, iterative decoders fail due to the colossal degeneracy and short cycles intrinsic to these code… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

    Comments: 11 pages, 6 figures

  34. arXiv:2412.20269  [pdf, other

    cs.LG

    TeLU Activation Function for Fast and Stable Deep Learning

    Authors: Alfredo Fernandez, Ankur Mali

    Abstract: We propose the Hyperbolic Tangent Exponential Linear Unit (TeLU), a neural network hidden activation function defined as TeLU(x)=xtanh(exp(x)). TeLU's design is grounded in the core principles of key activation functions, achieving strong convergence by closely approximating the identity function in its active region while effectively mitigating the vanishing gradient problem in its saturating reg… ▽ More

    Submitted 1 January, 2025; v1 submitted 28 December, 2024; originally announced December 2024.

    Comments: Updated version of "Stable and Robust Deep Learning By Hyperbolic Tangent Exponential Linear Unit (TeLU)"

  35. Real-time classification of EEG signals using Machine Learning deployment

    Authors: Swati Chowdhuri, Satadip Saha, Samadrita Karmakar, Ankur Chanda

    Abstract: The prevailing educational methods predominantly rely on traditional classroom instruction or online delivery, often limiting the teachers' ability to engage effectively with all the students simultaneously. A more intrinsic method of evaluating student attentiveness during lectures can enable the educators to tailor the course materials and their teaching styles in order to better meet the studen… ▽ More

    Submitted 27 December, 2024; originally announced December 2024.

    Comments: Published in Romanian Journal of Information Technology and Automatic Control

    Journal ref: Vol. 34, No. 4, 7-18, 2024

  36. arXiv:2412.16369  [pdf

    cs.CY cs.LG

    Navigating AI to Unpack Youth Privacy Concerns: An In-Depth Exploration and Systematic Review

    Authors: Ajay Kumar Shrestha, Ankur Barthwal, Molly Campbell, Austin Shouli, Saad Syed, Sandhya Joshi, Julita Vassileva

    Abstract: This systematic literature review investigates perceptions, concerns, and expectations of young digital citizens regarding privacy in artificial intelligence (AI) systems, focusing on social media platforms, educational technology, gaming systems, and recommendation algorithms. Using a rigorous methodology, the review started with 2,000 papers, narrowed down to 552 after initial screening, and fin… ▽ More

    Submitted 20 December, 2024; originally announced December 2024.

    Comments: To appear in the 2024 IEEE Annual Information Technology, Electronics and Mobile Communication Conference proceedings

  37. arXiv:2412.13678  [pdf, other

    cs.CY cs.AI cs.CL cs.CR cs.LG

    Clio: Privacy-Preserving Insights into Real-World AI Use

    Authors: Alex Tamkin, Miles McCain, Kunal Handa, Esin Durmus, Liane Lovitt, Ankur Rathi, Saffron Huang, Alfred Mountfield, Jerry Hong, Stuart Ritchie, Michael Stern, Brian Clarke, Landon Goldberg, Theodore R. Sumers, Jared Mueller, William McEachen, Wes Mitchell, Shan Carter, Jack Clark, Jared Kaplan, Deep Ganguli

    Abstract: How are AI assistants being used in the real world? While model providers in theory have a window into this impact via their users' data, both privacy concerns and practical challenges have made analyzing this data difficult. To address these issues, we present Clio (Claude insights and observations), a privacy-preserving platform that uses AI assistants themselves to analyze and surface aggregate… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

  38. arXiv:2412.12122  [pdf, other

    cs.LG cs.AI eess.SP

    AI-driven Inverse Design of Band-Tunable Mechanical Metastructures for Tailored Vibration Mitigation

    Authors: Tanuj Gupta, Arun Kumar Sharma, Ankur Dwivedi, Vivek Gupta, Subhadeep Sahana, Suryansh Pathak, Ashish Awasthi, Bishakh Bhattacharya

    Abstract: On-demand vibration mitigation in a mechanical system needs the suitable design of multiscale metastructures, involving complex unit cells. In this study, immersing in the world of patterns and examining the structural details of some interesting motifs are extracted from the mechanical metastructure perspective. Nine interlaced metastructures are fabricated using additive manufacturing, and corre… ▽ More

    Submitted 28 February, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

  39. arXiv:2412.02780  [pdf, other

    cs.LG cs.AI

    WxC-Bench: A Novel Dataset for Weather and Climate Downstream Tasks

    Authors: Rajat Shinde, Christopher E. Phillips, Kumar Ankur, Aman Gupta, Simon Pfreundschuh, Sujit Roy, Sheyenne Kirkland, Vishal Gaur, Amy Lin, Aditi Sheshadri, Udaysankar Nair, Manil Maskey, Rahul Ramachandran

    Abstract: High-quality machine learning (ML)-ready datasets play a foundational role in developing new artificial intelligence (AI) models or fine-tuning existing models for scientific applications such as weather and climate analysis. Unfortunately, despite the growing development of new deep learning models for weather and climate, there is a scarcity of curated, pre-processed machine learning (ML)-ready… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  40. arXiv:2412.02732  [pdf, other

    cs.CV

    Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications

    Authors: Daniela Szwarcman, Sujit Roy, Paolo Fraccaro, Þorsteinn Elí Gíslason, Benedikt Blumenstiel, Rinki Ghosal, Pedro Henrique de Oliveira, Joao Lucas de Sousa Almeida, Rocco Sedona, Yanghui Kang, Srija Chakraborty, Sizhe Wang, Carlos Gomes, Ankur Kumar, Myscon Truong, Denys Godwin, Hyunho Lee, Chia-Yu Hsu, Ata Akbari Asanjan, Besart Mujeci, Disha Shidham, Trevor Keenan, Paulo Arevalo, Wenwen Li, Hamed Alemohammad , et al. (10 additional authors not shown)

    Abstract: This technical report presents Prithvi-EO-2.0, a new geospatial foundation model that offers significant improvements over its predecessor, Prithvi-EO-1.0. Trained on 4.2M global time series samples from NASA's Harmonized Landsat and Sentinel-2 data archive at 30m resolution, the new 300M and 600M parameter models incorporate temporal and location embeddings for enhanced performance across various… ▽ More

    Submitted 3 February, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

  41. arXiv:2412.01935  [pdf, other

    cs.LG cs.AI

    Cross Domain Adaptation using Adversarial networks with Cyclic loss

    Authors: Manpreet Kaur, Ankur Tomar, Srijan Mishra, Shashwat Verma

    Abstract: Deep Learning methods are highly local and sensitive to the domain of data they are trained with. Even a slight deviation from the domain distribution affects prediction accuracy of deep networks significantly. In this work, we have investigated a set of techniques aimed at increasing accuracy of generator networks which perform translation from one domain to the other in an adversarial setting. I… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

    Comments: 16 pages, 14 figures

  42. arXiv:2412.01791  [pdf, other

    cs.RO

    DextrAH-RGB: Visuomotor Policies to Grasp Anything with Dexterous Hands

    Authors: Ritvik Singh, Arthur Allshire, Ankur Handa, Nathan Ratliff, Karl Van Wyk

    Abstract: One of the most important, yet challenging, skills for a dexterous robot is grasping a diverse range of objects. Much of the prior work has been limited by speed, generality, or reliance on depth maps and object poses. In this paper, we introduce DextrAH-RGB, a system that can perform dexterous arm-hand grasping end-to-end from RGB image input. We train a privileged fabric-guided policy (FGP) in s… ▽ More

    Submitted 1 February, 2025; v1 submitted 27 November, 2024; originally announced December 2024.

  43. arXiv:2411.15997  [pdf, other

    cs.LG cs.AI cs.DC cs.MA

    Ensuring Fair LLM Serving Amid Diverse Applications

    Authors: Redwan Ibne Seraj Khan, Kunal Jain, Haiying Shen, Ankur Mallick, Anjaly Parayil, Anoop Kulkarni, Steve Kofsky, Pankhuri Choudhary, Renèe St. Amant, Rujia Wang, Yue Cheng, Ali R. Butt, Victor Rühle, Chetan Bansal, Saravan Rajmohan

    Abstract: In a multi-tenant large language model (LLM) serving platform hosting diverse applications, some users may submit an excessive number of requests, causing the service to become unavailable to other users and creating unfairness. Existing fairness approaches do not account for variations in token lengths across applications and multiple LLM calls, making them unsuitable for such platforms. To addre… ▽ More

    Submitted 24 November, 2024; originally announced November 2024.

  44. arXiv:2411.15221  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.chem-ph

    Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry

    Authors: Yoel Zimmermann, Adib Bazgir, Zartashia Afzal, Fariha Agbere, Qianxiang Ai, Nawaf Alampara, Alexander Al-Feghali, Mehrad Ansari, Dmytro Antypov, Amro Aswad, Jiaru Bai, Viktoriia Baibakova, Devi Dutta Biswajeet, Erik Bitzek, Joshua D. Bocarsly, Anna Borisova, Andres M Bran, L. Catherine Brinson, Marcel Moran Calderon, Alessandro Canalicchio, Victor Chen, Yuan Chiang, Defne Circi, Benjamin Charmes, Vikrant Chaudhary , et al. (119 additional authors not shown)

    Abstract: Here, we present the outcomes from the second Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry, which engaged participants across global hybrid locations, resulting in 34 team submissions. The submissions spanned seven key application areas and demonstrated the diverse utility of LLMs for applications in (1) molecular and material property prediction; (2) mo… ▽ More

    Submitted 2 January, 2025; v1 submitted 20 November, 2024; originally announced November 2024.

    Comments: Updating author information, the submission remains largely unchanged. 98 pages total

  45. arXiv:2411.14344  [pdf, ps, other

    cs.DS cs.LG

    Overcomplete Tensor Decomposition via Koszul-Young Flattenings

    Authors: Pravesh K. Kothari, Ankur Moitra, Alexander S. Wein

    Abstract: Motivated by connections between algebraic complexity lower bounds and tensor decompositions, we investigate Koszul-Young flattenings, which are the main ingredient in recent lower bounds for matrix multiplication. Based on this tool we give a new algorithm for decomposing an $n_1 \times n_2 \times n_3$ tensor as the sum of a minimal number of rank-1 terms, and certifying uniqueness of this decomp… ▽ More

    Submitted 21 November, 2024; originally announced November 2024.

    Comments: 42 pages

  46. arXiv:2411.12719  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Rethinking MUSHRA: Addressing Modern Challenges in Text-to-Speech Evaluation

    Authors: Praveen Srinivasa Varadhan, Amogh Gulati, Ashwin Sankar, Srija Anand, Anirudh Gupta, Anirudh Mukherjee, Shiva Kumar Marepally, Ankur Bhatia, Saloni Jaju, Suvrat Bhooshan, Mitesh M. Khapra

    Abstract: Despite rapid advancements in TTS models, a consistent and robust human evaluation framework is still lacking. For example, MOS tests fail to differentiate between similar models, and CMOS's pairwise comparisons are time-intensive. The MUSHRA test is a promising alternative for evaluating multiple TTS systems simultaneously, but in this work we show that its reliance on matching human reference sp… ▽ More

    Submitted 26 May, 2025; v1 submitted 19 November, 2024; originally announced November 2024.

    Comments: Accepted in TMLR

  47. arXiv:2411.07536  [pdf, ps, other

    cs.LG cs.AI cs.DS stat.ML

    Model Stealing for Any Low-Rank Language Model

    Authors: Allen Liu, Ankur Moitra

    Abstract: Model stealing, where a learner tries to recover an unknown model via carefully chosen queries, is a critical problem in machine learning, as it threatens the security of proprietary models and the privacy of data they are trained on. In recent years, there has been particular interest in stealing large language models (LLMs). In this paper, we aim to build a theoretical understanding of stealing… ▽ More

    Submitted 11 November, 2024; originally announced November 2024.

  48. arXiv:2411.06037  [pdf, other

    cs.CL

    Sufficient Context: A New Lens on Retrieval Augmented Generation Systems

    Authors: Hailey Joren, Jianyi Zhang, Chun-Sung Ferng, Da-Cheng Juan, Ankur Taly, Cyrus Rashtchian

    Abstract: Augmenting LLMs with context leads to improved performance across many applications. Despite much research on Retrieval Augmented Generation (RAG) systems, an open question is whether errors arise because LLMs fail to utilize the context from retrieval or the context itself is insufficient to answer the query. To shed light on this, we develop a new notion of sufficient context, along with a metho… ▽ More

    Submitted 22 April, 2025; v1 submitted 8 November, 2024; originally announced November 2024.

  49. arXiv:2411.01834  [pdf, other

    cs.CL eess.AS

    Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback

    Authors: Guan-Ting Lin, Prashanth Gurunath Shivakumar, Aditya Gourav, Yile Gu, Ankur Gandhe, Hung-yi Lee, Ivan Bulyko

    Abstract: While textless Spoken Language Models (SLMs) have shown potential in end-to-end speech-to-speech modeling, they still lag behind text-based Large Language Models (LLMs) in terms of semantic coherence and relevance. This work introduces the Align-SLM framework, which leverages preference optimization inspired by Reinforcement Learning with AI Feedback (RLAIF) to enhance the semantic understanding o… ▽ More

    Submitted 27 May, 2025; v1 submitted 4 November, 2024; originally announced November 2024.

    Comments: Accepted by ACL 2025

  50. arXiv:2411.01643  [pdf, other

    cs.AI cs.CL

    EcoAct: Economic Agent Determines When to Register What Action

    Authors: Shaokun Zhang, Jieyu Zhang, Dujian Ding, Mirian Hipolito Garcia, Ankur Mallick, Daniel Madrigal, Menglin Xia, Victor Rühle, Qingyun Wu, Chi Wang

    Abstract: Recent advancements have enabled Large Language Models (LLMs) to function as agents that can perform actions using external tools. This requires registering, i.e., integrating tool information into the LLM context prior to taking actions. Current methods indiscriminately incorporate all candidate tools into the agent's context and retain them across multiple reasoning steps. This process remains o… ▽ More

    Submitted 3 November, 2024; originally announced November 2024.

    Comments: 16 pages, 10 figures