Skip to main content

Showing 1–50 of 1,098 results for author: Aayush

.
  1. arXiv:2506.15338  [pdf, ps, other

    eess.SP

    Urban RIS-Assisted HAP Networks: Performance Analysis Using Stochastic Geometry

    Authors: Islam M. Tanash, Ayush Kumar Dwivedi, Taneli Riihonen

    Abstract: This paper studies a high-altitude platform (HAP) network supported by reconfigurable intelligent surfaces (RISs). The practical irregular placement of HAPs and RISs is modeled using homogeneous Poisson point processes, while buildings that cause blockages in urban areas are modeled as a Boolean scheme of rectangles. We introduce a novel approach to characterize the statistical channel based on ge… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  2. arXiv:2506.14978  [pdf, ps, other

    cs.LG

    ODD: Overlap-aware Estimation of Model Performance under Distribution Shift

    Authors: Aayush Mishra, Anqi Liu

    Abstract: Reliable and accurate estimation of the error of an ML model in unseen test domains is an important problem for safe intelligent systems. Prior work uses disagreement discrepancy (DIS^2) to derive practical error bounds under distribution shifts. It optimizes for a maximally disagreeing classifier on the target domain to bound the error of a given source classifier. Although this approach offers a… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Accepted to the 41st Conference on Uncertainty in Artificial Intelligence, 2025

  3. arXiv:2506.13048  [pdf, ps, other

    cs.LG

    The Space Complexity of Learning-Unlearning Algorithms

    Authors: Yeshwanth Cherapanamjeri, Sumegha Garg, Nived Rajaraman, Ayush Sekhari, Abhishek Shetty

    Abstract: We study the memory complexity of machine unlearning algorithms that provide strong data deletion guarantees to the users. Formally, consider an algorithm for a particular learning task that initially receives a training dataset. Then, after learning, it receives data deletion requests from a subset of users (of arbitrary size), and the goal of unlearning is to perform the task as if the learner n… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  4. arXiv:2506.12347  [pdf, ps, other

    cs.SE cs.HC

    Sharp Tools: How Developers Wield Agentic AI in Real Software Engineering Tasks

    Authors: Aayush Kumar, Yasharth Bajpai, Sumit Gulwani, Gustavo Soares, Emerson Murphy-Hill

    Abstract: Software Engineering Agents (SWE agents) can autonomously perform development tasks on benchmarks like SWE Bench, but still face challenges when tackling complex and ambiguous real-world tasks. Consequently, SWE agents are often designed to allow interactivity with developers, enabling collaborative problem-solving. To understand how developers collaborate with SWE agents and the communication cha… ▽ More

    Submitted 17 June, 2025; v1 submitted 14 June, 2025; originally announced June 2025.

  5. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  6. arXiv:2506.12097  [pdf, ps, other

    cs.CL cs.CR cs.LG stat.ML

    UCD: Unlearning in LLMs via Contrastive Decoding

    Authors: Vinith M. Suriyakumar, Ayush Sekhari, Ashia Wilson

    Abstract: Machine unlearning aims to remove specific information, e.g. sensitive or undesirable content, from large language models (LLMs) while preserving overall performance. We propose an inference-time unlearning algorithm that uses contrastive decoding, leveraging two auxiliary smaller models, one trained without the forget set and one trained with it, to guide the outputs of the original model using t… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  7. arXiv:2506.12003  [pdf

    cs.NI cs.AI cs.MA

    Upgrade or Switch: Do We Need a New Registry Architecture for the Internet of AI Agents?

    Authors: Ramesh Raskar, Pradyumna Chari, Jared James Grogan, Mahesh Lambe, Robert Lincourt, Raghu Bala, Abhishek Singh, Ayush Chopra, Rajesh Ranjan, Shailja Gupta, Dimitris Stripelis, Maria Gorskikh, Sichao Wang

    Abstract: The emerging Internet of AI Agents challenges existing web infrastructure designed for human-scale, reactive interactions. Unlike traditional web resources, autonomous AI agents initiate actions, maintain persistent state, spawn sub-agents, and negotiate directly with peers: demanding millisecond-level discovery, instant credential revocation, and cryptographic behavioral proofs that exceed curren… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  8. arXiv:2506.11302  [pdf, ps, other

    cs.CV cs.AI

    TARDIS STRIDE: A Spatio-Temporal Road Image Dataset and World Model for Autonomy

    Authors: Héctor Carrión, Yutong Bai, Víctor A. Hernández Castro, Kishan Panaganti, Ayush Zenith, Matthew Trang, Tony Zhang, Pietro Perona, Jitendra Malik

    Abstract: World models aim to simulate environments and enable effective agent behavior. However, modeling real-world environments presents unique challenges as they dynamically change across both space and, crucially, time. To capture these composed dynamics, we introduce a Spatio-Temporal Road Image Dataset for Exploration (STRIDE) permuting 360-degree panoramic imagery into rich interconnected observatio… ▽ More

    Submitted 18 June, 2025; v1 submitted 12 June, 2025; originally announced June 2025.

    Comments: Computer Vision, Pattern Recognition, Early-Fusion, Dataset, Data Augmentation

  9. arXiv:2506.10961  [pdf, ps, other

    astro-ph.HE

    Discovery and Localization of the Swift-Observed FRB 20241228A in a Star-forming Host Galaxy

    Authors: Alice P. Curtin, Shion Andrew, Sunil Simha, Alice Cai, Kenzie Nimmo, Shami Chatterjee, Amanda M. Cook, Fengqiu Adam Dong, Yuxin Dong, Tarraneh Eftekhari, Wen-fai Fong, Emmanuel Fonseca, Jason W. T. Hessels, Ronniy C. Joseph, Victoria Kaspi, Calvin Leung, Robert Main, Kiyoshi W. Masui, Ryan Mckinven, Daniele Michilli, Mason Ng, Ayush Pandhi, Aaron B. Pearlman, Ziggy Pleunis, Mawson W. Sammons , et al. (5 additional authors not shown)

    Abstract: On 2024 December 28, CHIME/FRB detected the thus-far non-repeating FRB 20241228A with a real-time signal-to-noise ratio of $>50$. Approximately 112~s later, the X-ray Telescope onboard the Neil Gehrels Swift Observatory was on source, the fastest follow-up to-date of a non-repeating FRB (Tohuvavohu et al. in prep.). Using CHIME/FRB and two of the three CHIME/FRB Outriggers, we obtained a Very Long… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: Submitted to ApJ

  10. arXiv:2506.10955  [pdf, ps, other

    cs.LG cs.AI cs.CV

    ReGuidance: A Simple Diffusion Wrapper for Boosting Sample Quality on Hard Inverse Problems

    Authors: Aayush Karan, Kulin Shah, Sitan Chen

    Abstract: There has been a flurry of activity around using pretrained diffusion models as informed data priors for solving inverse problems, and more generally around steering these models using reward models. Training-free methods like diffusion posterior sampling (DPS) and its many variants have offered flexible heuristic algorithms for these tasks, but when the reward is not informative enough, e.g., in… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: 38 pages, 14 figures

  11. arXiv:2506.09445  [pdf, ps, other

    cs.CV cs.AI

    TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision

    Authors: Ayush Gupta, Anirban Roy, Rama Chellappa, Nathaniel D. Bastian, Alvaro Velasquez, Susmit Jha

    Abstract: We address the problem of video question answering (video QA) with temporal grounding in a weakly supervised setup, without any temporal annotations. Given a video and a question, we generate an open-ended answer grounded with the start and end time. For this task, we propose TOGA: a vision-language model for Temporally Grounded Open-Ended Video QA with Weak Supervision. We instruct-tune TOGA to j… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  12. arXiv:2506.09108  [pdf, ps, other

    cs.LG cs.AI cs.CL

    SensorLM: Learning the Language of Wearable Sensors

    Authors: Yuwei Zhang, Kumar Ayush, Siyuan Qiao, A. Ali Heydari, Girish Narayanswamy, Maxwell A. Xu, Ahmed A. Metwally, Shawn Xu, Jake Garrison, Xuhai Xu, Tim Althoff, Yun Liu, Pushmeet Kohli, Jiening Zhan, Mark Malhotra, Shwetak Patel, Cecilia Mascolo, Xin Liu, Daniel McDuff, Yuzhe Yang

    Abstract: We present SensorLM, a family of sensor-language foundation models that enable wearable sensor data understanding with natural language. Despite its pervasive nature, aligning and interpreting sensor data with language remains challenging due to the lack of paired, richly annotated sensor-text descriptions in uncurated, real-world wearable data. We introduce a hierarchical caption generation pipel… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  13. arXiv:2506.08376  [pdf, other

    astro-ph.HE gr-qc nucl-th

    Revealing Dark Matter's Role in Neutron Stars Anisotropy: A Bayesian Approach Using Multi-messenger Observations

    Authors: Xue-Zhi Liu, Premachand Mahapatra, Chun Huang, Ayush Hazarika, Chiranjeeb Singha, Prasanta Kumar Das

    Abstract: Dark matter (DM) continues to evade direct detection, but neutron stars (NSs) serve as natural laboratories where even a modest DM component can alter their structure. While many studies have examined DM effects on NSs, they often rely on specific choices of equations of state (EOS) models, assume isotropy, and lack a Bayesian statistical framework, limiting their predictive power. In this work, w… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 24 pages, 13 figures. Submitting to PRD. Comments welcome

  14. arXiv:2506.08249  [pdf, other

    cs.DB cs.CL

    RADAR: Benchmarking Language Models on Imperfect Tabular Data

    Authors: Ken Gu, Zhihan Zhang, Kate Lin, Yuwei Zhang, Akshay Paruchuri, Hong Yu, Mehran Kazemi, Kumar Ayush, A. Ali Heydari, Maxwell A. Xu, Girish Narayanswamy, Yun Liu, Ming-Zher Poh, Yuzhe Yang, Mark Malhotra, Shwetak Patel, Hamid Palangi, Xuhai Xu, Daniel McDuff, Tim Althoff, Xin Liu

    Abstract: Language models (LMs) are increasingly being deployed to perform autonomous data analyses. However, their data awareness -- the ability to recognize, reason over, and appropriately handle data artifacts such as missing values, outliers, and logical inconsistencies -- remains underexplored. These artifacts are especially common in real-world tabular data and, if mishandled, can significantly compro… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  15. arXiv:2506.07259  [pdf, ps, other

    stat.ML cs.LG

    ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition

    Authors: Daolang Huang, Xinyi Wen, Ayush Bharti, Samuel Kaski, Luigi Acerbi

    Abstract: Many critical applications, from autonomous scientific discovery to personalized medicine, demand systems that can both strategically acquire the most informative data and instantaneously perform inference based upon it. While amortized methods for Bayesian inference and experimental design offer part of the solution, neither approach is optimal in the most general and challenging task, where new… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

    Comments: 27 pages, 13 figures

  16. arXiv:2506.06087  [pdf, ps, other

    stat.ML astro-ph.CO astro-ph.IM cs.LG stat.CO

    Multilevel neural simulation-based inference

    Authors: Yuga Hikida, Ayush Bharti, Niall Jeffrey, François-Xavier Briol

    Abstract: Neural simulation-based inference (SBI) is a popular set of methods for Bayesian inference when models are only available in the form of a simulator. These methods are widely used in the sciences and engineering, where writing down a likelihood can be significantly more challenging than constructing a simulator. However, the performance of neural SBI can suffer when simulators are computationally… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  17. arXiv:2506.06073  [pdf, ps, other

    cs.LG

    System-Aware Unlearning Algorithms: Use Lesser, Forget Faster

    Authors: Linda Lu, Ayush Sekhari, Karthik Sridharan

    Abstract: Machine unlearning addresses the problem of updating a machine learning model/system trained on a dataset $S$ so that the influence of a set of deletion requests $U \subseteq S$ on the unlearned model is minimized. The gold standard definition of unlearning demands that the updated model, after deletion, be nearly identical to the model obtained by retraining. This definition is designed for a wor… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: ICML 2025

  18. arXiv:2506.05707  [pdf, ps, other

    q-bio.BM

    A cautious user's guide in applying HMMs to physical systems

    Authors: Max Schweiger, Ayush Saurabh, Steve Pressé

    Abstract: Nature, as far as we know, evolves continuously through space and time. Yet the ubiquitous hidden Markov model (HMM)--originally developed for discrete time and space analysis in natural language processing--remains a central tool in interpreting time series data drawn from from physical systems. This raises a fundamental question: What are the implications of applying a discrete-state, discrete-t… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  19. arXiv:2506.05670  [pdf, ps, other

    cs.CL

    Can LLMs Express Personality Across Cultures? Introducing CulturalPersonas for Evaluating Trait Alignment

    Authors: Priyanka Dey, Yugal Khanter, Aayush Bothra, Jieyu Zhao, Emilio Ferrara

    Abstract: As LLMs become central to interactive applications, ranging from tutoring to mental health, the ability to express personality in culturally appropriate ways is increasingly important. While recent works have explored personality evaluation of LLMs, they largely overlook the interplay between culture and personality. To address this, we introduce CulturalPersonas, the first large-scale benchmark w… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  20. arXiv:2506.05461  [pdf, ps, other

    cond-mat.stat-mech cond-mat.quant-gas cond-mat.str-el physics.atom-ph

    Emergent Berezinskii-Kosterlitz-Thouless deconfinement in super-Coulombic plasmas

    Authors: Ayush De, Leo Radzihovsky, Snir Gazit

    Abstract: We study the statistical mechanics of two-dimensional "super-Coulombic" plasmas, namely, neutral plasmas with power-law interactions longer-ranged than Coulomb. To that end, we employ numerically exact large-scale Monte Carlo simulations. Contrary to naive energy-entropy arguments, we observe a charge confinement-deconfinement transition as a function of temperature. Remarkably, the transition lie… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: 11 pages, 14 figures

  21. arXiv:2506.05321  [pdf, other

    cs.LG

    LSM-2: Learning from Incomplete Wearable Sensor Data

    Authors: Maxwell A. Xu, Girish Narayanswamy, Kumar Ayush, Dimitris Spathis, Shun Liao, Shyam A. Tailor, Ahmed Metwally, A. Ali Heydari, Yuwei Zhang, Jake Garrison, Samy Abdel-Ghaffar, Xuhai Xu, Ken Gu, Jacob Sunshine, Ming-Zher Poh, Yun Liu, Tim Althoff, Shrikanth Narayanan, Pushmeet Kohli, Mark Malhotra, Shwetak Patel, Yuzhe Yang, James M. Rehg, Xin Liu, Daniel McDuff

    Abstract: Foundation models, a cornerstone of recent advancements in machine learning, have predominantly thrived on complete and well-structured data. Wearable sensor data frequently suffers from significant missingness, posing a substantial challenge for self-supervised learning (SSL) models that typically assume complete data inputs. This paper introduces the second generation of Large Sensor Model (LSM-… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Xu and Narayanswamy are co-first authors. McDuff and Liu are co-last authors

  22. arXiv:2506.04987  [pdf, ps, other

    cs.SE cs.AI

    A Multi-Dataset Evaluation of Models for Automated Vulnerability Repair

    Authors: Zanis Ali Khan, Aayush Garg, Qiang Tang

    Abstract: Software vulnerabilities pose significant security threats, requiring effective mitigation. While Automated Program Repair (APR) has advanced in fixing general bugs, vulnerability patching, a security-critical aspect of APR remains underexplored. This study investigates pre-trained language models, CodeBERT and CodeT5, for automated vulnerability patching across six datasets and four languages. We… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Preprint has been accepted in ARES AI&CCPS (International Workshop on Artificial Intelligence, Cyber and Cyber-Physical Security)

  23. arXiv:2506.04368  [pdf, ps, other

    cs.DC

    Fully-Distributed Construction of Byzantine-Resilient Dynamic Peer-to-Peer Networks

    Authors: Aayush Gupta, Gopal Pandurangan

    Abstract: We address a fundamental problem in Peer-to-Peer (P2P) networks, namely, constructing and maintaining dynamic P2P overlay network topologies with essential properties such as connectivity, low diameter, and high expansion, that are resilient to continuous high churn and the presence of a large number of malicious (Byzantine) nodes. Our main goal is to construct and maintain a sparse (bounded degre… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  24. arXiv:2506.03148  [pdf, ps, other

    cs.CV

    Self-Supervised Spatial Correspondence Across Modalities

    Authors: Ayush Shrivastava, Andrew Owens

    Abstract: We present a method for finding cross-modal space-time correspondences. Given two images from different visual modalities, such as an RGB image and a depth map, our model identifies which pairs of pixels correspond to the same physical points in the scene. To solve this problem, we extend the contrastive random walk framework to simultaneously learn cycle-consistent feature representations for bot… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: CVPR 2025. Project link: https://www.ayshrv.com/cmrw . Code: https://github.com/ayshrv/cmrw

  25. arXiv:2506.02556  [pdf, ps, other

    cs.RO

    Sign Language: Towards Sign Understanding for Robot Autonomy

    Authors: Ayush Agrawal, Joel Loo, Nicky Zimmerman, David Hsu

    Abstract: Signage is an ubiquitous element of human environments, playing a critical role in both scene understanding and navigation. For autonomous systems to fully interpret human environments, effectively parsing and understanding signs is essential. We introduce the task of navigational sign understanding, aimed at extracting navigational cues from signs that convey symbolic spatial information about th… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  26. arXiv:2505.24603  [pdf, ps, other

    cs.LG

    The Gaussian Mixing Mechanism: Renyi Differential Privacy via Gaussian Sketches

    Authors: Omri Lev, Vishwak Srinivasan, Moshe Shenfeld, Katrina Ligett, Ayush Sekhari, Ashia C. Wilson

    Abstract: Gaussian sketching, which consists of pre-multiplying the data with a random Gaussian matrix, is a widely used technique for multiple problems in data science and machine learning, with applications spanning computationally efficient optimization, coded computing, and federated learning. This operation also provides differential privacy guarantees due to its inherent randomness. In this work, we r… ▽ More

    Submitted 4 June, 2025; v1 submitted 30 May, 2025; originally announced May 2025.

  27. arXiv:2505.24360  [pdf, ps, other

    cs.LG

    Interpreting Large Text-to-Image Diffusion Models with Dictionary Learning

    Authors: Stepan Shabalin, Ayush Panda, Dmitrii Kharlapenko, Abdur Raheem Ali, Yixiong Hao, Arthur Conmy

    Abstract: Sparse autoencoders are a promising new approach for decomposing language model activations for interpretation and control. They have been applied successfully to vision transformer image encoders and to small-scale diffusion models. Inference-Time Decomposition of Activations (ITDA) is a recently proposed variant of dictionary learning that takes the dictionary to be a set of data points from the… ▽ More

    Submitted 2 June, 2025; v1 submitted 30 May, 2025; originally announced May 2025.

    Comments: 10 pages, 10 figures, Mechanistic Interpretability for Vision at CVPR 2025

  28. arXiv:2505.24250  [pdf, ps, other

    econ.GN

    Winners vs. Losers: Momentum-based Strategies with Intertemporal Choice for ESG Portfolios

    Authors: Ayush Jha, Abootaleb Shirvani, Ali Jaffri, Svetlozar T. Rachev, Frank J. Fabozzi

    Abstract: This paper introduces a state-dependent momentum framework that integrates ESG regime switching with tail-risk-aware reward-risk metrics. Using a dynamic programming approach and solving a finite-horizon Bellman equation, we construct long-short momentum portfolios that adjust to changing ESG sentiment regimes. Unlike traditional momentum strategies based on historical returns, our approach incorp… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  29. arXiv:2505.24063  [pdf

    cs.CL cs.DB

    TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine

    Authors: Jiacheng Xie, Yang Yu, Ziyang Zhang, Shuai Zeng, Jiaxuan He, Ayush Vasireddy, Xiaoting Tang, Congyu Guo, Lening Zhao, Congcong Jing, Guanghui An, Dong Xu

    Abstract: Traditional Chinese Medicine (TCM), as an effective alternative medicine, has been receiving increasing attention. In recent years, the rapid development of large language models (LLMs) tailored for TCM has underscored the need for an objective and comprehensive evaluation framework to assess their performance on real-world tasks. However, existing evaluation datasets are limited in scope and prim… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 22 pages, 4 figures

  30. arXiv:2505.23678  [pdf, ps, other

    cs.CV

    Grounded Reinforcement Learning for Visual Reasoning

    Authors: Gabriel Sarch, Snigdha Saha, Naitik Khandelwal, Ayush Jain, Michael J. Tarr, Aviral Kumar, Katerina Fragkiadaki

    Abstract: While reinforcement learning (RL) over chains of thought has significantly advanced language models in tasks such as mathematics and coding, visual reasoning introduces added complexity by requiring models to direct visual attention, interpret perceptual inputs, and ground abstract reasoning in spatial evidence. We introduce ViGoRL (Visually Grounded Reinforcement Learning), a vision-language mode… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: Project website: https://visually-grounded-rl.github.io/

  31. arXiv:2505.22820  [pdf, ps, other

    cs.LG

    Preference Learning with Response Time

    Authors: Ayush Sawarni, Sahasrajit Sarmasarkar, Vasilis Syrgkanis

    Abstract: This paper investigates the integration of response time data into human preference learning frameworks for more effective reward model elicitation. While binary preference data has become fundamental in fine-tuning foundation models, generative AI systems, and other large-scale models, the valuable temporal information inherent in user decision-making remains largely unexploited. We propose novel… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  32. arXiv:2505.21833  [pdf, ps, other

    q-bio.QM

    A Graph Completion Method that Jointly Predicts Geometry and Topology Enables Effective Molecule Assembly

    Authors: Rohan V. Koodli, Alexander S. Powers, Ayush Pandit, Chiho Im, Ron O. Dror

    Abstract: A common starting point for drug design is to find small chemical groups or "fragments" that form interactions with distinct subregions in a protein binding pocket. The subsequent challenge is to assemble these fragments into a molecule that has high affinity to the protein, by adding chemical bonds between atoms in different fragments. This "molecule assembly" task is particularly challenging bec… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  33. arXiv:2505.18122  [pdf, ps, other

    cs.CL

    UNJOIN: Enhancing Multi-Table Text-to-SQL Generation via Schema Simplification

    Authors: Poojah Ganesan, Rajat Aayush Jha, Dan Roth, Vivek Gupta

    Abstract: Recent advances in large language models (LLMs) have greatly improved Text-to-SQL performance for single-table queries. But, it remains challenging in multi-table databases due to complex schema and relational operations. Existing methods often struggle with retrieving the right tables and columns, generating accurate JOINs and UNIONs, and generalizing across diverse schemas. To address these issu… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  34. arXiv:2505.17360  [pdf, ps, other

    cs.CC cs.DS

    The Quasi-Polynomial Low-Degree Conjecture is False

    Authors: Rares-Darius Buhai, Jun-Ting Hsieh, Aayush Jain, Pravesh K. Kothari

    Abstract: There is a growing body of work on proving hardness results for average-case estimation problems by bounding the low-degree advantage (LDA) - a quantitative estimate of the closeness of low-degree moments - between a null distribution and a related planted distribution. Such hardness results are now ubiquitous not only for foundational average-case problems but also central questions in statistics… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  35. arXiv:2505.16519  [pdf, ps, other

    cs.NI

    SONIC: Cost-Effective Web Access for Developing Countries

    Authors: Ayush Pandey, Rohail Asim, Jean Louis K. E. Fendji, Talal Rahwan, Matteo Varvello, Yasir Zaki

    Abstract: Over 2.6 billion people remain without access to the Internet in 2025. This phenomenon is especially pronounced in developing regions, where cost and infrastructure limitations are major barriers to connectivity. In response, we design SONIC, a low-cost, scalable data delivery system that builds on existing infrastructures: FM radio for downlink broadcasting, and SMS for personalized uplink. SONIC… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 16 pages, 20 figures

  36. arXiv:2505.16261  [pdf

    cs.CR

    Interpretable Anomaly Detection in Encrypted Traffic Using SHAP with Machine Learning Models

    Authors: Kalindi Singh, Aayush Kashyap, Aswani Kumar Cherukuri

    Abstract: The widespread adoption of encrypted communication protocols such as HTTPS and TLS has enhanced data privacy but also rendered traditional anomaly detection techniques less effective, as they often rely on inspecting unencrypted payloads. This study aims to develop an interpretable machine learning-based framework for anomaly detection in encrypted network traffic. This study proposes a model-agno… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  37. arXiv:2505.15623  [pdf, ps, other

    cs.CL cs.LG

    Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning

    Authors: Tiasa Singha Roy, Aditeya Baral, Ayush Rajesh Jhaveri, Yusuf Baig

    Abstract: Large language models (LLMs) demonstrate considerable potential in various natural language tasks but face significant challenges in mathematical reasoning, particularly in executing precise, multi-step logic. However, current evaluation frameworks judge their performance solely based on accuracy, which only accounts for the final answer. This study explores these pitfalls by employing a novel eva… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

  38. Physics-Guided Multi-View Graph Neural Network for Schizophrenia Classification via Structural-Functional Coupling

    Authors: Badhan Mazumder, Ayush Kanyal, Lei Wu, Vince D. Calhoun, Dong Hye Ye

    Abstract: Clinical studies reveal disruptions in brain structural connectivity (SC) and functional connectivity (FC) in neuropsychiatric disorders such as schizophrenia (SZ). Traditional approaches might rely solely on SC due to limited functional data availability, hindering comprehension of cognitive and behavioral impairments in individuals with SZ by neglecting the intricate SC-FC interrelationship. To… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: Accepted and presented at the 7th International Workshop on PRedictive Intelligence in MEdicine (Held in Conjunction with MICCAI 2024)

  39. arXiv:2505.15050  [pdf, ps, other

    cs.CL

    Improving the fact-checking performance of language models by relying on their entailment ability

    Authors: Gaurav Kumar, Debajyoti Mazumder, Ayush Garg, Jasabanta Patro

    Abstract: Automated fact-checking is a crucial task in this digital age. To verify a claim, current approaches majorly follow one of two strategies i.e. (i) relying on embedded knowledge of language models, and (ii) fine-tuning them with evidence pieces. While the former can make systems to hallucinate, the later have not been very successful till date. The primary reason behind this is that fact verificati… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 44 pages

  40. arXiv:2505.13777  [pdf, ps, other

    cs.CV cs.AI cs.SD

    Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping

    Authors: Subash Khanal, Srikumar Sastry, Aayush Dhakal, Adeel Ahmad, Nathan Jacobs

    Abstract: We present Sat2Sound, a multimodal representation learning framework for soundscape mapping, designed to predict the distribution of sounds at any location on Earth. Existing methods for this task rely on satellite image and paired geotagged audio samples, which often fail to capture the diversity of sound sources at a given location. To address this limitation, we enhance existing datasets by lev… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  41. arXiv:2505.13297  [pdf, ps, other

    astro-ph.HE

    The CHIME/FRB Discovery of the Extremely Active Fast Radio Burst Source FRB 20240114A

    Authors: Kaitlyn Shin, Alice Curtin, Maxwell Fine, Ayush Pandhi, Shion Andrew, Mohit Bhardwaj, Shami Chatterjee, Amanda M. Cook, Emmanuel Fonseca, B. M. Gaensler, Jason Hessels, Naman Jain, Victoria M. Kaspi, Bikash Kharel, Adam E. Lanman, Mattias Lazda, Calvin Leung, Robert Main, Kiyoshi W. Masui, Daniele Michilli, Mason Ng, Kenzie Nimmo, Aaron B. Pearlman, Ue-Li Pen, Ziggy Pleunis , et al. (6 additional authors not shown)

    Abstract: Among the thousands of observed fast radio bursts (FRBs), a few sources exhibit exceptionally high burst activity observable by many telescopes across a broad range of radio frequencies. Almost all of these highly active repeaters have been discovered by CHIME/FRB, due to its daily observations of the entire Northern sky as a transit radio telescope. FRB 20240114A is a source discovered and report… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: 10 pages, submitted

  42. arXiv:2505.12198  [pdf, other

    econ.EM q-fin.GN

    Multivariate Affine GARCH with Heavy Tails: A Unified Framework for Portfolio Optimization and Option Valuation

    Authors: Ayush Jha, Abootaleb Shirvani, Ali Jaffri, Svetlozar T. Rachev, Frank J. Fabozzi

    Abstract: This paper develops and estimates a multivariate affine GARCH(1,1) model with Normal Inverse Gaussian innovations that captures time-varying volatility, heavy tails, and dynamic correlation across asset returns. We generalize the Heston-Nandi framework to a multivariate setting and apply it to 30 Dow Jones Industrial Average stocks. The model jointly supports three core financial applications: dyn… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

  43. arXiv:2505.12178  [pdf, other

    math.CO math.DG

    Elementary symmetric polynomials under the fixed point measure

    Authors: Ayush Khaitan, Ishan Mata, Bhargav Narayanan

    Abstract: We identify a surprising inequality satisfied by elementary symmetric polynomials under the action of the fixed point measure of a random permutation. Concretely, for any collection of $n$ non-negative real numbers $a_1, \dots, a_n \in \mathbb{R}_{\geq 0}$, we prove that \[ \frac{1}{n!} \sum_{π\in S_n} \left[\prod_{\{i:i=π(i)\}} a_i\right] \ge \frac{1}{\binom{n}{2}} \sum_{S \in\binom{[n]}{2}}… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

    Comments: 14 pages, 0 figures

    MSC Class: 05A20 (Primary) 05E05; 15A15 (Secondary)

  44. arXiv:2505.11954  [pdf, ps, other

    math.DG math.RA

    Moduli spaces of Hom-Lie algebroid connections

    Authors: Ayush Jaiswal

    Abstract: We have studied irreducible Hom-Lie algebroid connections for Hom-bundle and prove that the H-gauge theoretic moduli space has a Hausdorff Hilbert manifold structure. This work generalizes some known results about simple semi-connections and Lie algebroid connections for complex vector bundles on compact complex manifold.

    Submitted 17 May, 2025; originally announced May 2025.

    Comments: This is first draft, any comments are most welcome. 21 pages

  45. arXiv:2505.10746  [pdf, ps, other

    cs.CY cs.AI cs.CR cs.SI

    ChestyBot: Detecting and Disrupting Chinese Communist Party Influence Stratagems

    Authors: Matthew Stoffolano, Ayush Rout, Justin M. Pelletier

    Abstract: Foreign information operations conducted by Russian and Chinese actors exploit the United States' permissive information environment. These campaigns threaten democratic institutions and the broader Westphalian model. Yet, existing detection and mitigation strategies often fail to identify active information campaigns in real time. This paper introduces ChestyBot, a pragmatics-based language model… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: Presented at USCYBERCOM Cyber Recon Symposium 2023 at DreamPort in Columbia, MD on April 20, 2023

  46. arXiv:2505.09805  [pdf, ps, other

    q-bio.QM cs.AI cs.LG stat.AP

    Contextual Phenotyping of Pediatric Sepsis Cohort Using Large Language Models

    Authors: Aditya Nagori, Ayush Gautam, Matthew O. Wiens, Vuong Nguyen, Nathan Kenya Mugisha, Jerome Kabakyenga, Niranjan Kissoon, John Mark Ansermino, Rishikesan Kamaleswaran

    Abstract: Clustering patient subgroups is essential for personalized care and efficient resource use. Traditional clustering methods struggle with high-dimensional, heterogeneous healthcare data and lack contextual understanding. This study evaluates Large Language Model (LLM) based clustering against classical methods using a pediatric sepsis dataset from a low-income country (LIC), containing 2,686 record… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: 11 pages, 2 Figures, 1 Table

  47. arXiv:2505.08561  [pdf, other

    cs.CV

    Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection

    Authors: Ayush K. Rai, Kyle Min, Tarun Krishna, Feiyan Hu, Alan F. Smeaton, Noel E. O'Connor

    Abstract: Masked video modeling~(MVM) has emerged as a highly effective pre-training strategy for visual foundation models, whereby the model reconstructs masked spatiotemporal tokens using information from visible tokens. However, a key challenge in such approaches lies in selecting an appropriate masking strategy. Previous studies have explored predefined masking techniques, including random and tube-base… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  48. arXiv:2505.08272  [pdf, ps, other

    astro-ph.GA

    The Polarisation Sky Survey of the Universe's Magnetism (POSSUM): Science Goals and Survey Description

    Authors: B. M. Gaensler, G. H. Heald, N. M. McClure-Griffiths, C. S. Anderson, C. L. Van Eck, J. L. West, A. J. M. Thomson, J. P. Leahy, L. Rudnick, Y. K. Ma, Takuya Akahori, G. Gürkan, T. L. Landecker, S. A. Mao, S. P. O'Sullivan, W. Raja, X. Sun, T. Vernstrom, Lerato Baidoo, Ettore Carretti, A. R. Taylor, A. G. Willis, Erik Osinga, J. D. Livingston, E. L. Alexander , et al. (35 additional authors not shown)

    Abstract: The Australian SKA Pathfinder (ASKAP) offers powerful new capabilities for studying the polarised and magnetised Universe at radio wavelengths. In this paper, we introduce the Polarisation Sky Survey of the Universe's Magnetism (POSSUM), a groundbreaking survey with three primary objectives: (1) to create a comprehensive Faraday rotation measure (RM) grid of up to one million compact extragalactic… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: Accepted for publication in PASA. 32 pages, 9 figures, 1 table

  49. arXiv:2505.06846  [pdf, ps, other

    econ.TH math.OC

    Utility Maximization Under Endogenous Uncertainty

    Authors: Ayush Gupta

    Abstract: This paper establishes a general existence result for expected utility maximization in settings where the agent's decision affects the uncertainty faced by her. We introduce a continuity condition for choice-dependent probability measures which ensures the upper semi-continuity of expected utility. Our topological proof imposes minimal restrictions on the utility function and the random variable.… ▽ More

    Submitted 10 June, 2025; v1 submitted 11 May, 2025; originally announced May 2025.

  50. arXiv:2505.06474  [pdf, ps, other

    physics.ao-ph

    Climate in a Bottle: Towards a Generative Foundation Model for the Kilometer-Scale Global Atmosphere

    Authors: Noah D. Brenowitz, Tao Ge, Akshay Subramaniam, Aayush Gupta, David M. Hall, Morteza Mardani, Arash Vahdat, Karthik Kashinath, Michael S. Pritchard

    Abstract: AI emulators offer a path to compressing, boosting limited ensembles, and improving the latency of interacting with petabyte-scale climate prediction data. However, prevailing auto-regressive paradigms offer limited flexibility, and are challenging to train on climate time horizons due to drifts, instabilities and component-coupling challenges. Conditionally generative models offer an appealing al… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.