Skip to main content

Showing 1–50 of 331 results for author: Agrawal, S

.
  1. arXiv:2507.03992  [pdf, ps, other

    cs.RO eess.SY

    Scalable Learning of High-Dimensional Demonstrations with Composition of Linear Parameter Varying Dynamical Systems

    Authors: Shreenabh Agrawal, Hugo T. M. Kussaba, Lingyun Chen, Allen Emmanuel Binny, Abdalla Swikir, Pushpak Jagtap, Sami Haddadin

    Abstract: Learning from Demonstration (LfD) techniques enable robots to learn and generalize tasks from user demonstrations, eliminating the need for coding expertise among end-users. One established technique to implement LfD in robots is to encode demonstrations in a stable Dynamical System (DS). However, finding a stable dynamical system entails solving an optimization problem with bilinear matrix inequa… ▽ More

    Submitted 5 July, 2025; originally announced July 2025.

    Comments: Submitted to the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2025)

    MSC Class: 68T40 ACM Class: I.2.9

  2. arXiv:2506.22694  [pdf, ps, other

    cs.CL

    VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs

    Authors: Raghavv Goel, Sudhanshu Agrawal, Mukul Gagrani, Junyoung Park, Yifan Zao, He Zhang, Tian Liu, Yiping Yang, Xin Yuan, Jiuyan Lu, Chris Lott, Mingu Lee

    Abstract: In this paper, we introduce a simple training-free technique to improve the performance of drafter-based speculative decoding (SpD) methods that incorporates language modeling head (LM head) during drafting process. A drafter-based speculative decoding leverages one or more smaller language models, a.k.a. drafters or draft models, to sample a draft sequence or tree consisting of multiple tokens, f… ▽ More

    Submitted 3 July, 2025; v1 submitted 27 June, 2025; originally announced June 2025.

    Comments: 8 pages, 4 figures, 5 tables, accepted at ICML 2025 workshop on Efficient Systems for Foundational Models

  3. arXiv:2506.19743  [pdf, ps, other

    cs.IR cs.CL

    NEAR$^2$: A Nested Embedding Approach to Efficient Product Retrieval and Ranking

    Authors: Shenbin Qian, Diptesh Kanojia, Samarth Agrawal, Hadeel Saadany, Swapnil Bhosale, Constantin Orasan, Zhe Wu

    Abstract: E-commerce information retrieval (IR) systems struggle to simultaneously achieve high accuracy in interpreting complex user queries and maintain efficient processing of vast product catalogs. The dual challenge lies in precisely matching user intent with relevant products while managing the computational demands of real-time search across massive inventories. In this paper, we propose a Nested Emb… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

    Comments: This paper is accepted to the 2025 SIGIR Workshop on eCommerce

  4. arXiv:2506.15526  [pdf, ps, other

    hep-th gr-qc

    Null infinity as an inverted extremal horizon: Matching an infinite set of conserved quantities for gravitational perturbations

    Authors: Shreyansh Agrawal, Panagiotis Charalambous, Laura Donnay

    Abstract: Every spacetime that is asymptotically flat near null infinity can be conformally mapped via a spatial inversion onto the geometry around an extremal, non-rotating and non-expanding horizon. We set up a dictionary for this geometric duality, connecting the geometry and physics near null infinity to those near the dual horizon. We then study its physical implications for conserved quantities for ex… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: 46+5 pages, 2 figures

  5. arXiv:2506.02546  [pdf, ps, other

    cs.CR

    Attention Knows Whom to Trust: Attention-based Trust Management for LLM Multi-Agent Systems

    Authors: Pengfei He, Zhenwei Dai, Xianfeng Tang, Yue Xing, Hui Liu, Jingying Zeng, Qiankun Peng, Shrivats Agrawal, Samarth Varshney, Suhang Wang, Jiliang Tang, Qi He

    Abstract: Large Language Model-based Multi-Agent Systems (LLM-MAS) have demonstrated strong capabilities in solving complex tasks but remain vulnerable when agents receive unreliable messages. This vulnerability stems from a fundamental gap: LLM agents treat all incoming messages equally without evaluating their trustworthiness. While some existing studies approach the trustworthiness, they focus on a singl… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  6. arXiv:2506.02206  [pdf, other

    cs.RO

    Reinforcement Learning with Data Bootstrapping for Dynamic Subgoal Pursuit in Humanoid Robot Navigation

    Authors: Chengyang Peng, Zhihao Zhang, Shiting Gong, Sankalp Agrawal, Keith A. Redmill, Ayonga Hereid

    Abstract: Safe and real-time navigation is fundamental for humanoid robot applications. However, existing bipedal robot navigation frameworks often struggle to balance computational efficiency with the precision required for stable locomotion. We propose a novel hierarchical framework that continuously generates dynamic subgoals to guide the robot through cluttered environments. Our method comprises a high-… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: 8 pages, 5 figures, 3 tables

  7. SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction

    Authors: Saurabh Agrawal, Raj Gohil, Gopal Kumar Agrawal, Vikram C M, Kushal Verma

    Abstract: Speech quality assessment is a critical process in selecting text-to-speech synthesis (TTS) or voice conversion models. Evaluation of voice synthesis can be done using objective metrics or subjective metrics. Although there are many objective metrics like the Perceptual Evaluation of Speech Quality (PESQ), Perceptual Objective Listening Quality Assessment (POLQA) or Short-Time Objective Intelligib… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Journal ref: 2024 International Conference on Signal Processing and Communications (SPCOM), 2024}, pages 1-5, 10631576

  8. arXiv:2506.00917  [pdf, ps, other

    cs.LG

    Q-learning with Posterior Sampling

    Authors: Priyank Agrawal, Shipra Agrawal, Azmat Azati

    Abstract: Bayesian posterior sampling techniques have demonstrated superior empirical performance in many exploration-exploitation settings. However, their theoretical analysis remains a challenge, especially in complex settings like reinforcement learning. In this paper, we introduce Q-Learning with Posterior Sampling (PSQL), a simple Q-learning-based algorithm that uses Gaussian posteriors on Q-values for… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: 39 Pages

  9. arXiv:2505.21048  [pdf, ps, other

    hep-ph nucl-th quant-ph

    Entanglement Negativity of Spin-Orbit Correlations in a general Qubit-Qudit Setup

    Authors: Sanskriti Agrawal, Raktim Abir

    Abstract: We present the complete eigenvalue spectrum of the partially transposed density matrix for a pure bipartite quantum state acting on a generic $2 \otimes n$ Hilbert space. The spectrum contains four non-zero eigenvalues, as, \begin{eqnarray} λ_{1,2}=\pm \sqrt{A}, ~~~ λ_{3,4}= \frac{1}{2}(1\pm\sqrt{1-4 A}), \nonumber \end{eqnarray} where $A$ is the determinant of the reduced density matrix (traced o… ▽ More

    Submitted 10 June, 2025; v1 submitted 27 May, 2025; originally announced May 2025.

  10. arXiv:2505.13762   

    cs.RO eess.SY

    From Structural Design to Dynamics Modeling: Control-Oriented Development of a 3-RRR Parallel Ankle Rehabilitation Robot

    Authors: Siyuan Zhang, Yufei Zhang, Junlin Lyu, Sunil K. Agrawal

    Abstract: This paper presents the development of a wearable ankle rehabilitation robot based on a 3-RRR spherical parallel mechanism (SPM) to support multi-DOF recovery through pitch, roll, and yaw motions. The system features a compact, ergonomic structure designed for comfort, safety, and compatibility with ankle biomechanics. A complete design-to-dynamics pipeline has been implemented, including structur… ▽ More

    Submitted 30 May, 2025; v1 submitted 19 May, 2025; originally announced May 2025.

    Comments: This paper was originally submitted as a class project and included the name of a faculty member without prior permission. At the instructor's request, I am withdrawing the paper. The work may be resubmitted in the future after further development and testing

  11. arXiv:2504.20106  [pdf, other

    cs.LG cs.AI

    Adaptive Helpfulness-Harmlessness Alignment with Preference Vectors

    Authors: Ren-Wei Liang, Chin-Ting Hsu, Chan-Hung Yu, Saransh Agrawal, Shih-Cheng Huang, Shang-Tse Chen, Kuan-Hao Huang, Shao-Hua Sun

    Abstract: Ensuring that large language models (LLMs) are both helpful and harmless is a critical challenge, as overly strict constraints can lead to excessive refusals, while permissive models risk generating harmful content. Existing approaches, such as reinforcement learning from human feedback (RLHF) and direct preference optimization (DPO), attempt to balance these trade-offs but suffer from performance… ▽ More

    Submitted 27 April, 2025; originally announced April 2025.

    Comments: 22 pages, 5 figures, 9 tables

  12. arXiv:2504.19952  [pdf, ps, other

    math.ST cs.LG stat.ML

    On Stopping Times of Power-one Sequential Tests: Tight Lower and Upper Bounds

    Authors: Shubhada Agrawal, Aaditya Ramdas

    Abstract: We prove two lower bounds for stopping times of sequential tests between general composite nulls and alternatives. The first lower bound is for the setting where the type-1 error level $α$ approaches zero, and equals $\log(1/α)$ divided by a certain infimum KL divergence, termed $\operatorname{KL_{inf}}$. The second lower bound applies to the setting where $α$ is fixed and… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: 36 pages

  13. arXiv:2504.12996  [pdf, other

    cs.CL cs.AI

    SHA256 at SemEval-2025 Task 4: Selective Amnesia -- Constrained Unlearning for Large Language Models via Knowledge Isolation

    Authors: Saransh Agrawal, Kuan-Hao Huang

    Abstract: Large language models (LLMs) frequently memorize sensitive information during training, posing risks when deploying publicly accessible models. Current machine unlearning methods struggle to selectively remove specific data associations without degrading overall model capabilities. This paper presents our solution to SemEval-2025 Task 4 on targeted unlearning, which introduces a two-stage methodol… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

    Comments: 8 pages, In Proceedings of The 19th International Workshop on Semantic Evaluation (SemEval), 2025

  14. arXiv:2504.12140  [pdf, other

    cs.CL

    Multilingual Contextualization of Large Language Models for Document-Level Machine Translation

    Authors: Miguel Moura Ramos, Patrick Fernandes, Sweta Agrawal, André F. T. Martins

    Abstract: Large language models (LLMs) have demonstrated strong performance in sentence-level machine translation, but scaling to document-level translation remains challenging, particularly in modeling long-range dependencies and discourse phenomena across sentences and paragraphs. In this work, we propose a method to improve LLM-based long-document translation through targeted fine-tuning on high-quality… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: 9 pages, work-in-progress

  15. arXiv:2504.11829  [pdf, other

    cs.CL cs.AI

    Déjà Vu: Multilingual LLM Evaluation through the Lens of Machine Translation Evaluation

    Authors: Julia Kreutzer, Eleftheria Briakou, Sweta Agrawal, Marzieh Fadaee, Kocmi Tom

    Abstract: Generation capabilities and language coverage of multilingual large language models (mLLMs) are advancing rapidly. However, evaluation practices for generative abilities of mLLMs are still lacking comprehensiveness, scientific rigor, and consistent adoption across research labs, which undermines their potential to meaningfully guide mLLM development. We draw parallels with machine translation (MT)… ▽ More

    Submitted 17 April, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

  16. arXiv:2504.11045  [pdf, other

    cs.RO cs.AI eess.SY

    Neural Control Barrier Functions from Physics Informed Neural Networks

    Authors: Shreenabh Agrawal, Manan Tayal, Aditya Singh, Shishir Kolathaya

    Abstract: As autonomous systems become increasingly prevalent in daily life, ensuring their safety is paramount. Control Barrier Functions (CBFs) have emerged as an effective tool for guaranteeing safety; however, manually designing them for specific applications remains a significant challenge. With the advent of deep learning techniques, recent research has explored synthesizing CBFs using neural networks… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

    Comments: 8 pages, 5 figures

  17. arXiv:2504.10577  [pdf, ps, other

    hep-th

    Soft theorems and spontaneous symmetry breaking

    Authors: Shreyansh Agrawal, Kevin Nguyen

    Abstract: The soft photon and soft graviton theorems of Weinberg are known to derive from conservation laws associated with asymptotic symmetries. Within the corresponding classical theories, one often speaks of spontaneous symmetry breaking and vacuum degeneracy, but a genuine quantum description of this phenomenon has largely been lacking. Here we establish spontaneous breaking of asymptotic symmetries an… ▽ More

    Submitted 26 May, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

    Comments: 7 pages; v2: new discussion of the Goldstone two-pont function + added references

  18. arXiv:2504.07583  [pdf, other

    cs.CL cs.LG

    Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering

    Authors: Patrick Fernandes, Sweta Agrawal, Emmanouil Zaranis, André F. T. Martins, Graham Neubig

    Abstract: Despite the steady progress in machine translation evaluation, existing automatic metrics struggle to capture how well meaning is preserved beyond sentence boundaries. We posit that reliance on a single intrinsic quality score, trained to mimic human judgments, might be insufficient for evaluating translations of long, complex passages, and a more ``pragmatic'' approach that assesses how accuratel… ▽ More

    Submitted 11 April, 2025; v1 submitted 10 April, 2025; originally announced April 2025.

  19. arXiv:2503.21400  [pdf, ps, other

    cs.CC cs.CR

    Lattice Based Crypto breaks in a Superposition of Spacetimes

    Authors: Divesh Aggarwal, Shashwat Agrawal, Rajendra Kumar

    Abstract: We explore the computational implications of a superposition of spacetimes, a phenomenon hypothesized in quantum gravity theories. This was initiated by Shmueli (2024) where the author introduced the complexity class $\mathbf{BQP^{OI}}$ consisting of promise problems decidable by quantum polynomial time algorithms with access to an oracle for computing order interference. In this work, it was show… ▽ More

    Submitted 1 April, 2025; v1 submitted 27 March, 2025; originally announced March 2025.

  20. arXiv:2503.16481  [pdf, other

    cs.HC cs.RO

    Pedestrians and Robots: A Novel Dataset for Learning Distinct Social Navigation Forces

    Authors: Subham Agrawal, Nico Ostermann-Myrau, Nils Dengler, Maren Bennewitz

    Abstract: The increasing use of robots in human-centric public spaces such as shopping malls, sidewalks, and hospitals, requires understanding of how pedestrians respond to their presence. However, existing research lacks comprehensive datasets that capture the full range of pedestrian behaviors, e.g., including avoidance, neutrality, and attraction in the presence of robots. Such datasets can be used to ef… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

  21. arXiv:2503.11855  [pdf, other

    cs.RO eess.SY

    Learning-based Estimation of Forward Kinematics for an Orthotic Parallel Robotic Mechanism

    Authors: Jingzong Zhou, Yuhan Zhu, Xiaobin Zhang, Sunil Agrawal, Konstantinos Karydis

    Abstract: This paper introduces a 3D parallel robot with three identical five-degree-of-freedom chains connected to a circular brace end-effector, aimed to serve as an assistive device for patients with cervical spondylosis. The inverse kinematics of the system is solved analytically, whereas learning-based methods are deployed to solve the forward kinematics. The methods considered herein include a Koopman… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  22. arXiv:2503.07970  [pdf, other

    cs.HC

    Sustaining Human Agency, Attending to Its Cost: An Investigation into Generative AI Design for Non-Native Speakers' Language Use

    Authors: Yimin Xiao, Cartor Hancock, Sweta Agrawal, Nikita Mehandru, Niloufar Salehi, Marine Carpuat, Ge Gao

    Abstract: AI systems and tools today can generate human-like expressions on behalf of people. It raises the crucial question about how to sustain human agency in AI-mediated communication. We investigated this question in the context of machine translation (MT) assisted conversations. Our participants included 45 dyads. Each dyad consisted of one new immigrant in the United States, who leveraged MT for Engl… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  23. arXiv:2503.04828  [pdf, other

    cs.CL cs.AI cs.LG

    Beyond Next Word Prediction: Developing Comprehensive Evaluation Frameworks for measuring LLM performance on real world applications

    Authors: Vishakha Agrawal, Archie Chaudhury, Shreya Agrawal

    Abstract: While Large Language Models (LLMs) are fundamentally next-token prediction systems, their practical applications extend far beyond this basic function. From natural language processing and text generation to conversational assistants and software use, LLMs have numerous use-cases, and have already acquired a significant degree of enterprise adoption. To evaluate such models, static evaluation data… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

  24. arXiv:2503.02128  [pdf, other

    cs.CV cs.LG

    Aerial Infrared Health Monitoring of Solar Photovoltaic Farms at Scale

    Authors: Isaac Corley, Conor Wallace, Sourav Agrawal, Burton Putrah, Jonathan Lwowski

    Abstract: Solar photovoltaic (PV) farms represent a major source of global renewable energy generation, yet their true operational efficiency often remains unknown at scale. In this paper, we present a comprehensive, data-driven framework for large-scale airborne infrared inspection of North American solar installations. Leveraging high-resolution thermal imagery, we construct and curate a geographically di… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

  25. arXiv:2503.01946  [pdf, other

    astro-ph.EP

    Water dissociation and rotational broadening in the atmosphere of KELT-20 b from high-resolution spectroscopy

    Authors: Luke Finnerty, Yinzi Xin, Jerry W. Xuan, Julie Inglis, Michael P. Fitzgerald, Shubh Agrawal, Ashley Baker, Randall Bartos, Geoffrey A. Blake, Benjamin Calvin, Sylvain Cetre, Jacques-Robert Delorme, Greg Doppmann, Daniel Echeverri, Katelyn Horstman, Chih-Chun Hsu, Nemanja Jovanovic, Joshua Liberman, Ronald A. López, Emily C. Martin, Dimitri Mawet, Evan Morris, Jacklyn Pezzato, Jean-Baptiste Ruffio, Ben Sappey , et al. (7 additional authors not shown)

    Abstract: We present atmospheric retrievals from Keck/KPIC phase II observations of the ultra-hot Jupiter KELT-20/MASCARA-2~b. Previous free retrievals of molecular abundances for ultra-hot Jupiters have been impacted by significant model biases due to variations in vertical abundance profiles, which we address by including molecular dissociation into our retrieval framework as an additional free parameter.… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: 24 pages, 2 tables, 10 figures, accepted in AJ

  26. arXiv:2502.20393  [pdf, other

    cs.LG cs.AI cs.CV

    Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models

    Authors: Susmit Agrawal, Deepika Vemuri, Sri Siddarth Chakaravarthy P, Vineeth N. Balasubramanian

    Abstract: Concept-based methods have emerged as a promising direction to develop interpretable neural networks in standard supervised settings. However, most works that study them in incremental settings assume either a static concept set across all experiences or assume that each experience relies on a distinct set of concepts. In this work, we study concept-based models in a more realistic, dynamic settin… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 8 pages of main text, 6 figures in main text, 11 pages of Appendix, published in AAAI 2025

  27. arXiv:2502.17112  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Defects in the $β$-Ga$_2$O$_3$($\bar201$)/HfO$_2$ MOS system and the effect of thermal treatments

    Authors: Khushabu. S. Agrawal, Paolo LaTorraca, Jonas Valentijn, Roberta Hawkins, Adam A. Gruszecki, Joy Roy, Vasily Lebedev, Lewys Jones, Robert M. Wallace, Chadwin D. Young, Paul K. Hurley, Karim Cherkaoui

    Abstract: We have investigated the properties of the $β$-Ga$_2$O$_3$($\bar201$)/HfO$_2$/Cr/Au MOS (metal-oxide-semiconductor) system after annealing (450$^\circ$C) in different ambient conditions (forming gas, N$_2$ and O$_2$). Defect properties have been analyzed using an approach combining experimental impedance measurements with physics-based simulations of the capacitance-voltage (C-V) and conductance-v… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

    Comments: Main article: 23 pages, 6 figures, Supporting information:7 pages, 5 Figures

  28. arXiv:2502.12701  [pdf, other

    cs.CL cs.AI cs.LG

    Translate Smart, not Hard: Cascaded Translation Systems with Quality-Aware Deferral

    Authors: António Farinhas, Nuno M. Guerreiro, Sweta Agrawal, Ricardo Rei, André F. T. Martins

    Abstract: Larger models often outperform smaller ones but come with high computational costs. Cascading offers a potential solution. By default, it uses smaller models and defers only some instances to larger, more powerful models. However, designing effective deferral rules remains a challenge. In this paper, we propose a simple yet effective approach for machine translation, using existing quality estimat… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

    Comments: Preprint

  29. arXiv:2502.12393  [pdf, other

    stat.ME cs.AI cs.LG stat.ML

    Time Series Treatment Effects Analysis with Always-Missing Controls

    Authors: Juan Shu, Qiyu Han, George Chen, Xihao Cao, Kangming Luo, Dan Pallotta, Shivam Agrawal, Yuping Lu, Xiaoyu Zhang, Jawad Mansoor, Jyoti Anand

    Abstract: Estimating treatment effects in time series data presents a significant challenge, especially when the control group is always unobservable. For example, in analyzing the effects of Christmas on retail sales, we lack direct observation of what would have occurred in late December without the Christmas impact. To address this, we try to recover the control group in the event period while accounting… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

  30. arXiv:2502.10295  [pdf, other

    cs.LG

    Fenchel-Young Variational Learning

    Authors: Sophia Sklaviadis, Sweta Agrawal, Antonio Farinhas, Andre Martins, Mario Figueiredo

    Abstract: From a variational perspective, many statistical learning criteria involve seeking a distribution that balances empirical risk and regularization. In this paper, we broaden this perspective by introducing a new general class of variational methods based on Fenchel-Young (FY) losses, treated as divergences that generalize (and encompass) the familiar Kullback-Leibler divergence at the core of class… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

    Comments: Under review

  31. arXiv:2502.02085  [pdf, other

    cs.DS cs.LG

    A New Rejection Sampling Approach to $k$-$\mathtt{means}$++ With Improved Trade-Offs

    Authors: Poojan Shah, Shashwat Agrawal, Ragesh Jaiswal

    Abstract: The $k$-$\mathtt{means}$++ seeding algorithm (Arthur & Vassilvitskii, 2007) is widely used in practice for the $k$-means clustering problem where the goal is to cluster a dataset $\mathcal{X} \subset \mathbb{R} ^d$ into $k$ clusters. The popularity of this algorithm is due to its simplicity and provable guarantee of being $O(\log k)$ competitive with the optimal solution in expectation. However,… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  32. arXiv:2501.16573  [pdf, other

    cs.LG math.OC

    Optimization Landscapes Learned: Proxy Networks Boost Convergence in Physics-based Inverse Problems

    Authors: Girnar Goyal, Philipp Holl, Sweta Agrawal, Nils Thuerey

    Abstract: Solving inverse problems in physics is central to understanding complex systems and advancing technologies in various fields. Iterative optimization algorithms, commonly used to solve these problems, often encounter local minima, chaos, or regions with zero gradients. This is due to their overreliance on local information and highly chaotic inverse loss landscapes governed by underlying partial di… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: Ongoing work

  33. arXiv:2501.05078  [pdf, other

    cs.LG cs.AI

    Analyzing Memorization in Large Language Models through the Lens of Model Attribution

    Authors: Tarun Ram Menta, Susmit Agrawal, Chirag Agarwal

    Abstract: Large Language Models (LLMs) are prevalent in modern applications but often memorize training data, leading to privacy breaches and copyright issues. Existing research has mainly focused on posthoc analyses, such as extracting memorized content or developing memorization metrics, without exploring the underlying architectural factors that contribute to memorization. In this work, we investigate me… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  34. arXiv:2501.04359  [pdf, other

    eess.AS cs.CL cs.HC cs.LG cs.SD

    Decoding EEG Speech Perception with Transformers and VAE-based Data Augmentation

    Authors: Terrance Yu-Hao Chen, Yulin Chen, Pontus Soederhaell, Sadrishya Agrawal, Kateryna Shapovalenko

    Abstract: Decoding speech from non-invasive brain signals, such as electroencephalography (EEG), has the potential to advance brain-computer interfaces (BCIs), with applications in silent communication and assistive technologies for individuals with speech impairments. However, EEG-based speech decoding faces major challenges, such as noisy data, limited datasets, and poor performance on complex tasks like… ▽ More

    Submitted 8 January, 2025; originally announced January 2025.

    Comments: 19 pages, 15 figures, 2 tables

    MSC Class: 68T07; 92C55 ACM Class: H.5.2; I.2.6; J.3

  35. arXiv:2412.16540  [pdf, other

    cs.CV cs.AI

    Prior2Posterior: Model Prior Correction for Long-Tailed Learning

    Authors: S Divakar Bhat, Amit More, Mudit Soni, Surbhi Agrawal

    Abstract: Learning-based solutions for long-tailed recognition face difficulties in generalizing on balanced test datasets. Due to imbalanced data prior, the learned \textit{a posteriori} distribution is biased toward the most frequent (head) classes, leading to an inferior performance on the least frequent (tail) classes. In general, the performance can be improved by removing such a bias by eliminating th… ▽ More

    Submitted 21 December, 2024; originally announced December 2024.

    Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2025

  36. arXiv:2412.04552  [pdf, other

    astro-ph.EP

    True mass and atmospheric composition of the non-transiting hot Jupiter HD 143105 b

    Authors: Luke Finnerty, Yinzi Xin, Jerry W. Xuan, Julie Inglis, Michael P Fitzgerald, Shubh Agrawal, Ashley Baker, Geoffrey A. Blake, Benjamin Calvin, Sylvain Cetre, Jacques-Robert Delorme, Greg Doppman, Daniel Echeverri, Katelyn Horstman, Chih-Chun Hsu, Nemanja Jovanovic, Joshua Liberman, Ronald A. López, Emily C. Martin, Dimitri Mawet, Evan Morris, Jacklyn Pezzato-Rovner, Jean-Baptiste Ruffio, Ben Sappey, Tobias Schofield , et al. (6 additional authors not shown)

    Abstract: We present Keck/KPIC phase II $K$-band observations of the non-transiting hot Jupiter HD 143105 b. Using a cross-correlation approach, we make the first detection of the planetary atmosphere at $K_p = 185^{+11}_{-13}\rm km\ s^{-1}$ and an inferior conjunction time 2.5 hours before the previously-published ephemeris. The retrieved $K_p$ value, in combination with orbital period, mass of the host st… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

    Comments: 19 pages, 7 figures, 2 tables. Accepted in AJ

  37. arXiv:2412.04205  [pdf, ps, other

    cs.CL

    A Context-aware Framework for Translation-mediated Conversations

    Authors: José Pombal, Sweta Agrawal, Patrick Fernandes, Emmanouil Zaranis, André F. T. Martins

    Abstract: Automatic translation systems offer a powerful solution to bridge language barriers in scenarios where participants do not share a common language. However, these systems can introduce errors leading to misunderstandings and conversation breakdown. A key issue is that current systems fail to incorporate the rich contextual information necessary to resolve ambiguities and omitted details, resulting… ▽ More

    Submitted 29 June, 2025; v1 submitted 5 December, 2024; originally announced December 2024.

  38. Celestial $sw_{1+\infty}$ algebra in Einstein-Yang-Mills theory

    Authors: Shreyansh Agrawal, Panagiotis Charalambous, Laura Donnay

    Abstract: From a study of the subleading structure of the asymptotic equations of motion in Einstein-Yang-Mills theory, we construct charges that are conserved up to quadratic order in non-radiative vacuum. We then show that these higher spin charges obey the celestial $sw_{1+\infty}$ symmetry algebra found earlier from the OPE of positive-helicity conformally soft gluons and gravitons.

    Submitted 30 April, 2025; v1 submitted 2 December, 2024; originally announced December 2024.

    Comments: 31+15 pages

  39. arXiv:2411.14983  [pdf, other

    stat.CO

    Large sample scaling analysis of the Zig-Zag algorithm for Bayesian inference

    Authors: Sanket Agrawal, Joris Bierkens, Gareth O. Roberts

    Abstract: Piecewise deterministic Markov processes provide scalable methods for sampling from the posterior distributions in big data settings by admitting principled sub-sampling strategies that do not bias the output. An important example is the Zig-Zag process of [Ann. Stats. 47 (2019) 1288 - 1320] where clever sub-sampling has been shown to produce an essentially independent sample at a cost that does n… ▽ More

    Submitted 22 November, 2024; originally announced November 2024.

    Comments: 47 pages, 7 figues, 1 table

    MSC Class: 62-08; 60F05; 62F15; 65C05

  40. arXiv:2411.13690  [pdf, other

    cs.LG

    Multi-Agent Best Arm Identification in Stochastic Linear Bandits

    Authors: Sanjana Agrawal, Saúl A. Blanco

    Abstract: We study the problem of collaborative best-arm identification in stochastic linear bandits under a fixed-budget scenario. In our learning model, we first consider multiple agents connected through a star network, interacting with a linear bandit instance in parallel. We then extend our analysis to arbitrary network topologies. The objective of the agents is to collaboratively identify the best arm… ▽ More

    Submitted 24 May, 2025; v1 submitted 20 November, 2024; originally announced November 2024.

    Comments: Updated algorithms, corrected proofs, fixed typos

    MSC Class: 93E35 ACM Class: I.2.6

  41. arXiv:2411.12497  [pdf, ps, other

    hep-ph hep-lat nucl-th

    Small-$x$ evolution of dipole amplitude in momentum space: forward--off-forward correspondence

    Authors: Sanskriti Agrawal, Raktim Abir

    Abstract: We have shown that the small-$x$ evolution of the off-forward leading-log dipole scattering amplitudes, both pomeron and odderon, in the momentum space can be completely determined by the evolution of the respective forward amplitudes, with rescaled momenta. In position space, if there is translation symmetry (assumption of a large nucleus), the dipole cross section depends on the positions of qua… ▽ More

    Submitted 23 November, 2024; v1 submitted 19 November, 2024; originally announced November 2024.

    Journal ref: Phys.Rev.D 111 (2025) 3, 034022

  42. arXiv:2411.11937  [pdf, other

    cs.LG cs.AI

    Value Imprint: A Technique for Auditing the Human Values Embedded in RLHF Datasets

    Authors: Ike Obi, Rohan Pant, Srishti Shekhar Agrawal, Maham Ghazanfar, Aaron Basiletti

    Abstract: LLMs are increasingly fine-tuned using RLHF datasets to align them with human preferences and values. However, very limited research has investigated which specific human values are operationalized through these datasets. In this paper, we introduce Value Imprint, a framework for auditing and classifying the human values embedded within RLHF datasets. To investigate the viability of this framework… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

  43. arXiv:2411.05986  [pdf, other

    cs.CL

    Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings

    Authors: Miguel Moura Ramos, Tomás Almeida, Daniel Vareta, Filipe Azevedo, Sweta Agrawal, Patrick Fernandes, André F. T. Martins

    Abstract: Reinforcement learning (RL) has been proven to be an effective and robust method for training neural machine translation systems, especially when paired with powerful reward models that accurately assess translation quality. However, most research has focused on RL methods that use sentence-level feedback, leading to inefficient learning signals due to the reward sparsity problem -- the model rece… ▽ More

    Submitted 16 April, 2025; v1 submitted 8 November, 2024; originally announced November 2024.

    Comments: 12 pages, work-in-progress

  44. arXiv:2410.19500  [pdf, other

    cond-mat.quant-gas cond-mat.str-el

    Microscopy of bosonic charge carriers in staggered magnetic fields

    Authors: Annabelle Bohrdt, David Wei, Daniel Adler, Kritsana Srakaew, Suchita Agrawal, Pascal Weckesser, Immanuel Bloch, Fabian Grusdt, Johannes Zeiher

    Abstract: The interplay of spin and charge degrees of freedom is believed to underlie various unresolved phenomena in strongly correlated systems. Quantum simulators based on neutral atoms provide an excellent testbed for investigating such phenomena and resolving their microscopic origins. Up to now, the majority of experimental and theoretical studies has focused on systems with fermionic exchange statist… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

    Comments: 9+3 pages, 4+5 figures

  45. arXiv:2410.18351  [pdf, other

    cs.CL cs.LG

    AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability

    Authors: Sudhanshu Agrawal, Wonseok Jeon, Mingu Lee

    Abstract: Speculative decoding is a powerful technique that attempts to circumvent the autoregressive constraint of modern Large Language Models (LLMs). The aim of speculative decoding techniques is to improve the average inference time of a large, target model without sacrificing its accuracy, by using a more efficient draft model to propose draft tokens which are then verified in parallel. The number of d… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

    Comments: Workshop on Efficient Natural Language and Signal Processing at NeurIPS 2024

  46. arXiv:2410.17709  [pdf, other

    eess.SY cs.DC

    Deoxys: A Causal Inference Engine for Unhealthy Node Mitigation in Large-scale Cloud Infrastructure

    Authors: Chaoyun Zhang, Randolph Yao, Si Qin, Ze Li, Shekhar Agrawal, Binit R. Mishra, Tri Tran, Minghua Ma, Qingwei Lin, Murali Chintalapati, Dongmei Zhang

    Abstract: The presence of unhealthy nodes in cloud infrastructure signals the potential failure of machines, which can significantly impact the availability and reliability of cloud services, resulting in negative customer experiences. Effectively addressing unhealthy node mitigation is therefore vital for sustaining cloud system performance. This paper introduces Deoxys, a causal inference engine tailored… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

  47. arXiv:2410.15930  [pdf, other

    cs.IR cs.AI

    Centrality-aware Product Retrieval and Ranking

    Authors: Hadeel Saadany, Swapnil Bhosale, Samarth Agrawal, Diptesh Kanojia, Constantin Orasan, Zhe Wu

    Abstract: This paper addresses the challenge of improving user experience on e-commerce platforms by enhancing product ranking relevant to users' search queries. Ambiguity and complexity of user queries often lead to a mismatch between the user's intent and retrieved product titles or documents. Recent approaches have proposed the use of Transformer-based models, which need millions of annotated query-title… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Comments: EMNLP 2024: Industry track

  48. arXiv:2410.11624  [pdf, other

    cs.CL

    Findings of the WMT 2024 Shared Task on Chat Translation

    Authors: Wafaa Mohammed, Sweta Agrawal, M. Amin Farajian, Vera Cabarrão, Bryan Eikema, Ana C. Farinha, José G. C. de Souza

    Abstract: This paper presents the findings from the third edition of the Chat Translation Shared Task. As with previous editions, the task involved translating bilingual customer support conversations, specifically focusing on the impact of conversation context in translation quality and evaluation. We also include two new language pairs: English-Korean and English-Dutch, in addition to the set of language… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: 12 pages, 5 figures, 13 tables

  49. arXiv:2410.10995  [pdf, ps, other

    cs.CL

    Watching the Watchers: Exposing Gender Disparities in Machine Translation Quality Estimation

    Authors: Emmanouil Zaranis, Giuseppe Attanasio, Sweta Agrawal, André F. T. Martins

    Abstract: Quality estimation (QE)-the automatic assessment of translation quality-has recently become crucial across several stages of the translation pipeline, from data curation to training and decoding. While QE metrics have been optimized to align with human judgments, whether they encode social biases has been largely overlooked. Biased QE risks favoring certain demographic groups over others, e.g., by… ▽ More

    Submitted 2 June, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: ACL 2025

  50. arXiv:2410.07779  [pdf, other

    cs.CL

    Modeling User Preferences with Automatic Metrics: Creating a High-Quality Preference Dataset for Machine Translation

    Authors: Sweta Agrawal, José G. C. de Souza, Ricardo Rei, António Farinhas, Gonçalo Faria, Patrick Fernandes, Nuno M Guerreiro, Andre Martins

    Abstract: Alignment with human preferences is an important step in developing accurate and safe large language models. This is no exception in machine translation (MT), where better handling of language nuances and context-specific variations leads to improved quality. However, preference data based on human feedback can be very expensive to obtain and curate at a large scale. Automatic metrics, on the othe… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: Accepted at EMNLP Main 2024