Skip to main content

Showing 1–50 of 113 results for author: Agarwal, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.08784  [pdf, ps, other

    stat.ML cs.LG math.ST stat.ME

    PCS-UQ: Uncertainty Quantification via the Predictability-Computability-Stability Framework

    Authors: Abhineet Agarwal, Michael Xiao, Rebecca Barter, Omer Ronen, Boyu Fan, Bin Yu

    Abstract: As machine learning (ML) models are increasingly deployed in high-stakes domains, trustworthy uncertainty quantification (UQ) is critical for ensuring the safety and reliability of these models. Traditional UQ methods rely on specifying a true generative model and are not robust to misspecification. On the other hand, conformal inference allows for arbitrary ML models but does not consider model s… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  2. arXiv:2504.01702  [pdf, ps, other

    econ.EM cs.LG stat.ME

    A Causal Inference Framework for Data Rich Environments

    Authors: Alberto Abadie, Anish Agarwal, Devavrat Shah

    Abstract: We propose a formal model for counterfactual estimation with unobserved confounding in "data-rich" settings, i.e., where there are a large number of units and a large number of measurements per unit. Our model provides a bridge between the structural causal model view of causal inference common in the graphical models literature with that of the latent factor model view common in the potential out… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  3. arXiv:2503.13521  [pdf, other

    cs.DB cs.CY physics.soc-ph stat.AP

    States of Disarray: Cleaning Data for Gerrymandering Analysis

    Authors: Ananya Agarwal, Fnu Alusi, Arbie Hsu, Arif Syraj, Ellen Veomett

    Abstract: The mathematics of redistricting is an area of study that has exploded in recent years. In particular, many different research groups and expert witnesses in court cases have used outlier analysis to argue that a proposed map is a gerrymander. This outlier analysis relies on having an ensemble of potential redistricting maps against which the proposed map is compared. Arguably the most widely-acce… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

    Comments: 12 pages, 3 figures

    MSC Class: 51-11 (Primary) 68V35 (Secondary) ACM Class: E.m; J.4

  4. arXiv:2502.02486  [pdf, ps, other

    stat.ML cs.LG

    Catoni Contextual Bandits are Robust to Heavy-tailed Rewards

    Authors: Chenlu Ye, Yujia Jin, Alekh Agarwal, Tong Zhang

    Abstract: Typical contextual bandit algorithms assume that the rewards at each round lie in some fixed range $[0, R]$, and their regret scales polynomially with this reward range $R$. However, many practical scenarios naturally involve heavy-tailed rewards or rewards where the worst-case range can be substantially larger than the variance. In this paper, we develop an algorithmic approach building on Catoni… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  5. arXiv:2412.03597  [pdf, other

    cs.CL cs.LG stat.ML

    The Vulnerability of Language Model Benchmarks: Do They Accurately Reflect True LLM Performance?

    Authors: Sourav Banerjee, Ayushi Agarwal, Eishkaran Singh

    Abstract: The pursuit of leaderboard rankings in Large Language Models (LLMs) has created a fundamental paradox: models excel at standardized tests while failing to demonstrate genuine language understanding and adaptability. Our systematic analysis of NLP evaluation frameworks reveals pervasive vulnerabilities across the evaluation spectrum, from basic metrics to complex benchmarks like GLUE and MMLU. Thes… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

    Comments: 11 pages

  6. arXiv:2410.13381  [pdf, other

    stat.ML cs.LG

    Learning Counterfactual Distributions via Kernel Nearest Neighbors

    Authors: Kyuseong Choi, Jacob Feitelberg, Caleb Chin, Anish Agarwal, Raaz Dwivedi

    Abstract: Consider a setting with multiple units (e.g., individuals, cohorts, geographic locations) and outcomes (e.g., treatments, times, items), where the goal is to learn a multivariate distribution for each unit-outcome entry, such as the distribution of a user's weekly spend and engagement under a specific mobile app version. A common challenge is the prevalence of missing not at random data, where obs… ▽ More

    Submitted 2 December, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: 39 pages, 8 figures

  7. arXiv:2410.13112  [pdf, other

    stat.ML cs.LG stat.ME

    Distributional Matrix Completion via Nearest Neighbors in the Wasserstein Space

    Authors: Jacob Feitelberg, Kyuseong Choi, Anish Agarwal, Raaz Dwivedi

    Abstract: We introduce the problem of distributional matrix completion: Given a sparsely observed matrix of empirical distributions, we seek to impute the true distributions associated with both observed and unobserved matrix entries. This is a generalization of traditional matrix completion where the observations per matrix entry are scalar valued. To do so, we utilize tools from optimal transport to gener… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  8. arXiv:2409.05746  [pdf, other

    stat.ML cs.LG

    LLMs Will Always Hallucinate, and We Need to Live With This

    Authors: Sourav Banerjee, Ayushi Agarwal, Saloni Singla

    Abstract: As Large Language Models become more ubiquitous across domains, it becomes important to examine their inherent limitations critically. This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems. We demonstrate that hallucinations stem from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  9. arXiv:2406.01611  [pdf, other

    cs.IR cs.LG stat.ML

    System-2 Recommenders: Disentangling Utility and Engagement in Recommendation Systems via Temporal Point-Processes

    Authors: Arpit Agarwal, Nicolas Usunier, Alessandro Lazaric, Maximilian Nickel

    Abstract: Recommender systems are an important part of the modern human experience whose influence ranges from the food we eat to the news we read. Yet, there is still debate as to what extent recommendation platforms are aligned with the user goals. A core issue fueling this debate is the challenge of inferring a user utility based on engagement signals such as likes, shares, watch time etc., which are the… ▽ More

    Submitted 29 May, 2024; originally announced June 2024.

    Comments: Accepted at FAccT'24

  10. arXiv:2405.20088  [pdf, other

    stat.AP stat.ME

    Personalized Predictions from Population Level Experiments: A Study on Alzheimer's Disease

    Authors: Dennis Shen, Anish Agarwal, Vishal Misra, Bjoern Schelter, Devavrat Shah, Helen Shiells, Claude Wischik

    Abstract: The purpose of this article is to infer patient level outcomes from population level randomized control trials (RCTs). In this pursuit, we utilize the recently proposed synthetic nearest neighbors (SNN) estimator. At its core, SNN leverages information across patients to impute missing data associated with each patient of interest. We focus on two types of missing data: (i) unrecorded outcomes fro… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  11. arXiv:2405.18621  [pdf, other

    cs.LG stat.ME stat.ML

    Multi-Armed Bandits with Network Interference

    Authors: Abhineet Agarwal, Anish Agarwal, Lorenzo Masoero, Justin Whitehouse

    Abstract: Online experimentation with interference is a common challenge in modern applications such as e-commerce and adaptive clinical trials in medicine. For example, in online marketplaces, the revenue of a good depends on discounts applied to competing goods. Statistical inference with interference is widely studied in the offline setting, but far less is known about how to adaptively assign treatments… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  12. arXiv:2403.12950  [pdf, other

    cs.LG stat.ML

    Non-Stationary Dueling Bandits Under a Weighted Borda Criterion

    Authors: Joe Suk, Arpit Agarwal

    Abstract: In $K$-armed dueling bandits, the learner receives preference feedback between arms, and the regret of an arm is defined in terms of its suboptimality to a $\textit{winner}$ arm. The $\textit{non-stationary}$ variant of the problem, motivated by concerns of changing user preferences, has received recent interest (Saha and Gupta, 2022; Buening and Saha, 2023; Suk and Agarwal, 2023). The goal here i… ▽ More

    Submitted 28 September, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  13. arXiv:2402.11652  [pdf, other

    econ.EM cs.LG stat.ME stat.ML

    Doubly Robust Inference in Causal Latent Factor Models

    Authors: Alberto Abadie, Anish Agarwal, Raaz Dwivedi, Abhin Shah

    Abstract: This article introduces a new estimator of average treatment effects under unobserved confounding in modern data-rich environments featuring large numbers of units and outcomes. The proposed estimator is doubly robust, combining outcome imputation, inverse probability weighting, and a novel cross-fitting procedure for matrix completion. We derive finite-sample and asymptotic guarantees, and show t… ▽ More

    Submitted 29 October, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  14. arXiv:2312.16307  [pdf, other

    econ.EM cs.GT cs.LG stat.ME

    Incentive-Aware Synthetic Control: Accurate Counterfactual Estimation via Incentivized Exploration

    Authors: Daniel Ngo, Keegan Harris, Anish Agarwal, Vasilis Syrgkanis, Zhiwei Steven Wu

    Abstract: We consider the setting of synthetic control methods (SCMs), a canonical approach used to estimate the treatment effect on the treated in a panel data setting. We shed light on a frequently overlooked but ubiquitous assumption made in SCMs of "overlap": a treated unit can be written as some combination -- typically, convex or linear combination -- of the units that remain under control. We show th… ▽ More

    Submitted 13 February, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

  15. arXiv:2307.01932  [pdf, other

    stat.ME cs.AI cs.LG stat.ML

    MDI+: A Flexible Random Forest-Based Feature Importance Framework

    Authors: Abhineet Agarwal, Ana M. Kenney, Yan Shuo Tan, Tiffany M. Tang, Bin Yu

    Abstract: Mean decrease in impurity (MDI) is a popular feature importance measure for random forests (RFs). We show that the MDI for a feature $X_k$ in each tree in an RF is equivalent to the unnormalized $R^2$ value in a linear regression of the response on the collection of decision stumps that split on $X_k$. We use this interpretation to propose a flexible feature importance framework called MDI+. Speci… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  16. arXiv:2307.01357  [pdf, other

    cs.LG econ.EM stat.ME stat.ML

    Adaptive Principal Component Regression with Applications to Panel Data

    Authors: Anish Agarwal, Keegan Harris, Justin Whitehouse, Zhiwei Steven Wu

    Abstract: Principal component regression (PCR) is a popular technique for fixed-design error-in-variables regression, a generalization of the linear regression setting in which the observed covariates are corrupted with random noise. We provide the first time-uniform finite sample guarantees for (regularized) PCR whenever data is collected adaptively. Since the proof techniques for analyzing PCR in the fixe… ▽ More

    Submitted 4 August, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

  17. arXiv:2306.13681  [pdf, other

    stat.ME cs.LG econ.EM stat.ML

    Estimating the Value of Evidence-Based Decision Making

    Authors: Alberto Abadie, Anish Agarwal, Guido Imbens, Siwei Jia, James McQueen, Serguei Stepaniants

    Abstract: Business/policy decisions are often based on evidence from randomized experiments and observational studies. In this article we propose an empirical framework to estimate the value of evidence-based decision making (EBDM) and the return on the investment in statistical precision.

    Submitted 9 September, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  18. arXiv:2303.14226  [pdf, other

    stat.ME cs.LG econ.EM stat.ML

    Synthetic Combinations: A Causal Inference Framework for Combinatorial Interventions

    Authors: Abhineet Agarwal, Anish Agarwal, Suhas Vijaykumar

    Abstract: Consider a setting where there are $N$ heterogeneous units and $p$ interventions. Our goal is to learn unit-specific potential outcomes for any combination of these $p$ interventions, i.e., $N \times 2^p$ causal parameters. Choosing a combination of interventions is a problem that naturally arises in a variety of applications such as factorial design experiments, recommendation engines, combinatio… ▽ More

    Submitted 15 January, 2024; v1 submitted 24 March, 2023; originally announced March 2023.

  19. arXiv:2303.04100  [pdf, other

    physics.data-an physics.ins-det stat.AP

    Continuous-Time Modeling and Analysis of Particle Beam Metrology

    Authors: Akshay Agarwal, Minxu Peng, Vivek K. Goyal

    Abstract: Particle beam microscopy (PBM) performs nanoscale imaging by pixelwise capture of scalar values representing noisy measurements of the response from secondary electrons (SEs) integrated over a dwell time. Extended to metrology, goals include estimating SE yield at each pixel and detecting differences in SE yield across pixels; obstacles include shot noise in the particle source as well as lack of… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: 14 pages, 10 figures

    Journal ref: IEEE J. Selected Areas of Information Theory, vol. 4, pp. 61-74, 9 June 2023

  20. arXiv:2303.00821  [pdf, ps, other

    cs.LG stat.ME

    Learning high-dimensional causal effect

    Authors: Aayush Agarwal, Saksham Bassi

    Abstract: The scarcity of high-dimensional causal inference datasets restricts the exploration of complex deep models. In this work, we propose a method to generate a synthetic causal dataset that is high-dimensional. The synthetic data simulates a causal effect using the MNIST dataset with Bernoulli treatment values. This provides an opportunity to study varieties of models for causal effect estimation. We… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  21. arXiv:2302.06595  [pdf, other

    cs.LG stat.ML

    When Can We Track Significant Preference Shifts in Dueling Bandits?

    Authors: Joe Suk, Arpit Agarwal

    Abstract: The $K$-armed dueling bandits problem, where the feedback is in the form of noisy pairwise preferences, has been widely studied due its applications in information retrieval, recommendation systems, etc. Motivated by concerns that user preferences/tastes can evolve over time, we consider the problem of dueling bandits with distribution shifts. Specifically, we study the recent notion of significan… ▽ More

    Submitted 24 January, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

  22. arXiv:2302.03784  [pdf, ps, other

    cs.LG stat.ML

    Leveraging User-Triggered Supervision in Contextual Bandits

    Authors: Alekh Agarwal, Claudio Gentile, Teodor V. Marinov

    Abstract: We study contextual bandit (CB) problems, where the user can sometimes respond with the best action in a given context. Such an interaction arises, for example, in text prediction or autocompletion settings, where a poor suggestion is simply ignored and the user enters the desired text instead. Crucially, this extra feedback is user-triggered on only a subset of the contexts. We develop a new fram… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

  23. arXiv:2301.13857  [pdf, other

    cs.LG cs.AI stat.ML

    Learning in POMDPs is Sample-Efficient with Hindsight Observability

    Authors: Jonathan N. Lee, Alekh Agarwal, Christoph Dann, Tong Zhang

    Abstract: POMDPs capture a broad class of decision making problems, but hardness results suggest that learning is intractable even in simple settings due to the inherent partial observability. However, in many realistic problems, more information is either revealed or can be computed during some point of the learning process. Motivated by diverse applications ranging from robotics to data center scheduling,… ▽ More

    Submitted 3 February, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

  24. arXiv:2212.06069  [pdf, other

    cs.LG stat.ML

    VO$Q$L: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation

    Authors: Alekh Agarwal, Yujia Jin, Tong Zhang

    Abstract: We study time-inhomogeneous episodic reinforcement learning (RL) under general function approximation and sparse rewards. We design a new algorithm, Variance-weighted Optimistic $Q$-Learning (VO$Q$L), based on $Q$-learning and bound its regret assuming completeness and bounded Eluder dimension for the regression function class. As a special case, VO$Q$L achieves $\tilde{O}(d\sqrt{HT}+d^6H^{5})$ re… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  25. arXiv:2210.11355  [pdf, other

    econ.EM cs.LG stat.ME

    Network Synthetic Interventions: A Causal Framework for Panel Data Under Network Interference

    Authors: Anish Agarwal, Sarah H. Cen, Devavrat Shah, Christina Lee Yu

    Abstract: We propose a generalization of the synthetic controls and synthetic interventions methodology to incorporate network interference. We consider the estimation of unit-specific potential outcomes from panel data in the presence of spillover across units and unobserved confounding. Key to our approach is a novel latent factor model that takes into account network interference and generalizes the fact… ▽ More

    Submitted 11 October, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: 49 pages, 6 figures

  26. arXiv:2210.11003  [pdf, other

    econ.EM cs.LG stat.ME

    Synthetic Blip Effects: Generalizing Synthetic Controls for the Dynamic Treatment Regime

    Authors: Anish Agarwal, Vasilis Syrgkanis

    Abstract: We propose a generalization of the synthetic control and synthetic interventions methodology to the dynamic treatment regime. We consider the estimation of unit-specific treatment effects from panel data collected via a dynamic treatment regime and in the presence of unobserved confounding. That is, each unit receives multiple treatments sequentially, based on an adaptive policy, which depends on… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

  27. arXiv:2209.12108  [pdf, other

    cs.LG stat.ML

    An Asymptotically Optimal Batched Algorithm for the Dueling Bandit Problem

    Authors: Arpit Agarwal, Rohan Ghuge, Viswanath Nagarajan

    Abstract: We study the $K$-armed dueling bandit problem, a variation of the traditional multi-armed bandit problem in which feedback is obtained in the form of pairwise comparisons. Previous learning algorithms have focused on the $\textit{fully adaptive}$ setting, where the algorithm can make updates after every comparison. The "batched" dueling bandit problem is motivated by large-scale applications like… ▽ More

    Submitted 24 September, 2022; originally announced September 2022.

  28. arXiv:2206.10770  [pdf, ps, other

    cs.LG cs.AI stat.ML

    On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

    Authors: Jinglin Chen, Aditya Modi, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

    Abstract: We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions. On the positive side, we propose the RFOLIVE (Reward-Free OLIVE) algorithm for sample-efficient reward-free exploration under minimal structural assumptions, which covers the previously studied settings… ▽ More

    Submitted 22 October, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

  29. arXiv:2206.07659  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity

    Authors: Alekh Agarwal, Tong Zhang

    Abstract: We propose a general framework to design posterior sampling methods for model-based RL. We show that the proposed algorithms can be analyzed by reducing regret to Hellinger distance in conditional probability estimation. We further show that optimistic posterior sampling can control this Hellinger distance, when we measure model error via data likelihood. This technique allows us to design and ana… ▽ More

    Submitted 16 October, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022 camera ready version

  30. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  31. arXiv:2205.00984  [pdf, ps, other

    cs.LG stat.ML

    A Sharp Memory-Regret Trade-Off for Multi-Pass Streaming Bandits

    Authors: Arpit Agarwal, Sanjeev Khanna, Prathamesh Patil

    Abstract: The stochastic $K$-armed bandit problem has been studied extensively due to its applications in various domains ranging from online advertising to clinical trials. In practice however, the number of arms can be very large resulting in large memory requirements for simultaneously processing them. In this paper we consider a streaming setting where the arms are presented in a stream and the algorith… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

  32. arXiv:2202.10660  [pdf, other

    cs.LG stat.ML

    Batched Dueling Bandits

    Authors: Arpit Agarwal, Rohan Ghuge, Viswanath Nagarajan

    Abstract: The $K$-armed dueling bandit problem, where the feedback is in the form of noisy pairwise comparisons, has been widely studied. Previous works have only focused on the sequential setting where the policy adapts after every comparison. However, in many applications such as search ranking and recommendation systems, it is preferable to perform comparisons in a limited number of parallel batches. We… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  33. arXiv:2202.05436  [pdf, ps, other

    cs.LG stat.ML

    Minimax Regret Optimization for Robust Machine Learning under Distribution Shift

    Authors: Alekh Agarwal, Tong Zhang

    Abstract: In this paper, we consider learning scenarios where the learned model is evaluated under an unknown test distribution which potentially differs from the training distribution (i.e. distribution shift). The learner has access to a family of weight functions such that the test distribution is a reweighting of the training distribution under one of these functions, a setting typically studied under t… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

  34. arXiv:2202.00858  [pdf, other

    cs.LG cs.AI stat.AP stat.ME stat.ML

    Hierarchical Shrinkage: improving the accuracy and interpretability of tree-based methods

    Authors: Abhineet Agarwal, Yan Shuo Tan, Omer Ronen, Chandan Singh, Bin Yu

    Abstract: Tree-based models such as decision trees and random forests (RF) are a cornerstone of modern machine-learning practice. To mitigate overfitting, trees are typically regularized by a variety of techniques that modify their structure (e.g. pruning). We introduce Hierarchical Shrinkage (HS), a post-hoc algorithm that does not modify the tree structure, and instead regularizes the tree by shrinking th… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

  35. arXiv:2201.11931  [pdf, other

    cs.LG cs.AI stat.AP stat.ME stat.ML

    Fast Interpretable Greedy-Tree Sums

    Authors: Yan Shuo Tan, Chandan Singh, Keyan Nasseri, Abhineet Agarwal, James Duncan, Omer Ronen, Matthew Epland, Aaron Kornblith, Bin Yu

    Abstract: Modern machine learning has achieved impressive prediction performance, but often sacrifices interpretability, a critical consideration in high-stakes domains such as medicine. In such settings, practitioners often use highly interpretable decision tree models, but these suffer from inductive bias against additive structure. To overcome this bias, we propose Fast Interpretable Greedy-Tree Sums (FI… ▽ More

    Submitted 8 July, 2023; v1 submitted 27 January, 2022; originally announced January 2022.

  36. arXiv:2201.01811  [pdf, other

    cs.LG cs.AI cs.NI stat.ML

    CausalSim: A Causal Framework for Unbiased Trace-Driven Simulation

    Authors: Abdullah Alomar, Pouya Hamadanian, Arash Nasr-Esfahany, Anish Agarwal, Mohammad Alizadeh, Devavrat Shah

    Abstract: We present CausalSim, a causal framework for unbiased trace-driven simulation. Current trace-driven simulators assume that the interventions being simulated (e.g., a new algorithm) would not affect the validity of the traces. However, real-world traces are often biased by the choices algorithms make during trace collection, and hence replaying traces under an intervention may lead to incorrect res… ▽ More

    Submitted 5 May, 2023; v1 submitted 5 January, 2022; originally announced January 2022.

    Comments: NSDI'23 Best Paper Award

    Journal ref: 20th USENIX Symposium on Networked Systems Design and Implementation (2023) 1115--1147

  37. arXiv:2110.09626  [pdf, other

    stat.ML cs.IT cs.LG

    A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds

    Authors: Yan Shuo Tan, Abhineet Agarwal, Bin Yu

    Abstract: Decision trees are important both as interpretable models amenable to high-stakes decision-making, and as building blocks of ensemble methods such as random forests and gradient boosting. Their statistical properties, however, are not well understood. The most cited prior works have focused on deriving pointwise consistency guarantees for CART in a classical nonparametric regression setting. We ta… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

  38. arXiv:2109.15154  [pdf, other

    econ.EM cs.LG math.ST stat.ML

    Causal Matrix Completion

    Authors: Anish Agarwal, Munther Dahleh, Devavrat Shah, Dennis Shen

    Abstract: Matrix completion is the study of recovering an underlying matrix from a sparse subset of noisy observations. Traditionally, it is assumed that the entries of the matrix are "missing completely at random" (MCAR), i.e., each entry is revealed at random, independent of everything else, with uniform probability. This is likely unrealistic due to the presence of "latent confounders", i.e., unobserved… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

  39. arXiv:2107.02780  [pdf, other

    econ.EM cs.LG math.ST stat.ML

    Causal Inference with Corrupted Data: Measurement Error, Missing Values, Discretization, and Differential Privacy

    Authors: Anish Agarwal, Rahul Singh

    Abstract: The US Census Bureau will deliberately corrupt data sets derived from the 2020 US Census, enhancing the privacy of respondents while potentially reducing the precision of economic analysis. To investigate whether this trade-off is inevitable, we formulate a semiparametric model of causal inference with high dimensional corrupted data. We propose a procedure for data cleaning, estimation, and infer… ▽ More

    Submitted 12 February, 2024; v1 submitted 6 July, 2021; originally announced July 2021.

    ACM Class: G.3; J.4

  40. arXiv:2106.06926  [pdf, other

    cs.LG cs.AI stat.ML

    Bellman-consistent Pessimism for Offline Reinforcement Learning

    Authors: Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal

    Abstract: The use of pessimism, when reasoning about datasets lacking exhaustive exploration has recently gained prominence in offline reinforcement learning. Despite the robustness it adds to the algorithm, overly pessimistic reasoning can be equally damaging in precluding the discovery of good policies, which is an issue for the popular bonus-based pessimism. In this paper, we introduce the notion of Bell… ▽ More

    Submitted 23 October, 2023; v1 submitted 13 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 (Oral)

  41. arXiv:2103.11559  [pdf, other

    cs.LG stat.ML

    Provably Correct Optimization and Exploration with Non-linear Policies

    Authors: Fei Feng, Wotao Yin, Alekh Agarwal, Lin F. Yang

    Abstract: Policy optimization methods remain a powerful workhorse in empirical Reinforcement Learning (RL), with a focus on neural policies that can easily reason over complex and continuous state and/or action spaces. Theoretical understanding of strategic exploration in policy-based methods with non-linear function approximation, however, is largely missing. In this paper, we address this question by desi… ▽ More

    Submitted 21 March, 2021; originally announced March 2021.

  42. arXiv:2103.10620  [pdf, other

    math.OC cs.LG stat.ML

    Towards a Dimension-Free Understanding of Adaptive Linear Control

    Authors: Juan C. Perdomo, Max Simchowitz, Alekh Agarwal, Peter Bartlett

    Abstract: We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension. We demonstrate that while sublinear regret requires finite dimensional inputs, the ambient state dimension of the system need not be bounded in order to perform online control. We provide the first regret bounds for LQR which hold for infinite dimensional systems, replac… ▽ More

    Submitted 15 July, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

    Comments: presented at COLT 2021

  43. arXiv:2102.07035  [pdf, other

    cs.LG stat.ML

    Model-free Representation Learning and Exploration in Low-rank MDPs

    Authors: Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

    Abstract: The low rank MDP has emerged as an important model for studying representation learning and exploration in reinforcement learning. With a known representation, several model-free exploration strategies exist. In contrast, all algorithms for the unknown representation setting are model-based, thereby requiring the ability to model the full dynamics. In this work, we present the first model-free rep… ▽ More

    Submitted 21 June, 2022; v1 submitted 13 February, 2021; originally announced February 2021.

    Comments: Changelog v2: Significant reorganization of the paper, added an improved analysis of elliptic planner and updated discussion wrt follow-up work

  44. arXiv:2011.03127  [pdf, other

    stat.ME

    Causal Imputation via Synthetic Interventions

    Authors: Chandler Squires, Dennis Shen, Anish Agarwal, Devavrat Shah, Caroline Uhler

    Abstract: Consider the problem of determining the effect of a compound on a specific cell type. To answer this question, researchers traditionally need to run an experiment applying the drug of interest to that cell type. This approach is not scalable: given a large number of different actions (compounds) and a large number of different contexts (cell types), it is infeasible to run an experiment for every… ▽ More

    Submitted 11 June, 2023; v1 submitted 5 November, 2020; originally announced November 2020.

  45. arXiv:2010.14449  [pdf, other

    math.ST cs.LG stat.ML

    On Model Identification and Out-of-Sample Prediction of Principal Component Regression: Applications to Synthetic Controls

    Authors: Anish Agarwal, Devavrat Shah, Dennis Shen

    Abstract: We analyze principal component regression (PCR) in a high-dimensional error-in-variables setting with fixed design. Under suitable conditions, we show that PCR consistently identifies the unique model with minimum $\ell_2$-norm. These results enable us to establish non-asymptotic out-of-sample prediction guarantees that improve upon the best known rates. In the course of our analysis, we introduce… ▽ More

    Submitted 25 August, 2023; v1 submitted 27 October, 2020; originally announced October 2020.

  46. arXiv:2007.08459  [pdf, other

    cs.LG cs.AI stat.ML

    PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

    Authors: Alekh Agarwal, Mikael Henaff, Sham Kakade, Wen Sun

    Abstract: Direct policy gradient methods for reinforcement learning are a successful approach for a variety of reasons: they are model free, they directly optimize the performance metric of interest, and they allow for richly parameterized policies. Their primary drawback is that, by being local in nature, they fail to adequately explore the environment. In contrast, while model-based approaches and Q-learn… ▽ More

    Submitted 13 August, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

  47. arXiv:2007.08202  [pdf, other

    cs.LG cs.AI stat.ML

    Provably Good Batch Reinforcement Learning Without Great Exploration

    Authors: Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

    Abstract: Batch reinforcement learning (RL) is important to apply RL algorithms to many high stakes tasks. Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies w… ▽ More

    Submitted 22 July, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: 36 pages, 7 figures

  48. Using LSTM for the Prediction of Disruption in ADITYA Tokamak

    Authors: Aman Agarwal, Aditya Mishra, Priyanka Sharma, Swati Jain, Sutapa Ranjan, Ranjana Manchanda

    Abstract: Major disruptions in tokamak pose a serious threat to the vessel and its surrounding pieces of equipment. The ability of the systems to detect any behavior that can lead to disruption can help in alerting the system beforehand and prevent its harmful effects. Many machine learning techniques have already been in use at large tokamaks like JET and ASDEX, but are not suitable for ADITYA, which is co… ▽ More

    Submitted 13 July, 2020; originally announced July 2020.

    Comments: 7 pages, 4 figures

    Journal ref: Plasma Physics and Controlled Fusion, Volume 63, Number 11, 2021

  49. arXiv:2007.00795  [pdf, other

    cs.LG cs.AI stat.ML

    Policy Improvement via Imitation of Multiple Oracles

    Authors: Ching-An Cheng, Andrey Kolobov, Alekh Agarwal

    Abstract: Despite its promise, reinforcement learning's real-world adoption has been hampered by the need for costly exploration to learn a good policy. Imitation learning (IL) mitigates this shortcoming by using an oracle policy during training as a bootstrap to accelerate the learning process. However, in many practical situations, the learner has access to multiple suboptimal oracles, which may provide c… ▽ More

    Submitted 5 December, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

  50. arXiv:2006.13448  [pdf, other

    cs.LG stat.ML

    On Multivariate Singular Spectrum Analysis and its Variants

    Authors: Anish Agarwal, Abdullah Alomar, Devavrat Shah

    Abstract: We introduce and analyze a variant of multivariate singular spectrum analysis (mSSA), a popular time series method to impute and forecast a multivariate time series. Under a spatio-temporal factor model we introduce, given $N$ time series and $T$ observations per time series, we establish prediction mean-squared-error for both imputation and out-of-sample forecasting effectively scale as… ▽ More

    Submitted 19 June, 2022; v1 submitted 23 June, 2020; originally announced June 2020.