Skip to main content

Showing 1–11 of 11 results for author: Pattathil, S

Searching in archive math. Search in all archives.
.
  1. arXiv:2212.13861  [pdf, ps, other

    cs.LG math.OC stat.ML

    Offline Reinforcement Learning via Linear-Programming with Error-Bound Induced Constraints

    Authors: Asuman Ozdaglar, Sarath Pattathil, Jiawei Zhang, Kaiqing Zhang

    Abstract: Offline reinforcement learning (RL) aims to find an optimal policy for Markov decision processes (MDPs) using a pre-collected dataset. In this work, we revisit the linear programming (LP) reformulation of Markov decision processes for offline RL, with the goal of developing algorithms with optimal $O(1/\sqrt{n})$ sample complexity, where $n$ is the sample size, under partial data coverage and gene… ▽ More

    Submitted 9 December, 2024; v1 submitted 28 December, 2022; originally announced December 2022.

    Comments: 47 pages; journal extension of the ICML version with new results

  2. arXiv:2210.12812  [pdf, ps, other

    math.OC cs.LG cs.MA stat.ML

    Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence

    Authors: Sarath Pattathil, Kaiqing Zhang, Asuman Ozdaglar

    Abstract: Multi-agent interactions are increasingly important in the context of reinforcement learning, and the theoretical foundations of policy gradient methods have attracted surging research interest. We investigate the global convergence of natural policy gradient (NPG) algorithms in multi-agent learning. We first show that vanilla NPG may not have parameter convergence, i.e., the convergence of the ve… ▽ More

    Submitted 20 March, 2023; v1 submitted 23 October, 2022; originally announced October 2022.

    Comments: Initially submitted for publication in January 2022

  3. arXiv:2206.04502  [pdf, other

    stat.ML cs.LG math.OC

    What is a Good Metric to Study Generalization of Minimax Learners?

    Authors: Asuman Ozdaglar, Sarath Pattathil, Jiawei Zhang, Kaiqing Zhang

    Abstract: Minimax optimization has served as the backbone of many machine learning (ML) problems. Although the convergence behavior of optimization algorithms has been extensively studied in the minimax settings, their generalization guarantees in stochastic minimax optimization problems, i.e., how the solution trained on empirical data performs on unseen testing data, have been relatively underexplored. A… ▽ More

    Submitted 20 June, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 34 pages, 2 figures

  4. arXiv:2101.00773  [pdf, other

    eess.SY math.DS math.OC physics.soc-ph q-bio.PE

    Optimal adaptive testing for epidemic control: combining molecular and serology tests

    Authors: D. Acemoglu, A. Fallah, A. Giometto, D. Huttenlocher, A. Ozdaglar, F. Parise, S. Pattathil

    Abstract: The COVID-19 crisis highlighted the importance of non-medical interventions, such as testing and isolation of infected individuals, in the control of epidemics. Here, we show how to minimize testing needs while maintaining the number of infected individuals below a desired threshold. We find that the optimal policy is adaptive, with testing rates that depend on the epidemic state. Additionally, we… ▽ More

    Submitted 3 January, 2021; originally announced January 2021.

    Journal ref: Automatica, Volume 160, February 2024, 111391

  5. arXiv:2010.13724  [pdf, ps, other

    cs.LG math.OC

    Tight last-iterate convergence rates for no-regret learning in multi-player games

    Authors: Noah Golowich, Sarath Pattathil, Constantinos Daskalakis

    Abstract: We study the question of obtaining last-iterate convergence rates for no-regret learning algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of $O(1/\sqrt{T})$ with respect to the gap function in smooth monotone games. This result addresses a question of Mertikopoulos & Zhou (2018), who as… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: To appear at NeurIPS 2020. 41 pages

  6. arXiv:2002.05683  [pdf, ps, other

    math.OC cs.LG stat.ML

    An Optimal Multistage Stochastic Gradient Method for Minimax Problems

    Authors: Alireza Fallah, Asuman Ozdaglar, Sarath Pattathil

    Abstract: In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent (GDA) method with constant stepsize, and show that it converges to a neighborhood of the solution of the minimax problem. We further provide tight bounds on the… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  7. arXiv:2002.00057  [pdf, ps, other

    cs.LG math.OC stat.ML

    Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems

    Authors: Noah Golowich, Sarath Pattathil, Constantinos Daskalakis, Asuman Ozdaglar

    Abstract: In this paper we study the smooth convex-concave saddle point problem. Specifically, we analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well known that the ergodic (averaged) iterates of EG converge at a rate of $O(1/T)$ (Nemirovski, 2004). In this paper, we show that the last iterate of EG converges at a rate of $O(1/\sqrt{T})$. To the best of our knowle… ▽ More

    Submitted 6 July, 2020; v1 submitted 31 January, 2020; originally announced February 2020.

    Comments: 27 pages

  8. arXiv:1910.14380  [pdf, other

    math.OC cs.LG stat.ML

    A Decentralized Proximal Point-type Method for Saddle Point Problems

    Authors: Weijie Liu, Aryan Mokhtari, Asuman Ozdaglar, Sarath Pattathil, Zebang Shen, Nenggan Zheng

    Abstract: In this paper, we focus on solving a class of constrained non-convex non-concave saddle point problems in a decentralized manner by a group of nodes in a network. Specifically, we assume that each node has access to a summand of a global objective function and nodes are allowed to exchange information only with their neighboring nodes. We propose a decentralized variant of the proximal point metho… ▽ More

    Submitted 31 October, 2019; originally announced October 2019.

    Comments: 18 pages

  9. arXiv:1906.01115  [pdf, ps, other

    math.OC cs.LG stat.ML

    Convergence Rate of $\mathcal{O}(1/k)$ for Optimistic Gradient and Extra-gradient Methods in Smooth Convex-Concave Saddle Point Problems

    Authors: Aryan Mokhtari, Asuman Ozdaglar, Sarath Pattathil

    Abstract: We study the iteration complexity of the optimistic gradient descent-ascent (OGDA) method and the extra-gradient (EG) method for finding a saddle point of a convex-concave unconstrained min-max problem. To do so, we first show that both OGDA and EG can be interpreted as approximate variants of the proximal point method. This is similar to the approach taken in [Nemirovski, 2004] which analyzes EG… ▽ More

    Submitted 29 September, 2020; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: 19 pages

  10. arXiv:1901.08511  [pdf, ps, other

    math.OC cs.LG stat.ML

    A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach

    Authors: Aryan Mokhtari, Asuman Ozdaglar, Sarath Pattathil

    Abstract: In this paper we consider solving saddle point problems using two variants of Gradient Descent-Ascent algorithms, Extra-gradient (EG) and Optimistic Gradient Descent Ascent (OGDA) methods. We show that both of these algorithms admit a unified analysis as approximations of the classical proximal point method for solving saddle point problems. This viewpoint enables us to develop a new framework for… ▽ More

    Submitted 5 September, 2019; v1 submitted 24 January, 2019; originally announced January 2019.

    Comments: 25 pages, 3 figures

  11. arXiv:1806.10798  [pdf, ps, other

    math.OC

    Concentration bounds for two time scale stochastic approximation

    Authors: Vivek S. Borkar, Sarath Pattathil

    Abstract: Viewing a two time scale stochastic approximation scheme as a noisy discretization of a singularly perturbed differential equation, we obtain a concentration bound for its iterates that captures its behavior with quantifiable high probability. This uses Alekseev's nonlinear variation of constants formula and a martingale concentration inequality and extends the corresponding results for single tim… ▽ More

    Submitted 28 June, 2018; originally announced June 2018.

    Comments: 8 pages