Skip to main content

Showing 1–17 of 17 results for author: Koppel, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2501.16489   

    stat.ML cs.LG eess.SY

    Nonparametric Sparse Online Learning of the Koopman Operator

    Authors: Boya Hou, Sina Sanjari, Nathan Dahlin, Alec Koppel, Subhonmesh Bose

    Abstract: The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. Data-driven techniques to learn the Koopman operator typically assume that the chosen function space is closed under system dynamics. In this paper, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and explore the mis-specified scenari… ▽ More

    Submitted 4 February, 2025; v1 submitted 27 January, 2025; originally announced January 2025.

    Comments: This work was intended as a replacement of arXiv:2405.07432 and any subsequent updates will appear there

  2. arXiv:2406.13992  [pdf, ps, other

    cs.MA eess.SY

    Robust Cooperative Multi-Agent Reinforcement Learning:A Mean-Field Type Game Perspective

    Authors: Muhammad Aneeq uz Zaman, Mathieu Laurière, Alec Koppel, Tamer Başar

    Abstract: In this paper, we study the problem of robust cooperative multi-agent reinforcement learning (RL) where a large number of cooperative agents with distributed information aim to learn policies in the presence of \emph{stochastic} and \emph{non-stochastic} uncertainties whose distributions are respectively known and unknown. Focusing on policy optimization that accounts for both types of uncertainti… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted for publication in L4DC 2024

  3. arXiv:2405.07432  [pdf, other

    stat.ML cs.LG eess.SY

    Nonparametric Sparse Online Learning of the Koopman Operator

    Authors: Boya Hou, Sina Sanjari, Nathan Dahlin, Alec Koppel, Subhonmesh Bose

    Abstract: The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. Data-driven techniques to learn the Koopman operator typically assume that the chosen function space is closed under system dynamics. In this paper, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and explore the mis-specified scenari… ▽ More

    Submitted 4 February, 2025; v1 submitted 12 May, 2024; originally announced May 2024.

    Comments: 49 pages, 6 figures

  4. arXiv:2310.07320  [pdf, ps, other

    cs.LG eess.SY

    Byzantine-Resilient Decentralized Multi-Armed Bandits

    Authors: Jingxuan Zhu, Alec Koppel, Alvaro Velasquez, Ji Liu

    Abstract: In decentralized cooperative multi-armed bandits (MAB), each agent observes a distinct stream of rewards, and seeks to exchange information with others to select a sequence of arms so as to minimize its regret. Agents in the cooperative setting can outperform a single agent running a MAB method such as Upper-Confidence Bound (UCB) independently. In this work, we study how to recover such salient b… ▽ More

    Submitted 11 June, 2025; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: add a disclaimer

  5. arXiv:2206.05652  [pdf, other

    cs.LG cs.RO eess.SY

    Dealing with Sparse Rewards in Continuous Control Robotics via Heavy-Tailed Policies

    Authors: Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Pratap Tokekar, Dinesh Manocha

    Abstract: In this paper, we present a novel Heavy-Tailed Stochastic Policy Gradient (HT-PSG) algorithm to deal with the challenges of sparse rewards in continuous control problems. Sparse reward is common in continuous control robotics tasks such as manipulation and navigation, and makes the learning problem hard due to non-trivial estimation of value functions over the state space. This demands either rewa… ▽ More

    Submitted 12 June, 2022; originally announced June 2022.

  6. arXiv:2111.10933  [pdf, other

    cs.LG eess.SY

    Decentralized Upper Confidence Bound Algorithms for Homogeneous Multi-Agent Multi-Armed Bandits

    Authors: Jingxuan Zhu, Ethan Mulle, Christopher S. Smith, Alec Koppel, Ji Liu

    Abstract: This paper studies a decentralized homogeneous multi-armed bandit problem in a multi-agent network. The problem is simultaneously solved by $N$ agents assuming they face a common set of $M$ arms and share the same arms' reward distributions. Each agent can receive information only from its neighbors, where the neighbor relationships among the agents are described by a fixed graph. Two fully decent… ▽ More

    Submitted 28 December, 2024; v1 submitted 21 November, 2021; originally announced November 2021.

    Comments: [v3] and [v4] are different works with different algorithm designs. A shortened version of [v3] was published in the 63rd IEEE Conference on Decision and Control, and a shortened version of [v4] was accepted in IEEE Transactions on Automatic Control

  7. arXiv:2106.08414  [pdf, other

    cs.LG cs.AI eess.SY math.OC stat.ML

    On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control

    Authors: Amrit Singh Bedi, Anjaly Parayil, Junyu Zhang, Mengdi Wang, Alec Koppel

    Abstract: Reinforcement learning is a framework for interactive decision-making with incentives sequentially revealed across time without a system dynamics model. Due to its scaling to continuous spaces, we focus on policy search where one iteratively improves a parameterized policy with stochastic policy gradient (PG) updates. In tabular Markov Decision Problems (MDPs), under persistent exploration and sui… ▽ More

    Submitted 2 January, 2023; v1 submitted 15 June, 2021; originally announced June 2021.

  8. arXiv:2007.01219  [pdf, ps, other

    eess.SP cs.LG

    Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems

    Authors: Zhan Gao, Alec Koppel, Alejandro Ribeiro

    Abstract: Stochastic gradient descent is a canonical tool for addressing stochastic optimization problems, and forms the bedrock of modern machine learning and statistics. In this work, we seek to balance the fact that attenuating step-size is required for exact asymptotic convergence with the fact that constant step-size learns faster in finite time up to an error. To do so, rather than fixing the mini-bat… ▽ More

    Submitted 9 July, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

  9. arXiv:2004.04843  [pdf, other

    cs.LG cs.MA eess.SY math.OC stat.ML

    Policy Gradient using Weak Derivatives for Reinforcement Learning

    Authors: Sujay Bhatt, Alec Koppel, Vikram Krishnamurthy

    Abstract: This paper considers policy search in continuous state-action reinforcement learning problems. Typically, one computes search directions using a classic expression for the policy gradient called the Policy Gradient Theorem, which decomposes the gradient of the value function into two factors: the score function and the Q-function. This paper presents four results:(i) an alternative policy gradient… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

    Comments: 1 figure

  10. arXiv:2003.12637  [pdf, other

    eess.SP cs.IT

    Collaborative Beamforming Under Localization Errors: A Discrete Optimization Approach

    Authors: Erfaun Noorani, Yagiz Savas, Alec Koppel, John Baras, Ufuk Topcu, Brian M. Sadler

    Abstract: We consider a network of agents that locate themselves in an environment through sensor measurements and aim to transmit a message signal to a base station via collaborative beamforming. The agents' sensor measurements result in localization errors, which degrade the quality of service at the base station due to unknown phase offsets that arise in the agents' communication channels. Assuming that… ▽ More

    Submitted 17 March, 2021; v1 submitted 27 March, 2020; originally announced March 2020.

  11. arXiv:2002.12475  [pdf, other

    stat.ML cs.AI cs.LG eess.SY math.OC

    Cautious Reinforcement Learning via Distributional Risk in the Dual Domain

    Authors: Junyu Zhang, Amrit Singh Bedi, Mengdi Wang, Alec Koppel

    Abstract: We study the estimation of risk-sensitive policies in reinforcement learning problems defined by a Markov Decision Process (MDPs) whose state and action spaces are countably finite. Prior efforts are predominately afflicted by computational challenges associated with the fact that risk-sensitive MDPs are time-inconsistent. To ameliorate this issue, we propose a new definition of risk, which we cal… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

  12. arXiv:1909.11555  [pdf, other

    eess.SP cs.LG math.OC

    Optimally Compressed Nonparametric Online Learning

    Authors: Alec Koppel, Amrit Singh Bedi, Ketan Rajawat, Brian M. Sadler

    Abstract: Batch training of machine learning models based on neural networks is now well established, whereas to date streaming methods are largely based on linear models. To go beyond linear in the online setting, nonparametric methods are of interest due to their universality and ability to stably incorporate new information via convexity or Bayes' Rule. Unfortunately, when used online, nonparametric meth… ▽ More

    Submitted 17 January, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

  13. arXiv:1909.05442  [pdf, other

    math.OC cs.LG eess.SP

    Nonstationary Nonparametric Online Learning: Balancing Dynamic Regret and Model Parsimony

    Authors: Amrit Singh Bedi, Alec Koppel, Ketan Rajawat, Brian M. Sadler

    Abstract: An open challenge in supervised learning is \emph{conceptual drift}: a data point begins as classified according to one label, but over time the notion of that label changes. Beyond linear autoregressive models, transfer and meta learning address drift, but require data that is representative of disparate domains at the outset of training. To relax this requirement, we propose a memory-efficient \… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

  14. arXiv:1908.00510  [pdf, ps, other

    math.OC eess.SP stat.ML

    Adaptive Kernel Learning in Heterogeneous Networks

    Authors: Hrusikesha Pradhan, Amrit Singh Bedi, Alec Koppel, Ketan Rajawat

    Abstract: We consider learning in decentralized heterogeneous networks: agents seek to minimize a convex functional that aggregates data across the network, while only having access to their local data streams. We focus on the case where agents seek to estimate a regression \emph{function} that belongs to a reproducing kernel Hilbert space (RKHS). To incentivize coordination while respecting network heterog… ▽ More

    Submitted 1 June, 2021; v1 submitted 1 August, 2019; originally announced August 2019.

  15. arXiv:1906.08383  [pdf, other

    math.OC cs.LG eess.SY math.ST

    Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies

    Authors: Kaiqing Zhang, Alec Koppel, Hao Zhu, Tamer Başar

    Abstract: Policy gradient (PG) methods are a widely used reinforcement learning methodology in many applications such as video games, autonomous driving, and robotics. In spite of its empirical success, a rigorous understanding of the global convergence of PG methods is lacking in the literature. In this work, we close the gap by viewing PG methods from a nonconvex optimization perspective. In particular, w… ▽ More

    Submitted 28 June, 2020; v1 submitted 19 June, 2019; originally announced June 2019.

    Comments: Initially submitted in Jan. 2019. Accepted to SIAM Journal on Control and Optimization (SICON)

  16. arXiv:1804.07323  [pdf, other

    cs.LG eess.SY stat.ML

    Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems

    Authors: Alec Koppel, Ekaterina Tolstaya, Ethan Stump, Alejandro Ribeiro

    Abstract: We consider Markov Decision Problems defined over continuous state and action spaces, where an autonomous agent seeks to learn a map from its states to actions so as to maximize its long-term discounted accumulation of rewards. We address this problem by considering Bellman's optimality equation defined over action-value functions, which we reformulate into a nested non-convex stochastic optimizat… ▽ More

    Submitted 19 April, 2018; originally announced April 2018.

  17. arXiv:1606.05578  [pdf, other

    cs.MA eess.SY stat.CO

    Proximity Without Consensus in Online Multi-Agent Optimization

    Authors: Alec Koppel, Brian M. Sadler, Alejandro Ribeiro

    Abstract: We consider stochastic optimization problems in multi-agent settings, where a network of agents aims to learn parameters which are optimal in terms of a global objective, while giving preference to locally observed streaming information. To do so, we depart from the canonical decentralized optimization framework where agreement constraints are enforced, and instead formulate a problem where each a… ▽ More

    Submitted 17 June, 2016; originally announced June 2016.