Skip to main content

Showing 1–3 of 3 results for author: Kordabad, A B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2203.13854  [pdf, other

    cs.LG eess.SY

    Quasi-Newton Iteration in Deterministic Policy Gradient

    Authors: Arash Bahari Kordabad, Hossein Nejatbakhsh Esfahani, Wenqi Cai, Sebastien Gros

    Abstract: This paper presents a model-free approximation for the Hessian of the performance of deterministic policies to use in the context of Reinforcement Learning based on Quasi-Newton steps in the policy parameters. We show that the approximate Hessian converges to the exact Hessian at the optimal policy, and allows for a superlinear convergence in the learning, provided that the policy parametrization… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: This paper has been accepted to 2022 American Control Conference (ACC). 6 pages

  2. arXiv:2104.02743  [pdf, other

    eess.SY cs.LG cs.RO

    Approximate Robust NMPC using Reinforcement Learning

    Authors: Hossein Nejatbakhsh Esfahani, Arash Bahari Kordabad, Sebastien Gros

    Abstract: We present a Reinforcement Learning-based Robust Nonlinear Model Predictive Control (RL-RNMPC) framework for controlling nonlinear systems in the presence of disturbances and uncertainties. An approximate Robust Nonlinear Model Predictive Control (RNMPC) of low computational complexity is used in which the state trajectory uncertainty is modelled via ellipsoids. Reinforcement Learning is then used… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

    Comments: This paper has been accepted to 2021 European Control Conference (ECC)

  3. arXiv:2104.02411  [pdf, other

    cs.LG eess.SY

    MPC-based Reinforcement Learning for Economic Problems with Application to Battery Storage

    Authors: Arash Bahari Kordabad, Wenqi Cai, Sebastien Gros

    Abstract: In this paper, we are interested in optimal control problems with purely economic costs, which often yield optimal policies having a (nearly) bang-bang structure. We focus on policy approximations based on Model Predictive Control (MPC) and the use of the deterministic policy gradient method to optimize the MPC closed-loop performance in the presence of unmodelled stochasticity or model error. Whe… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

    Comments: This paper has been accepted to ECC2021. 6 pages