-
An Information-state based Approach to the Optimal Output Feedback Control of Nonlinear Systems
Authors:
Raman Goyal,
Ran Wang,
Mohamed Naveed Gul Mohamed,
Aayushman Sharma,
Suman Chakravorty
Abstract:
This paper develops a data-based approach to the closed-loop output feedback control of nonlinear dynamical systems with a partial nonlinear observation model. We propose an information state based approach to rigorously transform the partially observed problem into a fully observed problem where the information state consists of the past several observations and control inputs. We further show th…
▽ More
This paper develops a data-based approach to the closed-loop output feedback control of nonlinear dynamical systems with a partial nonlinear observation model. We propose an information state based approach to rigorously transform the partially observed problem into a fully observed problem where the information state consists of the past several observations and control inputs. We further show the equivalence of the transformed and the initial partially observed optimal control problems and provide the conditions to solve for the deterministic optimal solution. We develop a data based generalization of the iterative Linear Quadratic Regulator (iLQR) to partially observed systems using a local linear time varying model of the information state dynamics approximated by an Autoregressive moving average (ARMA) model, that is generated using only the input-output data. This open-loop trajectory optimization solution is then used to design a local feedback control law, and the composite law then provides an optimum solution to the partially observed feedback design problem. The efficacy of the developed method is shown by controlling complex high dimensional nonlinear dynamical systems in the presence of model and sensing uncertainty.
△ Less
Submitted 5 October, 2023; v1 submitted 16 July, 2021;
originally announced July 2021.
-
On the Convergence of Reinforcement Learning in Nonlinear Continuous State Space Problems
Authors:
Raman Goyal,
Suman Chakravorty,
Ran Wang,
Mohamed Naveed Gul Mohamed
Abstract:
We consider the problem of Reinforcement Learning for nonlinear stochastic dynamical systems. We show that in the RL setting, there is an inherent ``Curse of Variance" in addition to Bellman's infamous ``Curse of Dimensionality", in particular, we show that the variance in the solution grows factorial-exponentially in the order of the approximation. A fundamental consequence is that this precludes…
▽ More
We consider the problem of Reinforcement Learning for nonlinear stochastic dynamical systems. We show that in the RL setting, there is an inherent ``Curse of Variance" in addition to Bellman's infamous ``Curse of Dimensionality", in particular, we show that the variance in the solution grows factorial-exponentially in the order of the approximation. A fundamental consequence is that this precludes the search for anything other than ``local" feedback solutions in RL, in order to control the explosive variance growth, and thus, ensure accuracy. We further show that the deterministic optimal control has a perturbation structure, in that the higher order terms do not affect the calculation of lower order terms, which can be utilized in RL to get accurate local solutions.
△ Less
Submitted 28 July, 2021; v1 submitted 21 November, 2020;
originally announced November 2020.
-
On the Feedback Law in Stochastic Optimal Nonlinear Control
Authors:
Mohamed Naveed Gul Mohamed,
Suman Chakravorty,
Raman Goyal,
Ran Wang
Abstract:
We consider the problem of nonlinear stochastic optimal control. This problem is thought to be fundamentally intractable owing to Bellman's "curse of dimensionality". We present a result that shows that repeatedly solving an open-loop deterministic problem from the current state with progressively shorter horizons, similar to Model Predictive Control (MPC), results in a feedback policy that is…
▽ More
We consider the problem of nonlinear stochastic optimal control. This problem is thought to be fundamentally intractable owing to Bellman's "curse of dimensionality". We present a result that shows that repeatedly solving an open-loop deterministic problem from the current state with progressively shorter horizons, similar to Model Predictive Control (MPC), results in a feedback policy that is $O(ε^4)$ near to the true global stochastic optimal policy, where $ε$ is a perturbation parameter modulating the noise. We also show that the optimal deterministic feedback problem has a perturbation structure such that higher-order terms of the feedback law do not affect lower-order terms and that this structure is lost in the optimal stochastic feedback problem. Consequently, solving the Stochastic Dynamic Programming problem is highly susceptible to noise, even in low dimensional problems, and in practice, the MPC-type feedback law offers superior performance even for high noise levels.
△ Less
Submitted 10 October, 2024; v1 submitted 1 April, 2020;
originally announced April 2020.
-
Experiments with Tractable Feedback in Robotic Planning under Uncertainty: Insights over a wide range of noise regimes (Extended Report)
Authors:
Mohamed Naveed Gul Mohamed,
Suman Chakravorty,
Dylan A. Shell
Abstract:
We consider the problem of robotic planning under uncertainty. This problem may be posed as a stochastic optimal control problem, complete solution to which is fundamentally intractable owing to the infamous curse of dimensionality. We report the results of an extensive simulation study in which we have compared two methods, both of which aim to salvage tractability by using alternative, albeit in…
▽ More
We consider the problem of robotic planning under uncertainty. This problem may be posed as a stochastic optimal control problem, complete solution to which is fundamentally intractable owing to the infamous curse of dimensionality. We report the results of an extensive simulation study in which we have compared two methods, both of which aim to salvage tractability by using alternative, albeit inexact, means for treating feedback. The first is a recently proposed method based on a near-optimal "decoupling principle" for tractable feedback design, wherein a nominal open-loop problem is solved, followed by a linear feedback design around the open-loop. The second is Model Predictive Control (MPC), a widely-employed method that uses repeated re-computation of the nominal open-loop problem during execution to correct for noise, though when interpreted as feedback, this can only said to be an implicit form. We examine a much wider range of noise levels than have been previously reported and empirical evidence suggests that the decoupling method allows for tractable planning over a wide range of uncertainty conditions without unduly sacrificing performance.
△ Less
Submitted 18 July, 2020; v1 submitted 20 February, 2020;
originally announced February 2020.