-
Data-Driven Near-Optimal Control of Nonlinear Systems Over Finite Horizon
Authors:
Vasanth Reddy,
Hoda Eldardiry,
Almuatazbellah Boker
Abstract:
We examine the problem of two-point boundary optimal control of nonlinear systems over finite-horizon time periods with unknown model dynamics by employing reinforcement learning. We use techniques from singular perturbation theory to decompose the control problem over the finite horizon into two sub-problems, each solved over an infinite horizon. In the process, we avoid the need to solve the tim…
▽ More
We examine the problem of two-point boundary optimal control of nonlinear systems over finite-horizon time periods with unknown model dynamics by employing reinforcement learning. We use techniques from singular perturbation theory to decompose the control problem over the finite horizon into two sub-problems, each solved over an infinite horizon. In the process, we avoid the need to solve the time-varying Hamilton-Jacobi-Bellman equation. Using a policy iteration method, which is made feasible as a result of this decomposition, it is now possible to learn the controller gains of both sub-problems. The overall control is then formed by piecing together the solutions to the two sub-problems. We show that the performance of the proposed closed-loop system approaches that of the model-based optimal performance as the time horizon gets long. Finally, we provide three simulation scenarios to support the paper's claims.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Singular Perturbation-based Reinforcement Learning of Two-Point Boundary Optimal Control Systems
Authors:
Vasanth Reddy,
Hoda Eldardiry,
Almuatazbellah Boker
Abstract:
This work presents a technique for learning systems, where the learning process is guided by knowledge of the physics of the system. In particular, we solve the problem of the two-point boundary optimal control problem of linear time-varying systems with unknown model dynamics using reinforcement learning. Borrowing techniques from singular perturbation theory, we transform the time-varying optima…
▽ More
This work presents a technique for learning systems, where the learning process is guided by knowledge of the physics of the system. In particular, we solve the problem of the two-point boundary optimal control problem of linear time-varying systems with unknown model dynamics using reinforcement learning. Borrowing techniques from singular perturbation theory, we transform the time-varying optimal control problem into a couple of time-invariant subproblems. This allows the utilization of an off-policy iteration method to learn the controller gains. We show that the performance of the learning-based controller approximates that of the model-based optimal controller and the accuracy of the approximation improves as the time horizon of the control problem increases. Finally, we provide a simulation example to verify the results of the paper.
△ Less
Submitted 29 April, 2021; v1 submitted 19 April, 2021;
originally announced April 2021.
-
Semi-global Output Feedback Stabilization of Non-Minimum Phase Nonlinear Systems
Authors:
Almuatazbellah M. Boker,
Hassan K. Khalil
Abstract:
We solve the problem of output feedback stabilization of a class of nonlinear systems, which may have unstable zero dynamics. We allow for any globally stabilizing full state feedback control scheme to be used as long as it satisfies a particular ISS condition. We show semi-global stability of the origin of the closed-loop system and also the recovery of the performance of an auxiliary system usin…
▽ More
We solve the problem of output feedback stabilization of a class of nonlinear systems, which may have unstable zero dynamics. We allow for any globally stabilizing full state feedback control scheme to be used as long as it satisfies a particular ISS condition. We show semi-global stability of the origin of the closed-loop system and also the recovery of the performance of an auxiliary system using a full-order observer. This observer is based on the use of an extended high-gain observer to provide estimates of the output and its derivatives plus a signal used by an extended Kalman filter to provide estimates of the remaining states. Finally, we provide a simulation example that illustrates the design procedure.
△ Less
Submitted 25 July, 2016;
originally announced July 2016.