Search | arXiv e-print repository

Data-Driven Near-Optimal Control of Nonlinear Systems Over Finite Horizon

Authors: Vasanth Reddy, Hoda Eldardiry, Almuatazbellah Boker

Abstract: We examine the problem of two-point boundary optimal control of nonlinear systems over finite-horizon time periods with unknown model dynamics by employing reinforcement learning. We use techniques from singular perturbation theory to decompose the control problem over the finite horizon into two sub-problems, each solved over an infinite horizon. In the process, we avoid the need to solve the tim… ▽ More We examine the problem of two-point boundary optimal control of nonlinear systems over finite-horizon time periods with unknown model dynamics by employing reinforcement learning. We use techniques from singular perturbation theory to decompose the control problem over the finite horizon into two sub-problems, each solved over an infinite horizon. In the process, we avoid the need to solve the time-varying Hamilton-Jacobi-Bellman equation. Using a policy iteration method, which is made feasible as a result of this decomposition, it is now possible to learn the controller gains of both sub-problems. The overall control is then formed by piecing together the solutions to the two sub-problems. We show that the performance of the proposed closed-loop system approaches that of the model-based optimal performance as the time horizon gets long. Finally, we provide three simulation scenarios to support the paper's claims. △ Less

Submitted 8 June, 2023; originally announced June 2023.

arXiv:2104.09652 [pdf, other]

Singular Perturbation-based Reinforcement Learning of Two-Point Boundary Optimal Control Systems

Authors: Vasanth Reddy, Hoda Eldardiry, Almuatazbellah Boker

Abstract: This work presents a technique for learning systems, where the learning process is guided by knowledge of the physics of the system. In particular, we solve the problem of the two-point boundary optimal control problem of linear time-varying systems with unknown model dynamics using reinforcement learning. Borrowing techniques from singular perturbation theory, we transform the time-varying optima… ▽ More This work presents a technique for learning systems, where the learning process is guided by knowledge of the physics of the system. In particular, we solve the problem of the two-point boundary optimal control problem of linear time-varying systems with unknown model dynamics using reinforcement learning. Borrowing techniques from singular perturbation theory, we transform the time-varying optimal control problem into a couple of time-invariant subproblems. This allows the utilization of an off-policy iteration method to learn the controller gains. We show that the performance of the learning-based controller approximates that of the model-based optimal controller and the accuracy of the approximation improves as the time horizon of the control problem increases. Finally, we provide a simulation example to verify the results of the paper. △ Less

Submitted 29 April, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

Comments: 7 pages, 6 figures

arXiv:1607.07402 [pdf, ps, other]

Semi-global Output Feedback Stabilization of Non-Minimum Phase Nonlinear Systems

Authors: Almuatazbellah M. Boker, Hassan K. Khalil

Abstract: We solve the problem of output feedback stabilization of a class of nonlinear systems, which may have unstable zero dynamics. We allow for any globally stabilizing full state feedback control scheme to be used as long as it satisfies a particular ISS condition. We show semi-global stability of the origin of the closed-loop system and also the recovery of the performance of an auxiliary system usin… ▽ More We solve the problem of output feedback stabilization of a class of nonlinear systems, which may have unstable zero dynamics. We allow for any globally stabilizing full state feedback control scheme to be used as long as it satisfies a particular ISS condition. We show semi-global stability of the origin of the closed-loop system and also the recovery of the performance of an auxiliary system using a full-order observer. This observer is based on the use of an extended high-gain observer to provide estimates of the output and its derivatives plus a signal used by an extended Kalman filter to provide estimates of the remaining states. Finally, we provide a simulation example that illustrates the design procedure. △ Less

Submitted 25 July, 2016; originally announced July 2016.

Comments: 9 pages, 1 figure

Showing 1–3 of 3 results for author: Boker, A