Search | arXiv e-print repository

Policy Optimization for PDE Control with a Warm Start

Authors: Xiangyuan Zhang, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar

Abstract: Dimensionality reduction is crucial for controlling nonlinear partial differential equations (PDE) through a "reduce-then-design" strategy, which identifies a reduced-order model and then implements model-based control solutions. However, inaccuracies in the reduced-order modeling can substantially degrade controller performance, especially in PDEs with chaotic behavior. To address this issue, we… ▽ More Dimensionality reduction is crucial for controlling nonlinear partial differential equations (PDE) through a "reduce-then-design" strategy, which identifies a reduced-order model and then implements model-based control solutions. However, inaccuracies in the reduced-order modeling can substantially degrade controller performance, especially in PDEs with chaotic behavior. To address this issue, we augment the reduce-then-design procedure with a policy optimization (PO) step. The PO step fine-tunes the model-based controller to compensate for the modeling error from dimensionality reduction. This augmentation shifts the overall strategy into reduce-then-design-then-adapt, where the model-based controller serves as a warm start for PO. Specifically, we study the state-feedback tracking control of PDEs that aims to align the PDE state with a specific constant target subject to a linear-quadratic cost. Through extensive experiments, we show that a few iterations of PO can significantly improve the model-based controller performance. Our approach offers a cost-effective alternative to PDE control using end-to-end reinforcement learning. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.15636 [pdf, other]

Smooth and Sparse Latent Dynamics in Operator Learning with Jerk Regularization

Authors: Xiaoyu Xie, Saviz Mowlavi, Mouhacine Benosman

Abstract: Spatiotemporal modeling is critical for understanding complex systems across various scientific and engineering disciplines, but governing equations are often not fully known or computationally intractable due to inherent system complexity. Data-driven reduced-order models (ROMs) offer a promising approach for fast and accurate spatiotemporal forecasting by computing solutions in a compressed late… ▽ More Spatiotemporal modeling is critical for understanding complex systems across various scientific and engineering disciplines, but governing equations are often not fully known or computationally intractable due to inherent system complexity. Data-driven reduced-order models (ROMs) offer a promising approach for fast and accurate spatiotemporal forecasting by computing solutions in a compressed latent space. However, these models often neglect temporal correlations between consecutive snapshots when constructing the latent space, leading to suboptimal compression, jagged latent trajectories, and limited extrapolation ability over time. To address these issues, this paper introduces a continuous operator learning framework that incorporates jerk regularization into the learning of the compressed latent space. This jerk regularization promotes smoothness and sparsity of latent space dynamics, which not only yields enhanced accuracy and convergence speed but also helps identify intrinsic latent space coordinates. Consisting of an implicit neural representation (INR)-based autoencoder and a neural ODE latent dynamics model, the framework allows for inference at any desired spatial or temporal resolution. The effectiveness of this framework is demonstrated through a two-dimensional unsteady flow problem governed by the Navier-Stokes equations, highlighting its potential to expedite high-fidelity simulations in various scientific and engineering applications. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2312.11839 [pdf, other]

Dual parametric and state estimation for partial differential equations

Authors: Saviz Mowlavi, Mouhacine Benosman

Abstract: Designing estimation algorithms for systems governed by partial differential equations (PDEs) such as fluid flows is challenging due to the high-dimensional and oftentimes nonlinear nature of the dynamics, as well as their dependence on unobserved physical parameters. In this paper, we propose two different lightweight and effective methodologies for real-time state estimation of PDEs in the prese… ▽ More Designing estimation algorithms for systems governed by partial differential equations (PDEs) such as fluid flows is challenging due to the high-dimensional and oftentimes nonlinear nature of the dynamics, as well as their dependence on unobserved physical parameters. In this paper, we propose two different lightweight and effective methodologies for real-time state estimation of PDEs in the presence of parametric uncertainties. Both approaches combine a Kalman filter with a data-driven polytopic linear reduced-order model obtained by dynamic mode decomposition (DMD). Using examples involving the nonlinear Burgers and Navier-Stokes equations, we demonstrate accurate estimation of both the state and the unknown physical parameter along system trajectories corresponding to various physical parameter values. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: Presented at IEEE CDC 2023. arXiv admin note: text overlap with arXiv:2302.01189

arXiv:2311.18736 [pdf, other]

Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms

Authors: Xiangyuan Zhang, Weichao Mao, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar

Abstract: We introduce controlgym, a library of thirty-six industrial control settings, and ten infinite-dimensional partial differential equation (PDE)-based control problems. Integrated within the OpenAI Gym/Gymnasium (Gym) framework, controlgym allows direct applications of standard reinforcement learning (RL) algorithms like stable-baselines3. Our control environments complement those in Gym with contin… ▽ More We introduce controlgym, a library of thirty-six industrial control settings, and ten infinite-dimensional partial differential equation (PDE)-based control problems. Integrated within the OpenAI Gym/Gymnasium (Gym) framework, controlgym allows direct applications of standard reinforcement learning (RL) algorithms like stable-baselines3. Our control environments complement those in Gym with continuous, unbounded action and observation spaces, motivated by real-world control applications. Moreover, the PDE control environments uniquely allow the users to extend the state dimensionality of the system to infinity while preserving the intrinsic dynamics. This feature is crucial for evaluating the scalability of RL algorithms for control. This project serves the learning for dynamics & control (L4DC) community, aiming to explore key questions: the convergence of RL algorithms in learning control policies; the stability and robustness issues of learning-based controllers; and the scalability of RL algorithms to high- and potentially infinite-dimensional systems. We open-source the controlgym project at https://github.com/xiangyuan-zhang/controlgym. △ Less

Submitted 23 April, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

Comments: 25 pages, 16 figures

arXiv:2309.04831 [pdf, other]

Global Convergence of Receding-Horizon Policy Search in Learning Estimator Designs

Authors: Xiangyuan Zhang, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar

Abstract: We introduce the receding-horizon policy gradient (RHPG) algorithm, the first PG algorithm with provable global convergence in learning the optimal linear estimator designs, i.e., the Kalman filter (KF). Notably, the RHPG algorithm does not require any prior knowledge of the system for initialization and does not require the target system to be open-loop stable. The key of RHPG is that we integrat… ▽ More We introduce the receding-horizon policy gradient (RHPG) algorithm, the first PG algorithm with provable global convergence in learning the optimal linear estimator designs, i.e., the Kalman filter (KF). Notably, the RHPG algorithm does not require any prior knowledge of the system for initialization and does not require the target system to be open-loop stable. The key of RHPG is that we integrate vanilla PG (or any other policy search directions) into a dynamic programming outer loop, which iteratively decomposes the infinite-horizon KF problem that is constrained and non-convex in the policy parameter into a sequence of static estimation problems that are unconstrained and strongly-convex, thus enabling global convergence. We further provide fine-grained analyses of the optimization landscape under RHPG and detail the convergence and sample complexity guarantees of the algorithm. This work serves as an initial attempt to develop reinforcement learning algorithms specifically for control applications with performance guarantees by utilizing classic control theory in both algorithmic design and theoretical analyses. Lastly, we validate our theories by deploying the RHPG algorithm to learn the Kalman filter design of a large-scale convection-diffusion model. We open-source the code repository at \url{https://github.com/xiangyuan-zhang/LearningKF}. △ Less

Submitted 9 September, 2023; originally announced September 2023.

Comments: arXiv admin note: text overlap with arXiv:2301.12624

arXiv:2012.09795 [pdf, ps, other]

Finite-time Newton seeking control

Authors: Martin Guay, Mouhacine Benosman

Abstract: This paper proposes a finite-time Newton seeking control design for systems described by unknown multivariable static maps. The Newton seeking system has an averaged system that implements a Newton continuous-time algorithm. The averaged Newton seeking system is shown to achieve finite-time stability of the unknown optimum of the static map. An averaging analysis demonstrates that the proposed New… ▽ More This paper proposes a finite-time Newton seeking control design for systems described by unknown multivariable static maps. The Newton seeking system has an averaged system that implements a Newton continuous-time algorithm. The averaged Newton seeking system is shown to achieve finite-time stability of the unknown optimum of the static map. An averaging analysis demonstrates that the proposed Newton-seeking achieves finite-time practical stability of the optimum. A simulation study is used to demonstrate the effectiveness of the design method. △ Less

Submitted 17 December, 2020; originally announced December 2020.

Comments: 16 pages, 4 figures

arXiv:2005.05888 [pdf, other]

Safe Learning-based Observers for Unknown Nonlinear Systems using Bayesian Optimization

Authors: Ankush Chakrabarty, Mouhacine Benosman

Abstract: Data generated from dynamical systems with unknown dynamics enable the learning of state observers that are: robust to modeling error, computationally tractable to design, and capable of operating with guaranteed performance. In this paper, a modular design methodology is formulated, that consists of three design phases: (i) an initial robust observer design that enables one to learn the dynamics… ▽ More Data generated from dynamical systems with unknown dynamics enable the learning of state observers that are: robust to modeling error, computationally tractable to design, and capable of operating with guaranteed performance. In this paper, a modular design methodology is formulated, that consists of three design phases: (i) an initial robust observer design that enables one to learn the dynamics without allowing the state estimation error to diverge (hence, safe); (ii) a learning phase wherein the unmodeled components are estimated using Bayesian optimization and Gaussian processes; and, (iii) a re-design phase that leverages the learned dynamics to improve convergence rate of the state estimation error. The potential of our proposed learning-based observer is demonstrated on a benchmark nonlinear system. Additionally, certificates of guaranteed estimation performance are provided. △ Less

Submitted 25 June, 2021; v1 submitted 12 May, 2020; originally announced May 2020.

Comments: 23 pages, post-review draft

arXiv:1912.08342 [pdf, other]

Finite-Time Convergence of Continuous-Time Optimization Algorithms via Differential Inclusions

Authors: Orlando Romero, Mouhacine Benosman

Abstract: In this paper, we propose two discontinuous dynamical systems in continuous time with guaranteed prescribed finite-time local convergence to strict local minima of a given cost function. Our approach consists of exploiting a Lyapunov-based differential inequality for differential inclusions, which leads to finite-time stability and thus finite-time convergence with a provable bound on the settling… ▽ More In this paper, we propose two discontinuous dynamical systems in continuous time with guaranteed prescribed finite-time local convergence to strict local minima of a given cost function. Our approach consists of exploiting a Lyapunov-based differential inequality for differential inclusions, which leads to finite-time stability and thus finite-time convergence with a provable bound on the settling time. In particular, for exact solutions to the aforementioned differential inequality, the settling-time bound is also exact, thus achieving prescribed finite-time convergence. We thus construct a class of discontinuous dynamical systems, of second order with respect to the cost function, that serve as continuous-time optimization algorithms with finite-time convergence and prescribed convergence time. Finally, we illustrate our results on the Rosenbrock function. △ Less

Submitted 17 December, 2019; originally announced December 2019.

Comments: Presented at workshop "Beyond First Order Methods in Machine Learning" of NeurIPS 2019

arXiv:1908.03517 [pdf]

doi 10.1109/TAC.2022.3214795

Fixed-Time Stable Proximal Dynamical System for Solving MVIPs

Authors: Kunal Garg, Mayank Baranwal, Rohit Gupta, Mouhacine Benosman

Abstract: In this paper, a novel modified proximal dynamical system is proposed to compute the solution of a mixed variational inequality problem (MVIP) within a fixed time, where the time of convergence is finite and is uniformly bounded for all initial conditions. Under the assumptions of strong monotonicity and Lipschitz continuity, it is shown that a solution of the modified proximal dynamical system ex… ▽ More In this paper, a novel modified proximal dynamical system is proposed to compute the solution of a mixed variational inequality problem (MVIP) within a fixed time, where the time of convergence is finite and is uniformly bounded for all initial conditions. Under the assumptions of strong monotonicity and Lipschitz continuity, it is shown that a solution of the modified proximal dynamical system exists, is uniquely determined, and converges to the unique solution of the associated MVIP within a fixed time. Furthermore, the fixed-time stability of the modified projected dynamical system continues to hold, even if the assumption of strong monotonicity is relaxed to that of strong pseudomonotonicity. Finally, it is shown that the solution obtained using the forward-Euler discretization of the proposed modified proximal dynamical system converges to an arbitrarily small neighborhood of the solution of the associated MVIP within a fixed number of time steps, independent of the initial conditions. △ Less

Submitted 19 October, 2022; v1 submitted 9 August, 2019; originally announced August 2019.

Comments: 12 pages, 2 figures

arXiv:1604.04586 [pdf, other]

Robust Reduced-Order Model Stabilization for Partial Differential Equations Based on Lyapunov Theory and Extremum Seeking with Application to the 3D Boussinesq Equations

Authors: Mouhacine Benosman, Jeff Borggaard, Boris Kramer

Abstract: We present some results on stabilization for reduced-order models (ROMs) of partial differential equations. The stabilization is achieved using Lyapunov theory to design a new closure model that is robust to parametric uncertainties. The free parameters in the proposed ROM stabilization method are optimized using a model-free multi-parametric extremum seeking (MES) algorithm. The 3D Boussinesq equ… ▽ More We present some results on stabilization for reduced-order models (ROMs) of partial differential equations. The stabilization is achieved using Lyapunov theory to design a new closure model that is robust to parametric uncertainties. The free parameters in the proposed ROM stabilization method are optimized using a model-free multi-parametric extremum seeking (MES) algorithm. The 3D Boussinesq equations provide a challenging numerical test-problem that is used to demonstrate the advantages of the proposed method. △ Less

Submitted 15 April, 2016; originally announced April 2016.

Comments: arXiv admin note: text overlap with arXiv:1510.01728

arXiv:1510.02831 [pdf, other]

doi 10.1137/15M104565X

Sparse sensing and DMD based identification of flow regimes and bifurcations in complex flows

Authors: Boris Kramer, Piyush Grover, Petros Boufounos, Mouhacine Benosman, Saleh Nabi

Abstract: We present a sparse sensing framework based on Dynamic Mode Decomposition (DMD) to identify flow regimes and bifurcations in large-scale thermo-fluid systems. Motivated by real-time sensing and control of thermal-fluid flows in buildings and equipment, we apply this method to a Direct Numerical Simulation (DNS) data set of a 2D laterally heated cavity. The resulting flow solutions can be divided i… ▽ More We present a sparse sensing framework based on Dynamic Mode Decomposition (DMD) to identify flow regimes and bifurcations in large-scale thermo-fluid systems. Motivated by real-time sensing and control of thermal-fluid flows in buildings and equipment, we apply this method to a Direct Numerical Simulation (DNS) data set of a 2D laterally heated cavity. The resulting flow solutions can be divided into several regimes, ranging from steady to chaotic flow. The DMD modes and eigenvalues capture the main temporal and spatial scales in the dynamics belonging to different regimes. Our proposed classification method is data-driven, robust w.r.t measurement noise, and exploits the dynamics extracted from the DMD method. Namely, we construct an augmented DMD basis, with "built-in" dynamics, given by the DMD eigenvalues. This allows us to employ a short time-series of data from sensors, to more robustly classify flow regimes, particularly in the presence of measurement noise. We also exploit the incoherence exhibited among the data generated by different regimes, which persists even if the number of measurements is small compared to the dimension of the DNS data. The data-driven regime identification algorithm can enable robust low-order modeling of flows for state estimation and control. △ Less

Submitted 22 August, 2016; v1 submitted 9 October, 2015; originally announced October 2015.

Comments: Expanded discussion. Fixed some typos and figures

MSC Class: 37L65; 37M05; 37N10

Journal ref: SIAM Journal on Applied Dynamical Systems 16(2) (2017)

arXiv:1510.01728 [pdf, other]

doi 10.1109/ACC.2016.7525157

Learning-based Reduced Order Model Stabilization for Partial Differential Equations: Application to the Coupled Burgers Equation

Authors: Mouhacine Benosman, Boris Kramer, Petros Boufounos, Piyush Grover

Abstract: We present results on stabilization for reduced order models (ROM) of partial differential equations using learning. Stabilization is achieved via closure models for ROMs, where we use a model-free extremum seeking (ES) dither-based algorithm to learn the best closure models' parameters, for optimal ROM stabilization. We first propose to auto-tune linear closure models using ES, and then extend th… ▽ More We present results on stabilization for reduced order models (ROM) of partial differential equations using learning. Stabilization is achieved via closure models for ROMs, where we use a model-free extremum seeking (ES) dither-based algorithm to learn the best closure models' parameters, for optimal ROM stabilization. We first propose to auto-tune linear closure models using ES, and then extend the results to a closure model combining linear and nonlinear terms, for better stabilization performance. The coupled Burgers' equation is employed as a test-bed for the proposed tuning method. △ Less

Submitted 6 October, 2015; originally announced October 2015.

Showing 1–12 of 12 results for author: Benosman, M