Search | arXiv e-print repository

Anytime Safe Reinforcement Learning

Authors: Pol Mestres, Arnau Marzabal, Jorge Cortés

Abstract: This paper considers the problem of solving constrained reinforcement learning problems with anytime guarantees, meaning that the algorithmic solution returns a safe policy regardless of when it is terminated. Drawing inspiration from anytime constrained optimization, we introduce Reinforcement Learning-based Safe Gradient Flow (RL-SGF), an on-policy algorithm which employs estimates o… ▽ More This paper considers the problem of solving constrained reinforcement learning problems with anytime guarantees, meaning that the algorithmic solution returns a safe policy regardless of when it is terminated. Drawing inspiration from anytime constrained optimization, we introduce Reinforcement Learning-based Safe Gradient Flow (RL-SGF), an on-policy algorithm which employs estimates of the value functions and their respective gradients associated with the objective and safety constraints for the current policy, and updates the policy parameters by solving a convex quadratically constrained quadratic program. We show that if the estimates are computed with a sufficiently large number of episodes (for which we provide an explicit bound), safe policies are updated to safe policies with a probability higher than a prescribed tolerance. We also show that iterates asymptotically converge to a neighborhood of a KKT point, whose size can be arbitrarily reduced by refining the estimates of the value function and their gradients. We illustrate the performance of RL-SGF in a navigation example. △ Less

Submitted 23 April, 2025; originally announced April 2025.

arXiv:2504.00813 [pdf, other]

Feedback Optimization with State Constraints through Control Barrier Functions

Authors: Giannis Delimpaltadakis, Pol Mestres, Jorge Cortés, W. P. M. H. Heemels

Abstract: Recently, there has been a surge of research on a class of methods called feedback optimization. These are methods to steer the state of a control system to an equilibrium that arises as the solution of an optimization problem. Despite the growing literature on the topic, the important problem of enforcing state constraints at all times remains unaddressed. In this work, we present the first feedb… ▽ More Recently, there has been a surge of research on a class of methods called feedback optimization. These are methods to steer the state of a control system to an equilibrium that arises as the solution of an optimization problem. Despite the growing literature on the topic, the important problem of enforcing state constraints at all times remains unaddressed. In this work, we present the first feedback-optimization method that enforces state constraints. The method combines a class of dynamics called safe gradient flows with high-order control barrier functions. We provide a number of results on our proposed controller, including well-posedness guarantees, anytime constraint-satisfaction guarantees, equivalence between the closed-loop's equilibria and the optimization problem's critical points, and local asymptotic stability of optima. △ Less

Submitted 1 April, 2025; originally announced April 2025.

arXiv:2501.09289 [pdf, other]

Control Barrier Function-Based Safety Filters: Characterization of Undesired Equilibria, Unbounded Trajectories, and Limit Cycles

Authors: Pol Mestres, Yiting Chen, Emiliano Dall'anese, Jorge Cortés

Abstract: This paper focuses on safety filters designed based on Control Barrier Functions (CBFs): these are modifications of a nominal stabilizing controller typically utilized in safety-critical control applications to render a given subset of states forward invariant. The paper investigates the dynamical properties of the closed-loop systems, with a focus on characterizing undesirable behaviors that may… ▽ More This paper focuses on safety filters designed based on Control Barrier Functions (CBFs): these are modifications of a nominal stabilizing controller typically utilized in safety-critical control applications to render a given subset of states forward invariant. The paper investigates the dynamical properties of the closed-loop systems, with a focus on characterizing undesirable behaviors that may emerge due to the use of CBF-based filters. These undesirable behaviors include unbounded trajectories, limit cycles, and undesired equilibria, which can be locally stable and even form a continuum. Our analysis offer the following contributions: (i) conditions under which trajectories remain bounded and (ii) conditions under which limit cycles do not exist; (iii) we show that undesired equilibria can be characterized by solving an algebraic equation, and (iv) we provide examples that show that asymptotically stable undesired equilibria can exist for a large class of nominal controllers and design parameters of the safety filter (even for convex safe sets). Further, for the specific class of planar systems, (v) we provide explicit formulas for the total number of undesired equilibria and the proportion of saddle points and asymptotically stable equilibria, and (vi) in the case of linear planar systems, we present an exhaustive analysis of their global stability properties. Examples throughout the paper illustrate the results. △ Less

Submitted 15 January, 2025; originally announced January 2025.

arXiv:2410.08364 [pdf, other]

Safe and Dynamically-Feasible Motion Planning using Control Lyapunov and Barrier Functions

Authors: Pol Mestres, Carlos Nieto-Granda, Jorge Cortés

Abstract: This paper considers the problem of designing motion planning algorithms for control-affine systems that generate collision-free paths from an initial to a final destination and can be executed using safe and dynamically-feasible controllers. We introduce the C-CLF-CBF-RRT algorithm, which produces paths with such properties and leverages rapidly exploring random trees (RRTs), control Lyapunov fun… ▽ More This paper considers the problem of designing motion planning algorithms for control-affine systems that generate collision-free paths from an initial to a final destination and can be executed using safe and dynamically-feasible controllers. We introduce the C-CLF-CBF-RRT algorithm, which produces paths with such properties and leverages rapidly exploring random trees (RRTs), control Lyapunov functions (CLFs) and control barrier functions (CBFs). We show that C-CLF-CBF-RRT is computationally efficient for linear systems with polytopic and ellipsoidal constraints, and establish its probabilistic completeness. We showcase the performance of C-CLF-CBF-RRT in different simulation and hardware experiments. △ Less

Submitted 12 March, 2025; v1 submitted 10 October, 2024; originally announced October 2024.

arXiv:2409.06808 [pdf, other]

Equilibria and Their Stability Do Not Depend on the Control Barrier Function in Safe Optimization-Based Control

Authors: Yiting Chen, Pol Mestres, Jorge Cortes, Emiliano Dall'Anese

Abstract: Control barrier functions (CBFs) play a critical role in the design of safe optimization-based controllers for control-affine systems. Given a CBF associated with a desired ``safe'' set, the typical approach consists in embedding CBF-based constraints into the optimization problem defining the control law to enforce forward invariance of the safe set. While this approach effectively guarantees saf… ▽ More Control barrier functions (CBFs) play a critical role in the design of safe optimization-based controllers for control-affine systems. Given a CBF associated with a desired ``safe'' set, the typical approach consists in embedding CBF-based constraints into the optimization problem defining the control law to enforce forward invariance of the safe set. While this approach effectively guarantees safety for a given CBF, the CBF-based control law can introduce undesirable equilibrium points (i.e., points that are not equilibria of the original system); open questions remain on how the choice of CBF influences the number and locations of undesirable equilibria and, in general, the dynamics of the closed-loop system. This paper investigates how the choice of CBF impacts the dynamics of the closed-loop system and shows that: (i) The CBF does not affect the number, location, and (local) stability properties of the equilibria in the interior of the safe set; (ii) undesirable equilibria only appear on the boundary of the safe set; and, (iii) the number and location of undesirable equilibria for the closed-loop system do not depend of the choice of the CBF. Additionally, for the well-established safety filters and controllers based on both CBF and control Lyapunov functions (CLFs), we show that the stability properties of equilibria of the closed-loop system are independent of the choice of the CBF and of the associated extended class-K function. △ Less

Submitted 10 September, 2024; originally announced September 2024.

arXiv:2408.08398 [pdf, other]

Stabilization of Nonlinear Systems through Control Barrier Functions

Authors: Pol Mestres, Kehan Long, Melvin Leok, Nikolay Atanasov, Jorge Cortes

Abstract: This paper proposes a control design approach for stabilizing nonlinear control systems. Our key observation is that the set of points where the decrease condition of a control Lyapunov function (CLF) is feasible can be regarded as a safe set. By leveraging a nonsmooth version of control barrier functions (CBFs) and a weaker notion of CLF, we develop a control design that forces the system to conv… ▽ More This paper proposes a control design approach for stabilizing nonlinear control systems. Our key observation is that the set of points where the decrease condition of a control Lyapunov function (CLF) is feasible can be regarded as a safe set. By leveraging a nonsmooth version of control barrier functions (CBFs) and a weaker notion of CLF, we develop a control design that forces the system to converge to and remain in the region where the CLF decrease condition is feasible. We characterize the conditions under which our controller asymptotically stabilizes the origin or a small neighborhood around it, even in the cases where it is discontinuous. We illustrate our design in various examples. △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.00958 [pdf, other]

Characterization of the Dynamical Properties of Safety Filters for Linear Planar Systems

Authors: Yiting Chen, Pol Mestres, Emiliano Dall'Anese, Jorge Cortes

Abstract: This paper studies the dynamical properties of closed-loop systems obtained from control barrier function-based safety filters. We provide a sufficient and necessary condition for the existence of undesirable equilibria and show that the Jacobian matrix of the closed-loop system evaluated at an undesirable equilibrium always has a nonpositive eigenvalue. In the special case of linear planar system… ▽ More This paper studies the dynamical properties of closed-loop systems obtained from control barrier function-based safety filters. We provide a sufficient and necessary condition for the existence of undesirable equilibria and show that the Jacobian matrix of the closed-loop system evaluated at an undesirable equilibrium always has a nonpositive eigenvalue. In the special case of linear planar systems and ellipsoidal obstacles, we give a complete characterization of the dynamical properties of the corresponding closed-loop system. We show that for underactuated systems, the safety filter always introduces a single undesirable equilibrium, which is a saddle-point. We prove that all trajectories outside the global stable manifold of such equilibrium converge to the origin. In the fully actuated case, we discuss how the choice of nominal controller affects the stability properties of the closed-loop system. Various simulations illustrate our results. △ Less

Submitted 15 August, 2024; v1 submitted 1 August, 2024; originally announced August 2024.

arXiv:2406.14823 [pdf, other]

Converse Theorems for Certificates of Safety and Stability

Authors: Pol Mestres, Jorge Cortés

Abstract: Motivated by the key role of control barrier functions (CBFs) in assessing safety and enabling the synthesis of safe controllers in nonlinear control systems, this paper presents a suite of converse results on CBFs. Given any safe set, we first identify a set of general sufficient conditions which guarantee the existence of a CBF. Our technical analysis also enables us to define an extended notion… ▽ More Motivated by the key role of control barrier functions (CBFs) in assessing safety and enabling the synthesis of safe controllers in nonlinear control systems, this paper presents a suite of converse results on CBFs. Given any safe set, we first identify a set of general sufficient conditions which guarantee the existence of a CBF. Our technical analysis also enables us to define an extended notion of CBF which is always guaranteed to exist if the set is safe. We next turn our attention to the problem of joint safety and stability, and give conditions under which the notions of control Lyapunov-barrier function (CLBF) and compatible control Lyapunov function (CLF) and CBF pair are guaranteed to exist. Finally, we identify conditions under which a CLBF and a compatible CLF-CBF pair can be constructed from a non-compatible CLF-CBF pair. Throughout the paper, we intersperse different examples and counterexamples to motivate our results and position them within the state of the art. △ Less

Submitted 11 February, 2025; v1 submitted 20 June, 2024; originally announced June 2024.

arXiv:2402.06195 [pdf, other]

Distributed Safe Navigation of Multi-Agent Systems using Control Barrier Function-Based Optimal Controllers

Authors: Pol Mestres, Carlos Nieto-Granda, Jorge Cortés

Abstract: This paper proposes a distributed controller synthesis framework for safe navigation of multi-agent systems. We leverage control barrier functions to formulate collision avoidance with obstacles and teammates as constraints on the control input for a state-dependent network optimization problem that encodes team formation and the navigation task. Our algorithmic solution is valid for general nonli… ▽ More This paper proposes a distributed controller synthesis framework for safe navigation of multi-agent systems. We leverage control barrier functions to formulate collision avoidance with obstacles and teammates as constraints on the control input for a state-dependent network optimization problem that encodes team formation and the navigation task. Our algorithmic solution is valid for general nonlinear control dynamics and optimization problems. The resulting controller is distributed, satisfies the safety constraints at all times, and is asymptotically optimal. We illustrate its performance in a team of differential-drive robots in a variety of complex environments, both in simulation and in hardware. △ Less

Submitted 1 May, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

arXiv:2311.05813 [pdf, other]

Feasibility Analysis and Regularity Characterization of Distributionally Robust Safe Stabilizing Controllers

Authors: Pol Mestres, Kehan Long, Nikolay Atanasov, Jorge Cortés

Abstract: This paper studies the well-posedness and regularity of safe stabilizing optimization-based controllers for control-affine systems in the presence of model uncertainty. When the system dynamics contain unknown parameters, a finite set of samples can be used to formulate distributionally robust versions of control barrier function and control Lyapunov function constraints. Control synthesis with su… ▽ More This paper studies the well-posedness and regularity of safe stabilizing optimization-based controllers for control-affine systems in the presence of model uncertainty. When the system dynamics contain unknown parameters, a finite set of samples can be used to formulate distributionally robust versions of control barrier function and control Lyapunov function constraints. Control synthesis with such distributionally robust constraints can be achieved by solving a (convex) second-order cone program (SOCP). We provide one necessary and two sufficient conditions to check the feasibility of such optimization problems, characterize their computational complexity and numerically show that they are significantly faster to check than direct use of SOCP solvers. Finally, we also analyze the regularity of the resulting control laws. △ Less

Submitted 29 December, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

arXiv:2301.04603 [pdf, other]

Feasibility and Regularity Analysis of Safe Stabilizing Controllers under Uncertainty

Authors: Pol Mestres, Jorge Cortés

Abstract: This paper studies the problem of safe stabilization of control-affine systems under uncertainty. Our starting point is the availability of worst-case or probabilistic error descriptions for the dynamics and a control barrier function (CBF). These descriptions give rise to second-order cone constraints (SOCCs) whose simultaneous satisfaction guarantees safe stabilization. We study the feasibility… ▽ More This paper studies the problem of safe stabilization of control-affine systems under uncertainty. Our starting point is the availability of worst-case or probabilistic error descriptions for the dynamics and a control barrier function (CBF). These descriptions give rise to second-order cone constraints (SOCCs) whose simultaneous satisfaction guarantees safe stabilization. We study the feasibility of such SOCCs and the regularity properties of various controllers satisfying them. △ Less

Submitted 2 December, 2023; v1 submitted 11 January, 2023; originally announced January 2023.

arXiv:2203.12550 [pdf, other]

doi 10.1109/LCSYS.2022.3188934.

Optimization-Based Safe Stabilizing Feedback with Guaranteed Region of Attraction

Authors: Pol Mestres, Jorge Cortés

Abstract: This paper proposes an optimization with penalty-based feedback design framework for safe stabilization of control affine systems. Our starting point is the availability of a control Lyapunov function (CLF) and a control barrier function (CBF) defining affine-in-the-input inequalities that certify, respectively, the stability and safety objectives for the dynamics. Leveraging ideas from penalty me… ▽ More This paper proposes an optimization with penalty-based feedback design framework for safe stabilization of control affine systems. Our starting point is the availability of a control Lyapunov function (CLF) and a control barrier function (CBF) defining affine-in-the-input inequalities that certify, respectively, the stability and safety objectives for the dynamics. Leveraging ideas from penalty methods for constrained optimization, the proposed design framework imposes one of the inequalities as a hard constraint and the other one as a soft constraint. We study the properties of the closed-loop system under the resulting feedback controller and identify conditions on the penalty parameter to eliminate undesired equilibria that might arise. Going beyond the local stability guarantees available in the literature, we are able to provide an inner approximation of the region of attraction of the equilibrium, and identify conditions under which the whole safe set belongs to it. Simulations illustrate our results. △ Less

Submitted 25 July, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

Comments: 6 pages, 1 figure, submitted to IEEE Control Systems Letters and 61st IEEE Conference on Decision and Control

MSC Class: 93C10

Journal ref: IEEE Control Systems Letters, vol. 7, pp. 367-372, 2023

Showing 1–12 of 12 results for author: Mestres, P