-
Stability Margins of Neural Network Controllers
Authors:
Neelay Junnarkar,
Murat Arcak,
Peter Seiler
Abstract:
We present a method to train neural network controllers with guaranteed stability margins. The method is applicable to linear time-invariant plants interconnected with uncertainties and nonlinearities that are described by integral quadratic constraints. The type of stability margin we consider is the disk margin. Our training method alternates between a training step to maximize reward and a stab…
▽ More
We present a method to train neural network controllers with guaranteed stability margins. The method is applicable to linear time-invariant plants interconnected with uncertainties and nonlinearities that are described by integral quadratic constraints. The type of stability margin we consider is the disk margin. Our training method alternates between a training step to maximize reward and a stability margin-enforcing step. In the stability margin enforcing-step, we solve a semidefinite program to project the controller into the set of controllers for which we can certify the desired disk margin.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
Fast Assignment in Asset-Guarding Engagements using Function Approximation
Authors:
Neelay Junnarkar,
Emmanuel Sin,
Peter Seiler,
Douglas Philbrick,
Murat Arcak
Abstract:
This letter considers assignment problems consisting of n pursuers attempting to intercept n targets. We consider stationary targets as well as targets maneuvering toward an asset. The assignment algorithm relies on an n x n cost matrix where entry (i, j) is the minimum time for pursuer i to intercept target j. Each entry of this matrix requires the solution of a nonlinear optimal control problem.…
▽ More
This letter considers assignment problems consisting of n pursuers attempting to intercept n targets. We consider stationary targets as well as targets maneuvering toward an asset. The assignment algorithm relies on an n x n cost matrix where entry (i, j) is the minimum time for pursuer i to intercept target j. Each entry of this matrix requires the solution of a nonlinear optimal control problem. This subproblem is computationally intensive and hence the computational cost of the assignment is dominated by the construction of the cost matrix. We propose to use neural networks for function approximation of the minimum time until intercept. The neural networks are trained offline, thus allowing for real-time online construction of cost matrices. Moreover, the function approximators have sufficient accuracy to obtain reasonable solutions to the assignment problem. In most cases, the approximators achieve assignments with optimal worst case intercept time. The proposed approach is demonstrated on several examples with increasing numbers of pursuers and targets.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Grouping of $N-1$ Contingencies for Controller Synthesis: A Study for Power Line Failures
Authors:
Neelay Junnarkar,
Emily Jensen,
Xiaofan Wu,
Suat Gumussoy,
Murat Arcak
Abstract:
The problem of maintaining power system stability and performance after the failure of any single line in a power system (an "N-1 contingency") is investigated. Due to the large number of possible N-1 contingencies for a power network, it is impractical to optimize controller parameters for each possible contingency a priori. A method to partition a set of contingencies into groups of contingencie…
▽ More
The problem of maintaining power system stability and performance after the failure of any single line in a power system (an "N-1 contingency") is investigated. Due to the large number of possible N-1 contingencies for a power network, it is impractical to optimize controller parameters for each possible contingency a priori. A method to partition a set of contingencies into groups of contingencies that are similar to each other from a control perspective is presented. Design of a single controller for each group, rather than for each contingency, provides a computationally tractable method for maintaining stability and performance after element failures. The choice of number of groups tunes a trade-off between computation time and controller performance for a given set of contingencies. Results are simulated on the IEEE 39-bus and 68-bus systems, illustrating that, with controllers designed for a relatively small number of groups, power system stability may be significantly improved after an N-1 contingency compared to continued use of the nominal controller. Furthermore, performance is comparable to that of controllers designed for each contingency individually.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Synthesizing Neural Network Controllers with Closed-Loop Dissipativity Guarantees
Authors:
Neelay Junnarkar,
Murat Arcak,
Peter Seiler
Abstract:
In this paper, a method is presented to synthesize neural network controllers such that the feedback system of plant and controller is dissipative, certifying performance requirements such as L2 gain bounds. The class of plants considered is that of linear time-invariant (LTI) systems interconnected with an uncertainty, including nonlinearities treated as an uncertainty for convenience of analysis…
▽ More
In this paper, a method is presented to synthesize neural network controllers such that the feedback system of plant and controller is dissipative, certifying performance requirements such as L2 gain bounds. The class of plants considered is that of linear time-invariant (LTI) systems interconnected with an uncertainty, including nonlinearities treated as an uncertainty for convenience of analysis. The uncertainty of the plant and the nonlinearities of the neural network are both described using integral quadratic constraints (IQCs). First, a dissipativity condition is derived for uncertain LTI systems. Second, this condition is used to construct a linear matrix inequality (LMI) which can be used to synthesize neural network controllers. Finally, this convex condition is used in a projection-based training method to synthesize neural network controllers with dissipativity guarantees. Numerical examples on an inverted pendulum and a flexible rod on a cart are provided to demonstrate the effectiveness of this approach.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
Exploiting Symmetry in Dynamics for Model-Based Reinforcement Learning with Asymmetric Rewards
Authors:
Yasin Sonmez,
Neelay Junnarkar,
Murat Arcak
Abstract:
Recent work in reinforcement learning has leveraged symmetries in the model to improve sample efficiency in training a policy. A commonly used simplifying assumption is that the dynamics and reward both exhibit the same symmetry; however, in many real-world environments, the dynamical model exhibits symmetry independent of the reward model. In this paper, we assume only the dynamics exhibit symmet…
▽ More
Recent work in reinforcement learning has leveraged symmetries in the model to improve sample efficiency in training a policy. A commonly used simplifying assumption is that the dynamics and reward both exhibit the same symmetry; however, in many real-world environments, the dynamical model exhibits symmetry independent of the reward model. In this paper, we assume only the dynamics exhibit symmetry, extending the scope of problems in reinforcement learning and learning in control theory to which symmetry techniques can be applied. We use Cartan's moving frame method to introduce a technique for learning dynamics that, by construction, exhibit specified symmetries. Numerical experiments demonstrate that the proposed method learns a more accurate dynamical model
△ Less
Submitted 16 August, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
Certifying Stability and Performance of Uncertain Differential-Algebraic Systems: A Dissipativity Framework
Authors:
Emily Jensen,
Neelay Junnarkar,
Murat Arcak,
Xiaofan Wu,
Suat Gumussoy
Abstract:
This paper presents a novel framework for characterizing dissipativity of uncertain systems whose dynamics evolve according to differential-algebraic equations. Sufficient conditions for dissipativity (specializing to, e.g., stability or $L_2$ gain bounds) are provided in the case that uncertainties are characterized by integral quadratic constraints. For polynomial or linear dynamics, these condi…
▽ More
This paper presents a novel framework for characterizing dissipativity of uncertain systems whose dynamics evolve according to differential-algebraic equations. Sufficient conditions for dissipativity (specializing to, e.g., stability or $L_2$ gain bounds) are provided in the case that uncertainties are characterized by integral quadratic constraints. For polynomial or linear dynamics, these conditions can be efficiently verified through sum-of-squares or semidefinite programming. Performance analysis of the IEEE 39-bus power network with a set of potential line failures modeled as an uncertainty set provides an illustrative example that highlights the computational tractability of this approach; conservatism introduced in this example is shown to be quite minimal.
△ Less
Submitted 10 May, 2024; v1 submitted 16 August, 2023;
originally announced August 2023.
-
Synthesis of Stabilizing Recurrent Equilibrium Network Controllers
Authors:
Neelay Junnarkar,
He Yin,
Fangda Gu,
Murat Arcak,
Peter Seiler
Abstract:
We propose a parameterization of a nonlinear dynamic controller based on the recurrent equilibrium network, a generalization of the recurrent neural network. We derive constraints on the parameterization under which the controller guarantees exponential stability of a partially observed dynamical system with sector bounded nonlinearities. Finally, we present a method to synthesize this controller…
▽ More
We propose a parameterization of a nonlinear dynamic controller based on the recurrent equilibrium network, a generalization of the recurrent neural network. We derive constraints on the parameterization under which the controller guarantees exponential stability of a partially observed dynamical system with sector bounded nonlinearities. Finally, we present a method to synthesize this controller using projected policy gradient methods to maximize a reward function with arbitrary structure. The projection step involves the solution of convex optimization problems. We demonstrate the proposed method with simulated examples of controlling nonlinear plants, including plants modeled with neural networks.
△ Less
Submitted 12 September, 2022; v1 submitted 31 March, 2022;
originally announced April 2022.