Search | arXiv e-print repository

Best Response Convergence for Zero-sum Stochastic Dynamic Games with Partial and Asymmetric Information

Authors: Yuxiang Guan, Iman Shames, Tyler H. Summers

Abstract: We analyze best response dynamics for finding a Nash equilibrium of an infinite horizon zero-sum stochastic linear quadratic dynamic game (LQDG) with partial and asymmetric information. We derive explicit expressions for each player's best response within the class of pure linear dynamic output feedback control strategies where the internal state dimension of each control strategy is an integer mu… ▽ More We analyze best response dynamics for finding a Nash equilibrium of an infinite horizon zero-sum stochastic linear quadratic dynamic game (LQDG) with partial and asymmetric information. We derive explicit expressions for each player's best response within the class of pure linear dynamic output feedback control strategies where the internal state dimension of each control strategy is an integer multiple of the system state dimension. With each best response, the players form increasingly higher-order belief states, leading to infinite-dimensional internal states. However, we observe in extensive numerical experiments that the game's value converges after just a few iterations, suggesting that strategies associated with increasingly higher-order belief states eventually provide no benefit. To help explain this convergence, our numerical analysis reveals rapid decay of the controllability and observability Gramian eigenvalues and Hankel singular values in higher-order belief dynamics, indicating that the higher-order belief dynamics become increasingly difficult for both players to control and observe. Consequently, the higher-order belief dynamics can be closely approximated by low-order belief dynamics with bounded error, and thus feedback strategies with limited internal state dimension can closely approximate a Nash equilibrium. △ Less

Submitted 10 January, 2025; originally announced January 2025.

arXiv:2410.03106 [pdf, other]

A Policy Iteration Algorithm for N-player General-Sum Linear Quadratic Dynamic Games

Authors: Yuxiang Guan, Giulio Salizzoni, Maryam Kamgarpour, Tyler H. Summers

Abstract: We present a policy iteration algorithm for the infinite-horizon N-player general-sum deterministic linear quadratic dynamic games and compare it to policy gradient methods. We demonstrate that the proposed policy iteration algorithm is distinct from the Gauss-Newton policy gradient method in the N-player game setting, in contrast to the single-player setting where under suitable choice of step si… ▽ More We present a policy iteration algorithm for the infinite-horizon N-player general-sum deterministic linear quadratic dynamic games and compare it to policy gradient methods. We demonstrate that the proposed policy iteration algorithm is distinct from the Gauss-Newton policy gradient method in the N-player game setting, in contrast to the single-player setting where under suitable choice of step size they are equivalent. We illustrate in numerical experiments that the convergence rate of the proposed policy iteration algorithm significantly surpasses that of the Gauss-Newton policy gradient method and other policy gradient variations. Furthermore, our numerical results indicate that, compared to policy gradient methods, the convergence performance of the proposed policy iteration algorithm is less sensitive to the initial policy and changes in the number of players. △ Less

Submitted 3 October, 2024; originally announced October 2024.

arXiv:2409.15493 [pdf, other]

Autonomous Exploration and Semantic Updating of Large-Scale Indoor Environments with Mobile Robots

Authors: Sai Haneesh Allu, Itay Kadosh, Tyler Summers, Yu Xiang

Abstract: We introduce a new robotic system that enables a mobile robot to autonomously explore an unknown environment, build a semantic map of the environment, and subsequently update the semantic map to reflect environment changes, such as location changes of objects. Our system leverages a LiDAR scanner for 2D occupancy grid mapping and an RGB-D camera for object perception. We introduce a semantic map r… ▽ More We introduce a new robotic system that enables a mobile robot to autonomously explore an unknown environment, build a semantic map of the environment, and subsequently update the semantic map to reflect environment changes, such as location changes of objects. Our system leverages a LiDAR scanner for 2D occupancy grid mapping and an RGB-D camera for object perception. We introduce a semantic map representation that combines a 2D occupancy grid map for geometry with a topological map for object semantics. This map representation enables us to effectively update the semantics by deleting or adding nodes to the topological map. Our system has been tested on a Fetch robot, semantically mapping a 93m x 90m and a 9m x 13m indoor environment and updating their semantic maps once objects are moved in the environments △ Less

Submitted 3 March, 2025; v1 submitted 23 September, 2024; originally announced September 2024.

Comments: 7 pages, 7 figures. Project page is available at https://irvlutd.github.io/SemanticMapping/

arXiv:2403.05466 [pdf, other]

Grasping Trajectory Optimization with Point Clouds

Authors: Yu Xiang, Sai Haneesh Allu, Rohith Peddi, Tyler Summers, Vibhav Gogate

Abstract: We introduce a new trajectory optimization method for robotic grasping based on a point-cloud representation of robots and task spaces. In our method, robots are represented by 3D points on their link surfaces. The task space of a robot is represented by a point cloud that can be obtained from depth sensors. Using the point-cloud representation, goal reaching in grasping can be formulated as point… ▽ More We introduce a new trajectory optimization method for robotic grasping based on a point-cloud representation of robots and task spaces. In our method, robots are represented by 3D points on their link surfaces. The task space of a robot is represented by a point cloud that can be obtained from depth sensors. Using the point-cloud representation, goal reaching in grasping can be formulated as point matching, while collision avoidance can be efficiently achieved by querying the signed distance values of the robot points in the signed distance field of the scene points. Consequently, a constrained nonlinear optimization problem is formulated to solve the joint motion and grasp planning problem. The advantage of our method is that the point-cloud representation is general to be used with any robot in any environment. We demonstrate the effectiveness of our method by performing experiments on a tabletop scene and a shelf scene for grasping with a Fetch mobile manipulator and a Franka Panda arm. The project page is available at \url{https://irvlutd.github.io/GraspTrajOpt} △ Less

Submitted 7 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: Published in IROS 2024

arXiv:2402.16861 [pdf, other]

Self-Tuning Network Control Architectures with Joint Sensor and Actuator Selection

Authors: Karthik Ganapathy, Iman Shames, Mathias Hudoba de Badyn, Tyler Summers

Abstract: We formulate a mathematical framework for designing a self-tuning network control architecture, and propose a computationally-feasible greedy algorithm for online architecture optimization. In this setting, the locations of active sensors and actuators in the network, as well as the feedback control policy are jointly adapted using all available information about the network states and dynamics to… ▽ More We formulate a mathematical framework for designing a self-tuning network control architecture, and propose a computationally-feasible greedy algorithm for online architecture optimization. In this setting, the locations of active sensors and actuators in the network, as well as the feedback control policy are jointly adapted using all available information about the network states and dynamics to optimize a performance criterion. We show that the case with full-state feedback can be solved with dynamic programming, and in the linear-quadratic setting, the optimal cost functions and policies are piecewise quadratic and piecewise linear, respectively. Our framework is extended for joint sensor and actuator selection for dynamic output feedback control with both control performance and architecture costs. For large networks where exhaustive architecture search is prohibitive, we describe a greedy heuristic for actuator selection and propose a greedy swapping algorithm for joint sensor and actuator selection. Via numerical experiments, we demonstrate a dramatic performance improvement of greedy self-tuning architectures over fixed architectures. Our general formulation provides an extremely rich and challenging problem space with opportunities to apply a wide variety of approximation methods from stochastic control, system identification, reinforcement learning, and static architecture design for practical model-based control. △ Less

Submitted 19 January, 2024; originally announced February 2024.

Comments: 12 pages, submitted to IEEE-TCNS. arXiv admin note: text overlap with arXiv:2301.06699

arXiv:2402.15464 [pdf, other]

doi 10.1109/LRA.2024.3368233

CLIPPER+: A Fast Maximal Clique Algorithm for Robust Global Registration

Authors: Kaveh Fathian, Tyler Summers

Abstract: We present CLIPPER+, an algorithm for finding maximal cliques in unweighted graphs for outlier-robust global registration. The registration problem can be formulated as a graph and solved by finding its maximum clique. This formulation leads to extreme robustness to outliers; however, finding the maximum clique is an NP-hard problem, and therefore approximation is required in practice for large-si… ▽ More We present CLIPPER+, an algorithm for finding maximal cliques in unweighted graphs for outlier-robust global registration. The registration problem can be formulated as a graph and solved by finding its maximum clique. This formulation leads to extreme robustness to outliers; however, finding the maximum clique is an NP-hard problem, and therefore approximation is required in practice for large-size problems. The performance of an approximation algorithm is evaluated by its computational complexity (the lower the runtime, the better) and solution accuracy (how close the solution is to the maximum clique). Accordingly, the main contribution of CLIPPER+ is outperforming the state-of-the-art in accuracy while maintaining a relatively low runtime. CLIPPER+ builds on prior work (CLIPPER [1] and PMC [2]) and prunes the graph by removing vertices that have a small core number and cannot be a part of the maximum clique. This will result in a smaller graph, on which the maximum clique can be estimated considerably faster. We evaluate the performance of CLIPPER+ on standard graph benchmarks, as well as synthetic and real-world point cloud registration problems. These evaluations demonstrate that CLIPPER+ has the highest accuracy and can register point clouds in scenarios where over $99\%$ of associations are outliers. Our code and evaluation benchmarks are released at https://github.com/ariarobotics/clipperp. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Journal ref: IEEE ROBOTICS AND AUTOMATION LETTERS, 2024

arXiv:2309.08821 [pdf, other]

Distributionally Robust CVaR-Based Safety Filtering for Motion Planning in Uncertain Environments

Authors: Sleiman Safaoui, Tyler H. Summers

Abstract: Safety is a core challenge of autonomous robot motion planning, especially in the presence of dynamic and uncertain obstacles. Many recent results use learning and deep learning-based motion planners and prediction modules to predict multiple possible obstacle trajectories and generate obstacle-aware ego robot plans. However, planners that ignore the inherent uncertainties in such predictions incu… ▽ More Safety is a core challenge of autonomous robot motion planning, especially in the presence of dynamic and uncertain obstacles. Many recent results use learning and deep learning-based motion planners and prediction modules to predict multiple possible obstacle trajectories and generate obstacle-aware ego robot plans. However, planners that ignore the inherent uncertainties in such predictions incur collision risks and lack formal safety guarantees. In this paper, we present a computationally efficient safety filtering solution to reduce the collision risk of ego robot motion plans using multiple samples of obstacle trajectory predictions. The proposed approach reformulates the collision avoidance problem by computing safe halfspaces based on obstacle sample trajectories using distributionally robust optimization (DRO) techniques. The safe halfspaces are used in a model predictive control (MPC)-like safety filter to apply corrections to the reference ego trajectory thereby promoting safer planning. The efficacy and computational efficiency of our approach are demonstrated through numerical simulations. △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2302.10411 [pdf, other]

Regret Analysis of Online LQR Control via Trajectory Prediction and Tracking: Extended Version

Authors: Yitian Chen, Timothy L. Molloy, Tyler Summers, Iman Shames

Abstract: In this paper, we propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices. The cost matrices are revealed sequentially with the potential for future values to be previewed over a short window. Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive… ▽ More In this paper, we propose and analyze a new method for online linear quadratic regulator (LQR) control with a priori unknown time-varying cost matrices. The cost matrices are revealed sequentially with the potential for future values to be previewed over a short window. Our novel method involves using the available cost matrices to predict the optimal trajectory, and a tracking controller to drive the system towards it. We adopted the notion of dynamic regret to measure the performance of this proposed online LQR control method, with our main result being that the (dynamic) regret of our method is upper bounded by a constant. Moreover, the regret upper bound decays exponentially with the preview window length, and is extendable to systems with disturbances. We show in simulations that our proposed method offers improved performance compared to other previously proposed online LQR methods. △ Less

Submitted 20 February, 2023; originally announced February 2023.

Comments: Submitted to L4DC2023

MSC Class: 49N10; 49M05

arXiv:2301.06699 [pdf, other]

doi 10.1109/CDC51059.2022.9992780

Self-Tuning Network Control Architectures

Authors: Tyler Summers, Karthik Ganapathy, Iman Shames, Mathias Hudoba de Badyn

Abstract: We formulate a general mathematical framework for self-tuning network control architecture design. This problem involves jointly adapting the locations of active sensors and actuators in the network and the feedback control policy to all available information about the time-varying network state and dynamics to optimize a performance criterion. We propose a general solution structure analogous to… ▽ More We formulate a general mathematical framework for self-tuning network control architecture design. This problem involves jointly adapting the locations of active sensors and actuators in the network and the feedback control policy to all available information about the time-varying network state and dynamics to optimize a performance criterion. We propose a general solution structure analogous to the classical self-tuning regulator from adaptive control. We show that a special case with full-state feedback can be solved in principle with dynamic programming, and in the linear quadratic setting the optimal cost functions and policies are piecewise quadratic and piecewise linear, respectively. For large networks where exhaustive architecture search is prohibitive, we describe a greedy heuristic for joint architecture-policy design. We demonstrate in numerical experiments that self-tuning architectures can provide dramatically improved performance over fixed architectures. Our general formulation provides an extremely rich and challenging problem space with opportunities to apply a wide variety of approximation methods from stochastic control, system identification, reinforcement learning, and static architecture design. △ Less

Submitted 16 January, 2023; originally announced January 2023.

Comments: 6 pages, 5 figures

Journal ref: 61st Conference on Decision and Control, pp 5876-5881, 2022

arXiv:2209.08869 [pdf, other]

Data-driven distributionally robust MPC for systems with uncertain dynamics

Authors: Francesco Micheli, Tyler Summers, John Lygeros

Abstract: We present a novel data-driven distributionally robust Model Predictive Control formulation for unknown discrete-time linear time-invariant systems affected by unknown and possibly unbounded additive uncertainties. We use off-line collected data and an approximate model of the dynamics to formulate a finite-horizon optimization problem. To account for both the uncertainty related to the dynamics a… ▽ More We present a novel data-driven distributionally robust Model Predictive Control formulation for unknown discrete-time linear time-invariant systems affected by unknown and possibly unbounded additive uncertainties. We use off-line collected data and an approximate model of the dynamics to formulate a finite-horizon optimization problem. To account for both the uncertainty related to the dynamics and the disturbance acting on the system, we resort to a distributionally robust formulation that optimizes the cost expectation while satisfying Conditional Value-at-Risk constraints with respect to the worst-case probability distributions of the uncertainties within an ambiguity set defined using the Wasserstein metric. Using results from the distributionally robust optimization literature we derive a tractable finite-dimensional convex optimization problem with finite-sample guarantees for the class of convex piecewise affine cost and constraint functions. The performance of the proposed algorithm is demonstrated in closed-loop simulation on a simple numerical example. △ Less

Submitted 19 September, 2022; originally announced September 2022.

arXiv:2208.09268 [pdf, other]

Sparse Structure Design for Stochastic Linear Systems via a Linear Matrix Inequality Approach

Authors: Yi Guo, Ognjen Stanojev, Gabriela Hug, Tyler Summers

Abstract: In this paper, we propose a sparsity-promoting feedback control design for stochastic linear systems with multiplicative noise. The objective is to identify a sparse control architecture that optimizes the closed-loop performance while stabilizing the system in the mean-square sense. The proposed approach approximates the nonconvex combinatorial optimization problem by minimizing various matrix no… ▽ More In this paper, we propose a sparsity-promoting feedback control design for stochastic linear systems with multiplicative noise. The objective is to identify a sparse control architecture that optimizes the closed-loop performance while stabilizing the system in the mean-square sense. The proposed approach approximates the nonconvex combinatorial optimization problem by minimizing various matrix norms subject to the Linear Matrix Inequality (LMI) stability condition. We present two design problems to reduce the number of actuators via the static state-feedback and a low-dimensional output. A regularized linear quadratic regulator with multiplicative noise (LQRm) optimal control problem and its convex relaxation are presented to demonstrate the tradeoff between the suboptimal closed-loop performance and the sparsity degree of control structure. Case studies on power grids for wide-area frequency control show that the proposed sparsity-promoting control can considerably reduce the number of actuators without significant loss in system performance. The sparse control architecture is robust to substantial system-level disturbances while achieving mean-square stability. △ Less

Submitted 19 August, 2022; originally announced August 2022.

arXiv:2205.05119 [pdf, other]

Robust Data-Driven Output Feedback Control via Bootstrapped Multiplicative Noise

Authors: Benjamin Gravell, Iman Shames, Tyler Summers

Abstract: We propose a robust data-driven output feedback control algorithm that explicitly incorporates inherent finite-sample model estimate uncertainties into the control design. The algorithm has three components: (1) a subspace identification nominal model estimator; (2) a bootstrap resampling method that quantifies non-asymptotic variance of the nominal model estimate; and (3) a non-conventional robus… ▽ More We propose a robust data-driven output feedback control algorithm that explicitly incorporates inherent finite-sample model estimate uncertainties into the control design. The algorithm has three components: (1) a subspace identification nominal model estimator; (2) a bootstrap resampling method that quantifies non-asymptotic variance of the nominal model estimate; and (3) a non-conventional robust control design method comprising a coupled optimal dynamic output feedback filter and controller with multiplicative noise. A key advantage of the proposed approach is that the system identification and robust control design procedures both use stochastic uncertainty representations, so that the actual inherent statistical estimation uncertainty directly aligns with the uncertainty the robust controller is being designed against. Moreover, the control design method accommodates a highly structured uncertainty representation that can capture uncertainty shape more effectively than existing approaches. We show through numerical experiments that the proposed robust data-driven output feedback controller can significantly outperform a certainty equivalent controller on various measures of sample complexity and stability robustness. △ Less

Submitted 10 May, 2022; originally announced May 2022.

arXiv:2204.04310 [pdf, other]

Risk-Bounded Temporal Logic Control of Continuous-Time Stochastic Systems

Authors: Sleiman Safaoui, Lars Lindemann, Iman Shames, Tyler H. Summers

Abstract: Motivated by the recent interest in risk-aware control, we study a continuous-time control synthesis problem to bound the risk that a stochastic linear system violates a given specification. We use risk signal temporal logic as a specification formalism in which distributionally robust risk predicates are considered and equipped with the usual Boolean and temporal operators. Our control approach r… ▽ More Motivated by the recent interest in risk-aware control, we study a continuous-time control synthesis problem to bound the risk that a stochastic linear system violates a given specification. We use risk signal temporal logic as a specification formalism in which distributionally robust risk predicates are considered and equipped with the usual Boolean and temporal operators. Our control approach relies on reformulating these risk predicates as deterministic predicates over mean and covariance states of the system. We then obtain a timed sequence of sets of mean and covariance states from the timed automata representation of the specification. To avoid an explosion in the number of automata states, we propose heuristics to find candidate sequences effectively. To execute and check dynamic feasibility of these sequences, we present a sampled-data control technique based on time discretization and constraint tightening that allows to perform timed transitions while satisfying the continuous-time constraints. △ Less

Submitted 8 April, 2022; originally announced April 2022.

Comments: 8 pages, 4 figures, contributed paper at the 2022 American Control Conference (ACC) in Atlanta, GA

arXiv:2203.17165 [pdf, other]

Policy Iteration for Multiplicative Noise Output Feedback Control

Authors: Benjamin Gravell, Matilde Gargiani, John Lygeros, Tyler H. Summers

Abstract: We propose a policy iteration algorithm for solving the multiplicative noise linear quadratic output feedback design problem. The algorithm solves a set of coupled Riccati equations for estimation and control arising from a partially observable Markov decision process (POMDP) under a class of linear dynamic control policies. We show in numerical experiments far faster convergence than a value iter… ▽ More We propose a policy iteration algorithm for solving the multiplicative noise linear quadratic output feedback design problem. The algorithm solves a set of coupled Riccati equations for estimation and control arising from a partially observable Markov decision process (POMDP) under a class of linear dynamic control policies. We show in numerical experiments far faster convergence than a value iteration algorithm, formerly the only known algorithm for solving this class of problem. The results suggest promising future research directions for policy optimization algorithms in more general POMDPs, including the potential to develop novel approximate data-driven approaches when model parameters are not available. △ Less

Submitted 31 March, 2022; originally announced March 2022.

arXiv:2203.11327 [pdf, other]

An Online Joint Optimization-Estimation Architecture for Distribution Networks

Authors: Yi Guo, Xinyang Zhou, Changhong Zhao, Lijun Chen, Gabriela Hug, Tyler H. Summers

Abstract: In this paper, we propose an optimal control-estimation architecture for distribution networks, which jointly solves the optimal power flow (OPF) problem and static state estimation (SE) problem through an online gradient-based feedback algorithm. The main objective is to enable a fast and timely interaction between the optimal controllers and state estimators with limited sensor measurements. Fir… ▽ More In this paper, we propose an optimal control-estimation architecture for distribution networks, which jointly solves the optimal power flow (OPF) problem and static state estimation (SE) problem through an online gradient-based feedback algorithm. The main objective is to enable a fast and timely interaction between the optimal controllers and state estimators with limited sensor measurements. First, convergence and optimality of the proposed algorithm are analytically established. Then, the proposed gradient-based algorithm is modified by introducing statistical information of the inherent estimation and linearization errors for an improved and robust performance of the online control decisions. Overall, the proposed method eliminates the traditional separation of control and operation, where control and estimation usually operate at distinct layers and different time-scales. Hence, it enables a computationally affordable, efficient and robust online operational framework for distribution networks under time-varying settings. △ Less

Submitted 30 August, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

arXiv:2203.08678 [pdf, other]

Dynamic Programming Through the Lens of Semismooth Newton-Type Methods (Extended Version)

Authors: Matilde Gargiani, Andrea Zanelli, Dominic Liao-McPherson, Tyler Summers, John Lygeros

Abstract: Policy iteration and value iteration are at the core of many (approximate) dynamic programming methods. For Markov Decision Processes with finite state and action spaces, we show that they are instances of semismooth Newton-type methods to solve the Bellman equation. In particular, we prove that policy iteration is equivalent to the exact semismooth Newton method and enjoys local quadratic converg… ▽ More Policy iteration and value iteration are at the core of many (approximate) dynamic programming methods. For Markov Decision Processes with finite state and action spaces, we show that they are instances of semismooth Newton-type methods to solve the Bellman equation. In particular, we prove that policy iteration is equivalent to the exact semismooth Newton method and enjoys local quadratic convergence rate. This finding is corroborated by extensive numerical evidence in the fields of control and operations research, which confirms that policy iteration generally requires few iterations to achieve convergence even when the number of policies is vast. We then show that value iteration is an instance of the fixed-point iteration method. In this spirit, we develop a novel locally accelerated version of value iteration with global convergence guarantees and negligible extra computational costs. △ Less

Submitted 24 June, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

arXiv:2202.12802 [pdf, other]

Probabilistic Data Association for Semantic SLAM at Scale

Authors: Elad Michael, Tyler Summers, Tony A. Wood, Chris Manzie, Iman Shames

Abstract: With advances in image processing and machine learning, it is now feasible to incorporate semantic information into the problem of simultaneous localisation and mapping (SLAM). Previously, SLAM was carried out using lower level geometric features (points, lines, and planes) which are often view-point dependent and error prone in visually repetitive environments. Semantic information can improve th… ▽ More With advances in image processing and machine learning, it is now feasible to incorporate semantic information into the problem of simultaneous localisation and mapping (SLAM). Previously, SLAM was carried out using lower level geometric features (points, lines, and planes) which are often view-point dependent and error prone in visually repetitive environments. Semantic information can improve the ability to recognise previously visited locations, as well as maintain sparser maps for long term SLAM applications. However, SLAM in repetitive environments has the critical problem of assigning measurements to the landmarks which generated them. In this paper, we use k-best assignment enumeration to compute marginal assignment probabilities for each measurement landmark pair, in real time. We present numerical studies on the KITTI dataset to demonstrate the effectiveness and speed of the proposed framework. △ Less

Submitted 25 February, 2022; originally announced February 2022.

Comments: 6 Pages, 3 figures, submitted to Robotics and Automation Letters and the IROS 2020 conference

MSC Class: 4104 (Primary); 05-08 (Secondary)

arXiv:2202.00308 [pdf, other]

PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation

Authors: Matilde Gargiani, Andrea Zanelli, Andrea Martinelli, Tyler Summers, John Lygeros

Abstract: Despite their success, policy gradient methods suffer from high variance of the gradient estimate, which can result in unsatisfactory sample complexity. Recently, numerous variance-reduced extensions of policy gradient methods with provably better sample complexity and competitive numerical performance have been proposed. After a compact survey on some of the main variance-reduced REINFORCE-type m… ▽ More Despite their success, policy gradient methods suffer from high variance of the gradient estimate, which can result in unsatisfactory sample complexity. Recently, numerous variance-reduced extensions of policy gradient methods with provably better sample complexity and competitive numerical performance have been proposed. After a compact survey on some of the main variance-reduced REINFORCE-type methods, we propose ProbAbilistic Gradient Estimation for Policy Gradient (PAGE-PG), a novel loopless variance-reduced policy gradient method based on a probabilistic switch between two types of updates. Our method is inspired by the PAGE estimator for supervised learning and leverages importance sampling to obtain an unbiased gradient estimator. We show that PAGE-PG enjoys a $\mathcal{O}\left( ε^{-3} \right)$ average sample complexity to reach an $ε$-stationary solution, which matches the sample complexity of its most competitive counterparts under the same setting. A numerical evaluation confirms the competitive performance of our method on classical control tasks. △ Less

Submitted 1 February, 2022; originally announced February 2022.

arXiv:2201.01483 [pdf, other]

Risk Bounded Nonlinear Robot Motion Planning With Integrated Perception & Control

Authors: Venkatraman Renganathan, Sleiman Safaoui, Aadi Kothari, Benjamin Gravell, Iman Shames, Tyler Summers

Abstract: Robust autonomy stacks require tight integration of perception, motion planning, and control layers, but these layers often inadequately incorporate inherent perception and prediction uncertainties, either ignoring them altogether or making questionable assumptions of Gaussianity. Robots with nonlinear dynamics and complex sensing modalities operating in an uncertain environment demand more carefu… ▽ More Robust autonomy stacks require tight integration of perception, motion planning, and control layers, but these layers often inadequately incorporate inherent perception and prediction uncertainties, either ignoring them altogether or making questionable assumptions of Gaussianity. Robots with nonlinear dynamics and complex sensing modalities operating in an uncertain environment demand more careful consideration of how uncertainties propagate across stack layers. We propose a framework to integrate perception, motion planning, and control by explicitly incorporating perception and prediction uncertainties into planning so that risks of constraint violation can be mitigated. Specifically, we use a nonlinear model predictive control based steering law coupled with a decorrelation scheme based Unscented Kalman Filter for state and environment estimation to propagate the robot state and environment uncertainties. Subsequently, we use distributionally robust risk constraints to limit the risk in the presence of these uncertainties. Finally, we present a layered autonomy stack consisting of a nonlinear steering-based distributionally robust motion planning module and a reference trajectory tracking module. Our numerical experiments with nonlinear robot models and an urban driving simulator show the effectiveness of our proposed approaches. △ Less

Submitted 5 January, 2022; originally announced January 2022.

Comments: arXiv admin note: text overlap with arXiv:2002.02928

arXiv:2112.13932 [pdf, other]

Distributionally Robust Bootstrap Optimization

Authors: Tyler Summers, Maryam Kamgarpour

Abstract: Control architectures and autonomy stacks for complex engineering systems are often divided into layers to decompose a complex problem and solution into distinct, manageable sub-problems. To simplify designs, uncertainties are often ignored across layers, an approach with deep roots in classical notions of separation and certainty equivalence. But to develop robust architectures, especially as int… ▽ More Control architectures and autonomy stacks for complex engineering systems are often divided into layers to decompose a complex problem and solution into distinct, manageable sub-problems. To simplify designs, uncertainties are often ignored across layers, an approach with deep roots in classical notions of separation and certainty equivalence. But to develop robust architectures, especially as interactions between data-driven learning layers and model-based decision-making layers grow more intricate, more sophisticated interfaces between layers are required. We propose a basic architecture that couples a statistical parameter estimation layer with a constrained optimization layer. We show how the layers can be tightly integrated by combining bootstrap resampling with distributionally robust optimization. The approach allows a finite-data out-of-sample safety guarantee and an exact reformulation as a tractable finite-dimensional convex optimization problem. △ Less

Submitted 27 December, 2021; originally announced December 2021.

arXiv:2106.16078 [pdf, ps, other]

Identification of Linear Systems with Multiplicative Noise from Multiple Trajectory Data

Authors: Yu Xing, Benjamin Gravell, Xingkang He, Karl Henrik Johansson, Tyler Summers

Abstract: The paper studies identification of linear systems with multiplicative noise from multiple-trajectory data. An algorithm based on the least-squares method and multiple-trajectory data is proposed for joint estimation of the nominal system matrices and the covariance matrix of the multiplicative noise. The algorithm does not need prior knowledge of the noise or stability of the system, but requires… ▽ More The paper studies identification of linear systems with multiplicative noise from multiple-trajectory data. An algorithm based on the least-squares method and multiple-trajectory data is proposed for joint estimation of the nominal system matrices and the covariance matrix of the multiplicative noise. The algorithm does not need prior knowledge of the noise or stability of the system, but requires only independent inputs with pre-designed first and second moments and relatively small trajectory length. The study of identifiability of the noise covariance matrix shows that there exists an equivalent class of matrices that generate the same second-moment dynamic of system states. It is demonstrated how to obtain the equivalent class based on estimates of the noise covariance. Asymptotic consistency of the algorithm is verified under sufficiently exciting inputs and system controllability conditions. Non-asymptotic performance of the algorithm is also analyzed under the assumption that the system is bounded. The analysis provides high-probability bounds vanishing as the number of trajectories grows to infinity. The results are illustrated by numerical simulations. △ Less

Submitted 6 June, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

arXiv:2103.15228 [pdf, other]

doi 10.1109/LCSYS.2021.3134944

Anomaly Detection Under Multiplicative Noise Model Uncertainty

Authors: Venkatraman Renganathan, Benjamin J. Gravell, Justin Ruths, Tyler H. Summers

Abstract: State estimators are crucial components of anomaly detectors that are used to monitor cyber-physical systems. Many frequently-used state estimators are susceptible to model risk as they rely critically on the availability of an accurate state-space model. Modeling errors make it more difficult to distinguish whether deviations from expected behavior are due to anomalies or simply a lack of knowled… ▽ More State estimators are crucial components of anomaly detectors that are used to monitor cyber-physical systems. Many frequently-used state estimators are susceptible to model risk as they rely critically on the availability of an accurate state-space model. Modeling errors make it more difficult to distinguish whether deviations from expected behavior are due to anomalies or simply a lack of knowledge about the system dynamics. In this research, we account for model uncertainty through a multiplicative noise framework. Specifically, we propose to use the multiplicative noise LQG based compensator in this setting to hedge against the model uncertainty risk. The size of the residual from the estimator can then be compared against a threshold to detect anomalies. Finally, the proposed detector is validated using numerical simulations. Extension of state-of-the-art anomaly detection in cyber-physical systems to handle model uncertainty represents the main novel contribution of the present work. △ Less

Submitted 18 December, 2021; v1 submitted 28 March, 2021; originally announced March 2021.

Journal ref: IEEE Control Systems Letters 2022

arXiv:2103.05572 [pdf, other]

Risk-Averse RRT* Planning with Nonlinear Steering and Tracking Controllers for Nonlinear Robotic Systems Under Uncertainty

Authors: Sleiman Safaoui, Benjamin J. Gravell, Venkatraman Renganathan, Tyler H. Summers

Abstract: We propose a two-phase risk-averse architecture for controlling stochastic nonlinear robotic systems. We present Risk-Averse Nonlinear Steering RRT* (RANS-RRT*) as an RRT* variant that incorporates nonlinear dynamics by solving a nonlinear program (NLP) and accounts for risk by approximating the state distribution and performing a distributionally robust (DR) collision check to promote safe planni… ▽ More We propose a two-phase risk-averse architecture for controlling stochastic nonlinear robotic systems. We present Risk-Averse Nonlinear Steering RRT* (RANS-RRT*) as an RRT* variant that incorporates nonlinear dynamics by solving a nonlinear program (NLP) and accounts for risk by approximating the state distribution and performing a distributionally robust (DR) collision check to promote safe planning. The generated plan is used as a reference for a low-level tracking controller. We demonstrate three controllers: finite horizon linear quadratic regulator (LQR) with linearized dynamics around the reference trajectory, LQR with robustness-promoting multiplicative noise terms, and a nonlinear model predictive control law (NMPC). We demonstrate the effectiveness of our algorithm using unicycle dynamics under heavy-tailed Laplace process noise in a cluttered environment. △ Less

Submitted 3 September, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: First three authors contributed equally

arXiv:2101.08829 [pdf, other]

Centralized Collision-free Polynomial Trajectories and Goal Assignment for Aerial Swarms

Authors: Benjamin Gravell, Tyler Summers

Abstract: Computationally tractable methods are developed for centralized goal assignment and planning of collision-free polynomial-in-time trajectories for systems of multiple aerial robots. The method first assigns robots to goals to minimize total time-in-motion based on initial trajectories. By coupling the assignment and trajectory generation, the initial motion plans tend to require only limited colli… ▽ More Computationally tractable methods are developed for centralized goal assignment and planning of collision-free polynomial-in-time trajectories for systems of multiple aerial robots. The method first assigns robots to goals to minimize total time-in-motion based on initial trajectories. By coupling the assignment and trajectory generation, the initial motion plans tend to require only limited collision resolution. The plans are then refined by checking for potential collisions and resolving them using either start time delays or altitude assignment. Numerical experiments using both methods show significant reductions in the total time required for agents to arrive at goals with only modest additional computational effort in comparison to state-of-the-art prior work, enabling planning for thousands of agents. △ Less

Submitted 21 January, 2021; originally announced January 2021.

arXiv:2012.05268 [pdf, other]

Revisiting the Water Quality Sensor Placement Problem: Optimizing Network Observability and State Estimation Metrics

Authors: Ahmad F. Taha, Shen Wang, Yi Guo, Tyler H. Summers, Nikolaos Gatsis, Marcio H. Giacomoni, Ahmed A. Abokifa

Abstract: Real-time water quality (WQ) sensors in water distribution networks (WDN) have the potential to enable network-wide observability of water quality indicators, contamination event detection, and closed-loop feedback control of WQ dynamics. To that end, prior research has investigated a wide range of methods that guide the geographic placement of WQ sensors. These methods assign a metric for fixed s… ▽ More Real-time water quality (WQ) sensors in water distribution networks (WDN) have the potential to enable network-wide observability of water quality indicators, contamination event detection, and closed-loop feedback control of WQ dynamics. To that end, prior research has investigated a wide range of methods that guide the geographic placement of WQ sensors. These methods assign a metric for fixed sensor placement (SP) followed by \textit{metric-optimization} to obtain optimal SP. These metrics include minimizing intrusion detection time, minimizing the expected population and amount of contaminated water affected by an intrusion event. In contrast to the literature, the objective of this paper is to provide a computational method that considers the overlooked metric of state estimation and network-wide observability of the WQ dynamics. This metric finds the optimal WQ sensor placement that minimizes the state estimation error via the Kalman filter for noisy WQ dynamics -- a metric that quantifies WDN observability. To that end, the state-space dynamics of WQ states for an entire WDN are given and the observability-driven sensor placement algorithm is presented. The algorithm takes into account the time-varying nature of WQ dynamics due to changes in the hydraulic profile -- a collection of hydraulic states including heads (pressures) at nodes and flow rates in links which are caused by a demand profile over a certain period of time. Thorough case studies are given, highlighting key findings, observations, and recommendations for WDN operators. Github codes are included for reproducibility. △ Less

Submitted 9 December, 2020; originally announced December 2020.

arXiv:2011.14212 [pdf, other]

Approximate Midpoint Policy Iteration for Linear Quadratic Control

Authors: Benjamin Gravell, Iman Shames, Tyler Summers

Abstract: We present a midpoint policy iteration algorithm to solve linear quadratic optimal control problems in both model-based and model-free settings. The algorithm is a variation of Newton's method, and we show that in the model-based setting it achieves cubic convergence, which is superior to standard policy iteration and policy gradient algorithms that achieve quadratic and linear convergence, respec… ▽ More We present a midpoint policy iteration algorithm to solve linear quadratic optimal control problems in both model-based and model-free settings. The algorithm is a variation of Newton's method, and we show that in the model-based setting it achieves cubic convergence, which is superior to standard policy iteration and policy gradient algorithms that achieve quadratic and linear convergence, respectively. We also demonstrate that the algorithm can be approximately implemented without knowledge of the dynamics model by using least-squares estimates of the state-action value function from trajectory data, from which policy improvements can be obtained. With sufficient trajectory data, the policy iterates converge cubically to approximately optimal policies, and this occurs with the same available sample budget as the approximate standard policy iteration. Numerical experiments demonstrate effectiveness of the proposed algorithms. △ Less

Submitted 15 February, 2022; v1 submitted 28 November, 2020; originally announced November 2020.

arXiv:2011.01522 [pdf, other]

doi 10.1109/LCSYS.2021.3058269

Higher-Order Moment-Based Anomaly Detection

Authors: Venkatraman Renganathan, Navid Hashemi, Justin Ruths, Tyler H. Summers

Abstract: The identification of anomalies is a critical component of operating complex, and possibly large-scale and geo-graphically distributed cyber-physical systems. While designing anomaly detectors, it is common to assume Gaussian noise models to maintain tractability; however, this assumption can lead to the actual false alarm rate being significantly higher than expected. Here we design a distributio… ▽ More The identification of anomalies is a critical component of operating complex, and possibly large-scale and geo-graphically distributed cyber-physical systems. While designing anomaly detectors, it is common to assume Gaussian noise models to maintain tractability; however, this assumption can lead to the actual false alarm rate being significantly higher than expected. Here we design a distributionally robust threshold of detection using finite and fixed higher-order moments of the detection measure data such that it guarantees the actual false alarm rate to be upper bounded by the desired one. Further, we bound the states reachable through the action of a stealthy attack and identify the trade-off between this impact of attacks that cannot be detected and the worst-case false alarm rate. Through numerical experiments, we illustrate how knowledge of higher-order moments results in a tightened threshold, thereby restricting an attacker's potential impact. △ Less

Submitted 6 February, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

Comments: arXiv admin note: text overlap with arXiv:1909.12506

arXiv:2006.00317 [pdf, other]

doi 10.1109/LCSYS.2020.2998543

Control Design for Risk-Based Signal Temporal Logic Specifications

Authors: Sleiman Safaoui, Lars Lindemann, Dimos V Dimarogonas, Iman Shames, Tyler H Summers

Abstract: We present a general framework for risk semantics on Signal Temporal Logic (STL) specifications for stochastic dynamical systems using axiomatic risk theory. We show that under our recursive risk semantics, risk constraints on STL formulas can be expressed in terms of risk constraints on atomic predicates. We then show how this allows a (stochastic) STL risk constraint to be transformed into a ris… ▽ More We present a general framework for risk semantics on Signal Temporal Logic (STL) specifications for stochastic dynamical systems using axiomatic risk theory. We show that under our recursive risk semantics, risk constraints on STL formulas can be expressed in terms of risk constraints on atomic predicates. We then show how this allows a (stochastic) STL risk constraint to be transformed into a risk-tightened deterministic STL constraint on a related deterministic nominal system, enabling the application of existing STL methods. For affine predicate functions and a (coherent) Distributionally Robust Value at Risk measure, we show how risk constraints on atomic predicates can be reformulated as tightened deterministic affine constraints. We demonstrate the framework using a Model Predictive Control (MPC) design with an STL risk constraint. △ Less

Submitted 30 May, 2020; originally announced June 2020.

Comments: 6 pages, 1 figure, to be published in IEEE L-CSS

arXiv:2005.08382 [pdf, other]

Optimal Pump Control for Water Distribution Networks via Data-based Distributional Robustness

Authors: Yi Guo, Shen Wang, Ahmad Taha, Tyler Summers

Abstract: In this paper, we propose a data-based methodology to solve a multi-period stochastic optimal water flow (OWF) problem for water distribution networks (WDNs). The framework explicitly considers the pump schedule and water network head level with limited information of demand forecast errors for an extended period simulation. The objective is to determine the optimal feedback decisions of network-c… ▽ More In this paper, we propose a data-based methodology to solve a multi-period stochastic optimal water flow (OWF) problem for water distribution networks (WDNs). The framework explicitly considers the pump schedule and water network head level with limited information of demand forecast errors for an extended period simulation. The objective is to determine the optimal feedback decisions of network-connected components, such as nominal pump schedules and tank head levels and reserve policies, which specify device reactions to forecast errors for accommodation of fluctuating water demand. Instead of assuming the uncertainties across the water network are generated by a prescribed certain distribution, we consider ambiguity sets of distributions centered at an empirical distribution, which is based directly on a finite training data set. We use a distance-based ambiguity set with the Wasserstein metric to quantify the distance between the real unknown data-generating distribution and the empirical distribution. This allows our multi-period OWF framework to trade off system performance and inherent sampling errors in the training dataset. Case studies on a three-tank water distribution network systematically illustrate the tradeoff between pump operational cost, risks of constraint violation, and out-of-sample performance. △ Less

Submitted 24 April, 2022; v1 submitted 17 May, 2020; originally announced May 2020.

arXiv:2005.00345 [pdf, other]

Optimal Power Flow with State Estimation In the Loop for Distribution Networks

Authors: Yi Guo, Xinyang Zhou, Changhong Zhao, Lijun Chen, Tyler H. Summers

Abstract: We propose a framework for integrating optimal power flow (OPF) with state estimation (SE) in the loop for distribution networks. Our approach combines a primal-dual gradient-based OPF solver with a SE feedback loop based on a limited set of sensors for system monitoring, instead of assuming exact knowledge of all states. The estimation algorithm reduces uncertainty on unmeasured grid states based… ▽ More We propose a framework for integrating optimal power flow (OPF) with state estimation (SE) in the loop for distribution networks. Our approach combines a primal-dual gradient-based OPF solver with a SE feedback loop based on a limited set of sensors for system monitoring, instead of assuming exact knowledge of all states. The estimation algorithm reduces uncertainty on unmeasured grid states based on a few appropriate online state measurements and noisy "pseudo-measurements". We analyze the convergence of the proposed algorithm and quantify the statistical estimation errors based on a weighted least squares (WLS) estimator. The numerical results on a 4521-node network demonstrate that this approach can scale to extremely large networks and provide robustness to both large pseudo measurement variability and inherent sensor measurement noise. △ Less

Submitted 4 May, 2022; v1 submitted 29 April, 2020; originally announced May 2020.

Comments: arXiv admin note: text overlap with arXiv:1909.12763

arXiv:2004.08019 [pdf, ps, other]

Robust Control Design for Linear Systems via Multiplicative Noise

Authors: Benjamin Gravell, Peyman Mohajerin Esfahani, Tyler Summers

Abstract: Robust stability and stochastic stability have separately seen intense study in control theory for many decades. In this work we establish relations between these properties for discrete-time systems and employ them for robust control design. Specifically, we examine a multiplicative noise framework which models the inherent uncertainty and variation in the system dynamics which arise in model-bas… ▽ More Robust stability and stochastic stability have separately seen intense study in control theory for many decades. In this work we establish relations between these properties for discrete-time systems and employ them for robust control design. Specifically, we examine a multiplicative noise framework which models the inherent uncertainty and variation in the system dynamics which arise in model-based learning control methods such as adaptive control and reinforcement learning. We provide results which guarantee robustness margins in terms of perturbations on the nominal dynamics as well as algorithms which generate maximally robust controllers. △ Less

Submitted 16 April, 2020; originally announced April 2020.

arXiv:2004.07176 [pdf, other]

Trust-based user-interface design for human-automation systems

Authors: Abraham P. Vinod, Adam J. Thorpe, Philip A. Olaniyi, Tyler H. Summers, Meeko M. K. Oishi

Abstract: We present a method for dynamics-driven, user-interface design for a human-automation system via sensor selection. We define the user-interface to be the output of a MIMO LTI system, and formulate the design problem as one of selecting an output matrix from a given set of candidate output matrices. Sufficient conditions for situation awareness are captured as additional constraints on the selectio… ▽ More We present a method for dynamics-driven, user-interface design for a human-automation system via sensor selection. We define the user-interface to be the output of a MIMO LTI system, and formulate the design problem as one of selecting an output matrix from a given set of candidate output matrices. Sufficient conditions for situation awareness are captured as additional constraints on the selection of the output matrix. These constraints depend upon the level of trust the human has in the automation. We show that the resulting user-interface design problem is a combinatorial, set-cardinality minimization problem with set function constraints. We propose tractable algorithms to compute optimal or sub-optimal solutions with suboptimality bounds. Our approaches exploit monotonicity and submodularity present in the design problem, and rely on constraint programming and submodular maximization. We apply this method to the IEEE 118-bus, to construct correct-by-design interfaces under various operating scenarios. △ Less

Submitted 15 April, 2020; originally announced April 2020.

Comments: 20 pages, 8 figures, 4 tables

arXiv:2002.10069 [pdf, other]

Robust Learning-Based Control via Bootstrapped Multiplicative Noise

Authors: Benjamin Gravell, Tyler Summers

Abstract: Despite decades of research and recent progress in adaptive control and reinforcement learning, there remains a fundamental lack of understanding in designing controllers that provide robustness to inherent non-asymptotic uncertainties arising from models estimated with finite, noisy data. We propose a robust adaptive control algorithm that explicitly incorporates such non-asymptotic uncertainties… ▽ More Despite decades of research and recent progress in adaptive control and reinforcement learning, there remains a fundamental lack of understanding in designing controllers that provide robustness to inherent non-asymptotic uncertainties arising from models estimated with finite, noisy data. We propose a robust adaptive control algorithm that explicitly incorporates such non-asymptotic uncertainties into the control design. The algorithm has three components: (1) a least-squares nominal model estimator; (2) a bootstrap resampling method that quantifies non-asymptotic variance of the nominal model estimate; and (3) a non-conventional robust control design method using an optimal linear quadratic regulator (LQR) with multiplicative noise. A key advantage of the proposed approach is that the system identification and robust control design procedures both use stochastic uncertainty representations, so that the actual inherent statistical estimation uncertainty directly aligns with the uncertainty the robust controller is being designed against. We show through numerical experiments that the proposed robust adaptive controller can significantly outperform the certainty equivalent controller on both expected regret and measures of regret risk. △ Less

Submitted 11 August, 2021; v1 submitted 23 February, 2020; originally announced February 2020.

arXiv:2002.06613 [pdf, other]

Linear System Identification Under Multiplicative Noise from Multiple Trajectory Data

Authors: Yu Xing, Ben Gravell, Xingkang He, Karl Henrik Johansson, Tyler Summers

Abstract: The study of multiplicative noise models has a long history in control theory but is re-emerging in the context of complex networked systems and systems with learning-based control. We consider linear system identification with multiplicative noise from multiple state-input trajectory data. We propose exploratory input signals along with a least-squares algorithm to simultaneously estimate nominal… ▽ More The study of multiplicative noise models has a long history in control theory but is re-emerging in the context of complex networked systems and systems with learning-based control. We consider linear system identification with multiplicative noise from multiple state-input trajectory data. We propose exploratory input signals along with a least-squares algorithm to simultaneously estimate nominal system parameters and multiplicative noise covariance matrices. Identifiability of the covariance structure and asymptotic consistency of the least-squares estimator are demonstrated by analyzing first and second moment dynamics of the system. The results are illustrated by numerical simulations. △ Less

Submitted 3 July, 2020; v1 submitted 16 February, 2020; originally announced February 2020.

arXiv:2002.02928 [pdf, other]

Towards Integrated Perception and Motion Planning with Distributionally Robust Risk Constraints

Authors: Venkatraman Renganathan, Iman Shames, Tyler H. Summers

Abstract: Safely deploying robots in uncertain and dynamic environments requires a systematic accounting of various risks, both within and across layers in an autonomy stack from perception to motion planning and control. Many widely used motion planning algorithms do not adequately incorporate inherent perception and prediction uncertainties, often ignoring them altogether or making questionable assumption… ▽ More Safely deploying robots in uncertain and dynamic environments requires a systematic accounting of various risks, both within and across layers in an autonomy stack from perception to motion planning and control. Many widely used motion planning algorithms do not adequately incorporate inherent perception and prediction uncertainties, often ignoring them altogether or making questionable assumptions of Gaussianity. We propose a distributionally robust incremental sampling-based motion planning framework that explicitly and coherently incorporates perception and prediction uncertainties. We design output feedback policies and consider moment-based ambiguity sets of distributions to enforce probabilistic collision avoidance constraints under the worst-case distribution in the ambiguity set. Our solution approach, called Output Feedback Distributionally Robust $RRT^{*}$(OFDR-$RRT^{*})$, produces asymptotically optimal risk-bounded trajectories for robots operating in dynamic, cluttered, and uncertain environments, explicitly incorporating mapping and localization error, stochastic process disturbances, unpredictable obstacle motion, and uncertain obstacle locations. Numerical experiments illustrate the effectiveness of the proposed algorithm. △ Less

Submitted 7 February, 2020; originally announced February 2020.

arXiv:1912.05149 [pdf, other]

doi 10.1109/TAC.2020.3044284

Actuator Placement under Structural Controllability using Forward and Reverse Greedy Algorithms

Authors: Baiwei Guo, Orcun Karaca, Tyler Summers, Maryam Kamgarpour

Abstract: Actuator placement is an active field of research which has received significant attention for its applications in complex dynamical networks. In this paper, we study the problem of finding a set of actuator placements minimizing the metric that measures the average energy consumed for state transfer by the controller, while satisfying a structural controllability requirement and a cardinality con… ▽ More Actuator placement is an active field of research which has received significant attention for its applications in complex dynamical networks. In this paper, we study the problem of finding a set of actuator placements minimizing the metric that measures the average energy consumed for state transfer by the controller, while satisfying a structural controllability requirement and a cardinality constraint on the number of actuators allowed. As no computationally efficient methods are known to solve such combinatorial set function optimization problems, two greedy algorithms, forward and reverse, are proposed to obtain approximate solutions. We first show that the constraint sets these algorithms explore can be characterized by matroids. We then obtain performance guarantees for the forward and reverse greedy algorithms applied to the general class of matroid optimization problems by exploiting properties of the objective function such as the submodularity ratio and the curvature. Finally, we propose feasibility check methods for both algorithms based on maximum flow problems on certain auxiliary graphs originating from the network graph. Our results are verified with case studies over large networks. △ Less

Submitted 29 October, 2020; v1 submitted 11 December, 2019; originally announced December 2019.

Journal ref: IEEE Transactions on Automatic Control, 2020

arXiv:1909.12763 [pdf, other]

Solving Optimal Power Flow for Distribution Networks with State Estimation Feedback

Authors: Yi Guo, Xinyang Zhou, Changhong Zhao, Yue Chen, Tyler Summers, Lijun Chen

Abstract: Conventional optimal power flow (OPF) solvers assume full observability of the involved system states. However, in practice, there is a lack of reliable system monitoring devices in the distribution networks. To close the gap between the theoretic algorithm design and practical implementation, this work proposes to solve the OPF problems based on the state estimation (SE) feedback for the distribu… ▽ More Conventional optimal power flow (OPF) solvers assume full observability of the involved system states. However, in practice, there is a lack of reliable system monitoring devices in the distribution networks. To close the gap between the theoretic algorithm design and practical implementation, this work proposes to solve the OPF problems based on the state estimation (SE) feedback for the distribution networks where only a part of the involved system states are physically measured. The SE feedback increases the observability of the under-measured system and provides more accurate system states monitoring when the measurements are noisy. We analytically investigate the convergence of the proposed algorithm. The numerical results demonstrate that the proposed approach is more robust to large pseudo measurement variability and inherent sensor noise in comparison to the other frameworks without SE feedback. △ Less

Submitted 16 March, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

arXiv:1909.12506 [pdf, other]

Distributionally Robust Tuning of Anomaly Detectors in Cyber-Physical Systems with Stealthy Attacks

Authors: Venkatraman Renganathan, Navid Hashemi, Justin Ruths, Tyler H. Summers

Abstract: Designing resilient control strategies for mitigating stealthy attacks is a crucial task in emerging cyber-physical systems. In the design of anomaly detectors, it is common to assume Gaussian noise models to maintain tractability; however, this assumption can lead the actual false alarm rate to be significantly higher than expected. We propose a distributionally robust anomaly detector for noise… ▽ More Designing resilient control strategies for mitigating stealthy attacks is a crucial task in emerging cyber-physical systems. In the design of anomaly detectors, it is common to assume Gaussian noise models to maintain tractability; however, this assumption can lead the actual false alarm rate to be significantly higher than expected. We propose a distributionally robust anomaly detector for noise distributions in moment-based ambiguity sets. We design a detection threshold that guarantees that the actual false alarm rate is upper bounded by the desired one by using generalized Chebyshev inequalities. Furthermore, we highlight an important trade-off between the worst-case false alarm rate and the potential impact of a stealthy attacker by efficiently computing an outer ellipsoidal bound for the attack-reachable states corresponding to the distributionally robust detector threshold. We illustrate this trade-off with a numerical example and compare the proposed approach with a traditional chi-squared detector. △ Less

Submitted 27 September, 2019; originally announced September 2019.

Journal ref: 2020 Annual American Control Conference (ACC)

arXiv:1908.09034 [pdf, other]

Stochastic Dynamic Programming for Wind Farm Power Maximization

Authors: Yi Guo, Mario Rotea, Tyler Summers

Abstract: Wind farms can increase annual energy production (AEP) with advanced control algorithms by coordinating the set points of individual turbine controllers across the farm. However, it remains a significant challenge to achieve performance improvements in practice because of the difficulty of utilizing models that capture pertinent complex aerodynamic phenomena while remaining amenable to control des… ▽ More Wind farms can increase annual energy production (AEP) with advanced control algorithms by coordinating the set points of individual turbine controllers across the farm. However, it remains a significant challenge to achieve performance improvements in practice because of the difficulty of utilizing models that capture pertinent complex aerodynamic phenomena while remaining amenable to control design. We formulate a multi-stage stochastic optimal control problem for wind farm power maximization and show that it can be solved analytically via dynamic programming. In particular, our model incorporates state- and input-dependent multiplicative noise whose distributions capture stochastic wind fluctuations. The optimal control policies and value functions explicitly incorporate the moments of these distributions, establishing a connection between wind flow data and optimal feedback control. We illustrate the results with numerical experiments that demonstrate the advantages of our approach over existing methods based on deterministic models. △ Less

Submitted 16 March, 2020; v1 submitted 23 August, 2019; originally announced August 2019.

arXiv:1905.13548 [pdf, other]

Sparse optimal control of networks with multiplicative noise via policy gradient

Authors: Benjamin Gravell, Yi Guo, Tyler Summers

Abstract: We give algorithms for designing near-optimal sparse controllers using policy gradient with applications to control of systems corrupted by multiplicative noise, which is increasingly important in emerging complex dynamical networks. Various regularization schemes are examined and incorporated into the optimization by the use of gradient, subgradient, and proximal gradient methods. Numerical exper… ▽ More We give algorithms for designing near-optimal sparse controllers using policy gradient with applications to control of systems corrupted by multiplicative noise, which is increasingly important in emerging complex dynamical networks. Various regularization schemes are examined and incorporated into the optimization by the use of gradient, subgradient, and proximal gradient methods. Numerical experiments on a large networked system show that the algorithms converge to performant sparse mean-square stabilizing controllers. △ Less

Submitted 28 May, 2019; originally announced May 2019.

arXiv:1905.13547 [pdf, other]

Learning robust control for LQR systems with multiplicative noise via policy gradient

Authors: Benjamin Gravell, Peyman Mohajerin Esfahani, Tyler Summers

Abstract: The linear quadratic regulator (LQR) problem has reemerged as an important theoretical benchmark for reinforcement learning-based control of complex dynamical systems with continuous state and action spaces. In contrast with nearly all recent work in this area, we consider multiplicative noise models, which are increasingly relevant because they explicitly incorporate inherent uncertainty and vari… ▽ More The linear quadratic regulator (LQR) problem has reemerged as an important theoretical benchmark for reinforcement learning-based control of complex dynamical systems with continuous state and action spaces. In contrast with nearly all recent work in this area, we consider multiplicative noise models, which are increasingly relevant because they explicitly incorporate inherent uncertainty and variation in the system dynamics and thereby improve robustness properties of the controller. Robustness is a critical and poorly understood issue in reinforcement learning; existing methods which do not account for uncertainty can converge to fragile policies or fail to converge at all. Additionally, intentional injection of multiplicative noise into learning algorithms can enhance robustness of policies, as observed in ad hoc work on domain randomization. Although policy gradient algorithms require optimization of a non-convex cost function, we show that the multiplicative noise LQR cost has a special property called gradient domination, which is exploited to prove global convergence of policy gradient algorithms to the globally optimum control policy with polynomial dependence on problem parameters. Results are provided both in the model-known and model-unknown settings where samples of system trajectories are used to estimate policy gradients. △ Less

Submitted 1 May, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

arXiv:1903.08120 [pdf, other]

doi 10.1109/CDC40024.2019.9030204

Actuator Placement for Optimizing Network Performance under Controllability Constraints

Authors: Baiwei Guo, Orcun Karaca, Tyler Summers, Maryam Kamgarpour

Abstract: With the rising importance of large-scale network control, the problem of actuator placement has received increasing attention. Our goal in this paper is to find a set of actuators minimizing the metric that measures the average energy consumption of the control inputs while ensuring structural controllability of the network. As this problem is intractable, greedy algorithm can be used to obtain a… ▽ More With the rising importance of large-scale network control, the problem of actuator placement has received increasing attention. Our goal in this paper is to find a set of actuators minimizing the metric that measures the average energy consumption of the control inputs while ensuring structural controllability of the network. As this problem is intractable, greedy algorithm can be used to obtain an approximate solution. To provide a performance guarantee for this approach, we first define the submodularity ratio for the metric under consideration and then reformulate the structural controllability constraint as a matroid constraint. This shows that the problem under study can be characterized by a matroid optimization involving a weakly submodular objective function. Then, we derive a novel performance guarantee for the greedy algorithm applied to this class of optimization problems. Finally, we show that the matroid feasibility check for the greedy algorithm can be cast as a maximum matching problem in a certain auxiliary bipartite graph related to the network graph. △ Less

Submitted 19 March, 2019; originally announced March 2019.

Journal ref: IEEE 58th Conference on Decision and Control (CDC), 2019

arXiv:1903.00635 [pdf, other]

A Performance and Stability Analysis of Low-inertia Power Grids with Stochastic System Inertia

Authors: Yi Guo, Tyler H. Summers

Abstract: Traditional synchronous generators with rotational inertia are being replaced by low-inertia renewable energy resources (RESs) in many power grids and operational scenarios. Due to emerging market mechanisms, inherent variability of RESs, and existing control schemes, the resulting system inertia levels can not only be low but also markedly time-varying. In this paper, we investigate performance a… ▽ More Traditional synchronous generators with rotational inertia are being replaced by low-inertia renewable energy resources (RESs) in many power grids and operational scenarios. Due to emerging market mechanisms, inherent variability of RESs, and existing control schemes, the resulting system inertia levels can not only be low but also markedly time-varying. In this paper, we investigate performance and stability of low-inertia power systems with stochastic system inertia. In particular, we consider system dynamics modelled by a linearized stochastic swing equation, where stochastic system inertia is regarded as multiplicative noise. The $\mathcal{H}_2$ norm is used to quantify the performance of the system in the presence of persistent disturbances or transient faults. The performance metric can be computed by solving a generalized Lyapunov equation, which has fundamentally different characteristics from systems with only additive noise. For grids with uniform inertia and damping parameters, we derive closed-form expressions for the $\mathcal{H}_2$ norm of the proposed stochastic swing equation. The analysis gives insights into how the $\mathcal{H}_2$ norm of the stochastic swing equation depends on 1) network topology; 2) system parameters; and 3) distribution parameters of disturbances. A mean-square stability condition is also derived. Numerical results provide additional insights for performance and stability of the stochastic swing equation. △ Less

Submitted 2 March, 2019; originally announced March 2019.

arXiv:1812.04771 [pdf, other]

Robust Optimal Design of Energy Efficient Series Elastic Actuators: Application to a Powered Prosthetic Ankle

Authors: Edgar Bolívar, Siavash Rezazadeh, Tyler Summers, Robert D. Gregg

Abstract: Design of robotic systems that safely and efficiently operate in uncertain operational conditions, such as rehabilitation and physical assistance robots, remains an important challenge in the field. Current methods for the design of energy efficient series elastic actuators use an optimization formulation that typically assumes known operational conditions. This approach could lead to actuators th… ▽ More Design of robotic systems that safely and efficiently operate in uncertain operational conditions, such as rehabilitation and physical assistance robots, remains an important challenge in the field. Current methods for the design of energy efficient series elastic actuators use an optimization formulation that typically assumes known operational conditions. This approach could lead to actuators that cannot perform in uncertain environments because elongation, speed, or torque requirements may be beyond actuator specifications when the operation deviates from its nominal conditions. Addressing this gap, we propose a convex optimization formulation to design the stiffness of series elastic actuators to minimize energy consumption and satisfy actuator constraints despite uncertainty due to manufacturing of the spring, unmodeled dynamics, efficiency of the transmission, and the kinematics and kinetics of the load. In our formulation, we express energy consumption as a scalar convex-quadratic function of compliance. In the unconstrained case, this quadratic equation provides an analytical solution to the optimal value of stiffness that minimizes energy consumption for arbitrary periodic reference trajectories. As actuator constraints, we consider peak motor torque, peak motor velocity, limitations due to the speed-torque relationship of DC motors, and peak elongation of the spring. As a simulation case study, we apply our formulation to the robust design of a series elastic actuator for a powered prosthetic ankle. Our simulation results indicate that a small trade-off between energy efficiency and robustness is justified to design actuators that can operate with uncertainty. △ Less

Submitted 5 February, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

arXiv:1811.11792 [pdf, ps, other]

Algorithms for Joint Sensor and Control Nodes Selection in Dynamic Networks

Authors: Sebastian A. Nugroho, Ahmad F. Taha, Nikolaos Gatsis, Tyler H. Summers, Ram Krishnan

Abstract: The problem of placing or selecting sensors and control nodes plays a pivotal role in the operation of dynamic networks. This paper proposes optimal algorithms and heuristics to solve the simultaneous sensor and actuator selection problem in linear dynamic networks. In particular, a sufficiency condition of static output feedback stabilizability is used to obtain the minimal set of sensors and con… ▽ More The problem of placing or selecting sensors and control nodes plays a pivotal role in the operation of dynamic networks. This paper proposes optimal algorithms and heuristics to solve the simultaneous sensor and actuator selection problem in linear dynamic networks. In particular, a sufficiency condition of static output feedback stabilizability is used to obtain the minimal set of sensors and control nodes needed to stabilize an unstable network. We show the joint sensor/actuator selection and output feedback control can be written as a mixed-integer nonconvex problem. To solve this nonconvex combinatorial problem, three methods based on (1) mixed-integer nonlinear programming, (2) binary search algorithms, and (3) simple heuristics are proposed. The first method yields optimal solutions to the selection problem---given that some constants are appropriately selected. The second method requires a database of binary sensor/actuator combinations, returns optimal solutions, and necessitates no tuning parameters. The third approach is a heuristic that yields suboptimal solutions but is computationally attractive. The theoretical properties of these methods are discussed and numerical tests on dynamic networks showcase the trade-off between optimality and computational time. △ Less

Submitted 7 March, 2019; v1 submitted 28 November, 2018; originally announced November 2018.

arXiv:1809.00093 [pdf, other]

Robust 3D Distributed Formation Control with Application to Quadrotors

Authors: Kaveh Fathian, Sleiman Safaoui, Tyler H. Summers, Nicholas R. Gans

Abstract: We present a distributed control strategy for a team of quadrotors to autonomously achieve a desired 3D formation. Our approach is based on local relative position measurements and does not require global position information or inter-vehicle communication. We assume that quadrotors have a common sense of direction, which is chosen as the direction of gravitational force measured by their onboard… ▽ More We present a distributed control strategy for a team of quadrotors to autonomously achieve a desired 3D formation. Our approach is based on local relative position measurements and does not require global position information or inter-vehicle communication. We assume that quadrotors have a common sense of direction, which is chosen as the direction of gravitational force measured by their onboard IMU sensors. However, this assumption is not crucial, and our approach is robust to inaccuracies and effects of acceleration on gravitational measurements. In particular, converge to the desired formation is unaffected if each quadrotor has a velocity vector that projects positively onto the desired velocity vector provided by the formation control strategy. We demonstrate the validity of proposed approach in an experimental setup and show that a team of quadrotors achieve a desired 3D formation. △ Less

Submitted 31 August, 2018; originally announced September 2018.

Comments: Extended abstract

arXiv:1807.11058 [pdf, other]

Robust Distributed Planar Formation Control for Higher-Order Holonomic and Nonholonomic Agents

Authors: Kaveh Fathian, Sleiman Safaoui, Tyler H. Summers, Nicholas R. Gans

Abstract: We present a distributed formation control strategy for agents with a variety of dynamics to achieve a desired planar formation. Our approach is based on the barycentric-coordinate-based (BCB) control, which is fully distributed, does not require inter-agent communication or a common sense of orientation, and can be implemented using relative position measurements acquired by agents in their local… ▽ More We present a distributed formation control strategy for agents with a variety of dynamics to achieve a desired planar formation. Our approach is based on the barycentric-coordinate-based (BCB) control, which is fully distributed, does not require inter-agent communication or a common sense of orientation, and can be implemented using relative position measurements acquired by agents in their local coordinate frames. This removes the need for global positioning or alignment of local coordinate frames, which are required across several existing strategies. We show how the BCB control for agents with the simplest dynamical model, i.e., the single-integrator dynamics, can be extended to agents with higher-order dynamics such as quadrotors, and nonholonomic agents such as unicycles and cars. Specifically, our extension preserves the desired convergence and robustness guarantees of the BCB approach and is provably robust to saturations in the input and unmodeled linear actuator dynamics for unicycle and car agents. We further show that under our proposed BCB control design, the agents can move along a rotated and scaled control direction without affecting the convergence to the desired formation. This observation is used to design a fully distributed collision avoidance strategy, which is often not considered in the formation control literature. We demonstrate the proposed approach in simulations and further present a distributed robotic platform to test the strategy experimentally. Our experimental platform consists of off-the-shelf equipment that can be used to test and validate other multi-agent algorithms. The code and implementation instructions for this platform are available online. △ Less

Submitted 2 June, 2020; v1 submitted 29 July, 2018; originally announced July 2018.

arXiv:1806.05481 [pdf, ps, other]

Simultaneous Sensor and Actuator Selection/Placement through Output Feedback Control

Authors: Sebastian Nugroho, Ahmad F. Taha, Tyler Summers, Nikolaos Gatsis

Abstract: In most dynamic networks, it is impractical to measure all of the system states; instead, only a subset of the states are measured through sensors. Consequently, and unlike full state feedback controllers, output feedback control utilizes only the measured states to obtain a stable closed-loop performance. This paper explores the interplay between the selection of minimal number of sensors and act… ▽ More In most dynamic networks, it is impractical to measure all of the system states; instead, only a subset of the states are measured through sensors. Consequently, and unlike full state feedback controllers, output feedback control utilizes only the measured states to obtain a stable closed-loop performance. This paper explores the interplay between the selection of minimal number of sensors and actuators (SaA) that yield a stable closed-loop system performance. Through the formulation of the static output feedback control problem, we show that the simultaneous selection of minimal set of SaA is a combinatorial optimization problem with mixed-integer nonlinear matrix inequality constraints. To address the computational complexity, we develop two approaches: The first approach relies on integer/disjunctive programming principles, while the second approach is a simple algorithm that is akin to binary search routines. The optimality of the two approaches is also discussed. Numerical experiments are included showing the performance of the developed approaches. △ Less

Submitted 14 June, 2018; originally announced June 2018.

Comments: 6 pages

Journal ref: In the Proceedings of the 2018 American Control Conference, Milwaukee, Wisconsin

arXiv:1804.06388 [pdf, ps, other]

Data-based Distributionally Robust Stochastic Optimal Power Flow, Part I: Methodologies

Authors: Yi Guo, Kyri Baker, Emiliano Dall'Anese, Zechun Hu, Tyler H. Summers

Abstract: We propose a data-based method to solve a multi-stage stochastic optimal power flow (OPF) problem based on limited information about forecast error distributions. The framework explicitly combines multi-stage feedback policies with any forecasting method and historical forecast error data. The objective is to determine power scheduling policies for controllable devices in a power network to balanc… ▽ More We propose a data-based method to solve a multi-stage stochastic optimal power flow (OPF) problem based on limited information about forecast error distributions. The framework explicitly combines multi-stage feedback policies with any forecasting method and historical forecast error data. The objective is to determine power scheduling policies for controllable devices in a power network to balance operational cost and conditional value-at-risk (CVaR) of device and network constraint violations. These decisions include both nominal power schedules and reserve policies, which specify planned reactions to forecast errors in order to accommodate fluctuating renewable energy sources. Instead of assuming the uncertainties across the networks follow prescribed probability distributions, we consider ambiguity sets of distributions centered around a finite training dataset. By utilizing the Wasserstein metric to quantify differences between the empirical data-based distribution and the real unknown data-generating distribution, we formulate a multi-stage distributionally robust OPF problem to compute optimal control policies that are robust to both forecast errors and sampling errors inherent in the dataset. Two specific data-based distributionally robust stochastic OPF problems are proposed for distribution networks and transmission systems. △ Less

Submitted 25 October, 2018; v1 submitted 17 April, 2018; originally announced April 2018.

Comments: arXiv admin note: text overlap with arXiv:1706.04267

arXiv:1804.06384 [pdf, other]

Data-based Distributionally Robust Stochastic Optimal Power Flow, Part II: Case studies

Authors: Yi Guo, Kyri Baker, Emiliano Dall'Anese, Zechun Hu, Tyler H. Summers

Abstract: This is the second part of a two-part paper on data-based distributionally robust stochastic optimal power flow (OPF). The general problem formulation and methodology have been presented in Part I [1]. Here, we present extensive numerical experiments in both distribution and transmission networks to illustrate the effectiveness and flexibility of the proposed methodology for balancing efficiency,… ▽ More This is the second part of a two-part paper on data-based distributionally robust stochastic optimal power flow (OPF). The general problem formulation and methodology have been presented in Part I [1]. Here, we present extensive numerical experiments in both distribution and transmission networks to illustrate the effectiveness and flexibility of the proposed methodology for balancing efficiency, constraint violation risk, and out-of-sample performance. On the distribution side, the method mitigates overvoltages due to high photovoltaic penetration using local energy storage devices. On the transmission side, the method reduces N-1 security line flow constraint risks due to high wind penetration using reserve policies for controllable generators. In both cases, the data-based distributionally robust model predictive control (MPC) algorithm explicitly utilizes forecast error training datasets, which can be updated online. The numerical results illustrate inherent tradeoffs between the operational costs, risks of constraints violations, and out-of-sample performance, offering systematic techniques for system operators to balance these objectives. △ Less

Submitted 25 October, 2018; v1 submitted 17 April, 2018; originally announced April 2018.

Showing 1–50 of 69 results for author: Summers, T