Search | arXiv e-print repository

Intrinsic Successive Convexification: Trajectory Optimization on Smooth Manifolds

Authors: Spencer Kraisler, Mehran Mesbahi, Behcet Acikmese

Abstract: A fundamental issue at the core of trajectory optimization on smooth manifolds is handling the implicit manifold constraint within the dynamics. The conventional approach is to enforce the dynamic model as a constraint. However, we show this approach leads to significantly redundant operations, as well as being heavily dependent on the state space representation. Specifically, we propose an intrin… ▽ More A fundamental issue at the core of trajectory optimization on smooth manifolds is handling the implicit manifold constraint within the dynamics. The conventional approach is to enforce the dynamic model as a constraint. However, we show this approach leads to significantly redundant operations, as well as being heavily dependent on the state space representation. Specifically, we propose an intrinsic successive convexification methodology for optimal control on smooth manifolds. This so-called iSCvx is then applied to a representative example involving attitude trajectory optimization for a spacecraft subject to non-convex constraints. △ Less

Submitted 16 March, 2025; originally announced March 2025.

arXiv:2501.09192 [pdf, other]

Estimation-Aware Trajectory Optimization with Set-Valued Measurement Uncertainties

Authors: Aditya Deole, Mehran Mesbahi

Abstract: In this paper, an optimization-based framework for generating estimation-aware trajectories is presented. In this setup, measurement (output) uncertainties are state-dependent and set-valued. Enveloping ellipsoids are employed to characterize state-dependent uncertainties with unknown distributions. The concept of regularity for set-valued output maps is then introduced, facilitating the formulati… ▽ More In this paper, an optimization-based framework for generating estimation-aware trajectories is presented. In this setup, measurement (output) uncertainties are state-dependent and set-valued. Enveloping ellipsoids are employed to characterize state-dependent uncertainties with unknown distributions. The concept of regularity for set-valued output maps is then introduced, facilitating the formulation of the estimation-aware trajectory generation problem. Specifically, it is demonstrated that for output-regular maps, one can utilize a set-valued observability measure that is concave with respect to the finite horizon state trajectories. By maximizing this measure, estimation-aware trajectories can then be synthesized for a broad class of systems. Trajectory planning routines are also examined in this work, by which the observability measure is optimized for systems with locally linearized dynamics. To illustrate the effectiveness of the proposed approach, representative examples in the context of trajectory planning with vision-based estimation are presented. Moreover, the paper presents estimation-aware planning for an uncooperative Target-Rendezvous problem, where an Ego-satellite employs an onboard machine learning (ML)-based estimation module to realize the rendezvous trajectory. △ Less

Submitted 10 May, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

Comments: 40 pages, 9 figures

arXiv:2412.15573 [pdf, other]

Multi Agent Reinforcement Learning for Sequential Satellite Assignment Problems

Authors: Joshua Holder, Natasha Jaques, Mehran Mesbahi

Abstract: Assignment problems are a classic combinatorial optimization problem in which a group of agents must be assigned to a group of tasks such that maximum utility is achieved while satisfying assignment constraints. Given the utility of each agent completing each task, polynomial-time algorithms exist to solve a single assignment problem in its simplest form. However, in many modern-day applications s… ▽ More Assignment problems are a classic combinatorial optimization problem in which a group of agents must be assigned to a group of tasks such that maximum utility is achieved while satisfying assignment constraints. Given the utility of each agent completing each task, polynomial-time algorithms exist to solve a single assignment problem in its simplest form. However, in many modern-day applications such as satellite constellations, power grids, and mobile robot scheduling, assignment problems unfold over time, with the utility for a given assignment depending heavily on the state of the system. We apply multi-agent reinforcement learning to this problem, learning the value of assignments by bootstrapping from a known polynomial-time greedy solver and then learning from further experience. We then choose assignments using a distributed optimal assignment mechanism rather than by selecting them directly. We demonstrate that this algorithm is theoretically justified and avoids pitfalls experienced by other RL algorithms in this setting. Finally, we show that our algorithm significantly outperforms other methods in the literature, even while scaling to realistic scenarios with hundreds of agents and tasks. △ Less

Submitted 20 December, 2024; originally announced December 2024.

arXiv:2406.04243 [pdf, other]

Policy Optimization in Control: Geometry and Algorithmic Implications

Authors: Shahriar Talebi, Yang Zheng, Spencer Kraisler, Na Li, Mehran Mesbahi

Abstract: This survey explores the geometric perspective on policy optimization within the realm of feedback control systems, emphasizing the intrinsic relationship between control design and optimization. By adopting a geometric viewpoint, we aim to provide a nuanced understanding of how various ``complete parameterization'' -- referring to the policy parameters together with its Riemannian geometry -- of… ▽ More This survey explores the geometric perspective on policy optimization within the realm of feedback control systems, emphasizing the intrinsic relationship between control design and optimization. By adopting a geometric viewpoint, we aim to provide a nuanced understanding of how various ``complete parameterization'' -- referring to the policy parameters together with its Riemannian geometry -- of control design problems, influence stability and performance of local search algorithms. The paper is structured to address key themes such as policy parameterization, the topology and geometry of stabilizing policies, and their implications for various (non-convex) dynamic performance measures. We focus on a few iconic control design problems, including the Linear Quadratic Regulator (LQR), Linear Quadratic Gaussian (LQG) control, and $\mathcal{H}_\infty$ control. In particular, we first discuss the topology and Riemannian geometry of stabilizing policies, distinguishing between their static and dynamic realizations. Expanding on this geometric perspective, we then explore structural properties of the aforementioned performance measures and their interplay with the geometry of stabilizing policies in presence of policy constraints; along the way, we address issues such as spurious stationary points, symmetries of dynamic feedback policies, and (non-)smoothness of the corresponding performance measures. We conclude the survey with algorithmic implications of policy optimization in feedback design. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.16680 [pdf, ps, other]

Six-Degree-of-Freedom Aircraft Landing Trajectory Planning with Runway Alignment

Authors: Taewan Kim, Abhinav G. Kamath, Niyousha Rahimi, Jasper Corleis, Behçet Açıkmeşe, Mehran Mesbahi

Abstract: This paper presents a numerical optimization algorithm for generating approach and landing trajectories for a six-degree-of-freedom (6-DoF) aircraft. We improve on the existing research on aircraft landing trajectory generation by formulating the trajectory optimization problem with additional real-world operational constraints, including 6-DoF aircraft dynamics, runway alignment, constant wind fi… ▽ More This paper presents a numerical optimization algorithm for generating approach and landing trajectories for a six-degree-of-freedom (6-DoF) aircraft. We improve on the existing research on aircraft landing trajectory generation by formulating the trajectory optimization problem with additional real-world operational constraints, including 6-DoF aircraft dynamics, runway alignment, constant wind field, and obstacle avoidance, to obtain a continuous-time nonconvex optimal control problem. Particularly, the runway alignment constraint enforces the trajectory of the aircraft to be aligned with the runway only during the final approach phase. This is a novel feature that is essential for preventing an approach that is either too steep or too shallow. The proposed method models the runway alignment constraint through a multi-phase trajectory planning scheme, imposing alignment conditions exclusively during the final approach phase. We compare this formulation with the existing state-triggered constraint formulation for runway alignment. To solve the formulated problem, we design a novel sequential convex programming algorithm called xPTR that extends the penalized trust-region (PTR) algorithm by incorporating an extrapolation step to expedite convergence. We validate the proposed method through extensive numerical simulations, including a Monte Carlo study, to evaluate the robustness of the algorithm to varying initial conditions. △ Less

Submitted 10 June, 2025; v1 submitted 26 May, 2024; originally announced May 2024.

Comments: This article has been accepted to JGCD

arXiv:2403.17157 [pdf, other]

doi 10.1109/LCSYS.2024.3414962

Output-feedback Synthesis Orbit Geometry: Quotient Manifolds and LQG Direct Policy Optimization

Authors: Spencer Kraisler, Mehran Mesbahi

Abstract: We consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of dynamic output-feedback controllers of relevance to LQG has an intricate geometry, particularly pertaining to the existence of degenerate stationary points, that hinders gradient methods. In order to address these challenges, in this paper, w… ▽ More We consider direct policy optimization for the linear-quadratic Gaussian (LQG) setting. Over the past few years, it has been recognized that the landscape of dynamic output-feedback controllers of relevance to LQG has an intricate geometry, particularly pertaining to the existence of degenerate stationary points, that hinders gradient methods. In order to address these challenges, in this paper, we adopt a system-theoretic coordinate-invariant Riemannian metric for the space of dynamic output-feedback controllers and develop a Riemannian gradient descent for direct LQG policy optimization. We then proceed to prove that the orbit space of such controllers, modulo the coordinate transformation, admits a Riemannian quotient manifold structure. This geometric structure--that is of independent interest--provides an effective approach to derive direct policy optimization algorithms for LQG with a local linear rate convergence guarantee. Subsequently, we show that the proposed approach exhibits significantly faster and more robust numerical performance as compared with ordinary gradient descent. △ Less

Submitted 15 August, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Journal ref: IEEE Control Systems Letters, vol. 8, pp. 1577-1582, 2024

arXiv:2311.12230 [pdf, other]

Data-Guided Regulator for Adaptive Nonlinear Control

Authors: Niyousha Rahimi, Mehran Mesbahi

Abstract: This paper addresses the problem of designing a data-driven feedback controller for complex nonlinear dynamical systems in the presence of time-varying disturbances with unknown dynamics. Such disturbances are modeled as the "unknown" part of the system dynamics. The goal is to achieve finite-time regulation of system states through direct policy updates while also generating informative data that… ▽ More This paper addresses the problem of designing a data-driven feedback controller for complex nonlinear dynamical systems in the presence of time-varying disturbances with unknown dynamics. Such disturbances are modeled as the "unknown" part of the system dynamics. The goal is to achieve finite-time regulation of system states through direct policy updates while also generating informative data that can subsequently be used for data-driven stabilization or system identification. First, we expand upon the notion of "regularizability" and characterize this system characteristic for a linear time-varying representation of the nonlinear system with locally-bounded higher-order terms. "Rapid-regularizability" then gauges the extent by which a system can be regulated in finite time, in contrast to its asymptotic behavior. We then propose the Data-Guided Regulation for Adaptive Nonlinear Control ( DG-RAN) algorithm, an online iterative synthesis procedure that utilizes discrete time-series data from a single trajectory for regulating system states and identifying disturbance dynamics. The effectiveness of our approach is demonstrated on a 6-DOF power descent guidance problem in the presence of adverse environmental disturbances. △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2311.10221 [pdf, other]

An Active-Sensing Approach for Bearing-based Target Localization

Authors: Beniamino Pozzan, Giulia Michieletto, Mehran Mesbahi, Angelo Cenedese

Abstract: Characterized by a cross-disciplinary nature, the bearing-based target localization task involves estimating the position of an entity of interest by a group of agents capable of collecting noisy bearing measurements. In this work, this problem is tackled by resting both on the weighted least square estimation approach and on the active-sensing control paradigm. Indeed, we propose an iterative alg… ▽ More Characterized by a cross-disciplinary nature, the bearing-based target localization task involves estimating the position of an entity of interest by a group of agents capable of collecting noisy bearing measurements. In this work, this problem is tackled by resting both on the weighted least square estimation approach and on the active-sensing control paradigm. Indeed, we propose an iterative algorithm that provides an estimate of the target position under the assumption of Gaussian noise distribution, which can be considered valid when more specific information is missing. Then, we present a seeker agents control law that aims at minimizing the localization uncertainty by optimizing the covariance matrix associated with the estimated target position. The validity of the designed bearing-based target localization solution is confirmed by the results of an extensive Monte Carlo simulation campaign. △ Less

Submitted 16 November, 2023; originally announced November 2023.

arXiv:2308.08054 [pdf, other]

Consensus on Lie groups for the Riemannian Center of Mass

Authors: Spencer Kraisler, Shahriar Talebi, Mehran Mesbahi

Abstract: In this paper, we develop a consensus algorithm for distributed computation of the Riemannian center of mass (RCM) on Lie Groups. The algorithm is built upon a distributed optimization reformulation that allows developing an intrinsic, distributed (without relying on a consensus subroutine), and a computationally efficient protocol for the RCM computation. The novel idea for developing this fast d… ▽ More In this paper, we develop a consensus algorithm for distributed computation of the Riemannian center of mass (RCM) on Lie Groups. The algorithm is built upon a distributed optimization reformulation that allows developing an intrinsic, distributed (without relying on a consensus subroutine), and a computationally efficient protocol for the RCM computation. The novel idea for developing this fast distributed algorithm is to utilize a Riemannian version of distributed gradient flow combined with a gradient tracking technique. We first guarantee that, under certain conditions, the limit point of our algorithm is the RCM point of interest. We then provide a proof of global convergence in the Euclidean setting, that can be viewed as a "geometric" dynamic consensus that converges to the average from arbitrary initial points. Finally, we proceed to showcase the superior convergence properties of the proposed approach as compared with other classes of consensus optimization-based algorithms for the RCM computation. △ Less

Submitted 15 August, 2023; originally announced August 2023.

arXiv:2305.17836 [pdf, other]

Data-driven Optimal Filtering for Linear Systems with Unknown Noise Covariances

Authors: Shahriar Talebi, Amirhossein Taghvaei, Mehran Mesbahi

Abstract: This paper examines learning the optimal filtering policy, known as the Kalman gain, for a linear system with unknown noise covariance matrices using noisy output data. The learning problem is formulated as a stochastic policy optimization problem, aiming to minimize the output prediction error. This formulation provides a direct bridge between data-driven optimal control and, its dual, optimal fi… ▽ More This paper examines learning the optimal filtering policy, known as the Kalman gain, for a linear system with unknown noise covariance matrices using noisy output data. The learning problem is formulated as a stochastic policy optimization problem, aiming to minimize the output prediction error. This formulation provides a direct bridge between data-driven optimal control and, its dual, optimal filtering. Our contributions are twofold. Firstly, we conduct a thorough convergence analysis of the stochastic gradient descent algorithm, adopted for the filtering problem, accounting for biased gradients and stability constraints. Secondly, we carefully leverage a combination of tools from linear system theory and high-dimensional statistics to derive bias-variance error bounds that scale logarithmically with problem dimension, and, in contrast to subspace methods, the length of output trajectories only affects the bias term. △ Less

Submitted 26 October, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

Comments: arXiv admin note: text overlap with arXiv:2210.14878

arXiv:2303.10504 [pdf, other]

doi 10.1109/LCSYS.2023.3290229

Optimization-based Constrained Funnel Synthesis for Systems with Lipschitz Nonlinearities via Numerical Optimal Control

Authors: Taewan Kim, Purnanand Elango, Taylor P. Reynolds, Behçet Açıkmeşe, Mehran Mesbahi

Abstract: This paper presents a funnel synthesis algorithm for computing controlled invariant sets and feedback control gains around a given nominal trajectory for dynamical systems with locally Lipschitz nonlinearities and bounded disturbances. The resulting funnel synthesis problem involves a differential linear matrix inequality (DLMI) whose solution satisfies a Lyapunov condition that implies invariance… ▽ More This paper presents a funnel synthesis algorithm for computing controlled invariant sets and feedback control gains around a given nominal trajectory for dynamical systems with locally Lipschitz nonlinearities and bounded disturbances. The resulting funnel synthesis problem involves a differential linear matrix inequality (DLMI) whose solution satisfies a Lyapunov condition that implies invariance and attractivity properties. Due to these properties, the proposed method can balance maximization of initial invariant funnel size, i.e., size of the funnel entry, and minimization of the size of the attractive funnel for attenuating the effect of disturbance. To solve the resulting funnel synthesis problem with the DLMI as constraints, we employ a numerical optimal control approach that uses a multiple shooting method to convert the problem into a finite dimensional semidefinite programming problem. This framework does not require piecewise linear system matrices and funnel parameters, which is typically assumed in recent related work. We illustrate the proposed funnel synthesis method with a numerical example. △ Less

Submitted 1 July, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

Comments: 6 pages, 3 figures, accepted to LCSS

arXiv:2210.14878 [pdf, other]

Duality-Based Stochastic Policy Optimization for Estimation with Unknown Noise Covariances

Authors: Shahriar Talebi, Amirhossein Taghvaei, Mehran Mesbahi

Abstract: Duality of control and estimation allows mapping recent advances in data-guided control to the estimation setup. This paper formalizes and utilizes such a mapping to consider learning the optimal (steady-state) Kalman gain when process and measurement noise statistics are unknown. Specifically, building on the duality between synthesizing optimal control and estimation gains, the filter design pro… ▽ More Duality of control and estimation allows mapping recent advances in data-guided control to the estimation setup. This paper formalizes and utilizes such a mapping to consider learning the optimal (steady-state) Kalman gain when process and measurement noise statistics are unknown. Specifically, building on the duality between synthesizing optimal control and estimation gains, the filter design problem is formalized as direct policy learning. In this direction, the duality is used to extend existing theoretical guarantees of direct policy updates for Linear Quadratic Regulator (LQR) to establish global convergence of the Gradient Descent (GD) algorithm for the estimation problem--while addressing subtle differences between the two synthesis problems. Subsequently, a Stochastic Gradient Descent (SGD) approach is adopted to learn the optimal Kalman gain without the knowledge of noise covariances. The results are illustrated via several numerical examples. △ Less

Submitted 6 March, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

arXiv:2210.04810 [pdf, other]

Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies

Authors: Bin Hu, Kaiqing Zhang, Na Li, Mehran Mesbahi, Maryam Fazel, Tamer Başar

Abstract: Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning. This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synt… ▽ More Gradient-based methods have been widely used for system design and optimization in diverse application domains. Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning. This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis, popularized by successes of reinforcement learning. We take an interdisciplinary perspective in our exposition that connects control theory, reinforcement learning, and large-scale optimization. We review a number of recently-developed theoretical results on the optimization landscape, global convergence, and sample complexity of gradient-based methods for various continuous control problems such as the linear quadratic regulator (LQR), $\mathcal{H}_\infty$ control, risk-sensitive control, linear quadratic Gaussian (LQG) control, and output feedback synthesis. In conjunction with these optimization results, we also discuss how direct policy optimization handles stability and robustness concerns in learning-based control, two main desiderata in control engineering. We conclude the survey by pointing out several challenges and opportunities at the intersection of learning and control. △ Less

Submitted 10 October, 2022; originally announced October 2022.

Comments: To Appear in Annual Review of Control, Robotics, and Autonomous Systems

arXiv:2209.10786 [pdf, ps, other]

Vector-valued Privacy-Preserving Average Consensus

Authors: Lulu Pan, Haibin Shao, Yang Lu, Mehran Mesbahi, Dewei Li, Yugeng Xi

Abstract: Achieving average consensus without disclosing sensitive information can be a critical concern for multi-agent coordination. This paper examines privacy-preserving average consensus (PPAC) for vector-valued multi-agent networks. In particular, a set of agents with vector-valued states aim to collaboratively reach an exact average consensus of their initial states, while each agent's initial state… ▽ More Achieving average consensus without disclosing sensitive information can be a critical concern for multi-agent coordination. This paper examines privacy-preserving average consensus (PPAC) for vector-valued multi-agent networks. In particular, a set of agents with vector-valued states aim to collaboratively reach an exact average consensus of their initial states, while each agent's initial state cannot be disclosed to other agents. We show that the vector-valued PPAC problem can be solved via associated matrix-weighted networks with the higher-dimensional agent state. Specifically, a novel distributed vector-valued PPAC algorithm is proposed by lifting the agent-state to higher-dimensional space and designing the associated matrix-weighted network with dynamic, low-rank, positive semi-definite coupling matrices to both conceal the vector-valued agent state and guarantee that the multi-agent network asymptotically converges to the average consensus. Essentially, the convergence analysis can be transformed into the average consensus problem on switching matrix-weighted networks. We show that the exact average consensus can be guaranteed and the initial agents' states can be kept private if each agent has at least one "legitimate" neighbor. The algorithm, involving only basic matrix operations, is computationally more efficient than cryptography-based approaches and can be implemented in a fully distributed manner without relying on a third party. Numerical simulation is provided to illustrate the effectiveness of the proposed algorithm. △ Less

Submitted 22 September, 2022; originally announced September 2022.

arXiv:2208.13223 [pdf, ps, other]

Structural Adaptivity of Directed Networks

Authors: Lulu Pan, Haibin Shao, Mehran Mesbahi, Dewei Li, Yugeng Xi

Abstract: Network structure plays a critical role in functionality and performance of network systems. This paper examines structural adaptivity of diffusively coupled, directed multi-agent networks that are subject to diffusion performance. Inspired by the observation that the link redundancy in a network may degrade its diffusion performance, a distributed data-driven neighbor selection framework is propo… ▽ More Network structure plays a critical role in functionality and performance of network systems. This paper examines structural adaptivity of diffusively coupled, directed multi-agent networks that are subject to diffusion performance. Inspired by the observation that the link redundancy in a network may degrade its diffusion performance, a distributed data-driven neighbor selection framework is proposed to adaptively adjust the network structure for improving the diffusion performance of exogenous influence over the network. Specifically, each agent is allowed to interact with only a specific subset of neighbors while global reachability from exogenous influence to all agents of the network is maintained. Both continuous-time and discrete-time directed networks are examined. For each of the two cases, we first examine the reachability properties encoded in the eigenvectors of perturbed variants of graph Laplacian or SIA matrix associated with directed networks, respectively. Then, an eigenvector-based rule for neighbor selection is proposed to derive a reduced network, on which the diffusion performance is enhanced. Finally, motivated by the necessity of distributed and data-driven implementation of the neighbor selection rule, quantitative connections between eigenvectors of the perturbed graph Laplacian and SIA matrix and relative rate of change in agent state are established, respectively. These connections immediately enable a data-driven inference of the reduced neighbor set for each agent using only locally accessible data. As an immediate extension, we further discuss the distributed data-driven construction of directed spanning trees of directed networks using the proposed neighbor selection framework. Numerical simulations are provided to demonstrate the theoretical results. △ Less

Submitted 28 August, 2022; originally announced August 2022.

arXiv:2208.08969 [pdf, other]

To charge in-flight or not: an inquiry into parallel-hybrid electric aircraft configurations via optimal control

Authors: Mengyuan Wang, Mehran Mesbahi

Abstract: We examine two configurations for parallel hybrid electric aircraft, one with, and one without, a mechanical connection between the engines and the electric motors. For this two designs, we then review the power allocation problem in the context of aircraft energy management for a 19-seat conceptual Hybrid Electric Aircraft. We then represent the original optimal control problem as a finite-dimens… ▽ More We examine two configurations for parallel hybrid electric aircraft, one with, and one without, a mechanical connection between the engines and the electric motors. For this two designs, we then review the power allocation problem in the context of aircraft energy management for a 19-seat conceptual Hybrid Electric Aircraft. We then represent the original optimal control problem as a finite-dimensional optimization and validate the second-order sufficient conditions for global optimality of the obtained solution. This is then followed by a sensitivity analysis of the fuel consumption on the initial aircraft weight and flight endurance. Our simulation and theoretical results clarify the limited benefit of charging the battery in-flight for this class of hybrid electric aircraft to reduce $CO_2$ emissions. △ Less

Submitted 18 August, 2022; originally announced August 2022.

arXiv:2203.05702 [pdf, other]

Vertiport Selection in Hybrid Air-Ground Transportation Networks via Mathematical Programs with Equilibrium Constraints

Authors: Yue Yu, Mengyuan Wang, Mehran Mesbahi, Ufuk Topcu

Abstract: Urban air mobility is a concept that promotes aerial modes of transport in urban areas. In these areas, the location and capacity of the vertiports--where the travelers embark and disembark the aircraft--not only affect the flight delays of the aircraft, but can also aggravate the congestion of ground vehicles by creating extra ground travel demands. We introduce a mathematical model for selecting… ▽ More Urban air mobility is a concept that promotes aerial modes of transport in urban areas. In these areas, the location and capacity of the vertiports--where the travelers embark and disembark the aircraft--not only affect the flight delays of the aircraft, but can also aggravate the congestion of ground vehicles by creating extra ground travel demands. We introduce a mathematical model for selecting the location and capacity of the vertiports that minimizes the traffic congestion in hybrid air-ground transportation networks. Our model is based on a mathematical program with bilinear equilibrium constraints. Furthermore, we show how to compute a global optimal solution of this mathematical program by solving a mixed integer linear program. We demonstrate our results via the Anaheim transportation network model, which contains more than 400 nodes and 900 links. △ Less

Submitted 1 July, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

arXiv:2201.11157 [pdf, other]

Policy Optimization over Submanifolds for Linearly Constrained Feedback Synthesis

Authors: Shahriar Talebi, Mehran Mesbahi

Abstract: In this paper, we study linearly constrained policy optimization over the manifold of Schur stabilizing controllers, equipped with a Riemannian metric that emerges naturally in the context of optimal control problems. We provide extrinsic analysis of a generic constrained smooth cost function, that subsequently facilitates subsuming any such constrained problem into this framework. By studying the… ▽ More In this paper, we study linearly constrained policy optimization over the manifold of Schur stabilizing controllers, equipped with a Riemannian metric that emerges naturally in the context of optimal control problems. We provide extrinsic analysis of a generic constrained smooth cost function, that subsequently facilitates subsuming any such constrained problem into this framework. By studying the second order geometry of this manifold, we provide a Newton-type algorithm that does not rely on the exponential mapping nor a retraction, while ensuring local convergence guarantees. The algorithm hinges instead upon the developed stability certificate and the linear structure of the constraints. We then apply our methodology to two well-known constrained optimal control problems. Finally, several numerical examples showcase the performance of the proposed algorithm. △ Less

Submitted 26 October, 2023; v1 submitted 26 January, 2022; originally announced January 2022.

arXiv:2109.02347 [pdf, ps, other]

Discrete-Time Linear-Quadratic Regulation via Optimal Transport

Authors: Mathias Hudoba de Badyn, Erik Miehling, Dylan Janak, Behçet Açıkmeşe, Mehran Mesbahi, Tamer Başar, John Lygeros, Roy S. Smith

Abstract: In this paper, we consider a discrete-time stochastic control problem with uncertain initial and target states. We first discuss the connection between optimal transport and stochastic control problems of this form. Next, we formulate a linear-quadratic regulator problem where the initial and terminal states are distributed according to specified probability densities. A closed-form solution for t… ▽ More In this paper, we consider a discrete-time stochastic control problem with uncertain initial and target states. We first discuss the connection between optimal transport and stochastic control problems of this form. Next, we formulate a linear-quadratic regulator problem where the initial and terminal states are distributed according to specified probability densities. A closed-form solution for the optimal transport map in the case of linear-time varying systems is derived, along with an algorithm for computing the optimal map. Two numerical examples pertaining to swarm deployment demonstrate the practical applicability of the model, and performance of the numerical method. △ Less

Submitted 6 September, 2021; originally announced September 2021.

Comments: 8 pages, 6 figures. To be included in the Proceedings of the 60th Conference on Decision and Control. This version includes proofs

arXiv:2107.12022 [pdf, ps, other]

Distributed Neighbor Selection in Multi-agent Networks

Authors: Haibin Shao, Lulu Pan, Mehran Mesbahi, Yugeng Xi, Dewei Li

Abstract: Achieving consensus via nearest neighbor rules is an important prerequisite for multi-agent networks to accomplish collective tasks. A common assumption in consensus setup is that each agent interacts with all its neighbors. This paper examines whether network functionality and performance can be maintained-and even enhanced-when agents interact only with a subset of their respective (available) n… ▽ More Achieving consensus via nearest neighbor rules is an important prerequisite for multi-agent networks to accomplish collective tasks. A common assumption in consensus setup is that each agent interacts with all its neighbors. This paper examines whether network functionality and performance can be maintained-and even enhanced-when agents interact only with a subset of their respective (available) neighbors. As shown in the paper, the answer to this inquiry is affirmative. In this direction, we show that by exploring the monotonicity property of the Laplacian eigenvectors, a neighbor selection rule with guaranteed performance enhancements, can be realized for consensus-type networks. For distributed implementation, a quantitative connection between entries of Laplacian eigenvectors and the "relative rate of change" in the state between neighboring agents is further established; this connection facilitates a distributed algorithm for each agent to identify "favorable" neighbors to interact with. Multi-agent networks with and without external influence are examined, as well as extensions to signed networks. This paper underscores the utility of Laplacian eigenvectors in the context of distributed neighbor selection, providing novel insights into distributed data-driven control of multi-agent systems. △ Less

Submitted 22 June, 2022; v1 submitted 26 July, 2021; originally announced July 2021.

arXiv:2107.09292 [pdf, ps, other]

Cluster Consensus on Matrix-weighted Switching Networks

Authors: Lulu Pan, Haibin Shao, Mehran Mesbahi, Dewei Li, Yugeng Xi

Abstract: This paper examines the cluster consensus problem of multi-agent systems on matrix-weighted switching networks. Necessary and/or sufficient conditions under which cluster consensus can be achieved are obtained and quantitative characterization of the steady-state of the cluster consensus are provided as well. Specifically, if the underlying network switches amongst finite number of networks, a nec… ▽ More This paper examines the cluster consensus problem of multi-agent systems on matrix-weighted switching networks. Necessary and/or sufficient conditions under which cluster consensus can be achieved are obtained and quantitative characterization of the steady-state of the cluster consensus are provided as well. Specifically, if the underlying network switches amongst finite number of networks, a necessary condition for cluster consensus of multi-agent system on switching matrix-weighted networks is firstly presented, it is shown that the steady-state of the system lies in the intersection of the null space of matrix-valued Laplacians corresponding to all switching networks. Second, if the underlying network switches amongst infinite number of networks, the matrix-weighted integral network is employed to provide sufficient conditions for cluster consensus and the quantitative characterization of the corresponding steady-state of the multi-agent system, using null space analysis of matrix-valued Laplacian related of integral network associated with the switching networks. In particular, conditions for the bipartite consensus under the matrix-weighted switching networks are examined. Simulation results are finally provided to demonstrate the theoretical analysis. △ Less

Submitted 20 July, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

arXiv:2103.11572 [pdf, other]

Data-Driven Structured Policy Iteration for Homogeneous Distributed Systems

Authors: Siavash Alemzadeh, Shahriar Talebi, Mehran Mesbahi

Abstract: Control of networked systems, comprised of interacting agents, is often achieved through modeling the underlying interactions. Constructing accurate models of such interactions--in the meantime--can become prohibitive in applications. Data-driven control methods avoid such complications by directly synthesizing a controller from the observed data. In this paper, we propose an algorithm referred to… ▽ More Control of networked systems, comprised of interacting agents, is often achieved through modeling the underlying interactions. Constructing accurate models of such interactions--in the meantime--can become prohibitive in applications. Data-driven control methods avoid such complications by directly synthesizing a controller from the observed data. In this paper, we propose an algorithm referred to as Data-driven Structured Policy Iteration (D2SPI), for synthesizing an efficient feedback mechanism that respects the sparsity pattern induced by the underlying interaction network. In particular, our algorithm uses temporary "auxiliary" communication links in order to enable the required information exchange on a (smaller) sub-network during the "learning phase" -- links that will be removed subsequently for the final distributed feedback synthesis. We then proceed to show that the learned policy results in a stabilizing structured policy for the entire network. Our analysis is then followed by showing the stability and convergence of the proposed distributed policies throughout the learning phase, exploiting a construct referred to as the "Patterned monoid.'' The performance of D2SPI is then demonstrated using representative simulation scenarios. △ Less

Submitted 16 November, 2023; v1 submitted 21 March, 2021; originally announced March 2021.

Comments: S. Alemzadeh and S. Talebi contributed equally to this work

arXiv:2102.02953 [pdf, other]

On Controllability and Persistency of Excitation in Data-Driven Control: Extensions of Willems' Fundamental Lemma

Authors: Yue Yu, Shahriar Talebi, Henk J. van Waarde, Ufuk Topcu, Mehran Mesbahi, Behçet Açıkmeşe

Abstract: Willems' fundamental lemma asserts that all trajectories of a linear time-invariant system can be obtained from a finite number of measured ones, assuming that controllability and a persistency of excitation condition hold. We show that these two conditions can be relaxed. First, we prove that the controllability condition can be replaced by a condition on the controllable subspace, unobservable s… ▽ More Willems' fundamental lemma asserts that all trajectories of a linear time-invariant system can be obtained from a finite number of measured ones, assuming that controllability and a persistency of excitation condition hold. We show that these two conditions can be relaxed. First, we prove that the controllability condition can be replaced by a condition on the controllable subspace, unobservable subspace, and a certain subspace associated with the measured trajectories. Second, we prove that the persistency of excitation requirement can be relaxed if the degree of a certain minimal polynomial is tightly bounded. Our results show that data-driven predictive control using online data is equivalent to model predictive control, even for uncontrollable systems. Moreover, our results significantly reduce the amount of data needed in identifying homogeneous multi-agent systems. △ Less

Submitted 9 April, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

arXiv:2007.10960 [pdf, other]

Adaptive Traffic Control with Deep Reinforcement Learning: Towards State-of-the-art and Beyond

Authors: Siavash Alemzadeh, Ramin Moslemi, Ratnesh Sharma, Mehran Mesbahi

Abstract: In this work, we study adaptive data-guided traffic planning and control using Reinforcement Learning (RL). We shift from the plain use of classic methods towards state-of-the-art in deep RL community. We embed several recent techniques in our algorithm that improve the original Deep Q-Networks (DQN) for discrete control and discuss the traffic-related interpretations that follow. We propose a nov… ▽ More In this work, we study adaptive data-guided traffic planning and control using Reinforcement Learning (RL). We shift from the plain use of classic methods towards state-of-the-art in deep RL community. We embed several recent techniques in our algorithm that improve the original Deep Q-Networks (DQN) for discrete control and discuss the traffic-related interpretations that follow. We propose a novel DQN-based algorithm for Traffic Control (called TC-DQN+) as a tool for fast and more reliable traffic decision-making. We introduce a new form of reward function which is further discussed using illustrative examples with comparisons to traditional traffic control methods. △ Less

Submitted 21 July, 2020; originally announced July 2020.

arXiv:2007.05880 [pdf, other]

Deep Learning-based Resource Allocation for Infrastructure Resilience

Authors: Siavash Alemzadeh, Hesam Talebiyan, Shahriar Talebi, Leonardo Duenas-Osorio, Mehran Mesbahi

Abstract: From an optimization point of view, resource allocation is one of the cornerstones of research for addressing limiting factors commonly arising in applications such as power outages and traffic jams. In this paper, we take a data-driven approach to estimate an optimal nodal restoration sequence for immediate recovery of the infrastructure networks after natural disasters such as earthquakes. We ge… ▽ More From an optimization point of view, resource allocation is one of the cornerstones of research for addressing limiting factors commonly arising in applications such as power outages and traffic jams. In this paper, we take a data-driven approach to estimate an optimal nodal restoration sequence for immediate recovery of the infrastructure networks after natural disasters such as earthquakes. We generate data from td-INDP, a high-fidelity simulator of optimal restoration strategies for interdependent networks, and employ deep neural networks to approximate those strategies. Despite the fact that the underlying problem is NP-complete, the restoration sequences obtained by our method are observed to be nearly optimal. In addition, by training multiple models---the so-called estimators---for a variety of resource availability levels, our proposed method balances a trade-off between resource utilization and restoration time. Decision-makers can use our trained models to allocate resources more efficiently after contingencies, and in turn, improve the community resilience. Besides their predictive power, such trained estimators unravel the effect of interdependencies among different nodal functionalities in the restoration strategies. We showcase our methodology by the real-world interdependent infrastructure of Shelby County, TN. △ Less

Submitted 11 July, 2020; originally announced July 2020.

arXiv:2006.16201 [pdf, ps, other]

Graph-theoretic optimization for edge consensus

Authors: Mathias Hudoba de Badyn, Dillon R. Foight, Daniel Calderone, Mehran Mesbahi, Roy S. Smith

Abstract: We consider network structures that optimize the $\mathcal{H}_2$ norm of weighted, time scaled consensus networks, under a minimal representation of such consensus networks described by the edge Laplacian. We show that a greedy algorithm can be used to find the minimum-$\mathcal{H}_2$ norm spanning tree, as well as how to choose edges to optimize the $\mathcal{H}_2$ norm when edges are added back… ▽ More We consider network structures that optimize the $\mathcal{H}_2$ norm of weighted, time scaled consensus networks, under a minimal representation of such consensus networks described by the edge Laplacian. We show that a greedy algorithm can be used to find the minimum-$\mathcal{H}_2$ norm spanning tree, as well as how to choose edges to optimize the $\mathcal{H}_2$ norm when edges are added back to a spanning tree. In the case of edge consensus with a measurement model considering all edges in the graph, we show that adding edges between slow nodes in the graph provides the smallest increase in the $\mathcal{H}_2$ norm. △ Less

Submitted 29 June, 2020; originally announced June 2020.

Comments: 8 pages, 3 figures. Accepted to the 24th International Symposium on Mathematical Theory of Networks and Systems (MTNS 2020), which has been postponed to August 2021. This version is the extended paper, which includes the proofs that were submitted for review

arXiv:2006.09178 [pdf, other]

Policy Gradient-based Algorithms for Continuous-time Linear Quadratic Control

Authors: Jingjing Bu, Afshin Mesbahi, Mehran Mesbahi

Abstract: We consider the continuous-time Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. The results developed are in parallel to those in Bu et al. [1] for discrete-time LTI systems. In this direction, we characterize several analytical properties (smoothness, coerciveness, quadratic growth) that are crucial in the analysis of g… ▽ More We consider the continuous-time Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. The results developed are in parallel to those in Bu et al. [1] for discrete-time LTI systems. In this direction, we characterize several analytical properties (smoothness, coerciveness, quadratic growth) that are crucial in the analysis of gradient-based algorithms. We also point out similarities and distinctive features of the continuous time setup in comparison with its discrete time analogue. First, we examine three types of well-posed flows direct policy update for LQR: gradient flow, natural gradient flow and the quasi-Newton flow. The coercive property of the corresponding cost function suggests that these flows admit unique solutions while the gradient dominated property indicates that the underling Lyapunov functionals decay at an exponential rate; quadratic growth on the other hand guarantees that the trajectories of these flows are exponentially stable in the sense of Lyapunov. We then discuss the forward Euler discretization of these flows, realized as gradient descent, natural gradient descent and quasi-Newton iteration. We present stepsize criteria for gradient descent and natural gradient descent, guaranteeing that both algorithms converge linearly to the global optima. An optimal stepsize for the quasi-Newton iteration is also proposed, guaranteeing a $Q$-quadratic convergence rate--and in the meantime--recovering the Kleinman-Newton iteration. Lastly, we examine LQR state feedback synthesis with a sparsity pattern. In this case, we develop the necessary formalism and insights for projected gradient descent, allowing us to guarantee a sublinear rate of convergence to a first-order stationary point. △ Less

Submitted 12 June, 2020; originally announced June 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1907.08921

arXiv:2006.08548 [pdf, ps, other]

A Note on Nesterov's Accelerated Method in Nonconvex Optimization: a Weak Estimate Sequence Approach

Authors: Jingjing Bu, Mehran Mesbahi

Abstract: We present a variant of accelerated gradient descent algorithms, adapted from Nesterov's optimal first-order methods, for weakly-quasi-convex and weakly-quasi-strongly-convex functions. We show that by tweaking the so-called estimate sequence method, the derived algorithm achieves optimal convergence rate for weakly-quasi-convex and weakly-quasi-strongly-convex in terms of oracle complexity. In pa… ▽ More We present a variant of accelerated gradient descent algorithms, adapted from Nesterov's optimal first-order methods, for weakly-quasi-convex and weakly-quasi-strongly-convex functions. We show that by tweaking the so-called estimate sequence method, the derived algorithm achieves optimal convergence rate for weakly-quasi-convex and weakly-quasi-strongly-convex in terms of oracle complexity. In particular, for a weakly-quasi-convex function with Lipschitz continuous gradient, we require $O(\frac{1}{\sqrt{\varepsilon}})$ iterations to acquire an $\varepsilon$-solution; for weakly-quasi-strongly-convex functions, the iteration complexity is $O\left( \ln\left(\frac{1}{\varepsilon}\right) \right)$. Furthermore, we discuss the implications of these algorithms for linear quadratic optimal control problem. △ Less

Submitted 15 June, 2020; originally announced June 2020.

arXiv:2006.04617 [pdf, ps, other]

doi 10.1109/TCNS.2020.3001835

Performance and design of consensus on matrix-weighted and time scaled graphs

Authors: Dillon R. Foight, Mathias Hudoba de Badyn, Mehran Mesbahi

Abstract: In this paper, we consider the $\mathcal{H}_2$-norm of networked systems with multi-time scale consensus dynamics and vector-valued agent states. This allows us to explore how measurement and process noise affect consensus on matrix-weighted graphs by examining edge-state consensus. In particular, we highlight an interesting case where the influences of the weighting and scaling on the… ▽ More In this paper, we consider the $\mathcal{H}_2$-norm of networked systems with multi-time scale consensus dynamics and vector-valued agent states. This allows us to explore how measurement and process noise affect consensus on matrix-weighted graphs by examining edge-state consensus. In particular, we highlight an interesting case where the influences of the weighting and scaling on the $\mathcal{H}_2$ norm can be separated in the design problem. We then consider optimization algorithms for updating the time scale parameters and matrix weights in order to minimize network response to injected noise. Finally, we present an application to formation control for multi-vehicle systems. △ Less

Submitted 24 December, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: 10 pages, 5 figures, accepted to the IEEE Transactions on Control of Network Systems. arXiv admin note: text overlap with arXiv:1909.07864

Journal ref: IEEE Transactions on Control of Network Systems, 2020, vol. 7, no. 4, pp. 1812-1822

arXiv:2006.00870 [pdf, ps, other]

From noisy data to feedback controllers: non-conservative design via a matrix S-lemma

Authors: Henk J. van Waarde, M. Kanat Camlibel, Mehran Mesbahi

Abstract: We propose a new method to obtain feedback controllers of an unknown dynamical system directly from noisy input/state data. The key ingredient of our design is a new matrix S-lemma that will be proven in this paper. We provide both strict and non-strict versions of this S-lemma, that are of interest in their own right. Thereafter, we will apply these results to data-driven control. In particular,… ▽ More We propose a new method to obtain feedback controllers of an unknown dynamical system directly from noisy input/state data. The key ingredient of our design is a new matrix S-lemma that will be proven in this paper. We provide both strict and non-strict versions of this S-lemma, that are of interest in their own right. Thereafter, we will apply these results to data-driven control. In particular, we will derive non-conservative design methods for quadratic stabilization, H_2 and H_inf control, all in terms of data-based linear matrix inequalities. In contrast to previous work, the dimensions of our decision variables are independent of the time horizon of the experiment. Our approach thus enables control design from large data sets. △ Less

Submitted 9 December, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

arXiv:2006.00125 [pdf, other]

doi 10.1109/TAC.2021.3131148

On Regularizability and its Application to Online Control of Unstable LTI Systems

Authors: Shahriar Talebi, Siavash Alemzadeh, Niyousha Rahimi, Mehran Mesbahi

Abstract: Learning, say through direct policy updates, often requires assumptions such as knowing a priori that the initial policy (gain) is stabilizing, or persistently exciting (PE) input-output data, is available. In this paper, we examine online regulation of (possibly unstable) partially unknown linear systems with no prior access to an initial stabilizing controller nor PE input-output data; we instea… ▽ More Learning, say through direct policy updates, often requires assumptions such as knowing a priori that the initial policy (gain) is stabilizing, or persistently exciting (PE) input-output data, is available. In this paper, we examine online regulation of (possibly unstable) partially unknown linear systems with no prior access to an initial stabilizing controller nor PE input-output data; we instead leverage the knowledge of the input matrix for online regulation. First, we introduce and characterize the notion of "regularizability" for linear systems that gauges the extent by which a system can be regulated in finite-time in contrast to its asymptotic behavior (commonly characterized by stabilizability/controllability). Next, having access only to the input matrix, we propose the Data-Guided Regulation (DGR) synthesis procedure that -- as its name suggests -- regulates the underlying state while also generating informative data that can subsequently be used for data-driven stabilization or system identification. We further improve the computational performance of DGR via a rank-one update and demonstrate its utility in online regulation of the X-29 aircraft. △ Less

Submitted 19 January, 2022; v1 submitted 29 May, 2020; originally announced June 2020.

arXiv:2003.05396 [pdf, other]

doi 10.1109/TAC.2020.2979758

$\mathcal{H}_2$ performance of series-parallel networks: A compositional perspective

Authors: Mathias Hudoba de Badyn, Mehran Mesbahi

Abstract: We examine the $\mathcal{H}_2$ norm of matrix-weighted leader-follower consensus on series-parallel networks. By using an extension of electrical network theory on matrix-valued resistances, voltages and currents, we show that the computation of the $\mathcal{H}_2$ norm can be performed efficiently by decomposing the network into atomic elements and composition rules. Lastly, we examine the proble… ▽ More We examine the $\mathcal{H}_2$ norm of matrix-weighted leader-follower consensus on series-parallel networks. By using an extension of electrical network theory on matrix-valued resistances, voltages and currents, we show that the computation of the $\mathcal{H}_2$ norm can be performed efficiently by decomposing the network into atomic elements and composition rules. Lastly, we examine the problem of efficiently adapting the matrix-valued edge weights to optimize the $\mathcal{H}_2$ norm of the network. △ Less

Submitted 24 December, 2020; v1 submitted 11 March, 2020; originally announced March 2020.

Comments: Provisionally accepted to the IEEE Transactions on Automatic Control. arXiv admin note: substantial text overlap with arXiv:1903.05325

Journal ref: IEEE Transactions on Automatic Control, 2021, vol. 6, no. 1, pp. 354 - 361

arXiv:2002.05023 [pdf, other]

Global Convergence of Policy Gradient Algorithms for Indefinite Least Squares Stationary Optimal Control

Authors: Jingjing Bu, Mehran Mesbahi

Abstract: We consider policy gradient algorithms for the indefinite least squares stationary optimal control, e.g., linear-quadratic-regulator (LQR) with indefinite state and input penalization matrices. Such a setup has important applications in control design with conflicting objectives, such as linear quadratic dynamic games. We show the global convergence of gradient, natural gradient and quasi-Newton p… ▽ More We consider policy gradient algorithms for the indefinite least squares stationary optimal control, e.g., linear-quadratic-regulator (LQR) with indefinite state and input penalization matrices. Such a setup has important applications in control design with conflicting objectives, such as linear quadratic dynamic games. We show the global convergence of gradient, natural gradient and quasi-Newton policies for this class of indefinite least squares problems. △ Less

Submitted 10 February, 2020; originally announced February 2020.

Comments: arXiv admin note: text overlap with arXiv:1911.04672

arXiv:2001.11179 [pdf, ps, other]

Consensus on Matrix-weighted Time-varying Networks

Authors: Lulu Pan, Haibin Shao, Mehran Mesbahi, Yugeng Xi, Dewei Li

Abstract: This paper examines the consensus problem on time-varying matrix-weighed undirected networks. First, we introduce the matrix-weighted integral network for the analysis of such networks. Under mild assumptions on the switching pattern of the time-varying network, necessary and/or sufficient conditions for which average consensus can be achieved are then provided in terms of the null space of matrix… ▽ More This paper examines the consensus problem on time-varying matrix-weighed undirected networks. First, we introduce the matrix-weighted integral network for the analysis of such networks. Under mild assumptions on the switching pattern of the time-varying network, necessary and/or sufficient conditions for which average consensus can be achieved are then provided in terms of the null space of matrix-valued Laplacian of the corresponding integral network. In particular, for periodic matrix-weighted time-varying networks, necessary and sufficient conditions for reaching average consensus is obtained from an algebraic perspective. Moreover, we show that if the integral network with period $T>0$ has a positive spanning tree over the time span $[0,T)$, average consensus for the node states is achieved. Simulation results are provided to demonstrate the theoretical analysis. △ Less

Submitted 30 January, 2020; originally announced January 2020.

arXiv:2001.04035 [pdf, ps, other]

On the Controllability of Matrix-weighted Networks

Authors: Lulu Pan, Haibin Shao, Mehran Mesbahi, Yugeng Xi, Dewei Li

Abstract: This letter examines the controllability of consensus dynamics on matrix-weighed networks from a graph-theoretic perspective. Unlike the scalar-weighted networks, the rank of weight matrix introduces additional intricacies into characterizing the dimension of controllable subspace for such networks. Specifically, we investigate how the definiteness of weight matrices influences the dimension of th… ▽ More This letter examines the controllability of consensus dynamics on matrix-weighed networks from a graph-theoretic perspective. Unlike the scalar-weighted networks, the rank of weight matrix introduces additional intricacies into characterizing the dimension of controllable subspace for such networks. Specifically, we investigate how the definiteness of weight matrices influences the dimension of the controllable subspace. In this direction, graph-theoretic characterizations of the lower and upper bounds on the dimension of the controllable subspace are provided by employing, respectively, distance partition and almost equitable partition of matrix-weighted networks. Furthermore, the structure of an uncontrollable input for such networks is examined. Examples are then provided to demonstrate the theoretical results. △ Less

Submitted 12 January, 2020; originally announced January 2020.

arXiv:1912.07671 [pdf, other]

Data-driven parameterizations of suboptimal LQR and H2 controllers

Authors: Henk J. van Waarde, Mehran Mesbahi

Abstract: In this paper we design suboptimal control laws for an unknown linear system on the basis of measured data. We focus on the suboptimal linear quadratic regulator problem and the suboptimal H2 control problem. For both problems, we establish conditions under which a given data set contains sufficient information for controller design. We follow up by providing a data-driven parameterization of all… ▽ More In this paper we design suboptimal control laws for an unknown linear system on the basis of measured data. We focus on the suboptimal linear quadratic regulator problem and the suboptimal H2 control problem. For both problems, we establish conditions under which a given data set contains sufficient information for controller design. We follow up by providing a data-driven parameterization of all suboptimal controllers. We will illustrate our results by numerical simulations, which will reveal an interesting trade-off between the number of collected data samples and the achieved controller performance. △ Less

Submitted 7 May, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

Comments: 6 pages

arXiv:1911.04672 [pdf, ps, other]

Global Convergence of Policy Gradient for Sequential Zero-Sum Linear Quadratic Dynamic Games

Authors: Jingjing Bu, Lillian J. Ratliff, Mehran Mesbahi

Abstract: We propose projection-free sequential algorithms for linear-quadratic dynamics games. These policy gradient based algorithms are akin to Stackelberg leadership model and can be extended to model-free settings. We show that if the leader performs natural gradient descent/ascent, then the proposed algorithm has a global sublinear convergence to the Nash equilibrium. Moreover, if the leader adopts a… ▽ More We propose projection-free sequential algorithms for linear-quadratic dynamics games. These policy gradient based algorithms are akin to Stackelberg leadership model and can be extended to model-free settings. We show that if the leader performs natural gradient descent/ascent, then the proposed algorithm has a global sublinear convergence to the Nash equilibrium. Moreover, if the leader adopts a quasi-Newton policy, the algorithm enjoys a global quadratic convergence. Along the way, we examine and clarify the intricacies of adopting sequential policy updates for LQ games, namely, issues pertaining to stabilization, indefinite cost structure, and circumventing projection steps. △ Less

Submitted 11 November, 2019; originally announced November 2019.

arXiv:1909.07864 [pdf, ps, other]

Time scale design for network resilience

Authors: Dillon R. Foight, Mathias Hudoba de Badyn, Mehran Mesbahi

Abstract: In this paper we consider the $\mathcal{H}_2$-norm of networked systems with multi-time scale consensus dynamics. We develop a general framework for such systems that allows for edge weighting, independent agent-based time scales, as well as measurement and process noise. From this general system description, we highlight an interesting case where the influences of the weighting and scaling can be… ▽ More In this paper we consider the $\mathcal{H}_2$-norm of networked systems with multi-time scale consensus dynamics. We develop a general framework for such systems that allows for edge weighting, independent agent-based time scales, as well as measurement and process noise. From this general system description, we highlight an interesting case where the influences of the weighting and scaling can be separated in the design problem. We then consider the design of the time scale parameters for minimizing the $\mathcal{H}_2$-norm for the purpose of network resilience. △ Less

Submitted 17 September, 2019; originally announced September 2019.

Comments: 6 pages, accepted to 58th IEEE Conference on Decision and Control

arXiv:1908.11329 [pdf, other]

Augmented State Feedback for Improving Observability of Linear Systems with Nonlinear Measurements

Authors: Atiye Alaeddini, Kristi A. Morgansen, Mehran Mesbahi

Abstract: This paper is concerned with the design of an augmented state feedback controller for finite-dimensional linear systems with nonlinear observation dynamics. Most of the theoretical results in the area of (optimal) feedback design are based on the assumption that the state is available for measurement. In this paper, we focus on finding a feedback control that avoids state trajectories with undesir… ▽ More This paper is concerned with the design of an augmented state feedback controller for finite-dimensional linear systems with nonlinear observation dynamics. Most of the theoretical results in the area of (optimal) feedback design are based on the assumption that the state is available for measurement. In this paper, we focus on finding a feedback control that avoids state trajectories with undesirable observability properties. In particular, we introduce an optimal control problem that specifically considers an index of observability in the control synthesis. The resulting cost functional is a combination of LQR-like quadratic terms and an index of observability. The main contribution of the paper is presenting a control synthesis procedure that on one hand, provides closed loop asymptotic stability, and addresses the observability of the system--as a transient performance criteria--on the other. △ Less

Submitted 29 August, 2019; originally announced August 2019.

Comments: Accepted in System and Control Letters

arXiv:1908.05732 [pdf, other]

doi 10.1109/CDC40024.2019.9030233

Strong Structural Controllability of Signed Networks

Authors: Shima Sadat Mousavi, Mohammad Haeri, Mehran Mesbahi

Abstract: In this paper, we discuss the controllability of a family of linear time-invariant (LTI) networks defined on a signed graph. In this direction, we introduce the notion of positive and negative signed zero forcing sets for the controllability analysis of positive and negative eigenvalues of system matrices with the same sign pattern. A sufficient combinatorial condition that ensures the strong stru… ▽ More In this paper, we discuss the controllability of a family of linear time-invariant (LTI) networks defined on a signed graph. In this direction, we introduce the notion of positive and negative signed zero forcing sets for the controllability analysis of positive and negative eigenvalues of system matrices with the same sign pattern. A sufficient combinatorial condition that ensures the strong structural controllability of signed networks is then proposed. Moreover, an upper bound on the maximum multiplicity of positive and negative eigenvalues associated with a signed graph is provided. △ Less

Submitted 10 October, 2019; v1 submitted 15 August, 2019; originally announced August 2019.

arXiv:1907.08921 [pdf, other]

LQR through the Lens of First Order Methods: Discrete-time Case

Authors: Jingjing Bu, Afshin Mesbahi, Maryam Fazel, Mehran Mesbahi

Abstract: We consider the Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. Such a setup facilitates examining the implications of a natural initial-state independent formulation of LQR in designing first order algorithms. It is shown that this cost function is smooth and coercive, and provide an alternate means of noting its gradie… ▽ More We consider the Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. Such a setup facilitates examining the implications of a natural initial-state independent formulation of LQR in designing first order algorithms. It is shown that this cost function is smooth and coercive, and provide an alternate means of noting its gradient dominated property. In the process, we provide a number of analytic observations on the LQR cost when directly analyzed in terms of the feedback gain. We then examine three types of well-posed flows for LQR: gradient flow, natural gradient flow and the quasi-Newton flow. The coercive property suggests that these flows admit unique solutions while gradient dominated property indicates that the corresponding Lyapunov functionals decay at an exponential rate; we also prove that these flows are exponentially stable in the sense of Lyapunov. We then discuss the forward Euler discretization of these flows, realized as gradient descent, natural gradient descent and the quasi-Newton iteration. We present stepsize criteria for gradient descent and natural gradient descent, guaranteeing that both algorithms converge linearly to the global optima. An optimal stepsize for the quasi-Newton iteration is also proposed, guaranteeing a $Q$-quadratic convergence rate--and in the meantime--recovering the Hewer algorithm. △ Less

Submitted 29 July, 2019; v1 submitted 21 July, 2019; originally announced July 2019.

arXiv:1906.04857 [pdf, other]

Fast Trajectory Optimization via Successive Convexification for Spacecraft Rendezvous with Integer Constraints

Authors: Danylo Malyuta, Taylor P. Reynolds, Michael Szmuk, Behcet Acikmese, Mehran Mesbahi

Abstract: In this paper we present a fast method based on successive convexification for generating fuel-optimized spacecraft rendezvous trajectories in the presence of mixed-integer constraints. A recently developed paradigm of state-triggered constraints allows to efficiently embed a subset of discrete decision constraints into the continuous optimization framework of successive convexification. As a resu… ▽ More In this paper we present a fast method based on successive convexification for generating fuel-optimized spacecraft rendezvous trajectories in the presence of mixed-integer constraints. A recently developed paradigm of state-triggered constraints allows to efficiently embed a subset of discrete decision constraints into the continuous optimization framework of successive convexification. As a result, we are able to solve difficult trajectory optimization problems at interactive speeds, as opposed to a mixed-integer programming approach that would require significantly more solution time and computing power. Our method is applied to the real problem of transposition and docking of the Apollo command and service module with the lunar module. We demonstrate that, within seconds, we are able to obtain trajectories that are up to 90 percent more fuel efficient (saving up to 45 kg of fuel) than non-optimization based Apollo-era design targets. Our trajectories take explicit account of minimum thrust pulse width and plume impingement constraints. Both of these constraints are naturally mixed-integer, but we handle them as state-triggered constraints. In its current state, our algorithm will serve as a useful off-line design tool for rapid trajectory trade studies. △ Less

Submitted 11 June, 2019; originally announced June 2019.

Comments: 23 pages, 10 figures, submitted to AIAA SciTech 2020

arXiv:1904.09960 [pdf, other]

doi 10.1109/TAC.2020.2992439

Strong Structural Controllability of Networks under Time-Invariant and Time-Varying Topological Perturbations

Authors: Shima Sadat Mousavi, Mohammad Haeri, Mehran Mesbahi

Abstract: This paper investigates the robustness of strong structural controllability for linear time-invariant and linear time-varying directed networks with respect to structural perturbations, including edge deletions and additions. In this direction, we introduce a new construct referred to as a perfect graph associated with a network with a given set of control nodes. The tight upper bounds on the numb… ▽ More This paper investigates the robustness of strong structural controllability for linear time-invariant and linear time-varying directed networks with respect to structural perturbations, including edge deletions and additions. In this direction, we introduce a new construct referred to as a perfect graph associated with a network with a given set of control nodes. The tight upper bounds on the number of edges that can be added to, or removed from a network, while ensuring strong structural controllability, are then derived. Moreover, we obtain a characterization of critical edge-sets, the maximal sets of edges whose any subset can be respectively added to, or removed from a network, while preserving strong structural controllability. In addition, procedures for combining networks to obtain strongly structurally controllable network-of-networks are proposed. Finally, controllability conditions are proposed for networks whose edge weights, as well as their structures, can vary over time. △ Less

Submitted 21 May, 2020; v1 submitted 22 April, 2019; originally announced April 2019.

arXiv:1904.09248 [pdf, other]

doi 10.2514/1.G004536

Dual Quaternion Based Powered Descent Guidance with State-Triggered Constraints

Authors: Taylor P. Reynolds, Michael Szmuk, Danylo Malyuta, Mehran Mesbahi, Behcet Acikmese, John M. Carson III

Abstract: This paper presents a numerical algorithm for computing 6-degree-of-freedom free-final-time powered descent guidance trajectories. The trajectory generation problem is formulated using a unit dual quaternion representation of the rigid body dynamics, and several standard path constraints. Our formulation also includes a special line of sight constraints that is enforced only within a specified ban… ▽ More This paper presents a numerical algorithm for computing 6-degree-of-freedom free-final-time powered descent guidance trajectories. The trajectory generation problem is formulated using a unit dual quaternion representation of the rigid body dynamics, and several standard path constraints. Our formulation also includes a special line of sight constraints that is enforced only within a specified band of slant ranges relative to the landing site, a novel feature that is especially relevant to Terrain and Hazard Relative Navigation. We use the newly introduced state-triggered constraints to formulate these range constraints in a manner that is amenable to real-time implementations. The resulting non-convex optimal control problem is solved iteratively as a sequence of convex second-order cone programs that locally approximate the non-convex problem. Each second-order cone program is solved using a customizable interior point method solver. Also introduced are a scaling method and a new heuristic technique that guide the convergence process towards dynamic feasibility. To demonstrate the capabilities of our algorithm, two numerical case studies are presented. The first studies the effect of including a slant-range-triggered line of sight constraint on the resulting trajectories. The second study performs a Monte Carlo analysis to assess the algorithm's robustness to initial conditions and real-time performance. △ Less

Submitted 19 April, 2019; originally announced April 2019.

Comments: Submitted to the AIAA Journal of Guidance, Control, and Dynamics

arXiv:1904.08451 [pdf, other]

On Topological Properties of the Set of Stabilizing Feedback Gains

Authors: Jingjing Bu, Afshin Mesbahi, Mehran Mesbahi

Abstract: This work presents a fairly complete account on various topological and metrical aspects of feedback stabilization for single-input-single-output (SISO) continuous and discrete time linear-time-invariant (LTI) systems. In particular, we prove that the set of stabilizing output feedback gains for a SISO system with n states has at most $\lceil{\frac{n}{2}}\rceil$ connected components. Furthermore,… ▽ More This work presents a fairly complete account on various topological and metrical aspects of feedback stabilization for single-input-single-output (SISO) continuous and discrete time linear-time-invariant (LTI) systems. In particular, we prove that the set of stabilizing output feedback gains for a SISO system with n states has at most $\lceil{\frac{n}{2}}\rceil$ connected components. Furthermore, our analysis yields an algorithm for determining intervals of stabilizing gains for general continuous and discrete LIT systems; the proposed algorithm also computes the number of unstable roots in each unstable interval. Along the way, we also make a number of observations on the set of stabilizing state feedback gains for MIMO systems. △ Less

Submitted 17 April, 2019; originally announced April 2019.

arXiv:1904.08449 [pdf, ps, other]

Nonlinear Observability via Koopman Analysis: Characterizing the Role of Symmetry

Authors: Afshin Mesbahi, Jingjing Bu, Mehran Mesbahi

Abstract: This paper considers the observability of nonlinear systems from a Koopman operator theoretic perspective--and in particular--the effect of symmetry on observability. We first examine an infinite-dimensional linear system (constructed using independent Koopman eigenfunctions) such that its observability is equivalent to the observability of the original nonlinear system. Next, we derive an analyti… ▽ More This paper considers the observability of nonlinear systems from a Koopman operator theoretic perspective--and in particular--the effect of symmetry on observability. We first examine an infinite-dimensional linear system (constructed using independent Koopman eigenfunctions) such that its observability is equivalent to the observability of the original nonlinear system. Next, we derive an analytic relation between symmetry and nonlinear observability; it is shown that symmetry in the nonlinear dynamics is reflected in the symmetry of the corresponding Koopman eigenfunctions, as well as presence of repeated Koopman eigenvalues. We then proceed to show that the loss of observability in symmetric nonlinear systems can be traced back to the presence of these repeated eigenvalues. In the case where we have a sufficient number of measurements, the nonlinear system remains unobservable when these functions have symmetries that mirror those of the dynamics. The proposed observability framework provides insights into the minimum number of the measurements needed to make an unobservable nonlinear system, observable. The proposed results are then applied to a network of nano-electromechanical oscillators coupled via a symmetric interaction topology. △ Less

Submitted 10 February, 2020; v1 submitted 17 April, 2019; originally announced April 2019.

arXiv:1904.02737 [pdf, ps, other]

On Topological and Metrical Properties of Stabilizing Feedback Gains: the MIMO Case

Authors: Jingjing Bu, Afshin Mesbahi, Mehran Mesbahi

Abstract: In this paper, we discuss various topological and metrical aspects of the set of stabilizing static feedback gains for multiple-input-multiple-output (MIMO) linear-time-invariant (LTI) systems, in both continuous and discrete-time. Recently, connectivity properties of this set (for continuous time) have been reported in the literature, along with a discussion on how this connectivity is affected b… ▽ More In this paper, we discuss various topological and metrical aspects of the set of stabilizing static feedback gains for multiple-input-multiple-output (MIMO) linear-time-invariant (LTI) systems, in both continuous and discrete-time. Recently, connectivity properties of this set (for continuous time) have been reported in the literature, along with a discussion on how this connectivity is affected by restricting the feedback gain to linear subspaces. We show that analogous to the continuous-time case, one can construct instances where the set of stabilizing feedback gains for discrete time LTI systems has exponentially many connected components. △ Less

Submitted 4 April, 2019; originally announced April 2019.

Comments: 17 pages

arXiv:1903.05325 [pdf, ps, other]

doi 10.23919/ACC.2019.8814989

Efficient Computation of H2 Performance on Series-Parallel Networks

Authors: Mathias Hudoba de Badyn, Mehran Mesbahi

Abstract: Series-parallel networks are a class of graphs on which many NP-hard problems have tractable solutions. In this paper, we examine performance measures on leader-follower consensus on series-parallel networks. We show that a distributed computation of the $\mathcal{H}_2$ norm can be done efficiently on this system by exploiting a decomposition of the network into atomic elements and composition rul… ▽ More Series-parallel networks are a class of graphs on which many NP-hard problems have tractable solutions. In this paper, we examine performance measures on leader-follower consensus on series-parallel networks. We show that a distributed computation of the $\mathcal{H}_2$ norm can be done efficiently on this system by exploiting a decomposition of the network into atomic elements and composition rules. Lastly, we examine the problem of adaptively re-weighting the network to optimize the $\mathcal{H}_2$ norm, and show that it can be done with similar complexity. △ Less

Submitted 25 April, 2020; v1 submitted 13 March, 2019; originally announced March 2019.

Comments: 6 pages, 5 figures. To appear in proceedings of the 2019 American Control Conference

Journal ref: Proc. 2019 American Control Conference, pp. 3364-3369

arXiv:1901.02181 [pdf, other]

Successive Convexification for 6-DoF Powered Descent Guidance with Compound State-Triggered Constraints

Authors: Michael Szmuk, Taylor P. Reynolds, Behcet Acikmese, Mehran Mesbahi, John M. Carson III

Abstract: This paper introduces a continuous formulation for compound state-triggered constraints, which are generalizations of the recently introduced state-triggered constraints. State-triggered constraints are different from ordinary constraints found in optimal control in that they use a state-dependent trigger condition to enable or disable a constraint condition, and can be expressed as continuous fun… ▽ More This paper introduces a continuous formulation for compound state-triggered constraints, which are generalizations of the recently introduced state-triggered constraints. State-triggered constraints are different from ordinary constraints found in optimal control in that they use a state-dependent trigger condition to enable or disable a constraint condition, and can be expressed as continuous functions that are readily handled by successive convexification. Compound state-triggered constraints go a step further, giving designers the ability to compose trigger and constraint conditions using Boolean and and or operations. Simulations of the 6-degree-of-freedom (DoF) powered descent guidance problem obtained using successive convexification are presented to illustrate the utility of state-triggered and compound state-triggered constraints. The examples employ a velocity-triggered angle of attack constraint to alleviate aerodynamic loads, and a collision avoidance constraint to avoid large geological formations. In particular, the velocity-triggered angle of attack constraint demonstrates the ability of state-triggered constraints to introduce new constraint phases to the solution without resorting to combinatorial techniques. △ Less

Submitted 8 January, 2019; originally announced January 2019.

Comments: This paper is a modified version of the one presented at the 2019 AIAA Guidance, Navigation, and Control Conference (SciTech) in San Diego, California (17 pages, 10 figures)

arXiv:1809.08745 [pdf, other]

Distributed Q-Learning for Dynamically Decoupled Systems

Authors: Siavash Alemzadeh, Mehran Mesbahi

Abstract: Control of large-scale networked systems often necessitates the availability of complex models for the interactions amongst the agents. However in many applications, building accurate models of agents or interactions amongst them might be infeasible or computationally prohibitive due to the curse of dimensionality or the complexity of these interactions. In the meantime, data-guided control method… ▽ More Control of large-scale networked systems often necessitates the availability of complex models for the interactions amongst the agents. However in many applications, building accurate models of agents or interactions amongst them might be infeasible or computationally prohibitive due to the curse of dimensionality or the complexity of these interactions. In the meantime, data-guided control methods can circumvent model complexity by directly synthesizing the controller from the observed data. In this paper, we propose a distributed Q-learning algorithm to design a feedback mechanism based on a given underlying graph structure parameterizing the agents' interaction network. We assume that the distributed nature of the system arises from the cost function of the corresponding control problem and show that for the specific case of identical dynamically decoupled systems, the learned controller converges to the optimal Linear Quadratic Regulator (LQR) controller for each subsystem. We provide a convergence analysis and verify the result with an example. △ Less

Submitted 19 March, 2019; v1 submitted 24 September, 2018; originally announced September 2018.

Showing 1–50 of 68 results for author: Mesbahi, M