-
Kernel-Based Optimal Control: An Infinitesimal Generator Approach
Authors:
Petar Bevanda,
Nicolas Hoischen,
Tobias Wittmann,
Jan Brüdigam,
Sandra Hirche,
Boris Houska
Abstract:
This paper presents a novel operator-theoretic approach for optimal control of nonlinear stochastic systems within reproducing kernel Hilbert spaces. Our learning framework leverages data samples of system dynamics and stage cost functions, with only control penalties and constraints provided. The proposed method directly learns the infinitesimal generator of a controlled stochastic diffusion in a…
▽ More
This paper presents a novel operator-theoretic approach for optimal control of nonlinear stochastic systems within reproducing kernel Hilbert spaces. Our learning framework leverages data samples of system dynamics and stage cost functions, with only control penalties and constraints provided. The proposed method directly learns the infinitesimal generator of a controlled stochastic diffusion in an infinite-dimensional hypothesis space. We demonstrate that our approach seamlessly integrates with modern convex operator-theoretic Hamilton-Jacobi-Bellman recursions, enabling a data-driven solution to the optimal control problems. Furthermore, our learning framework includes nonparametric estimators for uncontrolled infinitesimal generators as a special case. Numerical experiments, ranging from synthetic differential equations to simulated robotic systems, showcase the advantages of our approach compared to both modern data-driven and classical nonlinear programming methods for optimal control.
△ Less
Submitted 25 April, 2025; v1 submitted 2 December, 2024;
originally announced December 2024.
-
Risk-averse learning with delayed feedback
Authors:
Siyi Wang,
Zifan Wang,
Karl Henrik Johansson,
Sandra Hirche
Abstract:
In real-world scenarios, the impacts of decisions may not manifest immediately. Taking these delays into account facilitates accurate assessment and management of risk in real-world environments, thereby ensuring the efficacy of strategies. In this paper, we investigate risk-averse learning using Conditional Value at Risk (CVaR) as risk measure, while incorporating delayed feedback with unknown bu…
▽ More
In real-world scenarios, the impacts of decisions may not manifest immediately. Taking these delays into account facilitates accurate assessment and management of risk in real-world environments, thereby ensuring the efficacy of strategies. In this paper, we investigate risk-averse learning using Conditional Value at Risk (CVaR) as risk measure, while incorporating delayed feedback with unknown but bounded delays. We develop two risk-averse learning algorithms that rely on one-point and two-point zeroth-order optimization approaches, respectively. The regret achieved by the algorithms is analyzed in terms of the cumulative delay and the number of total samplings. The results suggest that the two-point risk-averse learning achieves a smaller regret bound than the one-point algorithm. Furthermore, the one-point risk-averse learning algorithm attains sublinear regret under certain delay conditions, and the two-point risk-averse learning algorithm achieves sublinear regret with minimal restrictions on the delay. We provide numerical experiments on a dynamic pricing problem to demonstrate the performance of the proposed algorithms.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Data-Driven Optimal Feedback Laws via Kernel Mean Embeddings
Authors:
Petar Bevanda,
Nicolas Hoischen,
Stefan Sosnowski,
Sandra Hirche,
Boris Houska
Abstract:
This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only control penalty function and constraints are provided. Leveraging the theory of reproducing kernel Hilbert spaces, we introduce novel kernel mea…
▽ More
This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only control penalty function and constraints are provided. Leveraging the theory of reproducing kernel Hilbert spaces, we introduce novel kernel mean embeddings (KMEs) to identify the Markov transition operators associated with controlled diffusion processes. The KME learning approach seamlessly integrates with modern convex operator-theoretic Hamilton-Jacobi-Bellman recursions. Thus, unlike traditional dynamic programming methods, our approach exploits the ``kernel trick'' to break the curse of dimensionality. We demonstrate the effectiveness of our method through numerical examples, highlighting its ability to solve a large class of nonlinear optimal control problems.
△ Less
Submitted 23 July, 2024;
originally announced July 2024.
-
Consistency of Value of Information: Effects of Packet Loss and Time Delay in Networked Control Systems Tasks
Authors:
Touraj Soleymani,
John S. Baras,
Siyi Wang,
Sandra Hirche,
Karl H. Johansson
Abstract:
In this chapter, we study the consistency of the value of information$\unicode{x2014}$a semantic metric that claims to determine the right piece of information in networked control systems tasks$\unicode{x2014}$in a lossy and delayed communication regime. Our analysis begins with a focus on state estimation, and subsequently extends to feedback control. To that end, we make a causal tradeoff betwe…
▽ More
In this chapter, we study the consistency of the value of information$\unicode{x2014}$a semantic metric that claims to determine the right piece of information in networked control systems tasks$\unicode{x2014}$in a lossy and delayed communication regime. Our analysis begins with a focus on state estimation, and subsequently extends to feedback control. To that end, we make a causal tradeoff between the packet rate and the mean square error. Associated with this tradeoff, we demonstrate the existence of an optimal policy profile, comprising a symmetric threshold scheduling policy based on the value of information for the encoder and a non-Gaussian linear estimation policy for the decoder. Our structural results assert that the scheduling policy is expressible in terms of $3d-1$ variables related to the source and the channel, where $d$ is the time delay, and that the estimation policy incorporates no residual related to signaling. We then construct an optimal control policy by exploiting the separation principle.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Foundations of Value of Information: A Semantic Metric for Networked Control Systems Tasks
Authors:
Touraj Soleymani,
John S. Baras,
Sandra Hirche,
Karl H. Johansson
Abstract:
In this chapter, we present our recent invention, i.e., the notion of the value of information$\unicode{x2014}$a semantic metric that is fundamental for networked control systems tasks. We begin our analysis by formulating a causal tradeoff between the packet rate and the regulation cost, with an encoder and a decoder as two distributed decision makers, and show that the valuation of information i…
▽ More
In this chapter, we present our recent invention, i.e., the notion of the value of information$\unicode{x2014}$a semantic metric that is fundamental for networked control systems tasks. We begin our analysis by formulating a causal tradeoff between the packet rate and the regulation cost, with an encoder and a decoder as two distributed decision makers, and show that the valuation of information is conceivable and quantifiable grounded on this tradeoff. More precisely, we characterize an equilibrium, and quantify the value of information there as the variation in a value function with respect to a piece of sensory measurement that can be communicated from the encoder to the decoder at each time. We prove that, in feedback control of a dynamical process over a noiseless channel, the value of information is a function of the discrepancy between the state estimates at the encoder and the decoder, and that a data packet containing a sensory measurement at each time should be exchanged only if the value of information at that time is nonnegative. Finally, we prove that the characterized equilibrium is in fact globally optimal.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Analyzing the Impact of Computation in Adaptive Dynamic Programming for Stochastic LQR Problem
Authors:
Wenhan Cao,
Alexandre Capone,
Sandra Hirche,
Wei Pan
Abstract:
Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a cr…
▽ More
Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of stochastic integrals during policy iteration (PI). In a fully model-free problem setting, this computation can only be approximated by state samples collected at discrete time points using computational methods such as the canonical Euler-Maruyama method. Our research reveals a critical phenomenon: the sampling period can significantly impact control performance. This impact is due to the fact that computational errors introduced in each step of PI can significantly affect the algorithm's convergence behavior, which in turn influences the resulting control policy. We draw a parallel between PI and Newton's method applied to the Ricatti equation to elucidate how the computation impacts control. In this light, the computational error in each PI step manifests itself as an extra error term in each step of Newton's method, with its upper bound proportional to the computational error. Furthermore, we demonstrate that the convergence rate for ADP in stochastic LQR problems using the Euler-Maruyama method is O(h), with h being the sampling period. A sensorimotor control task finally validates these theoretical findings.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
H2 suboptimal containment control of homogeneous and heterogeneous multi-agent systems
Authors:
Yuan Gao,
Junjie Jiao,
Zhongkui Li,
Sandra Hirche
Abstract:
This paper deals with the H2 suboptimal state containment control problem for homogeneous linear multi-agent systems and the H2 suboptimal output containment control problem for heterogeneous linear multi-agent systems. For both problems, given multiple autonomous leaders and a number of followers, we introduce suitable performance outputs and an associated H2 cost functional, respectively. The ai…
▽ More
This paper deals with the H2 suboptimal state containment control problem for homogeneous linear multi-agent systems and the H2 suboptimal output containment control problem for heterogeneous linear multi-agent systems. For both problems, given multiple autonomous leaders and a number of followers, we introduce suitable performance outputs and an associated H2 cost functional, respectively. The aim is to design a distributed protocol by dynamic output feedback that achieves state/output containment control while the associated H2 cost is smaller than an a priori given upper bound. To this end, we first show that the H2 suboptimal state/output containment control problem can be equivalently transformed into H2 suboptimal control problems for a set of independent systems. Based on this, design methods are then provided to compute such distributed dynamic output feedback protocols. Simulation examples are provided to illustrate the performance of our proposed protocols.
△ Less
Submitted 19 November, 2023;
originally announced November 2023.
-
Koopman Kernel Regression
Authors:
Petar Bevanda,
Max Beier,
Armin Lederer,
Stefan Sosnowski,
Eyke Hüllermeier,
Sandra Hirche
Abstract:
Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challeng…
▽ More
Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challenging. Koopman operator theory offers a beneficial paradigm for addressing this problem by characterizing forecasts via linear time-invariant (LTI) ODEs, turning multi-step forecasts into sparse matrix multiplication. Though there exists a variety of learning approaches, they usually lack crucial learning-theoretic guarantees, making the behavior of the obtained models with increasing data and dimensionality unclear. We address the aforementioned by deriving a universal Koopman-invariant reproducing kernel Hilbert space (RKHS) that solely spans transformations into LTI dynamical systems. The resulting Koopman Kernel Regression (KKR) framework enables the use of statistical learning tools from function approximation for novel convergence results and generalization error bounds under weaker assumptions than existing work. Our experiments demonstrate superior forecasting performance compared to Koopman operator and sequential data predictors in RKHS.
△ Less
Submitted 16 January, 2024; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Learning-Based Optimal Control with Performance Guarantees for Unknown Systems with Latent States
Authors:
Robert Lefringhausen,
Supitsana Srithasan,
Armin Lederer,
Sandra Hirche
Abstract:
As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be nece…
▽ More
As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be necessary to jointly estimate the dynamics and the latent state, making the quantification of uncertainties and the design of controllers with formal performance guarantees considerably more challenging. This paper proposes a novel method for the computation of an optimal input trajectory for unknown nonlinear systems with latent states based on a combination of particle Markov chain Monte Carlo methods and scenario theory. Probabilistic performance guarantees are derived for the resulting input trajectory, and an approach to validate the performance of arbitrary control laws is presented. The effectiveness of the proposed method is demonstrated in a numerical simulation.
△ Less
Submitted 6 August, 2024; v1 submitted 31 March, 2023;
originally announced March 2023.
-
Actuator Scheduling for Linear Systems: A Convex Relaxation Approach
Authors:
Junjie Jiao,
Dipankar Maity,
John S. Baras,
Sandra Hirche
Abstract:
In this letter, we investigate the problem of actuator scheduling for networked control systems. Given a stochastic linear system with a number of actuators, we consider the case that one actuator is activated at each time. This problem is combinatorial in nature and NP hard to solve. We propose a convex relaxation to the actuator scheduling problem, and use its solution as a reference to design a…
▽ More
In this letter, we investigate the problem of actuator scheduling for networked control systems. Given a stochastic linear system with a number of actuators, we consider the case that one actuator is activated at each time. This problem is combinatorial in nature and NP hard to solve. We propose a convex relaxation to the actuator scheduling problem, and use its solution as a reference to design an algorithm for solving the original scheduling problem. Using dynamic programming arguments, we provide a suboptimality bound of our proposed algorithm. Furthermore, we show that our framework can be extended to incorporate multiple actuators scheduling at each time and actuation costs. A simulation example is provided, which shows that our proposed method outperforms a random selection approach and a greedy selection approach.
△ Less
Submitted 20 May, 2022; v1 submitted 4 March, 2022;
originally announced March 2022.
-
Towards Data-driven LQR with Koopmanizing Flows
Authors:
Petar Bevanda,
Max Beier,
Shahab Heshmati-Alamdari,
Stefan Sosnowski,
Sandra Hirche
Abstract:
We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while…
▽ More
We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while concurrently learning meaningful lifting coordinates. For the latter, we rely on Koopmanizing Flows - a diffeomorphism-based representation of Koopman operators and extend it to systems with linear control entry. With such a learned model, we can replace the nonlinear optimal control problem with quadratic cost to that of a linear quadratic regulator (LQR), facilitating efficacious optimal control for nonlinear systems. The superior control performance of the proposed method is demonstrated on simulation examples.
△ Less
Submitted 23 May, 2022; v1 submitted 27 January, 2022;
originally announced January 2022.
-
Learning the Koopman Eigendecomposition: A Diffeomorphic Approach
Authors:
Petar Bevanda,
Johannes Kirmayr,
Stefan Sosnowski,
Sandra Hirche
Abstract:
We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system v…
▽ More
We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system via the spectral equivalence of conjugate systems - allowing the construction of linear predictors for nonlinear systems. The universality of the diffeomorphism learner leads to the universal approximation of the nonlinear system's Koopman eigenfunctions. The developed method is also safe as it guarantees the model is asymptotically stable regardless of the representation accuracy. To our best knowledge, this is the first work to close the gap between the operator, system and learning theories. The efficacy of our approach is shown through simulation examples.
△ Less
Submitted 30 May, 2022; v1 submitted 14 October, 2021;
originally announced October 2021.
-
Distributed Value of Information in Feedback Control over Multi-hop Networks
Authors:
Precious Ugo Abara,
Sandra Hirche
Abstract:
Recent works in the domain of networked control systems have demonstrated that the joint design of medium access control strategies and control strategies for the closed-loop system is beneficial. However, several metrics introduced so far fail in either appropriately representing the network requirements or in capturing how valuable the data is. In this paper we propose a distributed value of inf…
▽ More
Recent works in the domain of networked control systems have demonstrated that the joint design of medium access control strategies and control strategies for the closed-loop system is beneficial. However, several metrics introduced so far fail in either appropriately representing the network requirements or in capturing how valuable the data is. In this paper we propose a distributed value of information (dVoI) metric for the joint design of control and schedulers for medium access in a multi-loop system and multi-hop network. We start by providing conditions under certainty equivalent controller is optimal. Then we reformulate the joint control and communication problem as a Bellman-like equation. The corresponding dynamic programming problem is solved in a distributed fashion by the proposed VoI-based scheduling policies for the multi-loop multi-hop networked control system, which outperforms the well-known time-triggered periodic sampling policies. Additionally we show that the dVoI-based scheduling policies are independent of each other, both loop-wise and hop-wise. At last, we illustrate the results with a numerical example.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Value of Information in Feedback Control: Global Optimality
Authors:
Touraj Soleymani,
John S. Baras,
Sandra Hirche,
Karl H. Johansson
Abstract:
The rate-regulation tradeoff, defined between two objective functions, one penalizing the packet rate and one the regulation cost, can express the fundamental performance bound of networked control systems. However, the characterization of the set of globally optimal solutions in this tradeoff for multi-dimensional Gauss-Markov processes has been an open problem. In the present article, we charact…
▽ More
The rate-regulation tradeoff, defined between two objective functions, one penalizing the packet rate and one the regulation cost, can express the fundamental performance bound of networked control systems. However, the characterization of the set of globally optimal solutions in this tradeoff for multi-dimensional Gauss-Markov processes has been an open problem. In the present article, we characterize a policy profile that belongs to this set without imposing any restrictions on the information structure or the policy structure. We prove that such a policy profile consists of a symmetric threshold triggering policy based on the value of information and a certainty-equivalent control policy based on a non-Gaussian linear estimator. These policies are deterministic and can be designed separately. Besides, we provide a global optimality analysis for the value of information $\text{VoI}_k$, a semantic metric that emerges from the rate-regulation tradeoff as the difference between the benefit and the cost of a data packet. We prove that it is globally optimal that a data packet containing sensory information at time $k$ be transmitted to the controller only if $\text{VoI}_k$ becomes nonnegative. These results have important implications in the areas of communication and control.
△ Less
Submitted 4 May, 2022; v1 submitted 25 March, 2021;
originally announced March 2021.
-
Data-driven output synchronization of heterogeneous leader-follower multi-agent systems
Authors:
Junjie Jiao,
Henk J. van Waarde,
Harry L. Trentelman,
M. Kanat Camlibel,
Sandra Hirche
Abstract:
This paper deals with data-driven output synchronization for heterogeneous leader-follower linear multi-agent systems. Given a multi-agent system that consists of one autonomous leader and a number of heterogeneous followers with external disturbances, we provide necessary and sufficient data-based conditions for output synchronization. We also provide a design method for obtaining such output syn…
▽ More
This paper deals with data-driven output synchronization for heterogeneous leader-follower linear multi-agent systems. Given a multi-agent system that consists of one autonomous leader and a number of heterogeneous followers with external disturbances, we provide necessary and sufficient data-based conditions for output synchronization. We also provide a design method for obtaining such output synchronizing protocols directly from data. The results are then extended to the special case that the followers are disturbance-free. Finally, a simulation example is provided to illustrate our results.
△ Less
Submitted 23 September, 2021; v1 submitted 22 March, 2021;
originally announced March 2021.
-
Koopman Operator Dynamical Models: Learning, Analysis and Control
Authors:
Petar Bevanda,
Stefan Sosnowski,
Sandra Hirche
Abstract:
The Koopman operator allows for handling nonlinear systems through a (globally) linear representation. In general, the operator is infinite-dimensional - necessitating finite approximations - for which there is no overarching framework. Although there are principled ways of learning such finite approximations, they are in many instances overlooked in favor of, often ill-posed and unstructured meth…
▽ More
The Koopman operator allows for handling nonlinear systems through a (globally) linear representation. In general, the operator is infinite-dimensional - necessitating finite approximations - for which there is no overarching framework. Although there are principled ways of learning such finite approximations, they are in many instances overlooked in favor of, often ill-posed and unstructured methods. Also, Koopman operator theory has long-standing connections to known system-theoretic and dynamical system notions that are not universally recognized. Given the former and latter realities, this work aims to bridge the gap between various concepts regarding both theory and tractable realizations. Firstly, we review data-driven representations (both unstructured and structured) for Koopman operator dynamical models, categorizing various existing methodologies and highlighting their differences. Furthermore, we provide concise insight into the paradigm's relation to system-theoretic notions and analyze the prospect of using the paradigm for modeling control systems. Additionally, we outline the current challenges and comment on future perspectives.
△ Less
Submitted 22 December, 2021; v1 submitted 4 February, 2021;
originally announced February 2021.
-
Value of Information in Feedback Control: Quantification
Authors:
Touraj Soleymani,
John S. Baras,
Sandra Hirche
Abstract:
Although transmission of a data packet containing sensory information in a networked control system improves the quality of regulation, it has indeed a price from the communication perspective. It is, therefore, rational that such a data packet be transmitted only if it is valuable in the sense of a cost-benefit analysis. Yet, the fact is that little is known so far about this valuation of informa…
▽ More
Although transmission of a data packet containing sensory information in a networked control system improves the quality of regulation, it has indeed a price from the communication perspective. It is, therefore, rational that such a data packet be transmitted only if it is valuable in the sense of a cost-benefit analysis. Yet, the fact is that little is known so far about this valuation of information and its connection with traditional event-triggered communication. In the present article, we study this intrinsic property of networked control systems by formulating a rate-regulation tradeoff between the packet rate and the regulation cost with an event trigger and a controller as two distributed decision makers, and show that the valuation of information is conceivable and quantifiable grounded on this tradeoff. In particular, we characterize an equilibrium in the rate-regulation tradeoff, and quantify the value of information $\text{VoI}_k$ there as the variation in a so-called value function with respect to a piece of sensory information that can be communicated to the controller at each time $k$. We prove that, for a multi-dimensional Gauss-Markov process, $\text{VoI}_k$ is a symmetric function of the discrepancy between the state estimates at the event trigger and the controller, and that a data packet containing sensory information at time $k$ should be transmitted to the controller only if $\text{VoI}_k$ is nonnegative. Moreover, we discuss that $\text{VoI}_k$ can be computed with arbitrary accuracy, and that it can be approximated by a closed-form quadratic function with a performance guarantee.
△ Less
Submitted 2 May, 2022; v1 submitted 18 December, 2018;
originally announced December 2018.
-
Optimal LQG Control under Delay-dependent Costly Information
Authors:
Dipankar Maity,
Mohammad H. Mamduhi,
Sandra Hirche,
Karl Henrik Johansson,
John S. Baras
Abstract:
In the design of closed-loop networked control systems (NCSs), induced transmission delay between sensors and the control station is an often-present issue which compromises control performance and may even cause instability. A very relevant scenario in which network-induced delay needs to be investigated is costly usage of communication resources. More precisely, advanced communication technologi…
▽ More
In the design of closed-loop networked control systems (NCSs), induced transmission delay between sensors and the control station is an often-present issue which compromises control performance and may even cause instability. A very relevant scenario in which network-induced delay needs to be investigated is costly usage of communication resources. More precisely, advanced communication technologies, e.g. 5G, are capable of offering latency-varying information exchange for different prices. Therefore, induced delay becomes a decision variable. It is then the matter of decision maker's willingness to either pay the required cost to have low-latency access to the communication resource, or delay the access at a reduced price. In this article, we consider optimal price-based bi-variable decision making problem for single-loop NCS with a stochastic linear time-invariant system. Assuming that communication incurs cost such that transmission with shorter delay is more costly, a decision maker determines the switching strategy between communication links of different delays such that an optimal balance between the control performance and the communication cost is maintained. In this article, we show that, under mild assumptions on the available information for decision makers, the separation property holds between the optimal link selecting and control policies. As the cost function is decomposable, the optimal policies are efficiently computed.
△ Less
Submitted 28 June, 2018;
originally announced June 2018.
-
Consensus Driven by the Geometric Mean
Authors:
Herbert Mangesius,
Dong Xue,
Sandra Hirche
Abstract:
Consensus networks are usually understood as arithmetic mean driven dynamical averaging systems. In applications, however, network dynamics often describe inherently non-arithmetic and non-linear consensus processes. In this paper, we propose and study three novel consensus protocols driven by geometric mean averaging: a polynomial, an entropic, and a scaling-invariant protocol, where terminology…
▽ More
Consensus networks are usually understood as arithmetic mean driven dynamical averaging systems. In applications, however, network dynamics often describe inherently non-arithmetic and non-linear consensus processes. In this paper, we propose and study three novel consensus protocols driven by geometric mean averaging: a polynomial, an entropic, and a scaling-invariant protocol, where terminology characterizes the particular non-linearity appearing in the respective differential protocol equation. We prove exponential convergence to consensus for positive initial conditions. For the novel protocols we highlight connections to applied network problems: The polynomial consensus system is structured like a system of chemical kinetics on a graph. The entropic consensus system converges to the weighted geometric mean of the initial condition, which is an immediate extension of the (weighted) average consensus problem. We find that all three protocols generate gradient flows of free energy on the simplex of constant mass distribution vectors albeit in different metrics. On this basis, we propose a novel variational characterization of the geometric mean as the solution of a non-linear constrained optimization problem involving free energy as cost function. We illustrate our findings in numerical simulations.
△ Less
Submitted 5 August, 2016; v1 submitted 9 November, 2015;
originally announced November 2015.
-
Event-Triggered Estimation of Linear Systems: An Iterative Algorithm and Optimality Properties
Authors:
Adam Molin,
Sandra Hirche
Abstract:
This report investigates the optimal design of event-triggered estimation for first-order linear stochastic systems. The problem is posed as a two-player team problem with a partially nested information pattern. The two players are given by an estimator and an event-trigger. The event-trigger has full state information and decides, whether the estimator shall obtain the current state information b…
▽ More
This report investigates the optimal design of event-triggered estimation for first-order linear stochastic systems. The problem is posed as a two-player team problem with a partially nested information pattern. The two players are given by an estimator and an event-trigger. The event-trigger has full state information and decides, whether the estimator shall obtain the current state information by transmitting it through a resource constrained channel. The objective is to find an optimal trade-off between the mean squared estimation error and the expected transmission rate. The proposed iterative algorithm alternates between optimizing one player while fixing the other player. It is shown that the solution of the algorithm converges to a linear predictor and a symmetric threshold policy, if the densities of the initial state and the noise variables are even and radially decreasing functions. The effectiveness of the approach is illustrated on a numerical example. In case of a multimodal distribution of the noise variables a significant performance improvement can be achieved compared to a separate design that assumes a linear prediction and a symmetric threshold policy.
△ Less
Submitted 22 March, 2012;
originally announced March 2012.