-
InterQ: A DQN Framework for Optimal Intermittent Control
Authors:
Shubham Aggarwal,
Dipankar Maity,
Tamer Başar
Abstract:
In this letter, we explore the communication-control co-design of discrete-time stochastic linear systems through reinforcement learning. Specifically, we examine a closed-loop system involving two sequential decision-makers: a scheduler and a controller. The scheduler continuously monitors the system's state but transmits it to the controller intermittently to balance the communication cost and c…
▽ More
In this letter, we explore the communication-control co-design of discrete-time stochastic linear systems through reinforcement learning. Specifically, we examine a closed-loop system involving two sequential decision-makers: a scheduler and a controller. The scheduler continuously monitors the system's state but transmits it to the controller intermittently to balance the communication cost and control performance. The controller, in turn, determines the control input based on the intermittently received information. Given the partially nested information structure, we show that the optimal control policy follows a certainty-equivalence form. Subsequently, we analyze the qualitative behavior of the scheduling policy. To develop the optimal scheduling policy, we propose InterQ, a deep reinforcement learning algorithm which uses a deep neural network to approximate the Q-function. Through extensive numerical evaluations, we analyze the scheduling landscape and further compare our approach against two baseline strategies: (a) a multi-period periodic scheduling policy, and (b) an event-triggered policy. The results demonstrate that our proposed method outperforms both baselines. The open source implementation can be found at https://github.com/AC-sh/InterQ.
△ Less
Submitted 11 April, 2025;
originally announced April 2025.
-
On Model Protection in Federated Learning against Eavesdropping Attacks
Authors:
Dipankar Maity,
Kushal Chakrabarti
Abstract:
In this study, we investigate the protection offered by federated learning algorithms against eavesdropping adversaries. In our model, the adversary is capable of intercepting model updates transmitted from clients to the server, enabling it to create its own estimate of the model. Unlike previous research, which predominantly focuses on safeguarding client data, our work shifts attention protecti…
▽ More
In this study, we investigate the protection offered by federated learning algorithms against eavesdropping adversaries. In our model, the adversary is capable of intercepting model updates transmitted from clients to the server, enabling it to create its own estimate of the model. Unlike previous research, which predominantly focuses on safeguarding client data, our work shifts attention protecting the client model itself. Through a theoretical analysis, we examine how various factors, such as the probability of client selection, the structure of local objective functions, global aggregation at the server, and the eavesdropper's capabilities, impact the overall level of protection. We further validate our findings through numerical experiments, assessing the protection by evaluating the model accuracy achieved by the adversary. Finally, we compare our results with methods based on differential privacy, underscoring their limitations in this specific context.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
QSID-MPC: Model Predictive Control with System Identification from Quantized Data
Authors:
Shahab Ataei,
Dipankar Maity,
Debdipta Goswami
Abstract:
Least-square system identification is widely used for data-driven model-predictive control (MPC) of unknown or partially known systems. This letter investigates how the system identification and subsequent MPC is affected when the state and input data is quantized. Specifically, we examine the fundamental connection between model error and quantization resolution and how that affects the stability…
▽ More
Least-square system identification is widely used for data-driven model-predictive control (MPC) of unknown or partially known systems. This letter investigates how the system identification and subsequent MPC is affected when the state and input data is quantized. Specifically, we examine the fundamental connection between model error and quantization resolution and how that affects the stability and boundedness of the MPC tracking error. Furthermore, we demonstrate that, with a sufficiently rich dataset, the model error is bounded by a function of quantization resolution and the MPC tracking error is also ultimately bounded similarly. The theory is validated through numerical experiments conducted on two different linear dynamical systems.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
Communication and Control Co-design in Non-cooperative Games
Authors:
Shubham Aggarwal,
Tamer Başar,
Dipankar Maity
Abstract:
In this article, we revisit a communication-control co-design problem for a class of two-player stochastic differential games on an infinite horizon. Each 'player' represents two active decision makers, namely a scheduler and a remote controller, which cooperate to optimize over a global objective while competing with the other player. Each player's scheduler can only intermittently relay state in…
▽ More
In this article, we revisit a communication-control co-design problem for a class of two-player stochastic differential games on an infinite horizon. Each 'player' represents two active decision makers, namely a scheduler and a remote controller, which cooperate to optimize over a global objective while competing with the other player. Each player's scheduler can only intermittently relay state information to its respective controller due to associated cost/constraint to communication. The scheduler's policy determines the information structure at the controller, thereby affecting the quality of the control inputs. Consequently, it leads to the classical communication-control trade-off problem. A high communication frequency improves the control performance of the player on account of a higher communication cost, and vice versa. Under suitable information structures of the players, we first compute the Nash controller policies for both players in terms of the conditional estimate of the state. Consequently, we reformulate the problem of computing Nash scheduler policies (within a class of parametrized randomized policies) into solving for the steady-state solution of a generalized Sylvester equation. Since the above-mentioned reformulation involves infinite sum of powers of the policy parameters, we provide a projected gradient descent-based algorithm to numerically compute a Nash equilibrium using a truncated polynomial approximation. Finally, we demonstrate the performance of the Nash control and scheduler policies using extensive numerical simulations.
△ Less
Submitted 28 February, 2025;
originally announced March 2025.
-
Koopman Meets Limited Bandwidth: Effect of Quantization on Data-Driven Linear Prediction and Control of Nonlinear Systems
Authors:
Shahab Ataei,
Dipankar Maity,
Debdipta Goswami
Abstract:
Koopman-based lifted linear identification have been widely used for data-driven prediction and model predictive control (MPC) of nonlinear systems. It has found applications in flow-control, soft robotics, and unmanned aerial vehicles (UAV). For autonomous systems, this system identification method works by embedding the nonlinear system in a higher-dimensional linear space and computing a finite…
▽ More
Koopman-based lifted linear identification have been widely used for data-driven prediction and model predictive control (MPC) of nonlinear systems. It has found applications in flow-control, soft robotics, and unmanned aerial vehicles (UAV). For autonomous systems, this system identification method works by embedding the nonlinear system in a higher-dimensional linear space and computing a finite-dimensional approximation of the corresponding Koopman operator with the Extended Dynamic Mode Decomposition (EDMD) algorithm. EDMD is a data-driven algorithm that estimates an approximate linear system by lifting the state data-snapshots via nonlinear dictionary functions. For control systems, EDMD is further modified to utilize both state and control data-snapshots to estimate a lifted linear predictor with control input. This article investigates how the estimation process is affected when the data is quantized. Specifically, we examine the fundamental connection between estimates of the linear predictor matrices obtained from unquantized data and those from quantized data via modified EDMD. Furthermore, using the law of large numbers, we demonstrate that, under a large data regime, the quantized estimate can be considered a regularized version of the unquantized estimate. We also explore the relationship between the two estimates in the finite data regime. We further analyze the effect of nonlinear lifting functions on this regularization due to quantization. The theory is validated through repeated numerical experiments conducted on several control systems. The effect of quantization on the MPC performance is also demonstrated.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Cooperative Target Defense under Communication and Sensing Constraints
Authors:
Dipankar Maity,
Arman Pourghorban
Abstract:
We consider a variant of the target defense problems where a group of defenders are tasked to simultaneously capture an intruder. The intruder's objective is to reach a target without being simultaneously captured by the defender team. Some of the defenders are sensing-limited and do not have any information regarding the intruder's position or velocity at any time. The defenders may communicate w…
▽ More
We consider a variant of the target defense problems where a group of defenders are tasked to simultaneously capture an intruder. The intruder's objective is to reach a target without being simultaneously captured by the defender team. Some of the defenders are sensing-limited and do not have any information regarding the intruder's position or velocity at any time. The defenders may communicate with each other using a connected communication graph. We propose a decentralized feedback strategy for the defenders, which transforms the simultaneous capture problem into a unique nonlinear consensus problem. We derive a sufficient condition for simultaneous capture in terms of the agents' speeds, sensing, and communication capabilities. The proposed decentralized controller is evaluated through extensive numerical simulations.
△ Less
Submitted 13 December, 2024;
originally announced December 2024.
-
On the Effect of Quantization on Extended Dynamic Mode Decomposition
Authors:
Dipankar Maity,
Debdipta Goswami
Abstract:
Extended Dynamic Mode Decomposition (EDMD) is a widely used data-driven algorithm for estimating the Koopman Operator. EDMD extends Dynamic Mode Decomposition (DMD) by lifting the snapshot data using nonlinear dictionary functions before performing the estimation. This letter investigates how the estimation process is affected when the data is quantized. Specifically, we examine the fundamental co…
▽ More
Extended Dynamic Mode Decomposition (EDMD) is a widely used data-driven algorithm for estimating the Koopman Operator. EDMD extends Dynamic Mode Decomposition (DMD) by lifting the snapshot data using nonlinear dictionary functions before performing the estimation. This letter investigates how the estimation process is affected when the data is quantized. Specifically, we examine the fundamental connection between estimates of the operator obtained from unquantized data and those from quantized data via EDMD. Furthermore, using the law of large numbers, we demonstrate that, under a large data regime, the quantized estimate can be considered a regularized version of the unquantized estimate. We also explore the relationship between the two estimates in the finite data regime. We further analyze the effect of nonlinear lifting functions on this regularization due to quantization. The theory is validated through repeated numerical experiments conducted on two different dynamical systems.
△ Less
Submitted 18 September, 2024;
originally announced October 2024.
-
Ensuring System-Level Protection against Eavesdropping Adversaries in Distributed Dynamical Systems
Authors:
Dipankar Maity,
Van Sy Mai
Abstract:
In this work, we address the objective of protecting the states of a distributed dynamical system from eavesdropping adversaries. We prove that state-of-the-art distributed algorithms, which rely on communicating the agents' states, are vulnerable in that the final states can be perfectly estimated by any adversary including those with arbitrarily small eavesdropping success probability. While exi…
▽ More
In this work, we address the objective of protecting the states of a distributed dynamical system from eavesdropping adversaries. We prove that state-of-the-art distributed algorithms, which rely on communicating the agents' states, are vulnerable in that the final states can be perfectly estimated by any adversary including those with arbitrarily small eavesdropping success probability. While existing literature typically adds an extra layer of protection, such as encryption or differential privacy techniques, we demonstrate the emergence of a fundamental protection quotient in distributed systems when innovation signals are communicated instead of the agents' states.
△ Less
Submitted 21 September, 2024; v1 submitted 14 September, 2024;
originally announced September 2024.
-
Best Response Strategies for Asymmetric Sensing in Linear-Quadratic Differential Games
Authors:
Shubham Aggarwal,
Tamer Başar,
Dipankar Maity
Abstract:
In this paper, we revisit the two-player continuous-time infinite-horizon linear quadratic differential game problem, where one of the players can sample the state of the system only intermittently due to a sensing constraint while the other player can do so continuously. Under these asymmetric sensing limitations between the players, we analyze the optimal sensing and control strategies for the p…
▽ More
In this paper, we revisit the two-player continuous-time infinite-horizon linear quadratic differential game problem, where one of the players can sample the state of the system only intermittently due to a sensing constraint while the other player can do so continuously. Under these asymmetric sensing limitations between the players, we analyze the optimal sensing and control strategies for the player at a disadvantage while the other player continues to play its security strategy. We derive an optimal sensor policy within the class of stationary randomized policies. Finally, using simulations, we show that the expected cost accrued by the first player approaches its security level as its sensing limitation is relaxed.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Deception in Differential Games: Information Limiting Strategy to Induce Dilemma
Authors:
Daigo Shishika,
Alexander Von Moll,
Dipankar Maity,
Michael Dorothy
Abstract:
Can deception exist in differential games? We provide a case study for a Turret-Attacker differential game, where two Attackers seek to score points by reaching a target region while a Turret tries to minimize the score by aligning itself with the Attackers before they reach the target. In contrast to the original problem solved with complete information, we assume that the Turret only has partial…
▽ More
Can deception exist in differential games? We provide a case study for a Turret-Attacker differential game, where two Attackers seek to score points by reaching a target region while a Turret tries to minimize the score by aligning itself with the Attackers before they reach the target. In contrast to the original problem solved with complete information, we assume that the Turret only has partial information about the maximum speed of the Attackers. We investigate whether there is any incentive for the Attackers to move slower than their maximum speed in order to ``deceive'' the Turret into taking suboptimal actions. We first describe the existence of a dilemma that the Turret may face. Then we derive a set of initial conditions from which the Attackers can force the Turret into a situation where it must take a guess.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
On the Effect of Quantization on Dynamic Mode Decomposition
Authors:
Dipankar Maity,
Debdipta Goswami,
Sriram Narayanan
Abstract:
Dynamic Mode Decomposition (DMD) is a widely used data-driven algorithm for estimating the Koopman Operator.This paper investigates how the estimation process is affected when the data is quantized. Specifically, we examine the fundamental connection between estimates of the operator obtained from unquantized data and those from quantized data. Furthermore, using the law of large numbers, we demon…
▽ More
Dynamic Mode Decomposition (DMD) is a widely used data-driven algorithm for estimating the Koopman Operator.This paper investigates how the estimation process is affected when the data is quantized. Specifically, we examine the fundamental connection between estimates of the operator obtained from unquantized data and those from quantized data. Furthermore, using the law of large numbers, we demonstrate that, under a large data regime, the quantized estimate can be considered a regularized version of the unquantized estimate. This key theoretical finding paves the way to accurately recover the unquantized estimate from quantized data. We also explore the relationship between the two estimates in the finite data regime. The theory is validated through repeated numerical experiments conducted on three different dynamical systems.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Optimal Evasion from a Sensing-Limited Pursuer
Authors:
Dipankar Maity,
Alexander Von Moll,
Daigo Shishika,
Michael Dorothy
Abstract:
This paper investigates a partial-information pursuit evasion game in which the Pursuer has a limited-range sensor to detect the Evader. Given a fixed final time, we derive the optimal evasion strategy for the Evader to maximize its distance from the pursuer at the end. Our analysis reveals that in certain parametric regimes, the optimal Evasion strategy involves a 'risky' maneuver, where the Evad…
▽ More
This paper investigates a partial-information pursuit evasion game in which the Pursuer has a limited-range sensor to detect the Evader. Given a fixed final time, we derive the optimal evasion strategy for the Evader to maximize its distance from the pursuer at the end. Our analysis reveals that in certain parametric regimes, the optimal Evasion strategy involves a 'risky' maneuver, where the Evader's trajectory comes extremely close to the pursuer's sensing boundary before moving behind the Pursuer. Additionally, we explore a special case in which the Pursuer can choose the final time. In this scenario, we determine a (Nash) equilibrium pair for both the final time and the evasion strategy.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Efficient Communication for Pursuit-Evasion Games with Asymmetric Information
Authors:
Dipankar Maity
Abstract:
We consider a class of pursuit-evasion differential games in which the evader has continuous access to the pursuer's location, but not vice-versa. There is an immobile sensor (e.g., a ground radar station) that can sense the evader's location and communicate that information intermittently to the pursuer. Transmitting the information from the sensor to the pursuer is costly and only a finite numbe…
▽ More
We consider a class of pursuit-evasion differential games in which the evader has continuous access to the pursuer's location, but not vice-versa. There is an immobile sensor (e.g., a ground radar station) that can sense the evader's location and communicate that information intermittently to the pursuer. Transmitting the information from the sensor to the pursuer is costly and only a finite number of transmissions can happen throughout the entire game. The outcome of the game is determined by the control strategies of the players and the communication strategy between the sensor and the pursuer. We obtain the (Nash) equilibrium control strategies for both the players as well as the optimal communication strategy between the static sensor and the pursuer. We discuss a dilemma for the evader that emerges in this game. We also discuss the emergence of implicit communication where the absence of communication from the sensor can also convey some actionable information to the pursuer.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Optimal Intermittent Sensing for Pursuit-Evasion Games
Authors:
Dipankar Maity
Abstract:
We consider a class of pursuit-evasion differential games in which the evader has continuous access to the pursuer's location, but not vice-versa. There is a remote sensor (e.g., a radar station) that can sense the evader's location upon a request from the pursuer and communicate that sensed location to the pursuer. The pursuer has a budget on the total number of sensing requests. The outcome of t…
▽ More
We consider a class of pursuit-evasion differential games in which the evader has continuous access to the pursuer's location, but not vice-versa. There is a remote sensor (e.g., a radar station) that can sense the evader's location upon a request from the pursuer and communicate that sensed location to the pursuer. The pursuer has a budget on the total number of sensing requests. The outcome of the game is determined by the sensing and motion strategies of the players. We obtain an equilibrium sensing strategy for the pursuer and an equilibrium motion strategy for the evader. We quantify the degradation in the pursuer's pay-off due to its sensing limitations.
△ Less
Submitted 1 July, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
Planning Visual Inspection Tours for a 3D Dubins Airplane Model in an Urban Environment
Authors:
Collin Hague,
Andrew Willis,
Dipankar Maity,
Artur Wolek
Abstract:
This paper investigates the problem of planning a minimum-length tour for a three-dimensional Dubins airplane model to visually inspect a series of targets located on the ground or exterior surface of objects in an urban environment. Objects are 2.5D extruded polygons representing buildings or other structures. A visibility volume defines the set of admissible (occlusion-free) viewing locations fo…
▽ More
This paper investigates the problem of planning a minimum-length tour for a three-dimensional Dubins airplane model to visually inspect a series of targets located on the ground or exterior surface of objects in an urban environment. Objects are 2.5D extruded polygons representing buildings or other structures. A visibility volume defines the set of admissible (occlusion-free) viewing locations for each target that satisfy feasible airspace and imaging constraints. The Dubins traveling salesperson problem with neighborhoods (DTSPN) is extended to three dimensions with visibility volumes that are approximated by triangular meshes. Four sampling algorithms are proposed for sampling vehicle configurations within each visibility volume to define vertices of the underlying DTSPN. Additionally, a heuristic approach is proposed to improve computation time by approximating edge costs of the 3D Dubins airplane with a lower bound that is used to solve for a sequence of viewing locations. The viewing locations are then assigned pitch and heading angles based on their relative geometry. The proposed sampling methods and heuristics are compared through a Monte-Carlo experiment that simulates view planning tours over a realistic urban environment.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
Actuator Scheduling for Linear Systems: A Convex Relaxation Approach
Authors:
Junjie Jiao,
Dipankar Maity,
John S. Baras,
Sandra Hirche
Abstract:
In this letter, we investigate the problem of actuator scheduling for networked control systems. Given a stochastic linear system with a number of actuators, we consider the case that one actuator is activated at each time. This problem is combinatorial in nature and NP hard to solve. We propose a convex relaxation to the actuator scheduling problem, and use its solution as a reference to design a…
▽ More
In this letter, we investigate the problem of actuator scheduling for networked control systems. Given a stochastic linear system with a number of actuators, we consider the case that one actuator is activated at each time. This problem is combinatorial in nature and NP hard to solve. We propose a convex relaxation to the actuator scheduling problem, and use its solution as a reference to design an algorithm for solving the original scheduling problem. Using dynamic programming arguments, we provide a suboptimality bound of our proposed algorithm. Furthermore, we show that our framework can be extended to incorporate multiple actuators scheduling at each time and actuation costs. A simulation example is provided, which shows that our proposed method outperforms a random selection approach and a greedy selection approach.
△ Less
Submitted 20 May, 2022; v1 submitted 4 March, 2022;
originally announced March 2022.
-
Sensor Scheduling for Linear Systems: A Covariance Tracking Approach
Authors:
Dipankar Maity,
David Hartman,
John S. Baras
Abstract:
We consider the classical sensor scheduling problem for linear systems where only one sensor is activated at each time. We show that the sensor scheduling problem has a close relation to the sensor design problem and the solution of a sensor schedule problem can be extracted from an equivalent sensor design problem. We propose a convex relaxation to the sensor design problem and a reference covari…
▽ More
We consider the classical sensor scheduling problem for linear systems where only one sensor is activated at each time. We show that the sensor scheduling problem has a close relation to the sensor design problem and the solution of a sensor schedule problem can be extracted from an equivalent sensor design problem. We propose a convex relaxation to the sensor design problem and a reference covariance trajectory is obtained from solving the relaxed sensor design problem. Afterwards, a covariance tracking algorithm is designed to obtain an approximate solution to the sensor scheduling problem using the reference covariance trajectory obtained from the sensor design problem. While the sensor scheduling problem is NP-hard, the proposed framework circumvents this computational complexity by decomposing this problem into a convex sensor design problem and a covariance tracking problem. We provide theoretical justification and a sub-optimality bound for the proposed method using dynamic programming. The proposed method is validated over several experiments portraying the efficacy of the framework.
△ Less
Submitted 17 October, 2021;
originally announced October 2021.
-
Multi-Agent Consensus Subject to Communication and Privacy Constraints
Authors:
Dipankar Maity,
Panagiotis Tsiotras
Abstract:
We consider a multi-agent consensus problem in the presence of adversarial agents. The adversaries are able to listen to the inter-agent communications and try to estimate the state of the agents. The agents have a limited bit-rate for communication and are required to quantize the transmitted signal in order to meet the bit-rate constraint of the communication channel. We propose a consensus prot…
▽ More
We consider a multi-agent consensus problem in the presence of adversarial agents. The adversaries are able to listen to the inter-agent communications and try to estimate the state of the agents. The agents have a limited bit-rate for communication and are required to quantize the transmitted signal in order to meet the bit-rate constraint of the communication channel. We propose a consensus protocol that is protected against the adversaries, i.e., the expected mean-square error of the adversary state estimate is lower bounded. In order to deal with the bit-rate constraint, we propose a dynamic quantization scheme that guarantees protected consensus.
△ Less
Submitted 21 February, 2021;
originally announced February 2021.
-
Event-triggered Feedback Control for Signal Temporal Logic Tasks
Authors:
Lars Lindemann,
Dipankar Maity,
John S. Baras,
Dimos V. Dimarogonas
Abstract:
A framework for the event-triggered control synthesis under signal temporal logic (STL) tasks is proposed. In our previous work, a continuous-time feedback control law was designed, using the prescribed performance control technique, to satisfy STL tasks. We replace this continuous-time feedback control law by an event-triggered controller. The event-triggering mechanism is based on a maximum trig…
▽ More
A framework for the event-triggered control synthesis under signal temporal logic (STL) tasks is proposed. In our previous work, a continuous-time feedback control law was designed, using the prescribed performance control technique, to satisfy STL tasks. We replace this continuous-time feedback control law by an event-triggered controller. The event-triggering mechanism is based on a maximum triggering interval and on a norm bound on the difference between the value of the current state and the value of the state at the last triggering instance. Simulations of a multi-agent system quantitatively show the efficacy of using an event-triggered controller to reduce communication and computation efforts.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
Delay-sensitive Joint Optimal Control and Resource Management in Multi-loop Networked Control Systems
Authors:
Mohammad H. Mamduhi,
Dipankar Maity,
Sandra Hirche,
John S. Baras,
Karl H. Johansson
Abstract:
In the operation of networked control systems, where multiple processes share a resource-limited and time-varying cost-sensitive network, communication delay is inevitable and primarily influenced by, first, the control systems deploying intermittent sensor sampling to reduce the communication cost by restricting non-urgent transmissions, and second, the network performing resource management to m…
▽ More
In the operation of networked control systems, where multiple processes share a resource-limited and time-varying cost-sensitive network, communication delay is inevitable and primarily influenced by, first, the control systems deploying intermittent sensor sampling to reduce the communication cost by restricting non-urgent transmissions, and second, the network performing resource management to minimize excessive traffic and eventually data loss. In a heterogeneous scenario, where control systems may tolerate only specific levels of sensor-to-controller latency, delay sensitivities need to be considered in the design of control and network policies to achieve the desired performance guarantees. We propose a cross-layer optimal co-design of control, sampling and resource management policies for an NCS consisting of multiple stochastic linear time-invariant systems which close their sensor-to-controller loops over a shared network. Aligned with advanced communication technology, we assume that the network offers a range of latency-varying transmission services for given prices. Local samplers decide either to pay higher cost to access a low-latency channel, or to delay sending a state sample at a reduced price. A resource manager residing in the network data-link layer arbitrates channel access and re-allocates resources if link capacities are exceeded. The performance of the local closed-loop systems is measured by a combination of linear-quadratic Gaussian cost and a suitable communication cost, and the overall objective is to minimize a defined social cost by all three policy makers. We derive optimal control, sampling and resource allocation policies under different cross-layer awareness models, including constant and time-varying parameters, and show that higher awareness generally leads to performance enhancement at the expense of higher computational complexity.
△ Less
Submitted 15 July, 2020;
originally announced July 2020.
-
Bounded-Rational Pursuit-Evasion Games
Authors:
Yue Guan,
Dipankar Maity,
Christopher M. Kroninger,
Panagiotis Tsiotras
Abstract:
We present a framework that incorporates the idea of bounded rationality into dynamic stochastic pursuit-evasion games. The solution of a stochastic game is characterized, in general, by its (Nash) equilibria in feedback form. However, computing these Nash equilibrium strategies may require extensive computational resources. In this paper, the agents are modeled as bounded rational entities having…
▽ More
We present a framework that incorporates the idea of bounded rationality into dynamic stochastic pursuit-evasion games. The solution of a stochastic game is characterized, in general, by its (Nash) equilibria in feedback form. However, computing these Nash equilibrium strategies may require extensive computational resources. In this paper, the agents are modeled as bounded rational entities having limited computational resources. We illustrate the framework by applying it to a pursuit-evasion game between two vehicles in a stochastic wind field, where both the pursuer and the evader are bounded rational. We show how such a game may be analyzed by properly casting it as an iterative sequence of finite-state Markov Decision Processes (MDPs). Leveraging tools and algorithms from cognitive hierarchy theory ("level-$k$ thinking") we compute the solution of the ensuing discrete game, while taking into consideration the rationality level of each agent. We also present an online algorithm for each agent to infer its opponent rationality level.
△ Less
Submitted 15 March, 2020;
originally announced March 2020.
-
Optimal Controller Synthesis and Dynamic Quantizer Switching for Linear-Quadratic-Gaussian Systems
Authors:
Dipankar Maity,
Panagiotis Tsiotras
Abstract:
In networked control systems, often the sensory signals are quantized before being transmitted to the controller. Consequently, performance is affected by the coarseness of this quantization process. Modern communication technologies allow users to obtain resolution-varying quantized measurements based on the prices paid. In this paper, we consider optimal controller synthesis of a Quantized-Feedb…
▽ More
In networked control systems, often the sensory signals are quantized before being transmitted to the controller. Consequently, performance is affected by the coarseness of this quantization process. Modern communication technologies allow users to obtain resolution-varying quantized measurements based on the prices paid. In this paper, we consider optimal controller synthesis of a Quantized-Feedback Linear-Quadratic-Gaussian (QF-LQG) system where the measurements are to be quantized before being transmitted to the controller. The system is presented with several choices of quantizers, along with the cost of operating each quantizer. The objective is to jointly select the quantizers and the controller that would maintain an optimal balance between control performance and quantization cost. Under certain assumptions, this problem can be decoupled into two optimization problems: one for optimal controller synthesis and the other for optimal quantizer selection. We show that, similarly to the classical LQG problem, the optimal controller synthesis subproblem is characterized by Riccati equations. On the other hand, the optimal quantizer selection policy is found by solving a certain Markov-Decision-Process (MDP).
△ Less
Submitted 31 January, 2020;
originally announced January 2020.
-
C-DOC: Co-State Desensitized Optimal Control
Authors:
Venkata Ramana Makkapati,
Dipankar Maity,
Mehregan Dor,
Panagiotis Tsiotras
Abstract:
In this paper, co-states are used to develop a framework that desensitizes the optimal cost. A general formulation for an optimal control problem with fixed final time is considered. The proposed scheme involves elevating the parameters of interest into states, and further augmenting the co-state equations of the optimal control problem to the dynamical model. A running cost that penalizes the co-…
▽ More
In this paper, co-states are used to develop a framework that desensitizes the optimal cost. A general formulation for an optimal control problem with fixed final time is considered. The proposed scheme involves elevating the parameters of interest into states, and further augmenting the co-state equations of the optimal control problem to the dynamical model. A running cost that penalizes the co-states of the targeted parameters is then added to the original cost function. The solution obtained by minimizing the augmented cost yields a control which reduces the dispersion of the original cost with respect to parametric variations. The relationship between co-states and the cost-to-go function, for any given control law, is established substantiating the approach. Numerical examples and Monte-Carlo simulations that demonstrate the proposed scheme are discussed.
△ Less
Submitted 30 September, 2019;
originally announced October 2019.
-
Optimal Controller and Quantizer Selection for Partially Observable Linear-Quadratic-Gaussian Systems
Authors:
Dipankar Maity,
Panagiotis Tsiotras
Abstract:
In networked control systems, often the sensory signals are quantized before being transmitted to the controller. Consequently, performance is affected by the coarseness of this quantization process. Modern communication technologies allow users to obtain resolution-varying quantized measurements based on the prices paid. In this paper, we consider joint optimal controller synthesis and quantizer…
▽ More
In networked control systems, often the sensory signals are quantized before being transmitted to the controller. Consequently, performance is affected by the coarseness of this quantization process. Modern communication technologies allow users to obtain resolution-varying quantized measurements based on the prices paid. In this paper, we consider joint optimal controller synthesis and quantizer scheduling for a partially observed Quantized-Feedback Linear-Quadratic-Gaussian (QF-LQG) system, where the measurements are quantized before being sent to the controller. The system is presented with several choices of quantizers, along with the cost of using each quantizer. The objective is to jointly select the quantizers and synthesize the controller to strike an optimal balance between control performance and quantization cost. When the innovation signal is quantized instead of the measurement, the problem is decoupled into two optimization problems: one for optimal controller synthesis, and the other for optimal quantizer selection. The optimal controller is found by solving a Riccati equation and the optimal quantizer selection policy is found by solving a linear program (LP)- both of which can be solved offline.
△ Less
Submitted 7 November, 2021; v1 submitted 30 September, 2019;
originally announced September 2019.
-
Linear Quadratic Games with Costly Measurements
Authors:
Dipankar Maity,
Achilleas Anastasopoulos,
John S. Baras
Abstract:
In this work we consider a stochastic linear quadratic two-player game. The state measurements are observed through a switched noiseless communication link. Each player incurs a finite cost every time the link is established to get measurements. Along with the usual control action, each player is equipped with a switching action to control the communication link. The measurements help to improve t…
▽ More
In this work we consider a stochastic linear quadratic two-player game. The state measurements are observed through a switched noiseless communication link. Each player incurs a finite cost every time the link is established to get measurements. Along with the usual control action, each player is equipped with a switching action to control the communication link. The measurements help to improve the estimate and hence reduce the quadratic cost but at the same time the cost is increased due to switching. We study the subgame perfect equilibrium control and switching strategies for the players. We show that the problem can be solved in a two-step process by solving two dynamic programming problems. The first step corresponds to solving a dynamic programming for the control strategy and the second step solves another dynamic programming for the switching strategy
△ Less
Submitted 20 September, 2017;
originally announced September 2017.
-
Timed Automata Approach for Motion Planning Using Metric Interval Temporal Logic
Authors:
Yuchen Zhou,
Dipankar Maity,
John S. Baras
Abstract:
In this paper, we consider the robot motion (or task) planning problem under some given time bounded high level specifications. We use metric interval temporal logic (MITL), a member of the temporal logic family, to represent the task specification and then we provide a constructive way to generate a timed automaton and methods to look for accepting runs on the automaton to find a feasible motion…
▽ More
In this paper, we consider the robot motion (or task) planning problem under some given time bounded high level specifications. We use metric interval temporal logic (MITL), a member of the temporal logic family, to represent the task specification and then we provide a constructive way to generate a timed automaton and methods to look for accepting runs on the automaton to find a feasible motion (or path) sequence for the robot to complete the task.
△ Less
Submitted 28 March, 2016; v1 submitted 27 March, 2016;
originally announced March 2016.
-
Optimal Mission Planner with Timed Temporal Logic Constraints
Authors:
Yuchen Zhou,
Dipankar Maity,
John S. Baras
Abstract:
In this paper, we present an optimization based method for path planning of a mobile robot subject to time bounded temporal constraints, in a dynamic environment. Temporal logic (TL) can address very complex task specification such as safety, coverage, motion sequencing etc. We use metric temporal logic (MTL) to encode the task specifications with timing constraints. We then translate the MTL form…
▽ More
In this paper, we present an optimization based method for path planning of a mobile robot subject to time bounded temporal constraints, in a dynamic environment. Temporal logic (TL) can address very complex task specification such as safety, coverage, motion sequencing etc. We use metric temporal logic (MTL) to encode the task specifications with timing constraints. We then translate the MTL formulae into mixed integer linear constraints and solve the associated optimization problem using a mixed integer linear program solver. This approach is different from the automata based methods which generate a finite abstraction of the environment and dynamics, and use an automata theoretic approach to formally generate a path that satisfies the TL. We have applied our approach on several case studies in complex dynamical environments subjected to timed temporal specifications.
△ Less
Submitted 5 October, 2015;
originally announced October 2015.