Search | arXiv e-print repository

Deceptive Path Planning: A Bayesian Game Approach

Authors: Violetta Rostobaya, James Berneburg, Yue Guan, Michael Dorothy, Daigo Shishika

Abstract: This paper investigates how an autonomous agent can transmit information through its motion in an adversarial setting. We consider scenarios where an agent must reach its goal while deceiving an intelligent observer about its destination. We model this interaction as a dynamic Bayesian game between a mobile Attacker with a privately known goal and a Defender who infers the Attacker's intent to all… ▽ More This paper investigates how an autonomous agent can transmit information through its motion in an adversarial setting. We consider scenarios where an agent must reach its goal while deceiving an intelligent observer about its destination. We model this interaction as a dynamic Bayesian game between a mobile Attacker with a privately known goal and a Defender who infers the Attacker's intent to allocate defensive resources effectively. We use Perfect Bayesian Nash Equilibrium (PBNE) as our solution concept and propose a computationally efficient approach to find it. In the resulting equilibrium, the Defender employs a simple Markovian strategy, while the Attacker strategically balances deception and goal efficiency by stochastically mixing shortest and non-shortest paths to manipulate the Defender's beliefs. Numerical experiments demonstrate the advantages of our PBNE-based strategies over existing methods based on one-sided optimization. △ Less

Submitted 16 June, 2025; originally announced June 2025.

Comments: 8 pages, 9 figures. This work has been submitted to the IEEE for possible publication

arXiv:2504.13288 [pdf, other]

Integrated Control and Active Perception in POMDPs for Temporal Logic Tasks and Information Acquisition

Authors: Chongyang Shi, Michael R. Dorothy, Jie Fu

Abstract: This paper studies the synthesis of a joint control and active perception policy for a stochastic system modeled as a partially observable Markov decision process (POMDP), subject to temporal logic specifications. The POMDP actions influence both system dynamics (control) and the emission function (perception). Beyond task completion, the planner seeks to maximize information gain about certain te… ▽ More This paper studies the synthesis of a joint control and active perception policy for a stochastic system modeled as a partially observable Markov decision process (POMDP), subject to temporal logic specifications. The POMDP actions influence both system dynamics (control) and the emission function (perception). Beyond task completion, the planner seeks to maximize information gain about certain temporal events (the secret) through coordinated perception and control. To enable active information acquisition, we introduce minimizing the Shannon conditional entropy of the secret as a planning objective, alongside maximizing the probability of satisfying the temporal logic formula within a finite horizon. Using a variant of observable operators in hidden Markov models (HMMs) and POMDPs, we establish key properties of the conditional entropy gradient with respect to policy parameters. These properties facilitate efficient policy gradient computation. We validate our approach through graph-based examples, inspired by common security applications with UAV surveillance. △ Less

Submitted 17 April, 2025; originally announced April 2025.

arXiv:2409.16439 [pdf, other]

Active Perception with Initial-State Uncertainty: A Policy Gradient Method

Authors: Chongyang Shi, Shuo Han, Michael Dorothy, Jie Fu

Abstract: This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon co… ▽ More This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon conditional entropy as the planning objective and develop a novel policy gradient method with convergence guarantees. By leveraging a variant of observable operators in HMMs, we prove several important properties of the gradient of the conditional entropy with respect to the policy parameters, which allow efficient computation of the policy gradient and stable and fast convergence. We demonstrate the effectiveness of our solution by applying it to an inference problem in a stochastic grid world environment. △ Less

Submitted 24 September, 2024; originally announced September 2024.

arXiv:2409.09302 [pdf, other]

Heterogeneous Roles against Assignment Based Policies in Two vs Two Target Defense Game

Authors: Goutam Das, Violetta Rostobaya, James Berneburg, Zachary I. Bell, Michael Dorothy, Daigo Shishika

Abstract: In this paper, we consider a target defense game in which the attacker team seeks to reach a high-value target while the defender team seeks to prevent that by capturing them away from the target. To address the curse of dimensionality, a popular approach to solve such team-vs-team game is to decompose it into a set of one-vs-one games. Such an approximation assumes independence between teammates… ▽ More In this paper, we consider a target defense game in which the attacker team seeks to reach a high-value target while the defender team seeks to prevent that by capturing them away from the target. To address the curse of dimensionality, a popular approach to solve such team-vs-team game is to decompose it into a set of one-vs-one games. Such an approximation assumes independence between teammates assigned to different one-vs-one games, ignoring the possibility of a richer set of cooperative behaviors, ultimately leading to suboptimality. In this paper, we provide teammate-aware strategies for the attacker team and show that they can outperform the assignment-based strategy, if the defenders still employ an assignment-based strategy. More specifically, the attacker strategy involves heterogeneous roles where one attacker actively intercepts a defender to help its teammate reach the target. We provide sufficient conditions under which such a strategy benefits the attackers, and we validate the results using numerical simulations. △ Less

Submitted 14 September, 2024; originally announced September 2024.

Comments: 7 pages, 7 figures, 2024 CDC Final Submission

arXiv:2405.07465 [pdf, other]

Deception in Differential Games: Information Limiting Strategy to Induce Dilemma

Authors: Daigo Shishika, Alexander Von Moll, Dipankar Maity, Michael Dorothy

Abstract: Can deception exist in differential games? We provide a case study for a Turret-Attacker differential game, where two Attackers seek to score points by reaching a target region while a Turret tries to minimize the score by aligning itself with the Attackers before they reach the target. In contrast to the original problem solved with complete information, we assume that the Turret only has partial… ▽ More Can deception exist in differential games? We provide a case study for a Turret-Attacker differential game, where two Attackers seek to score points by reaching a target region while a Turret tries to minimize the score by aligning itself with the Attackers before they reach the target. In contrast to the original problem solved with complete information, we assume that the Turret only has partial information about the maximum speed of the Attackers. We investigate whether there is any incentive for the Attackers to move slower than their maximum speed in order to ``deceive'' the Turret into taking suboptimal actions. We first describe the existence of a dilemma that the Turret may face. Then we derive a set of initial conditions from which the Attackers can force the Turret into a situation where it must take a guess. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2401.12848 [pdf, other]

Optimal Evasion from a Sensing-Limited Pursuer

Authors: Dipankar Maity, Alexander Von Moll, Daigo Shishika, Michael Dorothy

Abstract: This paper investigates a partial-information pursuit evasion game in which the Pursuer has a limited-range sensor to detect the Evader. Given a fixed final time, we derive the optimal evasion strategy for the Evader to maximize its distance from the pursuer at the end. Our analysis reveals that in certain parametric regimes, the optimal Evasion strategy involves a 'risky' maneuver, where the Evad… ▽ More This paper investigates a partial-information pursuit evasion game in which the Pursuer has a limited-range sensor to detect the Evader. Given a fixed final time, we derive the optimal evasion strategy for the Evader to maximize its distance from the pursuer at the end. Our analysis reveals that in certain parametric regimes, the optimal Evasion strategy involves a 'risky' maneuver, where the Evader's trajectory comes extremely close to the pursuer's sensing boundary before moving behind the Pursuer. Additionally, we explore a special case in which the Pursuer can choose the final time. In this scenario, we determine a (Nash) equilibrium pair for both the final time and the evasion strategy. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: Accepted for presentation at, and publication in the proceedings of, the 2024 American Control Conference

arXiv:2311.03338 [pdf, other]

Defending a Static Target Point with a Slow Defender

Authors: Goutam Das, Michael Dorothy, Zachary I. Bell, Daigo Shishika

Abstract: This paper studies a target-defense game played between a slow defender and a fast attacker. The attacker wins the game if it reaches the target while avoiding the defender's capture disk. The defender wins the game by preventing the attacker from reaching the target, which includes reaching the target and containing it in the capture disk. Depending on the initial condition, the attacker must cir… ▽ More This paper studies a target-defense game played between a slow defender and a fast attacker. The attacker wins the game if it reaches the target while avoiding the defender's capture disk. The defender wins the game by preventing the attacker from reaching the target, which includes reaching the target and containing it in the capture disk. Depending on the initial condition, the attacker must circumnavigate the defender's capture disk, resulting in a constrained trajectory. This condition produces three phases of the game, which we analyze to solve for the game of kind. We provide the barrier surface that divides the state space into attacker-win and defender win regions, and present the corresponding strategies that guarantee win for each region. Numerical experiments demonstrate the theoretical results as well as the efficacy of the proposed strategies. △ Less

Submitted 16 March, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

Comments: 8 pages, 12 figures, Accepted for Publication to IEEE ACC 2024

arXiv:2306.03877 [pdf, other]

The Eater and the Mover Game

Authors: Violetta Rostobaya, Yue Guan, James Berneburg, Michael Dorothy, Daigo Shishika

Abstract: This paper studies the idea of ``deception by motion'' through a two-player dynamic game played between a Mover who must retrieve resources at a goal location, and an Eater who can consume resources at two candidate goals. The Mover seeks to minimize the resource consumption at the true goal, and the Eater tries to maximize it. While the Mover has the knowledge about the true goal, the Eater canno… ▽ More This paper studies the idea of ``deception by motion'' through a two-player dynamic game played between a Mover who must retrieve resources at a goal location, and an Eater who can consume resources at two candidate goals. The Mover seeks to minimize the resource consumption at the true goal, and the Eater tries to maximize it. While the Mover has the knowledge about the true goal, the Eater cannot differentiate between the two candidates. Unlike existing works on deceptive motion control that measures the deceptiveness through the quality of inference made by a distant observer (an estimator), we incorporate their actions to directly measure the efficacy of deception through the outcome of the game. An equilibrium concept is then proposed without the notion of an estimator. We further identify a pair of equilibrium strategies and demonstrate that if the Eater optimizes for the worst-case scenario, hiding the intention (deception by ambiguity) is still effective, whereas trying to fake the true goal (deception by exaggeration) is not. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: Submitted to the IEEE Control Systems Letters (L-CSS), 2023

arXiv:2209.11664 [pdf, other]

A Constraint-Driven Approach to Line Flocking: The V Formation as an Energy-Saving Strategy

Authors: Logan E. Beaver, Christopher Kroninger, Michael Dorothy, Andreas A. Malikopoulos

Abstract: The study of robotic flocking has received significant attention in the past twenty years. In this article, we present a constraint-driven control algorithm that minimizes the energy consumption of individual agents and yields an emergent V formation. As the formation emerges from the decentralized interaction between agents, our approach is robust to the spontaneous addition or removal of agents… ▽ More The study of robotic flocking has received significant attention in the past twenty years. In this article, we present a constraint-driven control algorithm that minimizes the energy consumption of individual agents and yields an emergent V formation. As the formation emerges from the decentralized interaction between agents, our approach is robust to the spontaneous addition or removal of agents to the system. First, we present an analytical model for the trailing upwash behind a fixed-wing UAV, and we derive the optimal air speed for trailing UAVs to maximize their travel endurance. Next, we prove that simply flying at the optimal airspeed will never lead to emergent flocking behavior, and we propose a new decentralized "anseroid" behavior that yields emergent V formations. We encode these behaviors in a constraint-driven control algorithm that minimizes the locomotive power of each UAV. Finally, we prove that UAVs initialized in an approximate V or echelon formation will converge under our proposed control law, and we demonstrate this emergence occurs in real-time in simulation and in physical experiments with a fleet of Crazyflie quadrotors. △ Less

Submitted 23 September, 2022; originally announced September 2022.

Comments: 12 pages, 7 figures

arXiv:2209.09318 [pdf, other]

Guarding a Non-Maneuverable Translating Line with an Attached Defender

Authors: Goutam Das, Michael Dorothy, Zachary I. Bell, Daigo Shishika

Abstract: In this paper we consider a target-guarding differential game where the defender must protect a linearly translating line-segment by intercepting an attacker who tries to reach it. In contrast to common target-guarding problems, we assume that the defender is attached to the target and moves along with it. This assumption affects the defenders' maximum speed in inertial frame, which depends on the… ▽ More In this paper we consider a target-guarding differential game where the defender must protect a linearly translating line-segment by intercepting an attacker who tries to reach it. In contrast to common target-guarding problems, we assume that the defender is attached to the target and moves along with it. This assumption affects the defenders' maximum speed in inertial frame, which depends on the target's direction of motion. Zero-sum differential game of degree for both the attacker-win and defender-win scenarios are studied, where the payoff is defined to be the distance between the two agents at the time of game termination. We derive the equilibrium strategies and the Value function by leveraging the solution for the infinite-length target scenario. The zero-level set of this Value function provides the barrier surface that divides the state space into defender-win and attacker-win regions. We present simulation results to demonstrate the theoretical results. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Comments: 8 pages, 8 figures. arXiv admin note: text overlap with arXiv:2207.04098

arXiv:2204.04176 [pdf, other]

Path Defense in Dynamic Defender-Attacker Blotto Games (dDAB) with Limited Information

Authors: Austin K. Chen, Bryce L. Ferguson, Daigo Shishika, Michael Dorothy, Jason R. Marden, George J. Pappas, Vijay Kumar

Abstract: We consider a path guarding problem in dynamic Defender-Attacker Blotto games (dDAB), where a team of robots must defend a path in a graph against adversarial agents. Multi-robot systems are particularly well suited to this application, as recent work has shown the effectiveness of these systems in related areas such as perimeter defense and surveillance. When designing a defender policy that guar… ▽ More We consider a path guarding problem in dynamic Defender-Attacker Blotto games (dDAB), where a team of robots must defend a path in a graph against adversarial agents. Multi-robot systems are particularly well suited to this application, as recent work has shown the effectiveness of these systems in related areas such as perimeter defense and surveillance. When designing a defender policy that guarantees the defense of a path, information about the adversary and the environment can be helpful and may reduce the number of resources required by the defender to achieve a sufficient level of security. In this work, we characterize the necessary and sufficient number of assets needed to guarantee the defense of a shortest path between two nodes in dDAB games when the defender can only detect assets within $k$-hops of a shortest path. By characterizing the relationship between sensing horizon and required resources, we show that increasing the sensing capability of the defender greatly reduces the number of defender assets needed to defend the path. △ Less

Submitted 25 May, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

arXiv:2112.09890 [pdf, other]

Dynamic Defender-Attacker Blotto Game

Authors: Daigo Shishika, Yue Guan, Michael Dorothy, Vijay Kumar

Abstract: This work studies a dynamic, adversarial resource allocation problem in environments modeled as graphs. A blue team of defender robots are deployed in the environment to protect the nodes from a red team of attacker robots. We formulate the engagement as a discrete-time dynamic game, where the robots can move at most one hop in each time step. The game terminates with the attacker's win if any loc… ▽ More This work studies a dynamic, adversarial resource allocation problem in environments modeled as graphs. A blue team of defender robots are deployed in the environment to protect the nodes from a red team of attacker robots. We formulate the engagement as a discrete-time dynamic game, where the robots can move at most one hop in each time step. The game terminates with the attacker's win if any location has more attacker robots than defender robots at any time. The goal is to identify dynamic resource allocation strategies, as well as the conditions that determines the winner: graph structure, available resources, and initial conditions. We analyze the problem using reachable sets and show how the outdegree of the underlying graph directly influences the difficulty of the defending task. Furthermore, we provide algorithms that identify sufficiency of attacker's victory. △ Less

Submitted 18 December, 2021; originally announced December 2021.

Showing 1–12 of 12 results for author: Dorothy, M