-
Social herding in mean field games
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider a mean field model of social behavior where there are an infinite number of players, each of whom observes a type privately that represents her preference, and publicly observes a mean field state of types and actions of the players in the society. The types (and equivalently preferences) of the players are dynamically evolving. Each player is fully rational and forward-…
▽ More
In this paper, we consider a mean field model of social behavior where there are an infinite number of players, each of whom observes a type privately that represents her preference, and publicly observes a mean field state of types and actions of the players in the society. The types (and equivalently preferences) of the players are dynamically evolving. Each player is fully rational and forward-looking and makes a decision in each round t to buy a product. She receives a higher utility if the product she bought is aligned with her current preference and if there is a higher fraction of people who bought that product (thus a game of strategic complementarity). We show that for certain parameters when the weight of strategic complementarity is high, players eventually herd towards one of the actions with probability 1 which is when each player buys a product irrespective of her preference.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Mean field teams and games with correlated types
Authors:
Deepanshu Vasal
Abstract:
Mean field games have traditionally been defined~[1,2] as a model of large scale interaction of players where each player has a private type that is independent across the players. In this paper, we introduce a new model of mean field teams and games with \emph{correlated types} where there are a large population of homogeneous players sequentially making strategic decisions and each player is aff…
▽ More
Mean field games have traditionally been defined~[1,2] as a model of large scale interaction of players where each player has a private type that is independent across the players. In this paper, we introduce a new model of mean field teams and games with \emph{correlated types} where there are a large population of homogeneous players sequentially making strategic decisions and each player is affected by other players through an aggregate population state. Each player has a private type that only she observes and types of any $N$ players are correlated through a kernel $Q$. All players commonly observe a correlated mean-field population state which represents the empirical distribution of any $N$ players' correlated joint types. We define the Mean-Field Team optimal Strategies (MFTO) as strategies of the players that maximize total expected joint reward of the players. We also define Mean-Field Equilibrium (MFE) in such games as solution of coupled Bellman dynamic programming backward equation and Fokker Planck forward equation of the correlated mean field state, where a player's strategy in an MFE depends on both, her private type and current correlated mean field population state. We present sufficient conditions for the existence of such an equilibria. We also present a backward recursive methodology equivalent of master's equation to compute all MFTO and MFEs of the team and game respectively. Each step in this methodology consists of solving an optimization problem for the team problem and a fixed-point equation for the game. We provide sufficient conditions that guarantee existence of this fixed-point equation for the game for each time $t$.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Master equation of discrete-time Stackelberg mean field games with multiple leaders
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider a discrete-time Stackelberg mean field game with a finite number of leaders, a finite number of major followers and an infinite number of minor followers. The leaders and the followers each observe types privately that evolve as conditionally independent controlled Markov processes. The leaders are of "Stackelberg" kind which means they commit to a dynamic policy. We con…
▽ More
In this paper, we consider a discrete-time Stackelberg mean field game with a finite number of leaders, a finite number of major followers and an infinite number of minor followers. The leaders and the followers each observe types privately that evolve as conditionally independent controlled Markov processes. The leaders are of "Stackelberg" kind which means they commit to a dynamic policy. We consider two types of followers: major and minor, each with a private type. All the followers best respond to the policies of the Stackelberg leaders and each other. Knowing that the followers would play a mean field game (with major players) based on their policy, each (Stackelberg) leader chooses a policy that maximizes her reward. We refer to the resulting outcome as a Stackelberg mean field equilibrium with multiple leaders (SMFE-ML). In this paper, we provide a master equation of this game that allows one to compute all SMFE-ML. We further extend this notion to the case when there are infinite number of leaders.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
Master Equation for Discrete-Time Stackelberg Mean Field Games with single leader
Authors:
Deepanshu Vasal,
Randall Berry
Abstract:
In this paper, we consider a discrete-time Stackelberg mean field game with a leader and an infinite number of followers. The leader and the followers each observe types privately that evolve as conditionally independent controlled Markov processes. The leader commits to a dynamic policy and the followers best respond to that policy and each other. Knowing that the followers would play a mean fiel…
▽ More
In this paper, we consider a discrete-time Stackelberg mean field game with a leader and an infinite number of followers. The leader and the followers each observe types privately that evolve as conditionally independent controlled Markov processes. The leader commits to a dynamic policy and the followers best respond to that policy and each other. Knowing that the followers would play a mean field game based on her policy, the leader chooses a policy that maximizes her reward. We refer to the resulting outcome as a Stackelberg mean field equilibrium (SMFE). In this paper, we provide a master equation of this game that allows one to compute all SMFE. Based on our framework, we consider two numerical examples. First, we consider an epidemic model where the followers get infected based on the mean field population. The leader chooses subsidies for a vaccine to maximize social welfare and minimize vaccination costs. In the second example, we consider a technology adoption game where the followers decide to adopt a technology or a product and the leader decides the cost of one product that maximizes his returns, which are proportional to the people adopting that technology
△ Less
Submitted 15 January, 2022;
originally announced January 2022.
-
A dynamic program for linear sequential coding for Gaussian MAC with noisy feedback
Authors:
Deepanshu Vasal
Abstract:
In this paper consider a two user multiple access channel with noisy feedback. There are two senders with independent messages who transmit symbols across an additive white Gaussian channel to a receiver, who in turn sends back a symbol which is received by the two senders through two independent noisy Gaussian channels. We consider the case when the feedback is active i.e. the receiver actively e…
▽ More
In this paper consider a two user multiple access channel with noisy feedback. There are two senders with independent messages who transmit symbols across an additive white Gaussian channel to a receiver, who in turn sends back a symbol which is received by the two senders through two independent noisy Gaussian channels. We consider the case when the feedback is active i.e. the receiver actively encodes the feedback using a linear state process. We pose this as a problem of linear sequential coding at the senders and the receiver to minimize the terminal mean square probability of error at the receiver. This is an instance of decentralized control with no common information at the senders and the receiver. In this paper, we construct two linear controllers at the sender and the receiver. Due to linearity of the policies and the controllers, all the random variables involved are jointly Gaussian. Moreover, the corresponding covariance matrix at the receiver of the estimation process of the senders' messages is a deterministic process, which is a function of the parameters of the controllers and the strategies of the players, and is thus perfectly observed by the senders. Based on this observation, we use deterministic dynamic programming to find the optimal policies and the optimal linear controllers at both the senders and the receiver. The problem with passive feedback can be considered as a special case.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Linear Coding for AWGN channels with Noisy Output Feedback via Dynamic Programming
Authors:
Rajesh Mishra,
Deepanshu Vasal,
Hyeji Kim
Abstract:
The optimal coding scheme for communicating a Gaussian message over an Additive White Gaussian noise (AWGN) channel with AWGN output feedback, with a limited number of transmissions is unknown. Even if we restrict the scope of the coding scheme to linear schemes, still, deriving the optimal coding scheme is a challenging task. The state-of-the-art linear scheme for channels with noisy feedback is…
▽ More
The optimal coding scheme for communicating a Gaussian message over an Additive White Gaussian noise (AWGN) channel with AWGN output feedback, with a limited number of transmissions is unknown. Even if we restrict the scope of the coding scheme to linear schemes, still, deriving the optimal coding scheme is a challenging task. The state-of-the-art linear scheme for channels with noisy feedback is by Chance and Love, where the coefficients of the linear scheme are numerically optimized based on unique observations [1]. In this paper, we introduce a new class of sequential linear schemes for this channel by introducing a novel linear state process at the transmitter and derive the optimal sequential scheme within this class of schemes in a closed-form by formulating a novel Dynamic Programming (DP). We empirically show that our scheme outperforms the state-of-the-art linear scheme in [1] for noisy feedback and coincides with the SK scheme for noiseless feedback. We also show that in communicating message bits as opposed to a Gaussian message, a learning-based approach further improves the reliability of sequential linear schemes. This problem is an instance of decentralized control without any common information and to the best of our knowledge the first such scenario where we can derive analytical solutions using a DP.
△ Less
Submitted 23 May, 2022; v1 submitted 17 March, 2021;
originally announced March 2021.
-
Dynamic information design
Authors:
Deepanshu Vasal
Abstract:
We consider the problem of dynamic information design with one sender and one receiver where the sender observers a private state of the system and takes an action to send a signal based on its observation to a receiver. Based on this signal, the receiver takes an action that determines rewards for both the sender and the receiver and controls the state of the system. In this technical note, we sh…
▽ More
We consider the problem of dynamic information design with one sender and one receiver where the sender observers a private state of the system and takes an action to send a signal based on its observation to a receiver. Based on this signal, the receiver takes an action that determines rewards for both the sender and the receiver and controls the state of the system. In this technical note, we show that this problem can be considered as a problem of dynamic game of asymmetric information and its perfect Bayesian equilibrium (PBE) and Stackelberg equilibrium (SE) can be analyzed using the algorithms presented in [1], [2] by the same author (among others). We then extend this model when there is one sender and multiple receivers and provide algorithms to compute a class of equilibria of this game.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Existence of structured perfect Bayesian equilibrium in dynamic games of asymmetric information
Authors:
Deepanshu Vasal
Abstract:
In~[1],authors considered a general finite horizon model of dynamic game of asymmetric information, where N players have types evolving as independent Markovian process, where each player observes its own type perfectly and actions of all players. The authors present a sequential decomposition algorithm to find all structured perfect Bayesian equilibria of the game. The algorithm consists of solvi…
▽ More
In~[1],authors considered a general finite horizon model of dynamic game of asymmetric information, where N players have types evolving as independent Markovian process, where each player observes its own type perfectly and actions of all players. The authors present a sequential decomposition algorithm to find all structured perfect Bayesian equilibria of the game. The algorithm consists of solving a class of fixed-point of equations for each time $t,π_t$, whose existence was left as an open question. In this paper, we prove existence of these fixed-point equations for compact metric spaces.
△ Less
Submitted 29 May, 2020; v1 submitted 12 May, 2020;
originally announced May 2020.
-
Model-free Reinforcement Learning for Non-stationary Mean Field Games
Authors:
Rajesh K Mishra,
Deepanshu Vasal,
Sriram Vishwanath
Abstract:
In this paper, we consider a finite horizon, non-stationary, mean field games (MFG) with a large population of homogeneous players, sequentially making strategic decisions, where each player is affected by other players through an aggregate population state termed as mean field state. Each player has a private type that only it can observe, and a mean field population state representing the empiri…
▽ More
In this paper, we consider a finite horizon, non-stationary, mean field games (MFG) with a large population of homogeneous players, sequentially making strategic decisions, where each player is affected by other players through an aggregate population state termed as mean field state. Each player has a private type that only it can observe, and a mean field population state representing the empirical distribution of other players' types, which is shared among all of them. Recently, authors in [1] provided a sequential decomposition algorithm to compute mean field equilibrium (MFE) for such games which allows for the computation of equilibrium policies for them in linear time than exponential, as before. In this paper, we extend it for the case when state transitions are not known, to propose a reinforcement learning algorithm based on Expected Sarsa with a policy gradient approach that learns the MFE policy by learning the dynamics of the game simultaneously. We illustrate our results using cyber-physical security example.
△ Less
Submitted 4 April, 2020;
originally announced April 2020.
-
Decentralized multi-agent reinforcement learning with shared actions
Authors:
Rajesh K Mishra,
Deepanshu Vasal,
Sriram Vishwanath
Abstract:
In this paper, we propose a novel model-free reinforcement learning algorithm to compute the optimal policies for a multi-agent system with $N$ cooperative agents where each agent privately observes it's own private type and publicly observes each others' actions. The goal is to maximize their collective reward. The problem belongs to the broad class of decentralized control problems with partial…
▽ More
In this paper, we propose a novel model-free reinforcement learning algorithm to compute the optimal policies for a multi-agent system with $N$ cooperative agents where each agent privately observes it's own private type and publicly observes each others' actions. The goal is to maximize their collective reward. The problem belongs to the broad class of decentralized control problems with partial information. We use the common agent approach wherein some fictitious common agent picks the best policy based on a belief on the current states of the agents. These beliefs are updated individually for each agent from their current belief and action histories. Belief state updates without the knowledge of system dynamics is a challenge. In this paper, we employ particle filters called the bootstrap filter distributively across agents to update the belief. We provide a model-free reinforcement learning (RL) method for this multi-agent partially observable Markov decision processes using the particle filter and sampled trajectories to estimate the optimal policies for the agents. We showcase our results with the help of a smartgrid application where the users strive to reduce collective cost of power for all the agents in the grid. Finally, we compare the performances for model and model-free implementation of the RL algorithm establishing the effectiveness of particle filter (pf) method.
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
Sequential decomposition of discrete memoryless channel with noisy feedback
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider a discrete memoryless point to point channel with noisy feedback, where there is a sender with a private message that she wants to communicate to a receiver by sequentially transmitting symbols over a noisy channel. After each transmission, she receives a noisy feedback of the symbol received by the receiver. The goal is to design transmission control strategy of the sen…
▽ More
In this paper, we consider a discrete memoryless point to point channel with noisy feedback, where there is a sender with a private message that she wants to communicate to a receiver by sequentially transmitting symbols over a noisy channel. After each transmission, she receives a noisy feedback of the symbol received by the receiver. The goal is to design transmission control strategy of the sender that minimize the average probability of error. This is an instance of decentralized control of information where the two controllers, the sender and the receiver have no common information. There exist no methodology in the literature that provides a notion of "state" and a dynamic program to find optimal policies for this problem In this paper, we show introduce a notion of state, based on which we provide a sequential decomposition methodology that finds optimum policies within the class of Markov strategies with respect to this state (which need not be globally optimum). This allows to decompose the problem across time and reduce the complexity dependence on time from double exponential to linear in time.
△ Less
Submitted 21 February, 2020;
originally announced February 2020.
-
Master equation of discrete time graphon mean field games and teams
Authors:
Deepanshu Vasal,
Rajesh K Mishra,
Sriram Vishwanath
Abstract:
In this paper, we present a sequential decomposition algorithm equivalent of Master equation to compute GMFE of GMFG and graphon optimal Markovian policies (GOMPs) of graphon mean field teams (GMFTs). We consider a large population of players sequentially making strategic decisions where the actions of each player affect their neighbors which is captured in a graph, generated by a known graphon. E…
▽ More
In this paper, we present a sequential decomposition algorithm equivalent of Master equation to compute GMFE of GMFG and graphon optimal Markovian policies (GOMPs) of graphon mean field teams (GMFTs). We consider a large population of players sequentially making strategic decisions where the actions of each player affect their neighbors which is captured in a graph, generated by a known graphon. Each player observes a private state and also a common information as a graphon mean-field population state which represents the empirical networked distribution of other players' types. We consider non-stationary population state dynamics and present a novel backward recursive algorithm to compute both GMFE and GOMP that depend on both, a player's private type, and the current (dynamic) population state determined through the graphon. Each step in computing GMFE consists of solving a fixed-point equation, while computing GOMP involves solving for an optimization problem. We provide conditions on model parameters for which there exists such a GMFE. Using this algorithm, we obtain the GMFE and GOMP for a specific security setup in cyber physical systems for different graphons that capture the interactions between the nodes in the system.
△ Less
Submitted 7 June, 2022; v1 submitted 15 January, 2020;
originally announced January 2020.
-
Markov perfect equilibria in non-stationary mean-field games
Authors:
Deepanshu Vasal
Abstract:
In this paper, we consider both finite and infinite horizon discounted dynamic mean-field games where there is a large population of homogeneous players sequentially making strategic decisions and each player is affected by other players through an aggregate population state. Each player has a private type that only she observes. Such games have been studied in the literature under simplifying ass…
▽ More
In this paper, we consider both finite and infinite horizon discounted dynamic mean-field games where there is a large population of homogeneous players sequentially making strategic decisions and each player is affected by other players through an aggregate population state. Each player has a private type that only she observes. Such games have been studied in the literature under simplifying assumption that population state dynamics are stationary. In this paper, we consider non-stationary population state dynamics and present a novel backward recursive algorithm to compute Markov perfect equilibrium (MPE) that depend on both, a player's private type, and current (dynamic) population state. Using this algorithm, we study a security problem in cyberphysical system where infected nodes put negative externality on the system, and each node makes a decision to get vaccinated. We numerically compute MPE of the game.
△ Less
Submitted 21 October, 2019; v1 submitted 10 May, 2019;
originally announced May 2019.
-
Incentive design for learning in user-recommendation systems with time-varying states
Authors:
Deepanshu Vasal,
Vijay Subramanian,
Achilleas Anastasopoulos
Abstract:
We consider the problem of how strategic users with asymmetric information can learn an underlying time varying state in a user-recommendation system. Users who observe private signals about the state, sequentially make a decision about buying a product whose value varies with time in an ergodic manner. We formulate the team problem as an instance of decentralized stochastic control problem and ch…
▽ More
We consider the problem of how strategic users with asymmetric information can learn an underlying time varying state in a user-recommendation system. Users who observe private signals about the state, sequentially make a decision about buying a product whose value varies with time in an ergodic manner. We formulate the team problem as an instance of decentralized stochastic control problem and characterize its optimal policies. With strategic users, we design incentives such that users reveal their true private signals, so that the gap between the strategic and team objective is small and the overall expected incentive payments are also small.
△ Less
Submitted 13 April, 2018;
originally announced April 2018.
-
Sequential decomposition of repeated games with asymmetric information and dependent states
Authors:
Deepanshu Vasal
Abstract:
We consider a finite horizon repeated game with $N$ selfish players who observe their types privately and take actions, which are publicly observed. Their actions and types jointly determine their instantaneous rewards. In each period, players jointly observe actions of each other with delay 1, and private observations of the state of the system, and get an instantaneous reward which is a function…
▽ More
We consider a finite horizon repeated game with $N$ selfish players who observe their types privately and take actions, which are publicly observed. Their actions and types jointly determine their instantaneous rewards. In each period, players jointly observe actions of each other with delay 1, and private observations of the state of the system, and get an instantaneous reward which is a function of the state and everyone's actions. The players' types are static and are potentially correlated among players.
An appropriate notion of equilibrium for such games is Perfect Bayesian Equilibrium (PBE) which consists of a strategy and a belief profile of the players which is coupled across time and as a result, the complexity of finding such equilibria grows double-exponentially in time. We present a sequential decomposition methodology to compute \emph{structured perfect Bayesian equilibria} (SPBE) of this game, introduced in~\cite{VaAn15arxiv}, where equilibrium policy of a player is a function of a common belief and a private state. This methodology computes SPBE in linear time. In general, the SPBE of the game problem exhibit \textit{signaling} behavior, i.e. players' actions reveal part of their private information that is payoff relevant to other players.
△ Less
Submitted 16 May, 2019; v1 submitted 10 January, 2018;
originally announced January 2018.
-
Decentralized Bayesian learning in dynamic games: A framework for studying informational cascades
Authors:
Deepanshu Vasal,
Achilleas Anastasopoulos
Abstract:
We study the problem of Bayesian learning in a dynamical system involving strategic agents with asymmetric information. In a series of seminal papers in the literature, this problem has been investigated under a simplifying model where myopically selfish players appear sequentially and act once in the game, based on private noisy observations of the system state and public observation of past play…
▽ More
We study the problem of Bayesian learning in a dynamical system involving strategic agents with asymmetric information. In a series of seminal papers in the literature, this problem has been investigated under a simplifying model where myopically selfish players appear sequentially and act once in the game, based on private noisy observations of the system state and public observation of past players' actions. It has been shown that there exist information cascades where users discard their private information and mimic the action of their predecessor. In this paper, we provide a framework for studying Bayesian learning dynamics in a more general setting than the one described above. In particular, our model incorporates cases where players are non-myopic and strategically participate for the whole duration of the game, and cases where an endogenous process selects which subset of players will act at each time instance. The proposed framework hinges on a sequential decomposition methodology for finding structured perfect Bayesian equilibria (PBE) of a general class of dynamic games with asymmetric information, where user-specific states evolve as conditionally independent Markov processes and users make independent noisy observations of their states. Using this methodology, we study a specific dynamic learning model where players make decisions about public investment based on their estimates of everyone's types. We characterize a set of informational cascades for this problem where learning stops for the team as a whole. We show that in such cascades, all players' estimates of other players' types freeze even though each individual player asymptotically learns its own true type.
△ Less
Submitted 8 April, 2018; v1 submitted 22 July, 2016;
originally announced July 2016.
-
Signaling equilibria for dynamic LQG games with asymmetric information
Authors:
Deepanshu Vasal,
Achilleas Anastasopoulos
Abstract:
We consider a finite horizon dynamic game with two players who observe their types privately and take actions, which are publicly observed. Players' types evolve as independent, controlled linear Gaussian processes and players incur quadratic instantaneous costs. This forms a dynamic linear quadratic Gaussian (LQG) game with asymmetric information. We show that under certain conditions, players' s…
▽ More
We consider a finite horizon dynamic game with two players who observe their types privately and take actions, which are publicly observed. Players' types evolve as independent, controlled linear Gaussian processes and players incur quadratic instantaneous costs. This forms a dynamic linear quadratic Gaussian (LQG) game with asymmetric information. We show that under certain conditions, players' strategies that are linear in their private types, together with Gaussian beliefs form a perfect Bayesian equilibrium (PBE) of the game. Furthermore, it is shown that this is a signaling equilibrium due to the fact that future beliefs on players' types are affected by the equilibrium strategies. We provide a backward-forward algorithm to find the PBE. Each step of the backward algorithm reduces to solving an algebraic matrix equation for every possible realization of the state estimate covariance matrix. The forward algorithm consists of Kalman filter recursions, where state estimate covariance matrices depend on equilibrium strategies.
△ Less
Submitted 15 June, 2016;
originally announced June 2016.
-
A systematic process for evaluating structured perfect Bayesian equilibria in dynamic games with asymmetric information
Authors:
Deepanshu Vasal,
Abhinav Sinha,
Achilleas Anastasopoulos
Abstract:
We consider finite-horizon and infinite-horizon versions of a dynamic game with $N$ selfish players who observe their types privately and take actions that are publicly observed. Players' types evolve as conditionally independent Markov processes, conditioned on their current actions. Their actions and types jointly determine their instantaneous rewards. In dynamic games with asymmetric informatio…
▽ More
We consider finite-horizon and infinite-horizon versions of a dynamic game with $N$ selfish players who observe their types privately and take actions that are publicly observed. Players' types evolve as conditionally independent Markov processes, conditioned on their current actions. Their actions and types jointly determine their instantaneous rewards. In dynamic games with asymmetric information, a widely used concept of equilibrium is perfect Bayesian equilibrium (PBE), which consists of a strategy and belief pair that simultaneously satisfy sequential rationality and belief consistency. In general, there does not exist a universal algorithm that decouples the interdependence of strategies and beliefs over time in calculating PBE. In this paper, for the finite-horizon game with independent types we develop a two-step backward-forward recursive algorithm that sequentially decomposes the problem (w.r.t. time) to obtain a subset of PBEs, which we refer to as structured Bayesian perfect equilibria (SPBE). In such equilibria, a player's strategy depends on its history only through a common public belief and its current private type. The backward recursive part of this algorithm defines an equilibrium generating function. Each period in the backward recursion involves solving a fixed-point equation on the space of probability simplexes for every possible belief on types. Using this function, equilibrium strategies and beliefs are generated through a forward recursion. We then extend this methodology to the infinite-horizon model, where we propose a time-invariant single-shot fixed-point equation, which in conjunction with a forward recursive step, generates the SPBE. Sufficient conditions for the existence of SPBE are provided. With our proposed method, we find equilibria that exhibit signaling behavior. This is illustrated with the help of a concrete public goods example.
△ Less
Submitted 18 March, 2018; v1 submitted 25 August, 2015;
originally announced August 2015.