-
Generalizing Better Response Paths and Weakly Acyclic Games
Authors:
Bora Yongacoglu,
Gürdal Arslan,
Lacra Pavel,
Serdar Yüksel
Abstract:
Weakly acyclic games generalize potential games and are fundamental to the study of game theoretic control. In this paper, we present a generalization of weakly acyclic games, and we observe its importance in multi-agent learning when agents employ experimental strategy updates in periods where they fail to best respond. While weak acyclicity is defined in terms of path connectivity properties of…
▽ More
Weakly acyclic games generalize potential games and are fundamental to the study of game theoretic control. In this paper, we present a generalization of weakly acyclic games, and we observe its importance in multi-agent learning when agents employ experimental strategy updates in periods where they fail to best respond. While weak acyclicity is defined in terms of path connectivity properties of a game's better response graph, our generalization is defined using a generalized better response graph. We provide sufficient conditions for this notion of generalized weak acyclicity in both two-player games and $n$-player games. To demonstrate that our generalization is not trivial, we provide examples of games admitting a pure Nash equilibrium that are not generalized weakly acyclic. The generalization presented in this work is closely related to the recent theory of satisficing paths, and the counterexamples presented here constitute the first negative results in that theory.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Paths to Equilibrium in Games
Authors:
Bora Yongacoglu,
Gürdal Arslan,
Lacra Pavel,
Serdar Yüksel
Abstract:
In multi-agent reinforcement learning (MARL) and game theory, agents repeatedly interact and revise their strategies as new data arrives, producing a sequence of strategy profiles. This paper studies sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning, where an agent who is best responding in one period does not switch its strategy in the…
▽ More
In multi-agent reinforcement learning (MARL) and game theory, agents repeatedly interact and revise their strategies as new data arrives, producing a sequence of strategy profiles. This paper studies sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning, where an agent who is best responding in one period does not switch its strategy in the next period. This constraint merely requires that optimizing agents do not switch strategies, but does not constrain the non-optimizing agents in any way, and thus allows for exploration. Sequences with this property are called satisficing paths, and arise naturally in many MARL algorithms. A fundamental question about strategic dynamics is such: for a given game and initial strategy profile, is it always possible to construct a satisficing path that terminates at an equilibrium? The resolution of this question has implications about the capabilities or limitations of a class of MARL algorithms. We answer this question in the affirmative for normal-form games. Our analysis reveals a counterintuitive insight that reward deteriorating strategic updates are key to driving play to equilibrium along a satisficing path.
△ Less
Submitted 1 October, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Recursive Reasoning in Minimax Games: A Level $k$ Gradient Play Method
Authors:
Zichu Liu,
Lacra Pavel
Abstract:
Despite the success of generative adversarial networks (GANs) in generating visually appealing images, they are notoriously challenging to train. In order to stabilize the learning dynamics in minimax games, we propose a novel recursive reasoning algorithm: Level $k$ Gradient Play (Lv.$k$ GP) algorithm. In contrast to many existing algorithms, our algorithm does not require sophisticated heuristic…
▽ More
Despite the success of generative adversarial networks (GANs) in generating visually appealing images, they are notoriously challenging to train. In order to stabilize the learning dynamics in minimax games, we propose a novel recursive reasoning algorithm: Level $k$ Gradient Play (Lv.$k$ GP) algorithm. In contrast to many existing algorithms, our algorithm does not require sophisticated heuristics or curvature information. We show that as $k$ increases, Lv.$k$ GP converges asymptotically towards an accurate estimation of players' future strategy. Moreover, we justify that Lv.$\infty$ GP naturally generalizes a line of provably convergent game dynamics which rely on predictive updates. Furthermore, we provide its local convergence property in nonconvex-nonconcave zero-sum games and global convergence in bilinear and quadratic games. By combining Lv.$k$ GP with Adam optimizer, our algorithm shows a clear advantage in terms of performance and computational overhead compared to other methods. Using a single Nvidia RTX3090 GPU and 30 times fewer parameters than BigGAN on CIFAR-10, we achieve an FID of 10.17 for unconditional image generation within 30 hours, allowing GAN training on common computational resources to reach state-of-the-art performance.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
Second-Order Mirror Descent: Convergence in Games Beyond Averaging and Discounting
Authors:
Bolin Gao,
Lacra Pavel
Abstract:
In this paper, we propose a second-order extension of the continuous-time game-theoretic mirror descent (MD) dynamics, referred to as MD2, which provably converges to mere (but not necessarily strict) variationally stable states (VSS) without using common auxiliary techniques such as time-averaging or discounting. We show that MD2 enjoys no-regret as well as an exponential rate of convergence towa…
▽ More
In this paper, we propose a second-order extension of the continuous-time game-theoretic mirror descent (MD) dynamics, referred to as MD2, which provably converges to mere (but not necessarily strict) variationally stable states (VSS) without using common auxiliary techniques such as time-averaging or discounting. We show that MD2 enjoys no-regret as well as an exponential rate of convergence towards strong VSS upon a slight modification. MD2 can also be used to derive many novel continuous-time primal-space dynamics. We then use stochastic approximation techniques to provide a convergence guarantee of discrete-time MD2 with noisy observations towards interior mere VSS. Selected simulations are provided to illustrate our results.
△ Less
Submitted 30 June, 2023; v1 submitted 18 November, 2021;
originally announced November 2021.
-
Continuous-Time Convergence Rates in Potential and Monotone Games
Authors:
Bolin Gao,
Lacra Pavel
Abstract:
In this paper, we provide exponential rates of convergence to the interior Nash equilibrium for continuous-time dual-space game dynamics such as mirror descent (MD) and actor-critic (AC). We perform our analysis in $N$-player continuous concave games that satisfy certain monotonicity assumptions while possibly also admitting potential functions. In the first part of this paper, we provide a novel…
▽ More
In this paper, we provide exponential rates of convergence to the interior Nash equilibrium for continuous-time dual-space game dynamics such as mirror descent (MD) and actor-critic (AC). We perform our analysis in $N$-player continuous concave games that satisfy certain monotonicity assumptions while possibly also admitting potential functions. In the first part of this paper, we provide a novel relative characterization of monotone games and show that MD and its discounted version converge with $\mathcal{O}(e^{-βt})$ in relatively strongly and relatively hypo-monotone games, respectively. In the second part of this paper, we specialize our results to games that admit a relatively strongly concave potential and show AC converges with $\mathcal{O}(e^{-βt})$. These rates extend their known convergence conditions. Simulations are performed which empirically back up our results.
△ Less
Submitted 2 February, 2022; v1 submitted 20 November, 2020;
originally announced November 2020.
-
Continuous-time Discounted Mirror-Descent Dynamics in Monotone Concave Games
Authors:
Bolin Gao,
Lacra Pavel
Abstract:
In this paper, we consider concave continuous-kernel games characterized by monotonicity properties and propose discounted mirror descent-type dynamics. We introduce two classes of dynamics whereby the associated mirror map is constructed based on a strongly convex or a Legendre regularizer. Depending on the properties of the regularizer we show that these new dynamics can converge asymptotically…
▽ More
In this paper, we consider concave continuous-kernel games characterized by monotonicity properties and propose discounted mirror descent-type dynamics. We introduce two classes of dynamics whereby the associated mirror map is constructed based on a strongly convex or a Legendre regularizer. Depending on the properties of the regularizer we show that these new dynamics can converge asymptotically in concave games with monotone (negative) pseudo-gradient. Furthermore, we show that when the regularizer enjoys strong convexity, the resulting dynamics can converge even in games with hypo-monotone (negative) pseudo-gradient, which corresponds to a shortage of monotonicity.
△ Less
Submitted 7 December, 2019;
originally announced December 2019.
-
Distributed GNE seeking under partial-decision information over networks via a doubly-augmented operator splitting approach
Authors:
Lacra Pavel
Abstract:
We consider distributed computation of generalized Nash equilibrium (GNE) over networks, in games with shared coupling constraints. Existing methods require that each player has full access to opponents' decisions. In this paper, we assume that players have only partial-decision information, and can communicate with their neighbours over an arbitrary undirected graph. We recast the problem as that…
▽ More
We consider distributed computation of generalized Nash equilibrium (GNE) over networks, in games with shared coupling constraints. Existing methods require that each player has full access to opponents' decisions. In this paper, we assume that players have only partial-decision information, and can communicate with their neighbours over an arbitrary undirected graph. We recast the problem as that of finding a zero of a sum of monotone operators through primal-dual analysis. To distribute the problem, we doubly augment variables, so that each player has local decision estimates and local copies of Lagrangian multipliers. We introduce a single-layer algorithm, fully distributed with respect to both primal and dual variables. We show its convergence to a variational GNE with fixed step-sizes, by reformulating it as a forward-backward iteration for a pair of doubly-augmented monotone operators.
△ Less
Submitted 13 August, 2018;
originally announced August 2018.
-
On Passivity, Reinforcement Learning and Higher-Order Learning in Multi-Agent Finite Games
Authors:
Bolin Gao,
Lacra Pavel
Abstract:
In this paper, we propose a passivity-based methodology for analysis and design of reinforcement learning in multi-agent finite games. Starting from a known exponentially-discounted reinforcement learning scheme, we show that convergence to a Nash distribution can be shown in the class of games characterized by the monotonicity property of their (negative) payoff. We further exploit passivity to p…
▽ More
In this paper, we propose a passivity-based methodology for analysis and design of reinforcement learning in multi-agent finite games. Starting from a known exponentially-discounted reinforcement learning scheme, we show that convergence to a Nash distribution can be shown in the class of games characterized by the monotonicity property of their (negative) payoff. We further exploit passivity to propose a class of higher-order schemes that preserve convergence properties, can improve the speed of convergence and can even converge in cases whereby their first-order counterpart fail to converge. We demonstrate these properties through numerical simulations for several representative games.
△ Less
Submitted 13 August, 2018;
originally announced August 2018.
-
From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning
Authors:
Mohammadhosein Hasanbeig,
Lacra Pavel
Abstract:
The main focus of this paper is on enhancement of two types of game-theoretic learning algorithms: log-linear learning and reinforcement learning. The standard analysis of log-linear learning needs a highly structured environment, i.e. strong assumptions about the game from an implementation perspective. In this paper, we introduce a variant of log-linear learning that provides asymptotic guarante…
▽ More
The main focus of this paper is on enhancement of two types of game-theoretic learning algorithms: log-linear learning and reinforcement learning. The standard analysis of log-linear learning needs a highly structured environment, i.e. strong assumptions about the game from an implementation perspective. In this paper, we introduce a variant of log-linear learning that provides asymptotic guarantees while relaxing the structural assumptions to include synchronous updates and limitations in information available to the players. On the other hand, model-free reinforcement learning is able to perform even under weaker assumptions on players' knowledge about the environment and other players' strategies. We propose a reinforcement algorithm that uses a double-aggregation scheme in order to deepen players' insight about the environment and constant learning step-size which achieves a higher convergence rate. Numerical experiments are conducted to verify each algorithm's robustness and performance.
△ Less
Submitted 18 September, 2018; v1 submitted 6 February, 2018;
originally announced February 2018.
-
On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning
Authors:
Bolin Gao,
Lacra Pavel
Abstract:
In this paper, we utilize results from convex analysis and monotone operator theory to derive additional properties of the softmax function that have not yet been covered in the existing literature. In particular, we show that the softmax function is the monotone gradient map of the log-sum-exp function. By exploiting this connection, we show that the inverse temperature parameter determines the L…
▽ More
In this paper, we utilize results from convex analysis and monotone operator theory to derive additional properties of the softmax function that have not yet been covered in the existing literature. In particular, we show that the softmax function is the monotone gradient map of the log-sum-exp function. By exploiting this connection, we show that the inverse temperature parameter determines the Lipschitz and co-coercivity properties of the softmax function. We then demonstrate the usefulness of these properties through an application in game-theoretic reinforcement learning.
△ Less
Submitted 20 August, 2018; v1 submitted 3 April, 2017;
originally announced April 2017.
-
A Distributed Nash Equilibrium Seeking in Networked Graphical Games
Authors:
Farzad Salehisadaghiani,
Lacra Pavel
Abstract:
This paper considers a distributed gossip approach for finding a Nash equilibrium in networked games on graphs. In such games a player's cost function may be affected by the actions of any subset of players. An interference graph is employed to illustrate the partially-coupled cost functions and the asymmetric information requirements. For a given interference graph, network communication between…
▽ More
This paper considers a distributed gossip approach for finding a Nash equilibrium in networked games on graphs. In such games a player's cost function may be affected by the actions of any subset of players. An interference graph is employed to illustrate the partially-coupled cost functions and the asymmetric information requirements. For a given interference graph, network communication between players is considered to be limited. A generalized communication graph is designed so that players exchange only their required information. An algorithm is designed whereby players, with possibly partially-coupled cost functions, make decisions based on the estimates of other players' actions obtained from local neighbors. It is shown that this choice of communication graph guarantees that all players' information is exchanged after sufficiently many iterations. Using a set of standard assumptions on the cost functions, the interference and the communication graphs, almost sure convergence to a Nash equilibrium is proved for diminishing step sizes. Moreover, the case when the cost functions are not known by the players is investigated and a convergence proof is presented for diminishing step sizes. The effect of the second largest eigenvalue of the expected communication matrix on the convergence rate is quantified. The trade-off between parameters associated with the communication graph and the ones associated with the interference graph is illustrated. Numerical results are presented for a large-scale networked game.
△ Less
Submitted 28 March, 2017;
originally announced March 2017.
-
Generalized Nash Equilibrium Problem by the Alternating Direction Method of Multipliers
Authors:
Farzad Salehisadaghiani,
Lacra Pavel
Abstract:
In this paper, the problem of finding a generalized Nash equilibrium (GNE) of a networked game is studied. Players are only able to choose their decisions from a feasible action set. The feasible set is considered to be a private linear equality constraint that is coupled through decisions of the other players. We consider that each player has his own private constraint and it has not to be shared…
▽ More
In this paper, the problem of finding a generalized Nash equilibrium (GNE) of a networked game is studied. Players are only able to choose their decisions from a feasible action set. The feasible set is considered to be a private linear equality constraint that is coupled through decisions of the other players. We consider that each player has his own private constraint and it has not to be shared with the other players. This general case also embodies the one with shared constraints between players and it can be also simply extended to the case with inequality constraints. Since the players don't have access to other players' actions, they need to exchange estimates of others' actions and a local copy of the Lagrangian multiplier with their neighbors over a connected communication graph. We develop a relatively fast algorithm by reformulating the conservative GNE problem within the framework of inexact-ADMM. The convergence of the algorithm is guaranteed under a few mild assumptions on cost functions. Finally, the algorithm is simulated for a wireless ad-hoc network.
△ Less
Submitted 24 March, 2017;
originally announced March 2017.
-
A distributed primal-dual algorithm for computation of generalized Nash equilibria with shared affine coupling constraints via operator splitting methods
Authors:
Peng Yi,
Lacra Pavel
Abstract:
In this paper, we propose a distributed primal-dual algorithm for computation of a generalized Nash equilibrium (GNE) in noncooperative games over network systems. In the considered game, not only each player's local objective function depends on other players' decisions, but also the feasible decision sets of all the players are coupled together with a globally shared affine inequality constraint…
▽ More
In this paper, we propose a distributed primal-dual algorithm for computation of a generalized Nash equilibrium (GNE) in noncooperative games over network systems. In the considered game, not only each player's local objective function depends on other players' decisions, but also the feasible decision sets of all the players are coupled together with a globally shared affine inequality constraint. Adopting the variational GNE, that is the solution of a variational inequality, as a refinement of GNE, we introduce a primal-dual algorithm that players can use to seek it in a distributed manner. Each player only needs to know its local objective function, local feasible set, and a local block of the affine constraint. Meanwhile, each player only needs to observe the decisions on which its local objective function explicitly depends through the interference graph and share information related to multipliers with its neighbors through a multiplier graph. Through a primal-dual analysis and an augmentation of variables, we reformulate the problem as finding the zeros of a sum of monotone operators. Our distributed primal-dual algorithm is based on forward-backward operator splitting methods. We prove its convergence to the variational GNE for fixed step-sizes under some mild assumptions. Then a distributed algorithm with inertia is also introduced and analyzed for variational GNE seeking. Finally, numerical simulations for network Cournot competition are given to illustrate the algorithm efficiency and performance.
△ Less
Submitted 15 March, 2017;
originally announced March 2017.
-
Nash Equilibrium Seeking with Non-doubly Stochastic Communication Weight Matrix
Authors:
Farzad Salehisadaghiani,
Lacra Pavel
Abstract:
A distributed Nash equilibrium seeking algorithm is presented for networked games. We assume an incomplete information available to each player about the other players' actions. The players communicate over a strongly connected digraph to send/receive the estimates of the other players' actions to/from the other local players according to a gossip communication protocol. Due to asymmetric informat…
▽ More
A distributed Nash equilibrium seeking algorithm is presented for networked games. We assume an incomplete information available to each player about the other players' actions. The players communicate over a strongly connected digraph to send/receive the estimates of the other players' actions to/from the other local players according to a gossip communication protocol. Due to asymmetric information exchange between the players, a non-doubly (row) stochastic weight matrix is defined. We show that, due to the non-doubly stochastic property, the total average of all players' estimates is not preserved for the next iteration which results in having no exact convergence. We present an almost sure convergence proof of the algorithm to a Nash equilibrium of the game. Then, we extend the algorithm for graphical games in which all players' cost functions are only dependent on the local neighboring players over an interference digraph. We design an assumption on the communication digraph such that the players are able to update all the estimates of the players who interfere with their cost functions. It is shown that the communication digraph needs to be a superset of a transitive reduction of the interference digraph. Finally, we verify the efficacy of the algorithm via a simulation on a social media behavioral case.
△ Less
Submitted 30 March, 2017; v1 submitted 21 December, 2016;
originally announced December 2016.
-
Distributed Nash Equilibrium Seeking via the Alternating Direction Method of Multipliers
Authors:
Farzad Salehisadaghiani,
Lacra Pavel
Abstract:
In this paper, the problem of finding a Nash equilibrium of a multi-player game is considered. The players are only aware of their own cost functions as well as the action space of all players. We develop a relatively fast algorithm within the framework of inexact-ADMM. It requires a communication graph for the information exchange between the players as well as a few mild assumptions on cost func…
▽ More
In this paper, the problem of finding a Nash equilibrium of a multi-player game is considered. The players are only aware of their own cost functions as well as the action space of all players. We develop a relatively fast algorithm within the framework of inexact-ADMM. It requires a communication graph for the information exchange between the players as well as a few mild assumptions on cost functions. The convergence proof of the algorithm to a Nash equilibrium of the game is then provided. Moreover, the convergence rate is investigated via simulations.
△ Less
Submitted 1 December, 2016;
originally announced December 2016.
-
Distributed Nash Equilibrium Seeking By Gossip in Games on Graphs
Authors:
Farzad Salehisadaghiani,
Lacra Pavel
Abstract:
We consider a gossip approach for finding a Nash equilibrium in a distributed multi-player network game. We extend previous results on Nash equilibrium seeking to the case when the players' cost functions may be affected by the actions of any subset of players. An interference graph is employed to illustrate the partially-coupled cost functions and the asymmetric information requirements. For a gi…
▽ More
We consider a gossip approach for finding a Nash equilibrium in a distributed multi-player network game. We extend previous results on Nash equilibrium seeking to the case when the players' cost functions may be affected by the actions of any subset of players. An interference graph is employed to illustrate the partially-coupled cost functions and the asymmetric information requirements. For a given interference graph, we design a generalized communication graph so that players with possibly partially-coupled cost functions exchange only their required information and make decisions based on them. Using a set of standard assumptions on the cost functions, interference and communication graphs, we prove almost sure convergence to a Nash equilibrium for diminishing step sizes. We then quantify the effect of the second largest eigenvalue of the expected communication matrix on the convergence rate, and illustrate the trade-off between the parameters associated with the communication and the interference graphs. Finally, the efficacy of the proposed algorithm on a large-scale networked game is demonstrated via simulation.
△ Less
Submitted 6 October, 2016;
originally announced October 2016.
-
Enabling Differentiated Services Using Generalized Power Control Model in Optical Networks
Authors:
Quanyan Zhu,
Lacra Pavel
Abstract:
This paper considers a generalized framework to study OSNR optimization-based end-to-end link level power control problems in optical networks. We combine favorable features of game-theoretical approach and central cost approach to allow different service groups within the network. We develop solutions concepts for both cases of empty and nonempty feasible sets. In addition, we derive and prove th…
▽ More
This paper considers a generalized framework to study OSNR optimization-based end-to-end link level power control problems in optical networks. We combine favorable features of game-theoretical approach and central cost approach to allow different service groups within the network. We develop solutions concepts for both cases of empty and nonempty feasible sets. In addition, we derive and prove the convergence of a distributed iterative algorithm for different classes of users. In the end, we use numerical examples to illustrate the novel framework.
△ Less
Submitted 12 March, 2011;
originally announced March 2011.
-
An Optimization and Control Theoretic Approach to Noncooperative Game Design
Authors:
Tansu Alpcan,
Lacra Pavel,
Nem Stefanovic
Abstract:
This paper investigates design of noncooperative games from an optimization and control theoretic perspective. Pricing mechanisms are used as a design tool to ensure that the Nash equilibrium of a fairly general class of noncooperative games satisfies certain global objectives such as welfare maximization or achieving a certain level of quality-of-service (QoS). The class of games considered provi…
▽ More
This paper investigates design of noncooperative games from an optimization and control theoretic perspective. Pricing mechanisms are used as a design tool to ensure that the Nash equilibrium of a fairly general class of noncooperative games satisfies certain global objectives such as welfare maximization or achieving a certain level of quality-of-service (QoS). The class of games considered provide a theoretical basis for decentralized resource allocation and control problems including network congestion control, wireless uplink power control, and optical power control. The game design problem is analyzed under different knowledge assumptions (full versus limited information) and design objectives (QoS versus utility maximization) for separable and non-separable utility functions. The ``price of anarchy'' is shown not to be an inherent feature of full-information games that incorporate pricing mechanisms. Moreover, a simple linear pricing is shown to be sufficient for design of Nash equilibrium according to a chosen global objective for a fairly general class of games. Stability properties of the game and pricing dynamics are studied under the assumption of time-scale separation and in two separate time-scales. Thus, sufficient conditions are derived, which allow the designer to place the Nash equilibrium solution or to guide the system trajectory to a desired region or point. The obtained results are illustrated with a number of examples.
△ Less
Submitted 1 July, 2010;
originally announced July 2010.