Skip to main content

Showing 1–8 of 8 results for author: Mguni, D H

.
  1. arXiv:2312.11063  [pdf, ps, other

    cs.GT cs.AI cs.DS cs.LG econ.TH

    A survey on algorithms for Nash equilibria in finite normal-form games

    Authors: Hanyu Li, Wenhan Huang, Zhijian Duan, David Henry Mguni, Kun Shao, Jun Wang, Xiaotie Deng

    Abstract: Nash equilibrium is one of the most influential solution concepts in game theory. With the development of computer science and artificial intelligence, there is an increasing demand on Nash equilibrium computation, especially for Internet economics and multi-agent learning. This paper reviews various algorithms computing the Nash equilibrium and its approximation solutions in finite normal-form ga… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: The published version is in Computer Science Review

  2. arXiv:2310.18127  [pdf, other

    cs.LG cs.AI cs.CL

    Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models

    Authors: Xue Yan, Yan Song, Xinyu Cui, Filippos Christianos, Haifeng Zhang, David Henry Mguni, Jun Wang

    Abstract: Large language models (LLMs) demonstrate their promise in tackling complicated practical challenges by combining action-based policies with chain of thought (CoT) reasoning. Having high-quality prompts on hand, however, is vital to the framework's effectiveness. Currently, these prompts are handcrafted utilising extensive human labor, resulting in CoT policies that frequently fail to generalise. H… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

  3. arXiv:2205.15434  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems

    Authors: Oliver Slumbers, David Henry Mguni, Stephen Marcus McAleer, Stefano B. Blumberg, Jun Wang, Yaodong Yang

    Abstract: In order for agents in multi-agent systems (MAS) to be safe, they need to take into account the risks posed by the actions of other agents. However, the dominant paradigm in game theory (GT) assumes that agents are not affected by risk from other agents and only strive to maximise their expected utility. For example, in hybrid human-AI driving systems, it is necessary to limit large deviations in… ▽ More

    Submitted 2 March, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

  4. arXiv:2112.02618  [pdf, other

    cs.MA

    LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning

    Authors: David Henry Mguni, Taher Jafferjee, Jianhong Wang, Oliver Slumbers, Nicolas Perez-Nieves, Feifei Tong, Li Yang, Jiangcheng Zhu, Yaodong Yang, Jun Wang

    Abstract: Efficient exploration is important for reinforcement learners to achieve high rewards. In multi-agent systems, coordinated exploration and behaviour is critical for agents to jointly achieve optimal outcomes. In this paper, we introduce a new general framework for improving coordination and performance of multi-agent reinforcement learners (MARL). Our framework, named Learnable Intrinsic-Reward Ge… ▽ More

    Submitted 16 March, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: text overlap with arXiv:2103.09159

  5. arXiv:2110.03604  [pdf, ps, other

    cs.LG cs.AI cs.GT cs.MA

    Online Markov Decision Processes with Non-oblivious Strategic Adversary

    Authors: Le Cong Dinh, David Henry Mguni, Long Tran-Thanh, Jun Wang, Yaodong Yang

    Abstract: We study a novel setting in Online Markov Decision Processes (OMDPs) where the loss function is chosen by a non-oblivious strategic adversary who follows a no-external regret algorithm. In this setting, we first demonstrate that MDP-Expert, an existing algorithm that works well with oblivious adversaries can still apply and achieve a policy regret bound of… ▽ More

    Submitted 27 January, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Accepted at Autonomous Agents and Multi-Agent Systems (2023)

    Report number: 15

  6. arXiv:2108.08612  [pdf, other

    cs.LG cs.AI cs.MA

    Settling the Variance of Multi-Agent Policy Gradients

    Authors: Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang

    Abstract: Policy gradient (PG) methods are popular reinforcement learning (RL) methods where a baseline is often applied to reduce the variance of gradient estimates. In multi-agent RL (MARL), although the PG theorem can be naturally extended, the effectiveness of multi-agent PG (MAPG) methods degrades as the variance of gradient estimates increases rapidly with the number of agents. In this paper, we offer… ▽ More

    Submitted 4 April, 2022; v1 submitted 19 August, 2021; originally announced August 2021.

  7. arXiv:2103.07927  [pdf, other

    cs.AI cs.GT cs.MA

    Modelling Behavioural Diversity for Learning in Open-Ended Games

    Authors: Nicolas Perez Nieves, Yaodong Yang, Oliver Slumbers, David Henry Mguni, Ying Wen, Jun Wang

    Abstract: Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce… ▽ More

    Submitted 10 June, 2021; v1 submitted 14 March, 2021; originally announced March 2021.

    Comments: corresponds to <[email protected]>

  8. arXiv:2103.07780  [pdf, other

    cs.AI cs.GT

    Online Double Oracle

    Authors: Le Cong Dinh, Yaodong Yang, Stephen McAleer, Zheng Tian, Nicolas Perez Nieves, Oliver Slumbers, David Henry Mguni, Haitham Bou Ammar, Jun Wang

    Abstract: Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial intelligence. This paper proposes new learning algorithms for solving two-player zero-sum normal-form games where the number of pure strategies is prohibitively large. Specifically, we combine no-regret analysis from online learning with Double Oracle (DO) methods… ▽ More

    Submitted 15 February, 2023; v1 submitted 13 March, 2021; originally announced March 2021.

    Comments: Accepted at Transactions on Machine Learning Research (TMLR)

    Journal ref: Transactions on Machine Learning Research 2022