-
An Anytime Algorithm for Optimal Coalition Structure Generation
Authors:
Talal Rahwan,
Sarvapali Dyanand Ramchurn,
Nicholas Robert Jennings,
Andrea Giovannucci
Abstract:
Coalition formation is a fundamental type of interaction that involves the creation of coherent groupings of distinct, autonomous, agents in order to efficiently achieve their individual or collective goals. Forming effective coalitions is a major research challenge in the field of multi-agent systems. Central to this endeavour is the problem of determining which of the many possible coalitions t…
▽ More
Coalition formation is a fundamental type of interaction that involves the creation of coherent groupings of distinct, autonomous, agents in order to efficiently achieve their individual or collective goals. Forming effective coalitions is a major research challenge in the field of multi-agent systems. Central to this endeavour is the problem of determining which of the many possible coalitions to form in order to achieve some goal. This usually requires calculating a value for every possible coalition, known as the coalition value, which indicates how beneficial that coalition would be if it was formed. Once these values are calculated, the agents usually need to find a combination of coalitions, in which every agent belongs to exactly one coalition, and by which the overall outcome of the system is maximized. However, this coalition structure generation problem is extremely challenging due to the number of possible solutions that need to be examined, which grows exponentially with the number of agents involved. To date, therefore, many algorithms have been proposed to solve this problem using different techniques ranging from dynamic programming, to integer programming, to stochastic search all of which suffer from major limitations relating to execution time, solution quality, and memory requirements.
With this in mind, we develop an anytime algorithm to solve the coalition structure generation problem. Specifically, the algorithm uses a novel representation of the search space, which partitions the space of possible solutions into sub-spaces such that it is possible to compute upper and lower bounds on the values of the best coalition structures in them. These bounds are then used to identify the sub-spaces that have no potential of containing the optimal solution so that they can be pruned. The algorithm, then, searches through the remaining sub-spaces very efficiently using a branch-and-bound technique to avoid examining all the solutions within the searched subspace(s). In this setting, we prove that our algorithm enumerates all coalition structures efficiently by avoiding redundant and invalid solutions automatically. Moreover, in order to effectively test our algorithm we develop a new type of input distribution which allows us to generate more reliable benchmarks compared to the input distributions previously used in the field. Given this new distribution, we show that for 27 agents our algorithm is able to find solutions that are optimal in 0.175% of the time required by the fastest available algorithm in the literature. The algorithm is anytime, and if interrupted before it would have normally terminated, it can still provide a solution that is guaranteed to be within a bound from the optimal one. Moreover, the guarantees we provide on the quality of the solution are significantly better than those provided by the previous state of the art algorithms designed for this purpose. For example, for the worst case distribution given 25 agents, our algorithm is able to find a 90% efficient solution in around 10% of time it takes to find the optimal solution.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.
-
Optimal Strategies for Simultaneous Vickrey Auctions with Perfect Substitutes
Authors:
Enrico H. Gerding,
Rajdeep Kumar Dash,
Andrew Byde,
Nicholas Robert Jennings
Abstract:
We derive optimal strategies for a bidding agent that participates in multiple, simultaneous second-price auctions with perfect substitutes. We prove that, if everyone else bids locally in a single auction, the global bidder should always place non-zero bids in all available auctions, provided there are no budget constraints. With a budget, however, the optimal strategy is to bid locally if this b…
▽ More
We derive optimal strategies for a bidding agent that participates in multiple, simultaneous second-price auctions with perfect substitutes. We prove that, if everyone else bids locally in a single auction, the global bidder should always place non-zero bids in all available auctions, provided there are no budget constraints. With a budget, however, the optimal strategy is to bid locally if this budget is equal or less than the valuation. Furthermore, for a wide range of valuation distributions, we prove that the problem of finding the optimal bids reduces to two dimensions if all auctions are identical. Finally, we address markets with both sequential and simultaneous auctions, non-identical auctions, and the allocative efficiency of the market.
△ Less
Submitted 14 January, 2014;
originally announced January 2014.
-
A Hierarchical Dynamic Programming Algorithm for Optimal Coalition Structure Generation
Authors:
Meritxell Vinyals,
Thomas Voice,
Sarvapali Ramchurn,
Nicholas R. Jennings
Abstract:
We present a new Dynamic Programming (DP) formulation of the Coalition Structure Generation (CSG) problem based on imposing a hierarchical organizational structure over the agents. We show the efficiency of this formulation by deriving DyPE, a new optimal DP algorithm which significantly outperforms current DP approaches in speed and memory usage. In the classic case, in which all coalitions are f…
▽ More
We present a new Dynamic Programming (DP) formulation of the Coalition Structure Generation (CSG) problem based on imposing a hierarchical organizational structure over the agents. We show the efficiency of this formulation by deriving DyPE, a new optimal DP algorithm which significantly outperforms current DP approaches in speed and memory usage. In the classic case, in which all coalitions are feasible, DyPE has half the memory requirements of other DP approaches. On graph-restricted CSG, in which feasibility is restricted by a (synergy) graph, DyPE has either the same or lower computational complexity depending on the underlying graph structure of the problem. Our empirical evaluation shows that DyPE outperforms the state-of-the-art DP approaches by several orders of magnitude in a large range of graph structures (e.g. for certain scalefree graphs DyPE reduces the memory requirements by $10^6$ and solves problems that previously needed hours in minutes).
△ Less
Submitted 24 October, 2013;
originally announced October 2013.
-
Efficient State-Space Inference of Periodic Latent Force Models
Authors:
Steven Reece,
Stephen Roberts,
Siddhartha Ghosh,
Alex Rogers,
Nicholas Jennings
Abstract:
Latent force models (LFM) are principled approaches to incorporating solutions to differential equations within non-parametric inference methods. Unfortunately, the development and application of LFMs can be inhibited by their computational cost, especially when closed-form solutions for the LFM are unavailable, as is the case in many real world problems where these latent forces exhibit periodic…
▽ More
Latent force models (LFM) are principled approaches to incorporating solutions to differential equations within non-parametric inference methods. Unfortunately, the development and application of LFMs can be inhibited by their computational cost, especially when closed-form solutions for the LFM are unavailable, as is the case in many real world problems where these latent forces exhibit periodic behaviour. Given this, we develop a new sparse representation of LFMs which considerably improves their computational efficiency, as well as broadening their applicability, in a principled way, to domains with periodic or near periodic latent forces. Our approach uses a linear basis model to approximate one generative model for each periodic force. We assume that the latent forces are generated from Gaussian process priors and develop a linear basis model which fully expresses these priors. We apply our approach to model the thermal dynamics of domestic buildings and show that it is effective at predicting day-ahead temperatures within the homes. We also apply our approach within queueing theory in which quasi-periodic arrival rates are modelled as latent forces. In both cases, we demonstrate that our approach can be implemented efficiently using state-space methods which encode the linear dynamic systems via LFMs. Further, we show that state estimates obtained using periodic latent force models can reduce the root mean squared error to 17% of that from non-periodic models and 27% of the nearest rival approach which is the resonator model.
△ Less
Submitted 29 May, 2014; v1 submitted 23 October, 2013;
originally announced October 2013.
-
Learning Periodic Human Behaviour Models from Sparse Data for Crowdsourcing Aid Delivery in Developing Countries
Authors:
James McInerney,
Alex Rogers,
Nicholas R. Jennings
Abstract:
In many developing countries, half the population lives in rural locations, where access to essentials such as school materials, mosquito nets, and medical supplies is restricted. We propose an alternative method of distribution (to standard road delivery) in which the existing mobility habits of a local population are leveraged to deliver aid, which raises two technical challenges in the areas op…
▽ More
In many developing countries, half the population lives in rural locations, where access to essentials such as school materials, mosquito nets, and medical supplies is restricted. We propose an alternative method of distribution (to standard road delivery) in which the existing mobility habits of a local population are leveraged to deliver aid, which raises two technical challenges in the areas optimisation and learning. For optimisation, a standard Markov decision process applied to this problem is intractable, so we provide an exact formulation that takes advantage of the periodicities in human location behaviour. To learn such behaviour models from sparse data (i.e., cell tower observations), we develop a Bayesian model of human mobility. Using real cell tower data of the mobility behaviour of 50,000 individuals in Ivory Coast, we find that our model outperforms the state of the art approaches in mobility prediction by at least 25% (in held-out data likelihood). Furthermore, when incorporating mobility prediction with our MDP approach, we find a 81.3% reduction in total delivery time versus routine planning that minimises just the number of participants in the solution path.
△ Less
Submitted 26 September, 2013;
originally announced September 2013.
-
Regret-Based Multi-Agent Coordination with Uncertain Task Rewards
Authors:
Feng Wu,
Nicholas R. Jennings
Abstract:
Many multi-agent coordination problems can be represented as DCOPs. Motivated by task allocation in disaster response, we extend standard DCOP models to consider uncertain task rewards where the outcome of completing a task depends on its current state, which is randomly drawn from unknown distributions. The goal of solving this problem is to find a solution for all agents that minimizes the overa…
▽ More
Many multi-agent coordination problems can be represented as DCOPs. Motivated by task allocation in disaster response, we extend standard DCOP models to consider uncertain task rewards where the outcome of completing a task depends on its current state, which is randomly drawn from unknown distributions. The goal of solving this problem is to find a solution for all agents that minimizes the overall worst-case loss. This is a challenging problem for centralized algorithms because the search space grows exponentially with the number of agents and is nontrivial for standard DCOP algorithms we have. To address this, we propose a novel decentralized algorithm that incorporates Max-Sum with iterative constraint generation to solve the problem by passing messages among agents. By so doing, our approach scales well and can solve instances of the task allocation problem with hundreds of agents and tasks.
△ Less
Submitted 8 September, 2013;
originally announced September 2013.
-
Targeted Social Mobilisation in a Global Manhunt
Authors:
Alex Rutherford,
Manuel Cebrian,
Iyad Rahwan,
Sohan Dsouza,
James McInerney,
Victor Naroditskiy,
Matteo Venanzi,
Nicholas R. Jennings,
J. R. deLara,
Eero Wahlstedt,
Steven U. Miller
Abstract:
Social mobilization, the ability to mobilize large numbers of people via social networks to achieve highly distributed tasks, has received significant attention in recent times. This growing capability, facilitated by modern communication technology, is highly relevant to endeavors which require the search for individuals that posses rare information or skill, such as finding medical doctors durin…
▽ More
Social mobilization, the ability to mobilize large numbers of people via social networks to achieve highly distributed tasks, has received significant attention in recent times. This growing capability, facilitated by modern communication technology, is highly relevant to endeavors which require the search for individuals that posses rare information or skill, such as finding medical doctors during disasters, or searching for missing people. An open question remains, as to whether in time-critical situations, people are able to recruit in a targeted manner, or whether they resort to so-called blind search, recruiting as many acquaintances as possible via broadcast communication. To explore this question, we examine data from our recent success in the U.S. State Department's Tag Challenge, which required locating and photographing 5 target persons in 5 different cities in the United States and Europe in less than 12 hours, based only on a single mug-shot. We find that people are able to consistently route information in a targeted fashion even under increasing time pressure. We derive an analytical model for global mobilization and use it to quantify the extent to which people were targeting others during recruitment. Our model estimates that approximately 1 in 3 messages were of targeted fashion during the most time-sensitive period of the challenge.This is a novel observation at such short temporal scales, and calls for opportunities for devising viral incentive schemes that provide distance- or time-sensitive rewards to approach the target geography more rapidly, with applications in multiple areas from emergency preparedness, to political mobilization.
△ Less
Submitted 6 April, 2014; v1 submitted 18 April, 2013;
originally announced April 2013.
-
Crowdsourcing Dilemma
Authors:
Victor Naroditskiy,
Nicholas R. Jennings,
Pascal Van Hentenryck,
Manuel Cebrian
Abstract:
Crowdsourcing offers unprecedented potential for solving tasks efficiently by tapping into the skills of large groups of people. A salient feature of crowdsourcing---its openness of entry---makes it vulnerable to malicious behavior. Such behavior took place in a number of recent popular crowdsourcing competitions. We provide game-theoretic analysis of a fundamental tradeoff between the potential f…
▽ More
Crowdsourcing offers unprecedented potential for solving tasks efficiently by tapping into the skills of large groups of people. A salient feature of crowdsourcing---its openness of entry---makes it vulnerable to malicious behavior. Such behavior took place in a number of recent popular crowdsourcing competitions. We provide game-theoretic analysis of a fundamental tradeoff between the potential for increased productivity and the possibility of being set back by malicious behavior. Our results show that in crowdsourcing competitions malicious behavior is the norm, not the anomaly---a result contrary to the conventional wisdom in the area. Counterintuitively, making the attacks more costly does not deter them but leads to a less desirable outcome. These findings have cautionary implications for the design of crowdsourcing competitions.
△ Less
Submitted 22 February, 2014; v1 submitted 12 April, 2013;
originally announced April 2013.
-
Matching Games with Additive Externalities
Authors:
Simina Brânzei,
Tomasz P. Michalak,
Talal Rahwan,
Kate Larson,
Nicholas R. Jennings
Abstract:
Two-sided matchings are an important theoretical tool used to model markets and social interactions. In many real life problems the utility of an agent is influenced not only by their own choices, but also by the choices that other agents make. Such an influence is called an externality. Whereas fully expressive representations of externalities in matchings require exponential space, in this paper…
▽ More
Two-sided matchings are an important theoretical tool used to model markets and social interactions. In many real life problems the utility of an agent is influenced not only by their own choices, but also by the choices that other agents make. Such an influence is called an externality. Whereas fully expressive representations of externalities in matchings require exponential space, in this paper we propose a compact model of externalities, in which the influence of a match on each agent is computed additively. In this framework, we analyze many-to-many and one-to-one matchings under neutral, optimistic, and pessimistic behaviour, and provide both computational hardness results and polynomial-time algorithms for computing stable outcomes.
△ Less
Submitted 16 July, 2012;
originally announced July 2012.
-
Knapsack based Optimal Policies for Budget-Limited Multi-Armed Bandits
Authors:
Long Tran-Thanh,
Archie Chapman,
Alex Rogers,
Nicholas R. Jennings
Abstract:
In budget-limited multi-armed bandit (MAB) problems, the learner's actions are costly and constrained by a fixed budget. Consequently, an optimal exploitation policy may not be to pull the optimal arm repeatedly, as is the case in other variants of MAB, but rather to pull the sequence of different arms that maximises the agent's total reward within the budget. This difference from existing MABs me…
▽ More
In budget-limited multi-armed bandit (MAB) problems, the learner's actions are costly and constrained by a fixed budget. Consequently, an optimal exploitation policy may not be to pull the optimal arm repeatedly, as is the case in other variants of MAB, but rather to pull the sequence of different arms that maximises the agent's total reward within the budget. This difference from existing MABs means that new approaches to maximising the total reward are required. Given this, we develop two pulling policies, namely: (i) KUBE; and (ii) fractional KUBE. Whereas the former provides better performance up to 40% in our experimental settings, the latter is computationally less expensive. We also prove logarithmic upper bounds for the regret of both policies, and show that these bounds are asymptotically optimal (i.e. they only differ from the best possible regret by a constant factor).
△ Less
Submitted 9 April, 2012;
originally announced April 2012.
-
Automated Planning in Repeated Adversarial Games
Authors:
Enrique Munoz de Cote,
Archie C. Chapman,
Adam M. Sykulski,
Nicholas R. Jennings
Abstract:
Game theory's prescriptive power typically relies on full rationality and/or self-play interactions. In contrast, this work sets aside these fundamental premises and focuses instead on heterogeneous autonomous interactions between two or more agents. Specifically, we introduce a new and concise representation for repeated adversarial (constant-sum) games that highlight the necessary features that…
▽ More
Game theory's prescriptive power typically relies on full rationality and/or self-play interactions. In contrast, this work sets aside these fundamental premises and focuses instead on heterogeneous autonomous interactions between two or more agents. Specifically, we introduce a new and concise representation for repeated adversarial (constant-sum) games that highlight the necessary features that enable an automated planing agent to reason about how to score above the game's Nash equilibrium, when facing heterogeneous adversaries. To this end, we present TeamUP, a model-based RL algorithm designed for learning and planning such an abstraction. In essence, it is somewhat similar to R-max with a cleverly engineered reward shaping that treats exploration as an adversarial optimization problem. In practice, it attempts to find an ally with which to tacitly collude (in more than two-player games) and then collaborates on a joint plan of actions that can consistently score a high utility in adversarial repeated games. We use the inaugural Lemonade Stand Game Tournament to demonstrate the effectiveness of our approach, and find that TeamUP is the best performing agent, demoting the Tournament's actual winning strategy into second place. In our experimental analysis, we show hat our strategy successfully and consistently builds collaborations with many different heterogeneous (and sometimes very sophisticated) adversaries.
△ Less
Submitted 15 March, 2012;
originally announced March 2012.
-
Filtered Fictitious Play for Perturbed Observation Potential Games and Decentralised POMDPs
Authors:
Archie C. Chapman,
Simon A. Williamson,
Nicholas R. Jennings
Abstract:
Potential games and decentralised partially observable MDPs (Dec-POMDPs) are two commonly used models of multi-agent interaction, for static optimisation and sequential decisionmaking settings, respectively. In this paper we introduce filtered fictitious play for solving repeated potential games in which each player's observations of others' actions are perturbed by random noise, and use this algo…
▽ More
Potential games and decentralised partially observable MDPs (Dec-POMDPs) are two commonly used models of multi-agent interaction, for static optimisation and sequential decisionmaking settings, respectively. In this paper we introduce filtered fictitious play for solving repeated potential games in which each player's observations of others' actions are perturbed by random noise, and use this algorithm to construct an online learning method for solving Dec-POMDPs. Specifically, we prove that noise in observations prevents standard fictitious play from converging to Nash equilibrium in potential games, which also makes fictitious play impractical for solving Dec-POMDPs. To combat this, we derive filtered fictitious play, and provide conditions under which it converges to a Nash equilibrium in potential games with noisy observations. We then use filtered fictitious play to construct a solver for Dec-POMDPs, and demonstrate our new algorithm's performance in a box pushing problem. Our results show that we consistently outperform the state-of-the-art Dec-POMDP solver by an average of 100% across the range of noise in the observation function.
△ Less
Submitted 14 February, 2012;
originally announced February 2012.
-
Multi-Issue Negotiation with Deadlines
Authors:
S. S. Fatima,
N. R. Jennings,
M. J. Wooldridge
Abstract:
This paper studies bilateral multi-issue negotiation between self-interested autonomous agents. Now, there are a number of different procedures that can be used for this process; the three main ones being the package deal procedure in which all the issues are bundled and discussed together, the simultaneous procedure in which the issues are discussed simultaneously but independently of each other,…
▽ More
This paper studies bilateral multi-issue negotiation between self-interested autonomous agents. Now, there are a number of different procedures that can be used for this process; the three main ones being the package deal procedure in which all the issues are bundled and discussed together, the simultaneous procedure in which the issues are discussed simultaneously but independently of each other, and the sequential procedure in which the issues are discussed one after another. Since each of them yields a different outcome, a key problem is to decide which one to use in which circumstances. Specifically, we consider this question for a model in which the agents have time constraints (in the form of both deadlines and discount factors) and information uncertainty (in that the agents do not know the opponents utility function). For this model, we consider issues that are both independent and those that are interdependent and determine equilibria for each case for each procedure. In so doing, we show that the package deal is in fact the optimal procedure for each party. We then go on to show that, although the package deal may be computationally more complex than the other two procedures, it generates Pareto optimal outcomes (unlike the other two), it has similar earliest and latest possible times of agreement to the simultaneous procedure (which is better than the sequential procedure), and that it (like the other two procedures) generates a unique outcome only under certain conditions (which we define).
△ Less
Submitted 12 October, 2011;
originally announced October 2011.
-
Cooperative Information Sharing to Improve Distributed Learning in Multi-Agent Systems
Authors:
P. S. Dutta,
N. R. Jennings,
L. Moreau
Abstract:
Effective coordination of agents actions in partially-observable domains is a major challenge of multi-agent systems research. To address this, many researchers have developed techniques that allow the agents to make decisions based on estimates of the states and actions of other agents that are typically learnt using some form of machine learning algorithm. Nevertheless, many of these approache…
▽ More
Effective coordination of agents actions in partially-observable domains is a major challenge of multi-agent systems research. To address this, many researchers have developed techniques that allow the agents to make decisions based on estimates of the states and actions of other agents that are typically learnt using some form of machine learning algorithm. Nevertheless, many of these approaches fail to provide an actual means by which the necessary information is made available so that the estimates can be learnt. To this end, we argue that cooperative communication of state information between agents is one such mechanism. However, in a dynamically changing environment, the accuracy and timeliness of this communicated information determine the fidelity of the learned estimates and the usefulness of the actions taken based on these. Given this, we propose a novel information-sharing protocol, post-task-completion sharing, for the distribution of state information. We then show, through a formal analysis, the improvement in the quality of estimates produced using our strategy over the widely used protocol of sharing information between nearest neighbours. Moreover, communication heuristics designed around our information-sharing principle are subjected to empirical evaluation along with other benchmark strategies (including Littmans Q-routing and Stones TPOT-RL) in a simulated call-routing application. These studies, conducted across a range of environmental settings, show that, compared to the different benchmarks used, our strategy generates an improvement of up to 60% in the call connection rate; of more than 1000% in the ability to connect long-distance calls; and incurs as low as 0.25 of the message overhead.
△ Less
Submitted 26 September, 2011;
originally announced September 2011.
-
Graph Coalition Structure Generation
Authors:
Thomas D. Voice,
Maria Polukarov,
Nicholas R. Jennings
Abstract:
We give the first analysis of the computational complexity of {\it coalition structure generation over graphs}. Given an undirected graph $G=(N,E)$ and a valuation function $v:2^N\rightarrow\RR$ over the subsets of nodes, the problem is to find a partition of $N$ into connected subsets, that maximises the sum of the components' values. This problem is generally NP--complete; in particular, it is h…
▽ More
We give the first analysis of the computational complexity of {\it coalition structure generation over graphs}. Given an undirected graph $G=(N,E)$ and a valuation function $v:2^N\rightarrow\RR$ over the subsets of nodes, the problem is to find a partition of $N$ into connected subsets, that maximises the sum of the components' values. This problem is generally NP--complete; in particular, it is hard for a defined class of valuation functions which are {\it independent of disconnected members}---that is, two nodes have no effect on each other's marginal contribution to their vertex separator. Nonetheless, for all such functions we provide bounds on the complexity of coalition structure generation over general and minor free graphs. Our proof is constructive and yields algorithms for solving corresponding instances of the problem. Furthermore, we derive polynomial time bounds for acyclic, $K_{2,3}$ and $K_4$ minor free graphs. However, as we show, the problem remains NP--complete for planar graphs, and hence, for any $K_k$ minor free graphs where $k\geq 5$. Moreover, our hardness result holds for a particular subclass of valuation functions, termed {\it edge sum}, where the value of each subset of nodes is simply determined by the sum of given weights of the edges in the induced subgraph.
△ Less
Submitted 8 February, 2011;
originally announced February 2011.