Skip to main content

Showing 1–16 of 16 results for author: Tumer, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2412.20361  [pdf, other

    cs.MA cs.AI cs.RO

    Safe Multiagent Coordination via Entropic Exploration

    Authors: Ayhan Alp Aydeniz, Enrico Marchesini, Robert Loftin, Christopher Amato, Kagan Tumer

    Abstract: Many real-world multiagent learning problems involve safety concerns. In these setups, typical safe reinforcement learning algorithms constrain agents' behavior, limiting exploration -- a crucial component for discovering effective cooperative multiagent behaviors. Moreover, the multiagent literature typically models individual constraints for each agent and has yet to investigate the benefits of… ▽ More

    Submitted 29 December, 2024; originally announced December 2024.

    Comments: 10 pages, 6 figures

  2. arXiv:1906.07315  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination

    Authors: Shauharda Khadka, Somdeb Majumdar, Santiago Miret, Stephen McAleer, Kagan Tumer

    Abstract: Many cooperative multiagent reinforcement learning environments provide agents with a sparse team-based reward, as well as a dense agent-specific reward that incentivizes learning basic skills. Training policies solely on the team-based reward is often difficult due to its sparsity. Furthermore, relying solely on the agent-specific reward is sub-optimal because it usually does not capture the team… ▽ More

    Submitted 11 June, 2020; v1 submitted 17 June, 2019; originally announced June 2019.

    Comments: Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, PMLR 108, 2020

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, PMLR 119, 2020

  3. arXiv:1905.00976  [pdf, other

    cs.LG cs.AI stat.ML

    Collaborative Evolutionary Reinforcement Learning

    Authors: Shauharda Khadka, Somdeb Majumdar, Tarek Nassar, Zach Dwiel, Evren Tumer, Santiago Miret, Yinyin Liu, Kagan Tumer

    Abstract: Deep reinforcement learning algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically struggle with achieving effective exploration and are extremely sensitive to the choice of hyperparameters. One reason is that most approaches use a noisy version of their operating policy to explore - thereby limiting the range of exploration. In this pap… ▽ More

    Submitted 6 May, 2019; v1 submitted 2 May, 2019; originally announced May 2019.

    Comments: Added link to public Github repo. Minor editorial changes. Order of authors modified to reflect ICML submission

    Journal ref: Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019

  4. arXiv:1805.07917  [pdf, other

    cs.LG cs.NE stat.ML

    Evolution-Guided Policy Gradient in Reinforcement Learning

    Authors: Shauharda Khadka, Kagan Tumer

    Abstract: Deep Reinforcement Learning (DRL) algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically suffer from three core difficulties: temporal credit assignment with sparse rewards, lack of effective exploration, and brittle convergence properties that are extremely sensitive to hyperparameters. Collectively, these challenges severely limit the… ▽ More

    Submitted 27 October, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, Canada

  5. Collective Intelligence, Data Routing and Braess' Paradox

    Authors: K. Tumer, D. H. Wolpert

    Abstract: We consider the problem of designing the the utility functions of the utility-maximizing agents in a multi-agent system so that they work synergistically to maximize a global utility. The particular problem domain we explore is the control of network routing by placing agents on all the routers in the network. Conventional approaches to this task have the agents all use the Ideal S… ▽ More

    Submitted 9 June, 2011; originally announced June 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 16, pages 359-387, 2002

  6. arXiv:math/0301268  [pdf, ps, other

    math.OC cond-mat.stat-mech cs.MA nlin.AO

    Improving Search Algorithms by Using Intelligent Coordinates

    Authors: David Wolpert, Kagan Tumer, Esfandiar Bandari

    Abstract: We consider the problem of designing a set of computational agents so that as they all pursue their self-interests a global function G of the collective system is optimized. Three factors govern the quality of such design. The first relates to conventional exploration-exploitation search algorithms for finding the maxima of such a global function, e.g., simulated annealing. Game-theoretic algori… ▽ More

    Submitted 23 January, 2003; originally announced January 2003.

  7. arXiv:cond-mat/0301459  [pdf, ps, other

    cond-mat.dis-nn cond-mat.stat-mech cs.MA nlin.AO

    Collectives for the Optimal Combination of Imperfect Objects

    Authors: Kagan Tumer, David Wolpert

    Abstract: In this letter we summarize some recent theoretical work on the design of collectives, i.e., of systems containing many agents, each of which can be viewed as trying to maximize an associated private utility, where there is also a world utility rating the behavior of that overall system that the designer of the collective wishes to optimize. We then apply algorithms based on that work on a recen… ▽ More

    Submitted 23 January, 2003; originally announced January 2003.

    Comments: 4 pages

  8. arXiv:cs/9912012  [pdf, ps, other

    cs.DC cs.MA cs.NI nlin.AO

    Avoiding Braess' Paradox through Collective Intelligence

    Authors: Kagan Tumer, David H. Wolpert

    Abstract: In an Ideal Shortest Path Algorithm (ISPA), at each moment each router in a network sends all of its traffic down the path that will incur the lowest cost to that traffic. In the limit of an infinitesimally small amount of traffic for a particular router, its routing that traffic via an ISPA is optimal, as far as cost incurred by that traffic is concerned. We demonstrate though that in many case… ▽ More

    Submitted 20 December, 1999; originally announced December 1999.

    Comments: 28 pages

    Report number: NASA-ARC-IC-99-124 ACM Class: C.2.0; I.2.11

  9. arXiv:cs/9912011  [pdf, ps, other

    cs.MA cs.NI nlin.AO

    Adaptivity in Agent-Based Routing for Data Networks

    Authors: David H. Wolpert, Sergey Kirshner, Chris J. Merz, Kagan Tumer

    Abstract: Adaptivity, both of the individual agents and of the interaction structure among the agents, seems indispensable for scaling up multi-agent systems (MAS's) in noisy environments. One important consideration in designing adaptive agents is choosing their action spaces to be as amenable as possible to machine learning techniques, especially to reinforcement learning (RL) techniques. One important… ▽ More

    Submitted 20 December, 1999; originally announced December 1999.

    Report number: NASA-ARC-IC-99-122 ACM Class: I.2.11; C.2.0

  10. arXiv:cs/9908014  [pdf, ps, other

    cs.LG cond-mat cs.DC cs.MA nlin.AO

    An Introduction to Collective Intelligence

    Authors: David H. Wolpert, Kagan Tumer

    Abstract: This paper surveys the emerging science of how to design a ``COllective INtelligence'' (COIN). A COIN is a large multi-agent system where: (i) There is little to no centralized communication or control; and (ii) There is a provided world utility function that rates the possible histories of the full system. In particular, we are interested in COINs in which each agent runs a reinforcement… ▽ More

    Submitted 17 August, 1999; originally announced August 1999.

    Comments: 88 pages, 10 figs, 297 refs

    Report number: NASA-ARC-IC-99-63 ACM Class: I.2.6; I.2.11

  11. arXiv:cs/9908013  [pdf, ps, other

    cs.LG cond-mat cs.AI cs.DC cs.MA nlin.AO

    Collective Intelligence for Control of Distributed Dynamical Systems

    Authors: David H. Wolpert, Kevin R. Wheeler, Kagan Tumer

    Abstract: We consider the El Farol bar problem, also known as the minority game (W. B. Arthur, ``The American Economic Review'', 84(2): 406--411 (1994), D. Challet and Y.C. Zhang, ``Physica A'', 256:514 (1998)). We view it as an instance of the general problem of how to configure the nodal elements of a distributed dynamical system so that they do not ``work at cross purposes'', in that their collective d… ▽ More

    Submitted 17 August, 1999; originally announced August 1999.

    Comments: 8 pages

    Report number: NASA-ARC-IC-99-44 ACM Class: I.2.6; I.2.11

  12. arXiv:cs/9905013  [pdf, ps, other

    cs.LG cs.CV cs.NE

    Robust Combining of Disparate Classifiers through Order Statistics

    Authors: Kagan Tumer, Joydeep Ghosh

    Abstract: Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In the typical setting investigated till now, each classifier is trained on data taken or resampled from a common data set, or (almost) randomly selected subsets thereof, and thus experiences similar quality of training data. Howeve… ▽ More

    Submitted 20 May, 1999; originally announced May 1999.

    Comments: 22 pages

    Report number: UT-CVIS-TR-99-001 (The University of Texas) ACM Class: I.5.1; G.3

  13. arXiv:cs/9905012  [pdf, ps, other

    cs.NE cs.LG

    Linear and Order Statistics Combiners for Pattern Classification

    Authors: Kagan Tumer, Joydeep Ghosh

    Abstract: Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We fi… ▽ More

    Submitted 20 May, 1999; originally announced May 1999.

    Comments: 31 pages

    ACM Class: I.5.1; I.2.6

    Journal ref: Combining Artificial Neural Networks,Ed. Amanda Sharkey, pp 127-162, Springer Verlag, 1999

  14. arXiv:cs/9905011  [pdf, ps, other

    cs.NE cs.LG q-bio

    Ensembles of Radial Basis Function Networks for Spectroscopic Detection of Cervical Pre-Cancer

    Authors: Kagan Tumer, Nirmala Ramanujam, Joydeep Ghosh, Rebecca Richards-Kortum

    Abstract: The mortality related to cervical cancer can be substantially reduced through early detection and treatment. However, current detection techniques, such as Pap smear and colposcopy, fail to achieve a concurrently high sensitivity and specificity. In vivo fluorescence spectroscopy is a technique which quickly, non-invasively and quantitatively probes the biochemical and morphological changes th… ▽ More

    Submitted 20 May, 1999; originally announced May 1999.

    Comments: 23 pages

    ACM Class: I.5.1; J.3

    Journal ref: IEEE Transactions on Biomedical Engineering, vol 45, no. 8, pp 953-962, 1998

  15. arXiv:cs/9905005  [pdf, ps, other

    cs.MA cond-mat.stat-mech cs.DC cs.LG nlin.AO

    General Principles of Learning-Based Multi-Agent Systems

    Authors: David H. Wolpert, Kevin R. Wheeler, Kagan Tumer

    Abstract: We consider the problem of how to design large decentralized multi-agent systems (MAS's) in an automated fashion, with little or no hand-tuning. Our approach has each agent run a reinforcement learning algorithm. This converts the problem into one of how to automatically set/update the reward functions for each of the agents so that the global goal is achieved. In particular we do not want the a… ▽ More

    Submitted 10 May, 1999; originally announced May 1999.

    Comments: 7 pages, 6 figures

    ACM Class: I.2.6; I.2.11

    Journal ref: Proceedings of the Third International Conference on Autonomous Agents, Seatle, WA 1999

  16. arXiv:cs/9905004  [pdf, ps, other

    cs.LG cond-mat.stat-mech cs.DC cs.NI nlin.AO

    Using Collective Intelligence to Route Internet Traffic

    Authors: David H. Wolpert, Kagan Tumer, Jeremy Frank

    Abstract: A COllective INtelligence (COIN) is a set of interacting reinforcement learning (RL) algorithms designed in an automated fashion so that their collective behavior optimizes a global utility function. We summarize the theory of COINs, then present experiments using that theory to design COINs to control internet traffic routing. These experiments indicate that COINs outperform all previously inve… ▽ More

    Submitted 10 May, 1999; originally announced May 1999.

    Comments: 7 pages

    ACM Class: I.2.6; I.2.11

    Journal ref: Advances in Information Processing Systems - 11, eds M. Kearns, S. Solla, D. Cohn, MIT Press, 1999