Skip to main content

Showing 1–50 of 55 results for author: Nowé, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.22232  [pdf, ps, other

    cs.LG cs.AI

    Fairness-Aware Reinforcement Learning (FAReL): A Framework for Transparent and Balanced Sequential Decision-Making

    Authors: Alexandra Cimpean, Nicole Orzan, Catholijn Jonker, Pieter Libin, Ann Nowé

    Abstract: Equity in real-world sequential decision problems can be enforced using fairness-aware methods. Therefore, we require algorithms that can make suitable and transparent trade-offs between performance and the desired fairness notions. As the desired performance-fairness trade-off is hard to specify a priori, we propose a framework where multiple trade-offs can be explored. Insights provided by the r… ▽ More

    Submitted 26 September, 2025; originally announced September 2025.

  2. Artificial Delegates Resolve Fairness Issues in Perpetual Voting with Partial Turnout

    Authors: Apurva Shah, Axel Abels, Ann Nowé, Tom Lenaerts

    Abstract: Perpetual voting addresses fairness in sequential collective decision-making by evaluating representational equity over time. However, existing perpetual voting rules rely on full participation and complete approval information, assumptions that rarely hold in practice, where partial turnout is the norm. In this work, we study the integration of Artificial Delegates, preference-learning agents tra… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: The paper has been accepted at the ACM Collective Intelligence Conference (CI 2025), August 4 to 6, 2025, San Diego, CA, USA

  3. arXiv:2506.05887  [pdf, ps, other

    cs.AI

    Explainability in Context: A Multilevel Framework Aligning AI Explanations with Stakeholder with LLMs

    Authors: Marilyn Bello, Rafael Bello, Maria-Matilde García, Ann Nowé, Iván Sevillano-García, Francisco Herrera

    Abstract: The growing application of artificial intelligence in sensitive domains has intensified the demand for systems that are not only accurate but also explainable and trustworthy. Although explainable AI (XAI) methods have proliferated, many do not consider the diverse audiences that interact with AI systems: from developers and domain experts to end-users and society. This paper addresses how trust i… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: 22 pages, 5 figures

  4. Explainable AI Based Diagnosis of Poisoning Attacks in Evolutionary Swarms

    Authors: Mehrdad Asadi, Roxana Rădulescu, Ann Nowé

    Abstract: Swarming systems, such as for example multi-drone networks, excel at cooperative tasks like monitoring, surveillance, or disaster assistance in critical environments, where autonomous agents make decentralized decisions in order to fulfill team-level objectives in a robust and efficient manner. Unfortunately, team-level coordinated strategies in the wild are vulnerable to data poisoning attacks, r… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: To appear in short form in Genetic and Evolutionary Computation Conference (GECCO '25 Companion), 2025

    Journal ref: GECCO'25 Companion: Genetic and Evolutionary Computation Conference Companion, July 14-18, 2025, Malaga, Spain

  5. arXiv:2410.21940  [pdf, other

    cs.LG cs.AI

    Human-Readable Programs as Actors of Reinforcement Learning Agents Using Critic-Moderated Evolution

    Authors: Senne Deproost, Denis Steckelmacher, Ann Nowé

    Abstract: With Deep Reinforcement Learning (DRL) being increasingly considered for the control of real-world systems, the lack of transparency of the neural network at the core of RL becomes a concern. Programmatic Reinforcement Learning (PRL) is able to to create representations of this black-box in the form of source code, not only increasing the explainability of the controller but also allowing for user… ▽ More

    Submitted 29 October, 2024; originally announced October 2024.

    Comments: Accepted in BNAIC/BeNeLearn 2024 conference proceedings

  6. arXiv:2407.18812  [pdf, other

    cs.LG cs.AI

    Online Planning in POMDPs with State-Requests

    Authors: Raphael Avalos, Eugenio Bargiacchi, Ann Nowé, Diederik M. Roijers, Frans A. Oliehoek

    Abstract: In key real-world problems, full state information is sometimes available but only at a high cost, like activating precise yet energy-intensive sensors or consulting humans, thereby compelling the agent to operate under partial observability. For this scenario, we propose AEMS-SR (Anytime Error Minimization Search with State Requests), a principled online planning algorithm tailored for POMDPs wit… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Journal ref: Reinforcement Learning Journal, vol. 1, no. 1, 2024, pp. TBD

  7. arXiv:2407.16312  [pdf, other

    cs.MA cs.AI cs.GT

    MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement Learning

    Authors: Florian Felten, Umut Ucak, Hicham Azmani, Gao Peng, Willem Röpke, Hendrik Baier, Patrick Mannion, Diederik M. Roijers, Jordan K. Terry, El-Ghazali Talbi, Grégoire Danoy, Ann Nowé, Roxana Rădulescu

    Abstract: Many challenging tasks such as managing traffic systems, electricity grids, or supply chains involve complex decision-making processes that must balance multiple conflicting objectives and coordinate the actions of various independent decision-makers (DMs). One perspective for formalising and addressing such tasks is multi-objective multi-agent reinforcement learning (MOMARL). MOMARL broadens rein… ▽ More

    Submitted 27 October, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

  8. arXiv:2404.03596  [pdf, other

    cs.LG cs.AI cs.MA

    Laser Learning Environment: A new environment for coordination-critical multi-agent tasks

    Authors: Yannick Molinghen, Raphaël Avalos, Mark Van Achter, Ann Nowé, Tom Lenaerts

    Abstract: We introduce the Laser Learning Environment (LLE), a collaborative multi-agent reinforcement learning environment in which coordination is central. In LLE, agents depend on each other to make progress (interdependence), must jointly take specific sequences of actions to succeed (perfect coordination), and accomplishing those joint actions does not yield any intermediate reward (zero-incentive dyna… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Pre-print, 21 pages

  9. arXiv:2403.08829  [pdf, other

    cs.HC cs.LG cs.SI

    Mitigating Biases in Collective Decision-Making: Enhancing Performance in the Face of Fake News

    Authors: Axel Abels, Elias Fernandez Domingos, Ann Nowé, Tom Lenaerts

    Abstract: Individual and social biases undermine the effectiveness of human advisers by inducing judgment errors which can disadvantage protected groups. In this paper, we study the influence these biases can have in the pervasive problem of fake news by evaluating human participants' capacity to identify false headlines. By focusing on headlines involving sensitive characteristics, we gather a comprehensiv… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  10. arXiv:2402.13785  [pdf, other

    cs.AI

    Composing Reinforcement Learning Policies, with Formal Guarantees

    Authors: Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowé, Guillermo A. Pérez

    Abstract: We propose a novel framework to controller design in environments with a two-level structure: a known high-level graph ("map") in which each vertex is populated by a Markov decision process, called a "room". The framework "separates concerns" by using different design techniques for low- and high-level tasks. We apply reactive synthesis for high-level tasks: given a specification as a logical form… ▽ More

    Submitted 10 March, 2025; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: AAMAS 2025, 8 pages main text, 19 pages Appendix (excluding references)

  11. arXiv:2402.07182  [pdf, other

    cs.LG

    Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning

    Authors: Willem Röpke, Mathieu Reymond, Patrick Mannion, Diederik M. Roijers, Ann Nowé, Roxana Rădulescu

    Abstract: An important challenge in multi-objective reinforcement learning is obtaining a Pareto front of policies to attain optimal performance under different preferences. We introduce Iterated Pareto Referent Optimisation (IPRO), which decomposes finding the Pareto front into a sequence of constrained single-objective problems. This enables us to guarantee convergence while providing an upper bound on th… ▽ More

    Submitted 6 February, 2025; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: Accepted at AAMAS 2025

  12. arXiv:2306.10134  [pdf, other

    cs.LG cs.AI cs.MA

    Dynamic Size Message Scheduling for Multi-Agent Communication under Limited Bandwidth

    Authors: Qingshuang Sun, Denis Steckelmacher, Yuan Yao, Ann Nowé, Raphaël Avalos

    Abstract: Communication plays a vital role in multi-agent systems, fostering collaboration and coordination. However, in real-world scenarios where communication is bandwidth-limited, existing multi-agent reinforcement learning (MARL) algorithms often provide agents with a binary choice: either transmitting a fixed number of bytes or no information at all. This limitation hinders the ability to effectively… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  13. arXiv:2305.05560  [pdf, other

    cs.AI

    Distributional Multi-Objective Decision Making

    Authors: Willem Röpke, Conor F. Hayes, Patrick Mannion, Enda Howley, Ann Nowé, Diederik M. Roijers

    Abstract: For effective decision support in scenarios with conflicting objectives, sets of potentially optimal solutions can be presented to the decision maker. We explore both what policies these sets should contain and how such sets can be computed efficiently. With this in mind, we take a distributional approach and introduce a novel dominance criterion relating return distributions of policies directly.… ▽ More

    Submitted 18 July, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: Accepted at IJCAI 2023

  14. Expertise Trees Resolve Knowledge Limitations in Collective Decision-Making

    Authors: Axel Abels, Tom Lenaerts, Vito Trianni, Ann Nowé

    Abstract: Experts advising decision-makers are likely to display expertise which varies as a function of the problem instance. In practice, this may lead to sub-optimal or discriminatory decisions against minority cases. In this work we model such changes in depth and breadth of knowledge as a partitioning of the problem space into regions of differing expertise. We provide here new algorithms that explicit… ▽ More

    Submitted 4 May, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: Proceedings of the 40th International Conference on Machine Learning (2023)

  15. arXiv:2304.08897  [pdf, other

    eess.SY cs.AI cs.LG math.OC

    An adaptive safety layer with hard constraints for safe reinforcement learning in multi-energy management systems

    Authors: Glenn Ceusters, Muhammad Andy Putratama, Rüdiger Franke, Ann Nowé, Maarten Messagie

    Abstract: Safe reinforcement learning (RL) with hard constraint guarantees is a promising optimal control direction for multi-energy management systems. It only requires the environment-specific constraint functions itself a priori and not a complete model. The project-specific upfront and ongoing engineering efforts are therefore still reduced, better representations of the underlying system dynamics can s… ▽ More

    Submitted 6 November, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: post-print

  16. arXiv:2303.12558  [pdf, other

    cs.LG cs.AI

    Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees

    Authors: Florent Delgrange, Ann Nowé, Guillermo A. Pérez

    Abstract: Although deep reinforcement learning (DRL) has many success stories, the large-scale deployment of policies learned through these advanced techniques in safety-critical scenarios is hindered by their lack of formal guarantees. Variational Markov Decision Processes (VAE-MDPs) are discrete latent space models that provide a reliable framework for distilling formally verifiable controllers from any R… ▽ More

    Submitted 21 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

    Comments: ICLR 2023, 10 pages main text, 14 pages appendix (excluding references)

  17. arXiv:2303.03284  [pdf, other

    cs.LG cs.AI

    The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

    Authors: Raphael Avalos, Florent Delgrange, Ann Nowé, Guillermo A. Pérez, Diederik M. Roijers

    Abstract: Partially Observable Markov Decision Processes (POMDPs) are used to model environments where the full state cannot be perceived by an agent. As such the agent needs to reason taking into account the past observations and actions. However, simply remembering the full history is generally intractable due to the exponential growth in the history space. Maintaining a probability distribution that mode… ▽ More

    Submitted 26 October, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

  18. arXiv:2301.12822  [pdf, other

    cs.LG cs.AI

    Evaluating COVID-19 vaccine allocation policies using Bayesian $m$-top exploration

    Authors: Alexandra Cimpean, Timothy Verstraeten, Lander Willem, Niel Hens, Ann Nowé, Pieter Libin

    Abstract: Individual-based epidemiological models support the study of fine-grained preventive measures, such as tailored vaccine allocation policies, in silico. As individual-based models are computationally intensive, it is pivotal to identify optimal strategies within a reasonable computational budget. Moreover, due to the high societal impact associated with the implementation of preventive strategies,… ▽ More

    Submitted 26 February, 2025; v1 submitted 30 January, 2023; originally announced January 2023.

  19. arXiv:2301.12820  [pdf, other

    cs.AI

    Transferring Multiple Policies to Hotstart Reinforcement Learning in an Air Compressor Management Problem

    Authors: Hélène Plisnier, Denis Steckelmacher, Jeroen Willems, Bruno Depraetere, Ann Nowé

    Abstract: Many instances of similar or almost-identical industrial machines or tools are often deployed at once, or in quick succession. For instance, a particular model of air compressor may be installed at hundreds of customers. Because these tools perform distinct but highly similar tasks, it is interesting to be able to quickly produce a high-quality controller for machine $N+1$ given the controllers al… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Preliminary version, experimental details still to be made more precise

  20. Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization

    Authors: Lucas N. Alegre, Ana L. C. Bazzan, Diederik M. Roijers, Ann Nowé, Bruno C. da Silva

    Abstract: Multi-objective reinforcement learning (MORL) algorithms tackle sequential decision problems where agents may have different preferences over (possibly conflicting) reward functions. Such algorithms often learn a set of policies (each optimized for a particular agent preference) that can later be used to solve problems with novel preferences. We introduce a novel algorithm that uses Generalized Po… ▽ More

    Submitted 23 March, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

    Comments: Accepted to AAMAS 2023

  21. arXiv:2301.05755  [pdf, other

    cs.GT

    Bridging the Gap Between Single and Multi Objective Games

    Authors: Willem Röpke, Carla Groenland, Roxana Rădulescu, Ann Nowé, Diederik M. Roijers

    Abstract: A classic model to study strategic decision making in multi-agent systems is the normal-form game. This model can be generalised to allow for an infinite number of pure strategies leading to continuous games. Multi-objective normal-form games are another generalisation that model settings where players receive separate payoffs in more than one objective. We bridge the gap between the two models by… ▽ More

    Submitted 1 March, 2023; v1 submitted 13 January, 2023; originally announced January 2023.

    Comments: Accepted to AAMAS 2023

  22. arXiv:2207.03830  [pdf, other

    eess.SY cs.AI cs.LG math.OC

    Safe reinforcement learning for multi-energy management systems with known constraint functions

    Authors: Glenn Ceusters, Luis Ramirez Camargo, Rüdiger Franke, Ann Nowé, Maarten Messagie

    Abstract: Reinforcement learning (RL) is a promising optimal control technique for multi-energy management systems. It does not require a model a priori - reducing the upfront and ongoing project-specific engineering effort and is capable of learning better representations of the underlying system dynamics. However, vanilla RL does not provide constraint satisfaction guarantees - resulting in various potent… ▽ More

    Submitted 1 September, 2022; v1 submitted 8 July, 2022; originally announced July 2022.

    Comments: 26 pages, 14 figures

  23. arXiv:2204.05036  [pdf, other

    cs.LG cs.AI

    Pareto Conditioned Networks

    Authors: Mathieu Reymond, Eugenio Bargiacchi, Ann Nowé

    Abstract: In multi-objective optimization, learning all the policies that reach Pareto-efficient solutions is an expensive process. The set of optimal policies can grow exponentially with the number of objectives, and recovering all solutions requires an exhaustive exploration of the entire state space. We propose Pareto Conditioned Networks (PCN), a method that uses a single neural network to encompass all… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: Accepted at the International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2022

  24. arXiv:2204.05027  [pdf, ps, other

    cs.LG cs.AI q-bio.PE

    Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning

    Authors: Mathieu Reymond, Conor F. Hayes, Lander Willem, Roxana Rădulescu, Steven Abrams, Diederik M. Roijers, Enda Howley, Patrick Mannion, Niel Hens, Ann Nowé, Pieter Libin

    Abstract: Infectious disease outbreaks can have a disruptive impact on public health and societal processes. As decision making in the context of epidemic mitigation is hard, reinforcement learning provides a methodology to automatically learn prevention strategies in combination with complex epidemic models. Current research focuses on optimizing policies w.r.t. a single objective, such as the pathogen's a… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

  25. arXiv:2112.12458  [pdf, other

    cs.LG cs.AI

    Local Advantage Networks for Cooperative Multi-Agent Reinforcement Learning

    Authors: Raphaël Avalos, Mathieu Reymond, Ann Nowé, Diederik M. Roijers

    Abstract: Many recent successful off-policy multi-agent reinforcement learning (MARL) algorithms for cooperative partially observable environments focus on finding factorized value functions, leading to convoluted network structures. Building on the structure of independent Q-learners, our LAN algorithm takes a radically different approach, leveraging a dueling architecture to learn for each agent a decentr… ▽ More

    Submitted 26 October, 2023; v1 submitted 23 December, 2021; originally announced December 2021.

    Comments: https://openreview.net/forum?id=adpKzWQunW

    Journal ref: Transactions on Machine Learning Research - October 2023

  26. arXiv:2112.09655  [pdf, other

    cs.LG cs.AI

    Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes (Technical Report)

    Authors: Florent Delgrange, Ann Nowé, Guillermo A. Pérez

    Abstract: We consider the challenge of policy simplification and verification in the context of policies learned through reinforcement learning (RL) in continuous environments. In well-behaved settings, RL algorithms have convergence guarantees in the limit. While these guarantees are valuable, they are insufficient for safety-critical applications. Furthermore, they are lost when applying advanced techniqu… ▽ More

    Submitted 14 June, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: AAAI 2022, technical report including supplementary material (10 pages main text, 14 pages appendix)

  27. arXiv:2112.06500  [pdf, other

    cs.GT cs.MA

    On Nash Equilibria in Normal-Form Games With Vectorial Payoffs

    Authors: Willem Röpke, Diederik M. Roijers, Ann Nowé, Roxana Rădulescu

    Abstract: We provide an in-depth study of Nash equilibria in multi-objective normal form games (MONFGs), i.e., normal form games with vectorial payoffs. Taking a utility-based approach, we assume that each player's utility can be modelled with a utility function that maps a vector to a scalar utility. In the case of a mixed strategy, it is meaningful to apply such a scalarisation both before calculating the… ▽ More

    Submitted 16 July, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

  28. arXiv:2111.09191  [pdf, other

    cs.GT cs.LG cs.MA

    Preference Communication in Multi-Objective Normal-Form Games

    Authors: Willem Röpke, Diederik M. Roijers, Ann Nowé, Roxana Rădulescu

    Abstract: We consider preference communication in two-player multi-objective normal-form games. In such games, the payoffs resulting from joint actions are vector-valued. Taking a utility-based approach, we assume there exists a utility function for each player which maps vectors to scalar utilities and consider agents that aim to maximise the utility of expected payoff vectors. As agents typically do not k… ▽ More

    Submitted 10 June, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

  29. arXiv:2106.13539  [pdf, other

    cs.AI cs.LG

    Dealing with Expert Bias in Collective Decision-Making

    Authors: Axel Abels, Tom Lenaerts, Vito Trianni, Ann Nowé

    Abstract: Quite some real-world problems can be formulated as decision-making problems wherein one must repeatedly make an appropriate choice from a set of alternatives. Multiple expert judgements, whether human or artificial, can help in taking correct decisions, especially when exploration of alternative solutions is costly. As expert opinions might deviate, the problem of finding the right alternative ca… ▽ More

    Submitted 29 August, 2022; v1 submitted 25 June, 2021; originally announced June 2021.

  30. Synthesising Reinforcement Learning Policies through Set-Valued Inductive Rule Learning

    Authors: Youri Coppens, Denis Steckelmacher, Catholijn M. Jonker, Ann Nowé

    Abstract: Today's advanced Reinforcement Learning algorithms produce black-box policies, that are often difficult to interpret and trust for a person. We introduce a policy distilling algorithm, building on the CN2 rule mining algorithm, that distills the policy into a rule-based decision system. At the core of our approach is the fact that an RL process does not just learn a policy, a mapping from states t… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: 17 pages, 4 figures. The final authenticated publication is available online at https://doi.org/10.1007/978-3-030-73959-1_15

    Journal ref: Trustworthy AI - Integrating Learning, Optimization and Reasoning (2021), Lecture Notes in Computer Science, vol. 12641, pp. 163-179

  31. arXiv:2104.09785  [pdf, other

    eess.SY cs.AI cs.LG math.OC

    Model-predictive control and reinforcement learning in multi-energy system case studies

    Authors: Glenn Ceusters, Román Cantú Rodríguez, Alberte Bouso García, Rüdiger Franke, Geert Deconinck, Lieve Helsen, Ann Nowé, Maarten Messagie, Luis Ramirez Camargo

    Abstract: Model-predictive-control (MPC) offers an optimal control technique to establish and ensure that the total operation cost of multi-energy systems remains at a minimum while fulfilling all system constraints. However, this method presumes an adequate model of the underlying system dynamics, which is prone to modelling errors and is not necessarily adaptive. This has an associated initial and ongoing… ▽ More

    Submitted 9 September, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: 43 pages, 29 figures

  32. A Practical Guide to Multi-Objective Reinforcement Learning and Planning

    Authors: Conor F. Hayes, Roxana Rădulescu, Eugenio Bargiacchi, Johan Källström, Matthew Macfarlane, Mathieu Reymond, Timothy Verstraeten, Luisa M. Zintgraf, Richard Dazeley, Fredrik Heintz, Enda Howley, Athirai A. Irissappane, Patrick Mannion, Ann Nowé, Gabriel Ramos, Marcello Restelli, Peter Vamplew, Diederik M. Roijers

    Abstract: Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying pr… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

    Journal ref: Auton Agent Multi-Agent Syst 36, 26 (2022)

  33. arXiv:2011.07290  [pdf, other

    cs.MA cs.AI cs.GT cs.LG

    Opponent Learning Awareness and Modelling in Multi-Objective Normal Form Games

    Authors: Roxana Rădulescu, Timothy Verstraeten, Yijie Zhang, Patrick Mannion, Diederik M. Roijers, Ann Nowé

    Abstract: Many real-world multi-agent interactions consider multiple distinct criteria, i.e. the payoffs are multi-objective in nature. However, the same multi-objective payoff vector may lead to different utilities for each participant. Therefore, it is essential for an agent to learn about the behaviour of other agents in the system. In this work, we present the first study of the effects of such opponent… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

    Comments: Under review since 14 November 2020

  34. arXiv:2003.13676  [pdf, other

    cs.LG cs.AI cs.MA

    Deep reinforcement learning for large-scale epidemic control

    Authors: Pieter Libin, Arno Moonens, Timothy Verstraeten, Fabian Perez-Sanjines, Niel Hens, Philippe Lemey, Ann Nowé

    Abstract: Epidemics of infectious diseases are an important threat to public health and global economies. Yet, the development of prevention strategies remains a challenging process, as epidemics are non-linear and complex processes. For this reason, we investigate a deep reinforcement learning approach to automatically learn prevention strategies in the context of pandemic influenza. Firstly, we construct… ▽ More

    Submitted 30 March, 2020; originally announced March 2020.

  35. arXiv:2001.09502  [pdf, other

    cs.LG stat.ML

    An interpretable semi-supervised classifier using two different strategies for amended self-labeling

    Authors: Isel Grau, Dipankar Sengupta, Maria M. Garcia Lorenzo, Ann Nowe

    Abstract: In the context of some machine learning applications, obtaining data instances is a relatively easy process but labeling them could become quite expensive or tedious. Such scenarios lead to datasets with few labeled instances and a larger number of unlabeled ones. Semi-supervised classification techniques combine labeled and unlabeled data during the learning phase in order to increase the classif… ▽ More

    Submitted 20 July, 2020; v1 submitted 26 January, 2020; originally announced January 2020.

    Comments: Accepted at Special Session on Advances on Explainable Artificial Intelligence, IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2020), IEEE World Congress on Computational Intelligence (WCCI 2020)

  36. arXiv:2001.08177  [pdf, other

    cs.GT cs.AI cs.LG cs.MA

    A utility-based analysis of equilibria in multi-objective normal form games

    Authors: Roxana Rădulescu, Patrick Mannion, Yijie Zhang, Diederik M. Roijers, Ann Nowé

    Abstract: In multi-objective multi-agent systems (MOMAS), agents explicitly consider the possible tradeoffs between conflicting objective functions. We argue that compromises between competing objectives in MOMAS should be analysed on the basis of the utility that these compromises have for the users of a system, where an agent's utility function maps their payoff vectors to scalar utility values. This util… ▽ More

    Submitted 17 January, 2020; originally announced January 2020.

    Comments: Under review since 16 January 2020

  37. arXiv:2001.07527  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping

    Authors: Eugenio Bargiacchi, Timothy Verstraeten, Diederik M. Roijers, Ann Nowé

    Abstract: We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes. The algorithm allows for sample-efficient learning on large problems by exploiting a factorization to approximate the value function. Our approach only requires knowledge about the structure of the problem in the form of a dynamic decisio… ▽ More

    Submitted 15 January, 2020; originally announced January 2020.

  38. arXiv:1911.10121  [pdf, other

    cs.LG cs.AI eess.SY stat.ML

    Fleet Control using Coregionalized Gaussian Process Policy Iteration

    Authors: Timothy Verstraeten, Pieter JK Libin, Ann Nowé

    Abstract: In many settings, as for example wind farms, multiple machines are instantiated to perform the same task, which is called a fleet. The recent advances with respect to the Internet of Things allow control devices and/or machines to connect through cloud-based architectures in order to share information about their status and environment. Such an infrastructure allows seamless data sharing between f… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

  39. arXiv:1911.10120  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures

    Authors: Timothy Verstraeten, Eugenio Bargiacchi, Pieter JK Libin, Jan Helsen, Diederik M Roijers, Ann Nowé

    Abstract: Multi-agent coordination is prevalent in many real-world applications. However, such coordination is challenging due to its combinatorial nature. An important observation in this regard is that agents in the real world often only directly affect a limited set of neighbouring agents. Leveraging such loose couplings among agents is key to making coordination in multi-agent systems feasible. In this… ▽ More

    Submitted 7 February, 2020; v1 submitted 22 November, 2019; originally announced November 2019.

    Journal ref: Sci Rep 10, 6728 (2020)

  40. IPC-Net: 3D point-cloud segmentation using deep inter-point convolutional layers

    Authors: Felipe Gomez Marulanda, Pieter Libin, Timothy Verstraeten, Ann Nowé

    Abstract: Over the last decade, the demand for better segmentation and classification algorithms in 3D spaces has significantly grown due to the popularity of new 3D sensor technologies and advancements in the field of robotics. Point-clouds are one of the most popular representations to store a digital description of 3D shapes. However, point-clouds are stored in irregular and unordered structures, which l… ▽ More

    Submitted 30 September, 2019; originally announced September 2019.

    Journal ref: 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI),

  41. Multi-Objective Multi-Agent Decision Making: A Utility-based Analysis and Survey

    Authors: Roxana Rădulescu, Patrick Mannion, Diederik M. Roijers, Ann Nowé

    Abstract: The majority of multi-agent system (MAS) implementations aim to optimise agents' policies with respect to a single objective, despite the fact that many real-world problem domains are inherently multi-objective in nature. Multi-objective multi-agent systems (MOMAS) explicitly consider the possible trade-offs between conflicting objective functions. We argue that, in MOMAS, such compromises should… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

    Comments: Under review since 15 May 2019

  42. arXiv:1907.07958  [pdf, other

    cs.AI cs.RO

    Transfer Learning Across Simulated Robots With Different Sensors

    Authors: Hélène Plisnier, Denis Steckelmacher, Diederik Roijers, Ann Nowé

    Abstract: For a robot to learn a good policy, it often requires expensive equipment (such as sophisticated sensors) and a prepared training environment conducive to learning. However, it is seldom possible to perfectly equip robots for economic reasons, nor to guarantee ideal learning conditions, when deployed in real-life environments. A solution would be to prepare the robot in the lab environment, when a… ▽ More

    Submitted 18 July, 2019; originally announced July 2019.

  43. arXiv:1903.04193  [pdf, other

    cs.LG cs.AI

    Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics

    Authors: Denis Steckelmacher, Hélène Plisnier, Diederik M. Roijers, Ann Nowé

    Abstract: Value-based reinforcement-learning algorithms provide state-of-the-art results in model-free discrete-action settings, and tend to outperform actor-critic algorithms. We argue that actor-critic algorithms are limited by their need for an on-policy critic. We propose Bootstrapped Dual Policy Iteration (BDPI), a novel model-free reinforcement-learning algorithm for continuous states and discrete act… ▽ More

    Submitted 12 June, 2019; v1 submitted 11 March, 2019; originally announced March 2019.

    Comments: Accepted at the European Conference on Machine Learning 2019 (ECML)

  44. arXiv:1902.02556  [pdf, other

    cs.AI

    The Actor-Advisor: Policy Gradient With Off-Policy Advice

    Authors: Hélène Plisnier, Denis Steckelmacher, Diederik M. Roijers, Ann Nowé

    Abstract: Actor-critic algorithms learn an explicit policy (actor), and an accompanying value function (critic). The actor performs actions in the environment, while the critic evaluates the actor's current policy. However, despite their stability and promising convergence properties, current actor-critic algorithms do not outperform critic-only ones in practice. We believe that the fact that the critic lea… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.

  45. arXiv:1809.07803  [pdf, other

    cs.LG cs.AI stat.ML

    Dynamic Weights in Multi-Objective Deep Reinforcement Learning

    Authors: Axel Abels, Diederik M. Roijers, Tom Lenaerts, Ann Nowé, Denis Steckelmacher

    Abstract: Many real-world decision problems are characterized by multiple conflicting objectives which must be balanced based on their relative importance. In the dynamic weights setting the relative importance changes over time and specialized algorithms that deal with such change, such as a tabular Reinforcement Learning (RL) algorithm by Natarajan and Tadepalli (2005), are required. However, this earlier… ▽ More

    Submitted 13 May, 2019; v1 submitted 20 September, 2018; originally announced September 2018.

    ACM Class: I.2.6

  46. arXiv:1808.04096  [pdf, other

    cs.LG cs.AI stat.ML

    Directed Policy Gradient for Safe Reinforcement Learning with Human Advice

    Authors: Hélène Plisnier, Denis Steckelmacher, Tim Brys, Diederik M. Roijers, Ann Nowé

    Abstract: Many currently deployed Reinforcement Learning agents work in an environment shared with humans, be them co-workers, users or clients. It is desirable that these agents adjust to people's preferences, learn faster thanks to their help, and act safely around them. We argue that most current approaches that learn from human feedback are unsafe: rewarding or punishing the agent a-posteriori cannot im… ▽ More

    Submitted 13 August, 2018; originally announced August 2018.

    Comments: Accepted at the European Workshop on Reinforcement Learning 2018 (EWRL14)

  47. arXiv:1802.07606  [pdf, other

    cs.LG cs.AI stat.ML

    Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making

    Authors: Luisa M Zintgraf, Diederik M Roijers, Sjoerd Linders, Catholijn M Jonker, Ann Nowé

    Abstract: In multi-objective decision planning and learning, much attention is paid to producing optimal solution sets that contain an optimal policy for every possible user preference profile. We argue that the step that follows, i.e, determining which policy to execute by maximising the user's intrinsic utility function over this (possibly infinite) set, is under-studied. This paper aims to fill this gap.… ▽ More

    Submitted 21 February, 2018; originally announced February 2018.

    Comments: AAMAS 2018, Source code at https://github.com/lmzintgraf/gp_pref_elicit

  48. arXiv:1711.06299  [pdf, ps, other

    cs.LG cs.AI q-bio.PE

    Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies

    Authors: Pieter Libin, Timothy Verstraeten, Diederik M. Roijers, Jelena Grujic, Kristof Theys, Philippe Lemey, Ann Nowé

    Abstract: Pandemic influenza has the epidemic potential to kill millions of people. While various preventive measures exist (i.a., vaccination and school closures), deciding on strategies that lead to their most effective and efficient use remains challenging. To this end, individual-based epidemiological models are essential to assist decision makers in determining the best strategy to curb epidemic spread… ▽ More

    Submitted 15 June, 2018; v1 submitted 16 November, 2017; originally announced November 2017.

  49. arXiv:1711.03817  [pdf, other

    cs.AI

    Learning with Options that Terminate Off-Policy

    Authors: Anna Harutyunyan, Peter Vrancx, Pierre-Luc Bacon, Doina Precup, Ann Nowe

    Abstract: A temporally abstract action, or an option, is specified by a policy and a termination condition: the policy guides option behavior, and the termination condition roughly determines its length. Generally, learning with longer options (like learning with multi-step returns) is known to be more efficient. However, if the option set for the task is not ideal, and cannot express the primitive optimal… ▽ More

    Submitted 2 December, 2017; v1 submitted 10 November, 2017; originally announced November 2017.

    Comments: AAAI 2018

  50. arXiv:1708.06551  [pdf, other

    cs.AI cs.LG

    Reinforcement Learning in POMDPs with Memoryless Options and Option-Observation Initiation Sets

    Authors: Denis Steckelmacher, Diederik M. Roijers, Anna Harutyunyan, Peter Vrancx, Hélène Plisnier, Ann Nowé

    Abstract: Many real-world reinforcement learning problems have a hierarchical nature, and often exhibit some degree of partial observability. While hierarchy and partial observability are usually tackled separately (for instance by combining recurrent neural networks and options), we show that addressing both problems simultaneously is simpler and more efficient in many cases. More specifically, we make the… ▽ More

    Submitted 12 September, 2017; v1 submitted 22 August, 2017; originally announced August 2017.