Skip to main content

Showing 1–11 of 11 results for author: Wiering, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2205.14410  [pdf, other

    cs.LG cs.AI

    Multi-Source Transfer Learning for Deep Model-Based Reinforcement Learning

    Authors: Remo Sasso, Matthia Sabatelli, Marco A. Wiering

    Abstract: A crucial challenge in reinforcement learning is to reduce the number of interactions with the environment that an agent requires to master a given task. Transfer learning proposes to address this issue by re-using knowledge from previously learned tasks. However, determining which source task qualifies as the most appropriate for knowledge extraction, as well as the choice regarding which algorit… ▽ More

    Submitted 27 April, 2023; v1 submitted 28 May, 2022; originally announced May 2022.

    Comments: 24 pages, 7 figures, 8 tables. arXiv admin note: text overlap with arXiv:2108.06526

    MSC Class: 68T07 ACM Class: I.2.m

  2. arXiv:2109.04847  [pdf, other

    cs.CL cs.LG

    Active learning for reducing labeling effort in text classification tasks

    Authors: Pieter Floris Jacobs, Gideon Maillette de Buy Wenniger, Marco Wiering, Lambert Schomaker

    Abstract: Labeling data can be an expensive task as it is usually performed manually by domain experts. This is cumbersome for deep learning, as it is dependent on large labeled datasets. Active learning (AL) is a paradigm that aims to reduce labeling effort by only using the data which the used model deems most informative. Little research has been done on AL in a text classification setting and next to no… ▽ More

    Submitted 3 November, 2021; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: Accepted as a conference paper at the joint 33rd Benelux Conference on Artificial Intelligence and the 30th Belgian Dutch Conference on Machine Learning (BNAIC/BENELEARN 2021). This camera-ready version submitted to BNAIC/BENELEARN, adds several improvements including a more thorough discussion of related work plus an extended discussion section. 28 pages including references and appendices

    ACM Class: I.2.7

  3. arXiv:2108.06526  [pdf, other

    cs.LG cs.AI

    Fractional Transfer Learning for Deep Model-Based Reinforcement Learning

    Authors: Remo Sasso, Matthia Sabatelli, Marco A. Wiering

    Abstract: Reinforcement learning (RL) is well known for requiring large amounts of data in order for RL agents to learn to perform complex tasks. Recent progress in model-based RL allows agents to be much more data-efficient, as it enables them to learn behaviors of visual environments in imagination by leveraging an internal World Model of the environment. Improved sample efficiency can also be achieved by… ▽ More

    Submitted 14 August, 2021; originally announced August 2021.

    Comments: 21 pages, 8 figures, 7 tables

    ACM Class: I.2.m

  4. Towards Real-World Deployment of Reinforcement Learning for Traffic Signal Control

    Authors: Arthur Müller, Vishal Rangras, Georg Schnittker, Michael Waldmann, Maxim Friesen, Tobias Ferfers, Lukas Schreckenberg, Florian Hufen, Jürgen Jasperneite, Marco Wiering

    Abstract: Sub-optimal control policies in intersection traffic signal controllers (TSC) contribute to congestion and lead to negative effects on human health and the environment. Reinforcement learning (RL) for traffic signal control is a promising approach to design better control policies and has attracted considerable research interest in recent years. However, most work done in this area used simplified… ▽ More

    Submitted 11 January, 2022; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: Paper was accepted by ICMLA 2021 (20th IEEE International Conference on Machine Learning and Applications). Code available under https://github.com/RL-INA/LemgoRL

  5. arXiv:2011.00485  [pdf, other

    q-bio.OT cs.AI cs.LG

    Comparing Machine Learning Algorithms with or without Feature Extraction for DNA Classification

    Authors: Xiangxie Zhang, Ben Beinke, Berlian Al Kindhi, Marco Wiering

    Abstract: The classification of DNA sequences is a key research area in bioinformatics as it enables researchers to conduct genomic analysis and detect possible diseases. In this paper, three state-of-the-art algorithms, namely Convolutional Neural Networks, Deep Neural Networks, and N-gram Probabilistic Models, are used for the task of DNA classification. Furthermore, we introduce a novel feature extractio… ▽ More

    Submitted 1 November, 2020; originally announced November 2020.

    Comments: 17 pages

  6. arXiv:2010.15597  [pdf, other

    cs.LG cs.AI

    Enhancing reinforcement learning by a finite reward response filter with a case study in intelligent structural control

    Authors: Hamid Radmard Rahmani, Carsten Koenke, Marco A. Wiering

    Abstract: In many reinforcement learning (RL) problems, it takes some time until a taken action by the agent reaches its maximum effect on the environment and consequently the agent receives the reward corresponding to that action by a delay called action-effect delay. Such delays reduce the performance of the learning algorithm and increase the computational costs, as the reinforcement learning agent value… ▽ More

    Submitted 25 October, 2020; originally announced October 2020.

    Comments: 16 pages, 16 figures

  7. arXiv:2001.05270  [pdf, other

    cs.LG cs.AI stat.ML

    Continuous-action Reinforcement Learning for Playing Racing Games: Comparing SPG to PPO

    Authors: Mario S. Holubar, Marco A. Wiering

    Abstract: In this paper, a novel racing environment for OpenAI Gym is introduced. This environment operates with continuous action- and state-spaces and requires agents to learn to control the acceleration and steering of a car while navigating a randomly generated racetrack. Different versions of two actor-critic learning algorithms are tested on this environment: Sampled Policy Gradient (SPG) and Proximal… ▽ More

    Submitted 15 January, 2020; originally announced January 2020.

    Comments: 12 pages, 9 figures. Code is available at https://github.com/mario-holubar/RacingRL

    ACM Class: I.2.6

  8. arXiv:1909.01779  [pdf, other

    cs.LG cs.AI stat.ML

    Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms

    Authors: Matthia Sabatelli, Gilles Louppe, Pierre Geurts, Marco A. Wiering

    Abstract: This paper makes one step forward towards characterizing a new family of \textit{model-free} Deep Reinforcement Learning (DRL) algorithms. The aim of these algorithms is to jointly learn an approximation of the state-value function ($V$), alongside an approximation of the state-action value function ($Q$). Our analysis starts with a thorough study of the Deep Quality-Value Learning (DQV) algorithm… ▽ More

    Submitted 14 October, 2019; v1 submitted 1 September, 2019; originally announced September 2019.

  9. arXiv:1810.00368  [pdf, other

    stat.ML cs.LG

    Deep Quality-Value (DQV) Learning

    Authors: Matthia Sabatelli, Gilles Louppe, Pierre Geurts, Marco A. Wiering

    Abstract: We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-Value (DQV) Learning. DQV uses temporal-difference learning to train a Value neural network and uses this network for training a second Quality-value network that learns to estimate state-action values. We first test DQV's update rules with Multilayer Perceptrons as function approximators on two classic RL problem… ▽ More

    Submitted 10 October, 2018; v1 submitted 30 September, 2018; originally announced October 2018.

  10. arXiv:1809.05763  [pdf, other

    cs.AI

    Sampled Policy Gradient for Learning to Play the Game Agar.io

    Authors: Anton Orell Wiehe, Nil Stolt Ansó, Madalina M. Drugan, Marco A. Wiering

    Abstract: In this paper, a new offline actor-critic learning algorithm is introduced: Sampled Policy Gradient (SPG). SPG samples in the action space to calculate an approximated policy gradient by using the critic to evaluate the samples. This sampling allows SPG to search the action-Q-value space more globally than deterministic policy gradient (DPG), enabling it to theoretically avoid more local optima. S… ▽ More

    Submitted 15 September, 2018; originally announced September 2018.

  11. arXiv:1803.09093  [pdf, other

    cs.LG cs.CV stat.ML

    Comparing Generative Adversarial Network Techniques for Image Creation and Modification

    Authors: Mathijs Pieters, Marco Wiering

    Abstract: Generative adversarial networks (GANs) have demonstrated to be successful at generating realistic real-world images. In this paper we compare various GAN techniques, both supervised and unsupervised. The effects on training stability of different objective functions are compared. We add an encoder to the network, making it possible to encode images to the latent space of the GAN. The generator, di… ▽ More

    Submitted 24 March, 2018; originally announced March 2018.

    Comments: 20 pages, 23 figures