Skip to main content

Showing 1–8 of 8 results for author: Hollenstein, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.02728  [pdf, other

    cs.RO cs.AI cs.LG

    Unsupervised Learning of Effective Actions in Robotics

    Authors: Marko Zaric, Jakob Hollenstein, Justus Piater, Erwan Renaudo

    Abstract: Learning actions that are relevant to decision-making and can be executed effectively is a key problem in autonomous robotics. Current state-of-the-art action representations in robotics lack proper effect-driven learning of the robot's actions. Although successful in solving manipulation tasks, deep learning methods also lack this ability, in addition to their high cost in terms of memory or trai… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted at The First Austrian Symposium on AI, Robotics, and Vision (AIROV24)

  2. Colored Noise in PPO: Improved Exploration and Performance through Correlated Action Sampling

    Authors: Jakob Hollenstein, Georg Martius, Justus Piater

    Abstract: Proximal Policy Optimization (PPO), a popular on-policy deep reinforcement learning method, employs a stochastic policy for exploration. In this paper, we propose a colored noise-based stochastic policy variant of PPO. Previous research highlighted the importance of temporal correlation in action noise for effective exploration in off-policy reinforcement learning. Building on this, we investigate… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Journal ref: (2024) Proceedings of the AAAI Conference on Artificial Intelligence, 38(11), 12466-12472

  3. arXiv:2311.03600  [pdf, other

    cs.RO

    Scalable and Efficient Continual Learning from Demonstration via a Hypernetwork-generated Stable Dynamics Model

    Authors: Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano, Antonio Rodríguez-Sánchez, Justus Piater

    Abstract: Learning from demonstration (LfD) provides an efficient way to train robots. The learned motions should be convergent and stable, but to be truly effective in the real world, LfD-capable robots should also be able to remember multiple motion skills. Existing stable-LfD approaches lack the capability of multi-skill retention. Although recent work on continual-LfD has shown that hypernetwork-generat… ▽ More

    Submitted 9 January, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: This paper is currently under peer review

  4. arXiv:2301.09954  [pdf, other

    cs.RO cs.SE

    Differentiable Forward Kinematics for TensorFlow 2

    Authors: Lukas Mölschl, Jakob J. Hollenstein, Justus Piater

    Abstract: Robotic systems are often complex and depend on the integration of a large number of software components. One important component in robotic systems provides the calculation of forward kinematics, which is required by both motion-planning and perception related components. End-to-end learning systems based on deep learning require passing gradients across component boundaries.Typical software impl… ▽ More

    Submitted 10 March, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

  5. arXiv:2206.03787  [pdf, other

    cs.LG cs.AI

    Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

    Authors: Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo, Justus Piater

    Abstract: Many Deep Reinforcement Learning (D-RL) algorithms rely on simple forms of exploration such as the additive action noise often used in continuous control domains. Typically, the scaling factor of this action noise is chosen as a hyper-parameter and is kept constant during training. In this paper, we focus on action noise in off-policy deep reinforcement learning for continuous control. We analyze… ▽ More

    Submitted 5 June, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Published in Transactions on Machine Learning Research (11/2022) https://openreview.net/forum?id=NljBlZ6hmG

  6. arXiv:2202.06843  [pdf, other

    cs.RO cs.LG

    Continual Learning from Demonstration of Robotics Skills

    Authors: Sayantan Auddy, Jakob Hollenstein, Matteo Saveriano, Antonio Rodríguez-Sánchez, Justus Piater

    Abstract: Methods for teaching motion skills to robots focus on training for a single skill at a time. Robots capable of learning from demonstration can considerably benefit from the added ability to learn new movement skills without forgetting what was learned in the past. To this end, we propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equa… ▽ More

    Submitted 12 April, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: To appear in Robotics and Autonomous Systems

  7. arXiv:2010.15533  [pdf, other

    cs.LG

    How do Offline Measures for Exploration in Reinforcement Learning behave?

    Authors: Jakob J. Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo, Justus Piater

    Abstract: Sufficient exploration is paramount for the success of a reinforcement learning agent. Yet, exploration is rarely assessed in an algorithm-independent way. We compare the behavior of three data-based, offline exploration metrics described in the literature on intuitive simple distributions and highlight problems to be aware of when using them. We propose a fourth metric,uniform relative entropy, a… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Comments: KBRL Workshop at IJCAI-PRICAI 2020, Yokohama, Japan

  8. arXiv:2010.12974  [pdf, other

    cs.LG cs.RO

    Improving the Exploration of Deep Reinforcement Learning in Continuous Domains using Planning for Policy Search

    Authors: Jakob J. Hollenstein, Erwan Renaudo, Matteo Saveriano, Justus Piater

    Abstract: Local policy search is performed by most Deep Reinforcement Learning (D-RL) methods, which increases the risk of getting trapped in a local minimum. Furthermore, the availability of a simulation model is not fully exploited in D-RL even in simulation-based training, which potentially decreases efficiency. To better exploit simulation models in policy search, we propose to integrate a kinodynamic p… ▽ More

    Submitted 24 October, 2020; originally announced October 2020.