Skip to main content

Showing 1–50 of 60 results for author: Boedecker, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.07639  [pdf, ps, other

    cs.RO

    Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse

    Authors: Zhekai Duan, Yuan Zhang, Shikai Geng, Gaowen Liu, Joschka Boedecker, Chris Xiaoxuan Lu

    Abstract: Embodied Chain-of-Thought (ECoT) reasoning enhances vision-language-action (VLA) models by improving performance and interpretability through intermediate reasoning steps. However, its sequential autoregressive token generation introduces significant inference latency, limiting real-time deployment. We propose Fast ECoT, an inference-time acceleration method that exploits the structured and repeti… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  2. arXiv:2505.03296  [pdf, other

    cs.RO cs.AI cs.LG

    The Unreasonable Effectiveness of Discrete-Time Gaussian Process Mixtures for Robot Policy Learning

    Authors: Jan Ole von Hartz, Adrian Röfer, Joschka Boedecker, Abhinav Valada

    Abstract: We present Mixture of Discrete-time Gaussian Processes (MiDiGap), a novel approach for flexible policy representation and imitation learning in robot manipulation. MiDiGap enables learning from as few as five demonstrations using only camera observations and generalizes across a wide range of challenging tasks. It excels at long-horizon behaviors such as making coffee, highly constrained motions s… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: Submitted for publication to IEEE Transaction on Robotics

  3. arXiv:2504.01431  [pdf, ps, other

    math.OC cs.CE cs.LG

    Multi-convex Programming for Discrete Latent Factor Models Prototyping

    Authors: Hao Zhu, Shengchao Yan, Jasper Hoffmann, Joschka Boedecker

    Abstract: Discrete latent factor models (DLFMs) are widely used in various domains such as machine learning, economics, neuroscience, psychology, etc. Currently, fitting a DLFM to some dataset relies on a customized solver for individual models, which requires lots of effort to implement and is limited to the targeted specific instance of DLFMs. In this paper, we propose a generic framework based on CVXPY,… ▽ More

    Submitted 26 June, 2025; v1 submitted 2 April, 2025; originally announced April 2025.

    MSC Class: 90C25 (Primary); 90C59; 90C90

  4. arXiv:2502.02133  [pdf, other

    eess.SY cs.AI cs.LG

    Synthesis of Model Predictive Control and Reinforcement Learning: Survey and Classification

    Authors: Rudolf Reiter, Jasper Hoffmann, Dirk Reinhardt, Florian Messerer, Katrin Baumgärtner, Shamburaj Sawant, Joschka Boedecker, Moritz Diehl, Sebastien Gros

    Abstract: The fields of MPC and RL consider two successful control techniques for Markov decision processes. Both approaches are derived from similar fundamental principles, and both are widely used in practical applications, including robotics, process control, energy systems, and autonomous driving. Despite their similarities, MPC and RL follow distinct paradigms that emerged from diverse communities and… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  5. arXiv:2501.18945  [pdf, ps, other

    cs.CE cs.LG math.OC q-bio.NC

    Solving Inverse Problem for Multi-armed Bandits via Convex Optimization

    Authors: Hao Zhu, Joschka Boedecker

    Abstract: We consider the inverse problem of multi-armed bandits (IMAB) that are widely used in neuroscience and psychology research for behavior modelling. We first show that the IMAB problem is not convex in general, but can be relaxed to a convex problem via variable transformation. Based on this result, we propose a two-step sequential heuristic for (approximately) solving the IMAB problem. We discuss a… ▽ More

    Submitted 26 June, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

  6. arXiv:2501.15957  [pdf, ps, other

    cs.LG cs.CE math.OC q-bio.NC

    Inverse Reinforcement Learning via Convex Optimization

    Authors: Hao Zhu, Yuan Zhang, Joschka Boedecker

    Abstract: We consider the inverse reinforcement learning (IRL) problem, where an unknown reward function of some Markov decision process is estimated based on observed expert demonstrations. In most existing approaches, IRL is formulated and solved as a nonconvex optimization problem, posing challenges in scenarios where robustness and reproducibility are critical. We discuss a convex formulation of the IRL… ▽ More

    Submitted 26 June, 2025; v1 submitted 27 January, 2025; originally announced January 2025.

  7. arXiv:2501.02330  [pdf, ps, other

    cs.LG cs.AI

    SR-Reward: Taking The Path More Traveled

    Authors: Seyed Mahdi B. Azad, Zahra Padar, Gabriel Kalweit, Joschka Boedecker

    Abstract: In this paper, we propose a novel method for learning reward functions directly from offline demonstrations. Unlike traditional inverse reinforcement learning (IRL), our approach decouples the reward function from the learner's policy, eliminating the adversarial interaction typically required between the two. This results in a more stable and efficient training process. Our reward function, calle… ▽ More

    Submitted 12 June, 2025; v1 submitted 4 January, 2025; originally announced January 2025.

  8. arXiv:2411.10175  [pdf, other

    cs.LG cs.AI cs.CV

    The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning

    Authors: Moritz Schneider, Robert Krug, Narunas Vaskevicius, Luigi Palmieri, Joschka Boedecker

    Abstract: Visual Reinforcement Learning (RL) methods often require extensive amounts of data. As opposed to model-free RL, model-based RL (MBRL) offers a potential solution with efficient data utilization through planning. Additionally, RL lacks generalization capabilities for real-world tasks. Prior work has shown that incorporating pre-trained visual representations (PVRs) enhances sample efficiency and g… ▽ More

    Submitted 15 January, 2025; v1 submitted 15 November, 2024; originally announced November 2024.

    Comments: Published at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024). Project page: https://schneimo.com/pvr4mbrl/

  9. arXiv:2409.19716  [pdf, other

    cs.LG cs.AI eess.SY

    Constrained Reinforcement Learning for Safe Heat Pump Control

    Authors: Baohe Zhang, Lilli Frison, Thomas Brox, Joschka Bödecker

    Abstract: Constrained Reinforcement Learning (RL) has emerged as a significant research area within RL, where integrating constraints with rewards is crucial for enhancing safety and performance across diverse control tasks. In the context of heating systems in the buildings, optimizing the energy efficiency while maintaining the residents' thermal comfort can be intuitively formulated as a constrained opti… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

  10. arXiv:2409.16298  [pdf, other

    q-bio.BM cs.LG

    BetterBodies: Reinforcement Learning guided Diffusion for Antibody Sequence Design

    Authors: Yannick Vogt, Mehdi Naouar, Maria Kalweit, Christoph Cornelius Miething, Justus Duyster, Joschka Boedecker, Gabriel Kalweit

    Abstract: Antibodies offer great potential for the treatment of various diseases. However, the discovery of therapeutic antibodies through traditional wet lab methods is expensive and time-consuming. The use of generative models in designing antibodies therefore holds great promise, as it can reduce the time and resources required. Recently, the class of diffusion models has gained considerable traction for… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  11. arXiv:2409.01245  [pdf, other

    cs.LG cs.AI cs.RO

    Revisiting Safe Exploration in Safe Reinforcement learning

    Authors: David Eckel, Baohe Zhang, Joschka Bödecker

    Abstract: Safe reinforcement learning (SafeRL) extends standard reinforcement learning with the idea of safety, where safety is typically defined through the constraint of the expected cost return of a trajectory being below a set limit. However, this metric fails to distinguish how costs accrue, treating infrequent severe cost events as equal to frequent mild ones, which can lead to riskier behaviors and r… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  12. arXiv:2407.16463  [pdf, other

    physics.ao-ph cs.LG

    Advances in Land Surface Model-based Forecasting: A comparative study of LSTM, Gradient Boosting, and Feedforward Neural Network Models as prognostic state emulators

    Authors: Marieke Wesselkamp, Matthew Chantry, Ewan Pinnington, Margarita Choulga, Souhail Boussetta, Maria Kalweit, Joschka Boedecker, Carsten F. Dormann, Florian Pappenberger, Gianpaolo Balsamo

    Abstract: Most useful weather prediction for the public is near the surface. The processes that are most relevant for near-surface weather prediction are also those that are most interactive and exhibit positive feedback or have key role in energy partitioning. Land surface models (LSMs) consider these processes together with surface heterogeneity and forecast water, carbon and energy fluxes, and coupled wi… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  13. arXiv:2407.13432  [pdf, other

    cs.RO cs.LG

    The Art of Imitation: Learning Long-Horizon Manipulation Tasks from Few Demonstrations

    Authors: Jan Ole von Hartz, Tim Welschehold, Abhinav Valada, Joschka Boedecker

    Abstract: Task Parametrized Gaussian Mixture Models (TP-GMM) are a sample-efficient method for learning object-centric robot manipulation tasks. However, there are several open challenges to applying TP-GMMs in the wild. In this work, we tackle three crucial challenges synergistically. First, end-effector velocities are non-Euclidean and thus hard to model using standard GMMs. We thus propose to factorize t… ▽ More

    Submitted 23 October, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  14. arXiv:2407.11107  [pdf, other

    cs.RO cs.LG

    Latent Linear Quadratic Regulator for Robotic Control Tasks

    Authors: Yuan Zhang, Shaohui Yang, Toshiyuki Ohtsuka, Colin Jones, Joschka Boedecker

    Abstract: Model predictive control (MPC) has played a more crucial role in various robotic control tasks, but its high computational requirements are concerning, especially for nonlinear dynamical models. This paper presents a $\textbf{la}$tent $\textbf{l}$inear $\textbf{q}$uadratic $\textbf{r}$egulator (LaLQR) that maps the state space into a latent space, on which the dynamical model is linear and the cos… ▽ More

    Submitted 11 February, 2025; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted at RSS 2024 workshop on Koopman Operators in Robotics

  15. arXiv:2405.02598  [pdf, other

    cs.LG

    UDUC: An Uncertainty-driven Approach for Learning-based Robust Control

    Authors: Yuan Zhang, Jasper Hoffmann, Joschka Boedecker

    Abstract: Learning-based techniques have become popular in both model predictive control (MPC) and reinforcement learning (RL). Probabilistic ensemble (PE) models offer a promising approach for modelling system dynamics, showcasing the ability to capture uncertainty and scalability in high-dimensional control scenarios. However, PE models are susceptible to mode collapse, resulting in non-robust control whe… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  16. arXiv:2404.18863  [pdf, other

    cs.RO math.OC

    PlanNetX: Learning an Efficient Neural Network Planner from MPC for Longitudinal Control

    Authors: Jasper Hoffmann, Diego Fernandez, Julien Brosseit, Julian Bernhard, Klemens Esterle, Moritz Werling, Michael Karg, Joschka Boedecker

    Abstract: Model predictive control (MPC) is a powerful, optimization-based approach for controlling dynamical systems. However, the computational complexity of online optimization can be problematic on embedded devices. Especially, when we need to guarantee fixed control frequencies. Thus, previous work proposed to reduce the computational burden using imitation learning (IL) approximating the MPC policy by… ▽ More

    Submitted 22 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: 6th Annual Learning for Dynamics & Control Conference (L4DC 2024)

  17. arXiv:2404.06124  [pdf, other

    cs.CV cs.AI cs.RO

    Hierarchical Insights: Exploiting Structural Similarities for Reliable 3D Semantic Segmentation

    Authors: Mariella Dreissig, Simon Ruehle, Florian Piewak, Joschka Boedecker

    Abstract: Safety-critical applications such as autonomous driving require robust 3D environment perception algorithms capable of handling diverse and ambiguous surroundings. The predictive performance of classification models is heavily influenced by the dataset and the prior knowledge provided by the annotated labels. While labels guide the learning process, they often fail to capture the inherent relation… ▽ More

    Submitted 31 July, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  18. arXiv:2403.14508  [pdf, other

    cs.LG cs.AI eess.SY

    Constrained Reinforcement Learning with Smoothed Log Barrier Function

    Authors: Baohe Zhang, Yuan Zhang, Lilli Frison, Thomas Brox, Joschka Bödecker

    Abstract: Reinforcement Learning (RL) has been widely applied to many control tasks and substantially improved the performances compared to conventional control methods in many domains where the reward function is well defined. However, for many real-world problems, it is often more convenient to formulate optimization problems in terms of rewards and constraints simultaneously. Optimizing such constrained… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  19. arXiv:2401.05341  [pdf, other

    q-bio.BM cs.LG

    Stable Online and Offline Reinforcement Learning for Antibody CDRH3 Design

    Authors: Yannick Vogt, Mehdi Naouar, Maria Kalweit, Christoph Cornelius Miething, Justus Duyster, Roland Mertelsmann, Gabriel Kalweit, Joschka Boedecker

    Abstract: The field of antibody-based therapeutics has grown significantly in recent years, with targeted antibodies emerging as a potentially effective approach to personalized therapies. Such therapies could be particularly beneficial for complex, highly individual diseases such as cancer. However, progress in this field is often constrained by the extensive search space of amino acid sequences that form… ▽ More

    Submitted 29 November, 2023; originally announced January 2024.

  20. arXiv:2312.00671  [pdf, other

    cs.CV cs.LG

    CellMixer: Annotation-free Semantic Cell Segmentation of Heterogeneous Cell Populations

    Authors: Mehdi Naouar, Gabriel Kalweit, Anusha Klett, Yannick Vogt, Paula Silvestrini, Diana Laura Infante Ramirez, Roland Mertelsmann, Joschka Boedecker, Maria Kalweit

    Abstract: In recent years, several unsupervised cell segmentation methods have been presented, trying to omit the requirement of laborious pixel-level annotations for the training of a cell segmentation model. Most if not all of these methods handle the instance segmentation task by focusing on the detection of different cell instances ignoring their type. While such models prove adequate for certain tasks,… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Medical Imaging Meets NeurIPS 2023

  21. arXiv:2311.13870  [pdf, other

    cs.LG q-bio.NC

    Multi-intention Inverse Q-learning for Interpretable Behavior Representation

    Authors: Hao Zhu, Brice De La Crompe, Gabriel Kalweit, Artur Schneider, Maria Kalweit, Ilka Diester, Joschka Boedecker

    Abstract: In advancing the understanding of natural decision-making processes, inverse reinforcement learning (IRL) methods have proven instrumental in reconstructing animal's intentions underlying complex behaviors. Given the recent development of a continuous-time multi-intention IRL framework, there has been persistent inquiry into inferring discrete time-varying rewards with IRL. To address this challen… ▽ More

    Submitted 10 September, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

  22. arXiv:2310.07029  [pdf, other

    q-bio.NC cs.LG eess.SP

    Brain Age Revisited: Investigating the State vs. Trait Hypotheses of EEG-derived Brain-Age Dynamics with Deep Learning

    Authors: Lukas AW Gemein, Robin T Schirrmeister, Joschka Boedecker, Tonio Ball

    Abstract: The brain's biological age has been considered as a promising candidate for a neurologically significant biomarker. However, recent results based on longitudinal magnetic resonance imaging data have raised questions on its interpretation. A central question is whether an increased biological age of the brain is indicative of brain pathology and if changes in brain age correlate with diagnosed path… ▽ More

    Submitted 22 September, 2023; originally announced October 2023.

  23. arXiv:2308.02248  [pdf, other

    cs.CV

    On the Calibration of Uncertainty Estimation in LiDAR-based Semantic Segmentation

    Authors: Mariella Dreissig, Florian Piewak, Joschka Boedecker

    Abstract: The confidence calibration of deep learning-based perception models plays a crucial role in their reliability. Especially in the context of autonomous driving, downstream tasks like prediction and planning depend on accurate confidence estimates. In point-wise multiclass classification tasks like sematic segmentation the model has to deal with heavy class imbalances. Due to their underrepresentati… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: accepted at IEEE ITSC 2023

  24. arXiv:2307.09206  [pdf, other

    cs.RO cs.LG

    Context-Conditional Navigation with a Learning-Based Terrain- and Robot-Aware Dynamics Model

    Authors: Suresh Guttikonda, Jan Achterhold, Haolong Li, Joschka Boedecker, Joerg Stueckler

    Abstract: In autonomous navigation settings, several quantities can be subject to variations. Terrain properties such as friction coefficients may vary over time depending on the location of the robot. Also, the dynamics of the robot may change due to, e.g., different payloads, changing the system's mass, or wear and tear, changing actuator gains or joint friction. An autonomous agent should thus be able to… ▽ More

    Submitted 5 October, 2024; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: ©2023 IEEE. Accepted for publication in European Conference on Mobile Robots (ECMR), 2023. Version including corrections (see p. 8)

  25. arXiv:2306.16316  [pdf, other

    cs.RO

    Learning Continuous Control with Geometric Regularity from Robot Intrinsic Symmetry

    Authors: Shengchao Yan, Baohe Zhang, Yuan Zhang, Joschka Boedecker, Wolfram Burgard

    Abstract: Geometric regularity, which leverages data symmetry, has been successfully incorporated into deep learning architectures such as CNNs, RNNs, GNNs, and Transformers. While this concept has been widely applied in robotics to address the curse of dimensionality when learning from high-dimensional data, the inherent reflectional and rotational symmetry of robot structures has not been adequately explo… ▽ More

    Submitted 18 March, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: accepted by ICRA 2024

  26. arXiv:2305.04718  [pdf, other

    cs.RO cs.AI cs.CV

    The Treachery of Images: Bayesian Scene Keypoints for Deep Policy Learning in Robotic Manipulation

    Authors: Jan Ole von Hartz, Eugenio Chisari, Tim Welschehold, Wolfram Burgard, Joschka Boedecker, Abhinav Valada

    Abstract: In policy learning for robotic manipulation, sample efficiency is of paramount importance. Thus, learning and extracting more compact representations from camera observations is a promising avenue. However, current methods often assume full observability of the scene and struggle with scale invariance. In many tasks and settings, this assumption does not hold as objects in the scene are often occl… ▽ More

    Submitted 20 September, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 6931-6938, Nov. 2023

  27. arXiv:2304.06312  [pdf, other

    cs.RO cs.CV

    Survey on LiDAR Perception in Adverse Weather Conditions

    Authors: Mariella Dreissig, Dominik Scheuble, Florian Piewak, Joschka Boedecker

    Abstract: Autonomous vehicles rely on a variety of sensors to gather information about their surrounding. The vehicle's behavior is planned based on the environment perception, making its reliability crucial for safety reasons. The active LiDAR sensor is able to create an accurate 3D representation of a scene, making it a valuable addition for environment perception for autonomous vehicles. Due to light sca… ▽ More

    Submitted 6 June, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: published at IEEE IV 2023

  28. arXiv:2304.01782  [pdf, other

    cs.LG math.OC

    Imitation Learning from Nonlinear MPC via the Exact Q-Loss and its Gauss-Newton Approximation

    Authors: Andrea Ghezzi, Jasper Hoffman, Jonathan Frey, Joschka Boedecker, Moritz Diehl

    Abstract: This work presents a novel loss function for learning nonlinear Model Predictive Control policies via Imitation Learning. Standard approaches to Imitation Learning neglect information about the expert and generally adopt a loss function based on the distance between expert and learned controls. In this work, we present a loss based on the Q-function directly embedding the performance objectives an… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: Submitted to Conference on Decision and Control (CDC) 2023. The paper contains 6 pages

  29. arXiv:2303.16533  [pdf, other

    cs.CV

    Robust Tumor Detection from Coarse Annotations via Multi-Magnification Ensembles

    Authors: Mehdi Naouar, Gabriel Kalweit, Ignacio Mastroleo, Philipp Poxleitner, Marc Metzger, Joschka Boedecker, Maria Kalweit

    Abstract: Cancer detection and classification from gigapixel whole slide images of stained tissue specimens has recently experienced enormous progress in computational histopathology. The limitation of available pixel-wise annotated scans shifted the focus from tumor localization to global slide-level classification on the basis of (weakly-supervised) multiple-instance learning despite the clinical importan… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

  30. arXiv:2301.13313  [pdf, other

    cs.LG cs.AI cs.RO

    Incorporating Recurrent Reinforcement Learning into Model Predictive Control for Adaptive Control in Autonomous Driving

    Authors: Yuan Zhang, Joschka Boedecker, Chuxuan Li, Guyue Zhou

    Abstract: Model Predictive Control (MPC) is attracting tremendous attention in the autonomous driving task as a powerful control technique. The success of an MPC controller strongly depends on an accurate internal dynamics model. However, the static parameters, usually learned by system identification, often fail to adapt to both internal and external perturbations in real-world scenarios. In this paper, we… ▽ More

    Submitted 27 April, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  31. arXiv:2212.02941  [pdf, other

    cs.RO cs.LG math.OC

    Safe Imitation Learning of Nonlinear Model Predictive Control for Flexible Robots

    Authors: Shamil Mamedov, Rudolf Reiter, Seyed Mahdi Basiri Azad, Ruan Viljoen, Joschka Boedecker, Moritz Diehl, Jan Swevers

    Abstract: Flexible robots may overcome some of the industry's major challenges, such as enabling intrinsically safe human-robot collaboration and achieving a higher payload-to-mass ratio. However, controlling flexible robots is complicated due to their complex dynamics, which include oscillatory behavior and a high-dimensional state space. Nonlinear model predictive control (NMPC) offers an effective means… ▽ More

    Submitted 14 August, 2024; v1 submitted 6 December, 2022; originally announced December 2022.

    Comments: Accepted to IROS 2024

  32. arXiv:2212.01607  [pdf, other

    cs.RO eess.SY

    A Hierarchical Approach for Strategic Motion Planning in Autonomous Racing

    Authors: Rudolf Reiter, Jasper Hoffmann, Joschka Boedecker, Moritz Diehl

    Abstract: We present an approach for safe trajectory planning, where a strategic task related to autonomous racing is learned sample-efficient within a simulation environment. A high-level policy, represented as a neural network, outputs a reward specification that is used within the cost function of a parametric nonlinear model predictive controller (NMPC). By including constraints and vehicle kinematics… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

  33. arXiv:2210.06811  [pdf, ps, other

    cs.CV

    On the calibration of underrepresented classes in LiDAR-based semantic segmentation

    Authors: Mariella Dreissig, Florian Piewak, Joschka Boedecker

    Abstract: The calibration of deep learning-based perception models plays a crucial role in their reliability. Our work focuses on a class-wise evaluation of several model's confidence performance for LiDAR-based semantic segmentation with the aim of providing insights into the calibration of underrepresented classes. Those classes often include VRUs and are thus of particular interest for safety reasons. Wi… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  34. arXiv:2209.08959  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Latent Plans for Task-Agnostic Offline Reinforcement Learning

    Authors: Erick Rosete-Beas, Oier Mees, Gabriel Kalweit, Joschka Boedecker, Wolfram Burgard

    Abstract: Everyday tasks of long-horizon and comprising a sequence of multiple implicit subtasks still impose a major challenge in offline robot control. While a number of prior methods aimed to address this setting with variants of imitation and offline reinforcement learning, the learned behavior is typically narrow and often struggles to reach configurable long-horizon goals. As both paradigms have compl… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: CoRL 2022. Project website: http://tacorl.cs.uni-freiburg.de/

  35. arXiv:2207.02016  [pdf, other

    cs.LG cs.AI eess.SY

    Robust Reinforcement Learning in Continuous Control Tasks with Uncertainty Set Regularization

    Authors: Yuan Zhang, Jianhong Wang, Joschka Boedecker

    Abstract: Reinforcement learning (RL) is recognized as lacking generalization and robustness under environmental perturbations, which excessively restricts its application for real-world robotics. Prior work claimed that adding regularization to the value function is equivalent to learning a robust policy with uncertain transitions. Although the regularization-robustness transformation is appealing for its… ▽ More

    Submitted 5 December, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: Accepted at CoRL 2023

  36. arXiv:2203.10949  [pdf, other

    cs.RO cs.LG

    Optimizing Trajectories for Highway Driving with Offline Reinforcement Learning

    Authors: Branka Mirchevska, Moritz Werling, Joschka Boedecker

    Abstract: Implementing an autonomous vehicle that is able to output feasible, smooth and efficient trajectories is a long-standing challenge. Several approaches have been considered, roughly falling under two categories: rule-based and learning-based approaches. The rule-based approaches, while guaranteeing safety and feasibility, fall short when it comes to long-term planning and generalization. The learni… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

  37. arXiv:2203.00352  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Affordance Learning from Play for Sample-Efficient Policy Learning

    Authors: Jessica Borja-Diaz, Oier Mees, Gabriel Kalweit, Lukas Hermann, Joschka Boedecker, Wolfram Burgard

    Abstract: Robots operating in human-centered environments should have the ability to understand how objects function: what can be done with each object, where this interaction may occur, and how the object is used to achieve a goal. To this end, we propose a novel approach that extracts a self-supervised visual affordance model from human teleoperated play data and leverages it to enable efficient policy le… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

    Comments: Accepted at the 2022 IEEE International Conference on Robotics and Automation (ICRA). Videos at http://vapo.cs.uni-freiburg.de/

  38. arXiv:2111.12673  [pdf, other

    cs.LG cs.AI cs.RO

    Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

    Authors: Nicolai Dorka, Tim Welschehold, Joschka Boedecker, Wolfram Burgard

    Abstract: Accurate value estimates are important for off-policy reinforcement learning. Algorithms based on temporal difference learning typically are prone to an over- or underestimation bias building up over time. In this paper, we propose a general method called Adaptively Calibrated Critics (ACC) that uses the most recent high variance but unbiased on-policy rollouts to alleviate the bias of the low var… ▽ More

    Submitted 21 October, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: Submitted to RA-L

  39. arXiv:2110.03316  [pdf, other

    cs.RO

    Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation

    Authors: Eugenio Chisari, Tim Welschehold, Joschka Boedecker, Wolfram Burgard, Abhinav Valada

    Abstract: Learning to solve complex manipulation tasks from visual observations is a dominant challenge for real-world robot learning. Although deep reinforcement learning algorithms have recently demonstrated impressive results in this context, they still require an impractical amount of time-consuming trial-and-error iterations. In this work, we consider the promising alternative paradigm of interactive l… ▽ More

    Submitted 19 January, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Accepted for publication in RA-L. Video, code and models available at http://ceiling.cs.uni-freiburg.de/

  40. arXiv:2106.04306  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Residual Feedback Learning for Contact-Rich Manipulation Tasks with Uncertainty

    Authors: Alireza Ranjbar, Ngo Anh Vien, Hanna Ziesche, Joschka Boedecker, Gerhard Neumann

    Abstract: While classic control theory offers state of the art solutions in many problem scenarios, it is often desired to improve beyond the structure of such solutions and surpass their limitations. To this end, residual policy learning (RPL) offers a formulation to improve existing controllers with reinforcement learning (RL) by learning an additive "residual" to the output of a given controller. However… ▽ More

    Submitted 6 August, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

  41. arXiv:2012.03234  [pdf, other

    cs.LG cs.RO

    Amortized Q-learning with Model-based Action Proposals for Autonomous Driving on Highways

    Authors: Branka Mirchevska, Maria Hügle, Gabriel Kalweit, Moritz Werling, Joschka Boedecker

    Abstract: Well-established optimization-based methods can guarantee an optimal trajectory for a short optimization horizon, typically no longer than a few seconds. As a result, choosing the optimal trajectory for this short horizon may still result in a sub-optimal long-term solution. At the same time, the resulting short-term trajectories allow for effective, comfortable and provable safe maneuvers in a dy… ▽ More

    Submitted 6 December, 2020; originally announced December 2020.

  42. arXiv:2010.11278  [pdf, other

    cs.LG cs.RO

    Deep Surrogate Q-Learning for Autonomous Driving

    Authors: Maria Kalweit, Gabriel Kalweit, Moritz Werling, Joschka Boedecker

    Abstract: Challenging problems of deep reinforcement learning systems with regard to the application on real systems are their adaptivity to changing environments and their efficiency w.r.t. computational resources and data. In the application of learning lane-change behavior for autonomous driving, agents have to deal with a varying number of surrounding vehicles. Furthermore, the number of required transi… ▽ More

    Submitted 17 February, 2022; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: Accepted at ICRA 2022

  43. A Dynamic Deep Neural Network For Multimodal Clinical Data Analysis

    Authors: Maria Hügle, Gabriel Kalweit, Thomas Huegle, Joschka Boedecker

    Abstract: Clinical data from electronic medical records, registries or trials provide a large source of information to apply machine learning methods in order to foster precision medicine, e.g. by finding new disease phenotypes or performing individual disease prediction. However, to take full advantage of deep learning methods on clinical data, architectures are necessary that 1) are robust with respect to… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Comments: Accepted at the AAAI 2020 International Workshop on Health Intelligence

  44. arXiv:2008.01712  [pdf, other

    cs.LG cs.RO stat.ML

    Deep Inverse Q-learning with Constraints

    Authors: Gabriel Kalweit, Maria Huegle, Moritz Werling, Joschka Boedecker

    Abstract: Popular Maximum Entropy Inverse Reinforcement Learning approaches require the computation of expected state visitation frequencies for the optimal policy under an estimate of the reward function. This usually requires intermediate value estimation in the inner loop of the algorithm, slowing down convergence considerably. In this work, we introduce a novel class of algorithms that only needs to sol… ▽ More

    Submitted 4 August, 2020; originally announced August 2020.

  45. arXiv:2003.09398  [pdf, other

    cs.LG cs.RO stat.ML

    Deep Constrained Q-learning

    Authors: Gabriel Kalweit, Maria Huegle, Moritz Werling, Joschka Boedecker

    Abstract: In many real world applications, reinforcement learning agents have to optimize multiple objectives while following certain rules or satisfying a list of constraints. Classical methods based on reward shaping, i.e. a weighted combination of different objectives in the reward signal, or Lagrangian methods, including constraints in the loss function, have no guarantees that the agent satisfies the c… ▽ More

    Submitted 14 September, 2020; v1 submitted 20 March, 2020; originally announced March 2020.

  46. arXiv:2002.05115  [pdf, other

    eess.IV cs.LG eess.SP stat.ML

    Machine-Learning-Based Diagnostics of EEG Pathology

    Authors: Lukas Alexander Wilhelm Gemein, Robin Tibor Schirrmeister, Patryk Chrabąszcz, Daniel Wilson, Joschka Boedecker, Andreas Schulze-Bonhage, Frank Hutter, Tonio Ball

    Abstract: Machine learning (ML) methods have the potential to automate clinical EEG analysis. They can be categorized into feature-based (with handcrafted features), and end-to-end approaches (with learned features). Previous studies on EEG pathology decoding have typically analyzed a limited number of features, decoders, or both. For a I) more elaborate feature-based EEG analysis, and II) in-depth comparis… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

    Journal ref: NeuroImage, Volume 220, 15 October 2020, 117021

  47. arXiv:1909.13582  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Dynamic Interaction-Aware Scene Understanding for Reinforcement Learning in Autonomous Driving

    Authors: Maria Huegle, Gabriel Kalweit, Moritz Werling, Joschka Boedecker

    Abstract: The common pipeline in autonomous driving systems is highly modular and includes a perception component which extracts lists of surrounding objects and passes these lists to a high-level decision component. In this case, leveraging the benefits of deep reinforcement learning for high-level decision making requires special architectures to deal with multiple variable-length sequences of different o… ▽ More

    Submitted 30 September, 2019; originally announced September 2019.

  48. arXiv:1909.13518  [pdf, other

    cs.LG cs.AI stat.ML

    Composite Q-learning: Multi-scale Q-function Decomposition and Separable Optimization

    Authors: Gabriel Kalweit, Maria Huegle, Joschka Boedecker

    Abstract: In the past few years, off-policy reinforcement learning methods have shown promising results in their application for robot control. Deep Q-learning, however, still suffers from poor data-efficiency and is susceptible to stochasticity in the environment or reward functions which is limiting with regard to real-world applications. We alleviate these problems by proposing two novel off-policy Tempo… ▽ More

    Submitted 14 August, 2020; v1 submitted 30 September, 2019; originally announced September 2019.

  49. Dynamic Input for Deep Reinforcement Learning in Autonomous Driving

    Authors: Maria Hügle, Gabriel Kalweit, Branka Mirchevska, Moritz Werling, Joschka Boedecker

    Abstract: In many real-world decision making problems, reaching an optimal decision requires taking into account a variable number of objects around the agent. Autonomous driving is a domain in which this is especially relevant, since the number of cars surrounding the agent varies considerably over time and affects the optimal action to be taken. Classical methods that process object lists can deal with th… ▽ More

    Submitted 25 July, 2019; originally announced July 2019.

    Comments: Accepted at IROS 2019

  50. arXiv:1906.12189  [pdf, other

    eess.SY cs.AI cs.LG

    Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning

    Authors: Torsten Koller, Felix Berkenkamp, Matteo Turchetta, Joschka Boedecker, Andreas Krause

    Abstract: Reinforcement learning has been successfully used to solve difficult tasks in complex unknown environments. However, these methods typically do not provide any safety guarantees during the learning process. This is particularly problematic, since reinforcement learning agent actively explore their environment. This prevents their use in safety-critical, real-world applications. In this paper, we p… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

    Comments: 14 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:1803.08287