-
Multi-Objective Reinforcement Learning for Adaptive Personalized Autonomous Driving
Authors:
Hendrik Surmann,
Jorge de Heuvel,
Maren Bennewitz
Abstract:
Human drivers exhibit individual preferences regarding driving style. Adapting autonomous vehicles to these preferences is essential for user trust and satisfaction. However, existing end-to-end driving approaches often rely on predefined driving styles or require continuous user feedback for adaptation, limiting their ability to support dynamic, context-dependent preferences. We propose a novel a…
▽ More
Human drivers exhibit individual preferences regarding driving style. Adapting autonomous vehicles to these preferences is essential for user trust and satisfaction. However, existing end-to-end driving approaches often rely on predefined driving styles or require continuous user feedback for adaptation, limiting their ability to support dynamic, context-dependent preferences. We propose a novel approach using multi-objective reinforcement learning (MORL) with preference-driven optimization for end-to-end autonomous driving that enables runtime adaptation to driving style preferences. Preferences are encoded as continuous weight vectors to modulate behavior along interpretable style objectives$\unicode{x2013}$including efficiency, comfort, speed, and aggressiveness$\unicode{x2013}$without requiring policy retraining. Our single-policy agent integrates vision-based perception in complex mixed-traffic scenarios and is evaluated in diverse urban environments using the CARLA simulator. Experimental results demonstrate that the agent dynamically adapts its driving behavior according to changing preferences while maintaining performance in terms of collision avoidance and route completion.
△ Less
Submitted 8 May, 2025;
originally announced May 2025.
-
Auditory Localization and Assessment of Consequential Robot Sounds: A Multi-Method Study in Virtual Reality
Authors:
Marlene Wessels,
Jorge de Heuvel,
Leon Müller,
Anna Luisa Maier,
Maren Bennewitz,
Johannes Kraus
Abstract:
Mobile robots increasingly operate alongside humans but are often out of sight, so that humans need to rely on the sounds of the robots to recognize their presence. For successful human-robot interaction (HRI), it is therefore crucial to understand how humans perceive robots by their consequential sounds, i.e., operating noise. Prior research suggests that the sound of a quadruped Go1 is more dete…
▽ More
Mobile robots increasingly operate alongside humans but are often out of sight, so that humans need to rely on the sounds of the robots to recognize their presence. For successful human-robot interaction (HRI), it is therefore crucial to understand how humans perceive robots by their consequential sounds, i.e., operating noise. Prior research suggests that the sound of a quadruped Go1 is more detectable than that of a wheeled Turtlebot. This study builds on this and examines the human ability to localize consequential sounds of three robots (quadruped Go1, wheeled Turtlebot 2i, wheeled HSR) in Virtual Reality. In a within-subjects design, we assessed participants' localization performance for the robots with and without an acoustic vehicle alerting system (AVAS) for two velocities (0.3, 0.8 m/s) and two trajectories (head-on, radial). In each trial, participants were presented with the sound of a moving robot for 3~s and were tasked to point at its final position (localization task). Localization errors were measured as the absolute angular difference between the participants' estimated and the actual robot position. Results showed that the robot type significantly influenced the localization accuracy and precision, with the sound of the wheeled HSR (especially without AVAS) performing worst under all experimental conditions. Surprisingly, participants rated the HSR sound as more positive, less annoying, and more trustworthy than the Turtlebot and Go1 sound. This reveals a tension between subjective evaluation and objective auditory localization performance. Our findings highlight consequential robot sounds as a critical factor for designing intuitive and effective HRI, with implications for human-centered robot design and social navigation.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
Immersive Explainability: Visualizing Robot Navigation Decisions through XAI Semantic Scene Projections in Virtual Reality
Authors:
Jorge de Heuvel,
Sebastian Müller,
Marlene Wessels,
Aftab Akhtar,
Christian Bauckhage,
Maren Bennewitz
Abstract:
End-to-end robot policies achieve high performance through neural networks trained via reinforcement learning (RL). Yet, their black box nature and abstract reasoning pose challenges for human-robot interaction (HRI), because humans may experience difficulty in understanding and predicting the robot's navigation decisions, hindering trust development. We present a virtual reality (VR) interface th…
▽ More
End-to-end robot policies achieve high performance through neural networks trained via reinforcement learning (RL). Yet, their black box nature and abstract reasoning pose challenges for human-robot interaction (HRI), because humans may experience difficulty in understanding and predicting the robot's navigation decisions, hindering trust development. We present a virtual reality (VR) interface that visualizes explainable AI (XAI) outputs and the robot's lidar perception to support intuitive interpretation of RL-based navigation behavior. By visually highlighting objects based on their attribution scores, the interface grounds abstract policy explanations in the scene context. This XAI visualization bridges the gap between obscure numerical XAI attribution scores and a human-centric semantic level of explanation. A within-subjects study with 24 participants evaluated the effectiveness of our interface for four visualization conditions combining XAI and lidar. Participants ranked scene objects across navigation scenarios based on their importance to the robot, followed by a questionnaire assessing subjective understanding and predictability. Results show that semantic projection of attributions significantly enhances non-expert users' objective understanding and subjective awareness of robot behavior. In addition, lidar visualization further improves perceived predictability, underscoring the value of integrating XAI and sensor for transparent, trustworthy HRI.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
The Impact of VR and 2D Interfaces on Human Feedback in Preference-Based Robot Learning
Authors:
Jorge de Heuvel,
Daniel Marta,
Simon Holk,
Iolanda Leite,
Maren Bennewitz
Abstract:
Aligning robot navigation with human preferences is essential for ensuring comfortable and predictable robot movement in shared spaces, facilitating seamless human-robot coexistence. While preference-based learning methods, such as reinforcement learning from human feedback (RLHF), enable this alignment, the choice of the preference collection interface may influence the process. Traditional 2D in…
▽ More
Aligning robot navigation with human preferences is essential for ensuring comfortable and predictable robot movement in shared spaces, facilitating seamless human-robot coexistence. While preference-based learning methods, such as reinforcement learning from human feedback (RLHF), enable this alignment, the choice of the preference collection interface may influence the process. Traditional 2D interfaces provide structured views but lack spatial depth, whereas immersive VR offers richer perception, potentially affecting preference articulation. This study systematically examines how the interface modality impacts human preference collection and navigation policy alignment. We introduce a novel dataset of 2,325 human preference queries collected through both VR and 2D interfaces, revealing significant differences in user experience, preference consistency, and policy outcomes. Our findings highlight the trade-offs between immersion, perception, and preference reliability, emphasizing the importance of interface selection in preference-based robot learning. The dataset will be publicly released to support future research.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
Compact Multi-Object Placement Using Adjacency-Aware Reinforcement Learning
Authors:
Benedikt Kreis,
Nils Dengler,
Jorge de Heuvel,
Rohit Menon,
Hamsa Perur,
Maren Bennewitz
Abstract:
Close and precise placement of irregularly shaped objects requires a skilled robotic system. The manipulation of objects that have sensitive top surfaces and a fixed set of neighbors is particularly challenging. To avoid damaging the surface, the robot has to grasp them from the side, and during placement, it has to maintain the spatial relations with adjacent objects, while considering the physic…
▽ More
Close and precise placement of irregularly shaped objects requires a skilled robotic system. The manipulation of objects that have sensitive top surfaces and a fixed set of neighbors is particularly challenging. To avoid damaging the surface, the robot has to grasp them from the side, and during placement, it has to maintain the spatial relations with adjacent objects, while considering the physical gripper extent. In this work, we propose a framework to learn an agent based on reinforcement learning that generates end-effector motions for placing objects as closely as possible to one another. During the placement, our agent considers the spatial constraints with neighbors defined in a given layout of the objects while avoiding collisions. Our approach learns to place compact object assemblies without the need for predefined spacing between objects, as required by traditional methods. We thoroughly evaluated our approach using a two-finger gripper mounted on a robotic arm with six degrees of freedom. The results demonstrate that our agent significantly outperforms two baseline approaches in object assembly compactness, thereby reducing the space required to position the objects while adhering to specified spatial constraints.
△ Less
Submitted 11 October, 2024; v1 submitted 16 April, 2024;
originally announced April 2024.
-
Sound Matters: Auditory Detectability of Mobile Robots
Authors:
Subham Agrawal,
Marlene Wessels,
Jorge de Heuvel,
Johannes Kraus,
Maren Bennewitz
Abstract:
Mobile robots are increasingly being used in noisy environments for social purposes, e.g. to provide support in healthcare or public spaces. Since these robots also operate beyond human sight, the question arises as to how different robot types, ambient noise or cognitive engagement impacts the detection of the robots by their sound. To address this research gap, we conducted a user study measurin…
▽ More
Mobile robots are increasingly being used in noisy environments for social purposes, e.g. to provide support in healthcare or public spaces. Since these robots also operate beyond human sight, the question arises as to how different robot types, ambient noise or cognitive engagement impacts the detection of the robots by their sound. To address this research gap, we conducted a user study measuring auditory detection distances for a wheeled (Turtlebot 2i) and quadruped robot (Unitree Go 1), which emit different consequential sounds when moving. Additionally, we also manipulated background noise levels and participants' engagement in a secondary task during the study. Our results showed that the quadruped robot sound was detected significantly better (i.e., at a larger distance) than the wheeled one, which demonstrates that the movement mechanism has a meaningful impact on the auditory detectability. The detectability for both robots diminished significantly as background noise increased. But even in high background noise, participants detected the quadruped robot at a significantly larger distance. The engagement in a secondary task had hardly any impact. In essence, these findings highlight the critical role of distinguishing auditory characteristics of different robots to improve the smooth human-centered navigation of mobile robots in noisy environments.
△ Less
Submitted 25 June, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Demonstration-Enhanced Adaptive Multi-Objective Robot Navigation
Authors:
Jorge de Heuvel,
Tharun Sethuraman,
Maren Bennewitz
Abstract:
Preference-aligned robot navigation in human environments is typically achieved through learning-based approaches, utilizing user feedback or demonstrations for personalization. However, personal preferences are subject to change and might even be context-dependent. Yet traditional reinforcement learning (RL) approaches with static reward functions often fall short in adapting to varying user pref…
▽ More
Preference-aligned robot navigation in human environments is typically achieved through learning-based approaches, utilizing user feedback or demonstrations for personalization. However, personal preferences are subject to change and might even be context-dependent. Yet traditional reinforcement learning (RL) approaches with static reward functions often fall short in adapting to varying user preferences, inevitably reflecting demonstrations once training is completed. This paper introduces a structured framework that combines demonstration-based learning with multi-objective reinforcement learning (MORL). To ensure real-world applicability, our approach allows for dynamic adaptation of the robot navigation policy to changing user preferences without retraining. It fluently modulates the amount of demonstration data reflection and other preference-related objectives. Through rigorous evaluations, including a baseline comparison and sim-to-real transfer on two robots, we demonstrate our framework's capability to adapt to user preferences accurately while achieving high navigational performance in terms of collision avoidance and goal pursuance.
△ Less
Submitted 10 March, 2025; v1 submitted 7 April, 2024;
originally announced April 2024.
-
EnQuery: Ensemble Policies for Diverse Query-Generation in Preference Alignment of Robot Navigation
Authors:
Jorge de Heuvel,
Florian Seiler,
Maren Bennewitz
Abstract:
To align mobile robot navigation policies with user preferences through reinforcement learning from human feedback (RLHF), reliable and behavior-diverse user queries are required. However, deterministic policies fail to generate a variety of navigation trajectory suggestions for a given navigation task. In this paper, we introduce EnQuery, a query generation approach using an ensemble of policies…
▽ More
To align mobile robot navigation policies with user preferences through reinforcement learning from human feedback (RLHF), reliable and behavior-diverse user queries are required. However, deterministic policies fail to generate a variety of navigation trajectory suggestions for a given navigation task. In this paper, we introduce EnQuery, a query generation approach using an ensemble of policies that achieve behavioral diversity through a regularization term. For a given navigation task, EnQuery produces multiple navigation trajectory suggestions, thereby optimizing the efficiency of preference data collection with fewer queries. Our methodology demonstrates superior performance in aligning navigation policies with user preferences in low-query regimes, offering enhanced policy convergence from sparse preference queries. The evaluation is complemented with a novel explainability representation, capturing full scene navigation behavior of the mobile robot in a single plot. Our code is available online at https://github.com/hrl-bonn/EnQuery.
△ Less
Submitted 11 June, 2024; v1 submitted 7 April, 2024;
originally announced April 2024.
-
RHINO-VR Experience: Teaching Mobile Robotics Concepts in an Interactive Museum Exhibit
Authors:
Erik Schlachhoff,
Nils Dengler,
Leif Van Holland,
Patrick Stotko,
Jorge de Heuvel,
Reinhard Klein,
Maren Bennewitz
Abstract:
In 1997, the very first tour guide robot RHINO was deployed in a museum in Germany. With the ability to navigate autonomously through the environment, the robot gave tours to over 2,000 visitors. Today, RHINO itself has become an exhibit and is no longer operational. In this paper, we present RHINO-VR, an interactive museum exhibit using virtual reality (VR) that allows museum visitors to experien…
▽ More
In 1997, the very first tour guide robot RHINO was deployed in a museum in Germany. With the ability to navigate autonomously through the environment, the robot gave tours to over 2,000 visitors. Today, RHINO itself has become an exhibit and is no longer operational. In this paper, we present RHINO-VR, an interactive museum exhibit using virtual reality (VR) that allows museum visitors to experience the historical robot RHINO in operation in a virtual museum. RHINO-VR, unlike static exhibits, enables users to familiarize themselves with basic mobile robotics concepts without the fear of damaging the exhibit. In the virtual environment, the user is able to interact with RHINO in VR by pointing to a location to which the robot should navigate and observing the corresponding actions of the robot. To include other visitors who cannot use the VR, we provide an external observation view to make RHINO visible to them. We evaluated our system by measuring the frame rate of the VR simulation, comparing the generated virtual 3D models with the originals, and conducting a user study. The user-study showed that RHINO-VR improved the visitors' understanding of the robot's functionality and that they would recommend experiencing the VR exhibit to others.
△ Less
Submitted 10 June, 2024; v1 submitted 22 March, 2024;
originally announced March 2024.
-
Spatiotemporal Attention Enhances Lidar-Based Robot Navigation in Dynamic Environments
Authors:
Jorge de Heuvel,
Xiangyu Zeng,
Weixian Shi,
Tharun Sethuraman,
Maren Bennewitz
Abstract:
Foresighted robot navigation in dynamic indoor environments with cost-efficient hardware necessitates the use of a lightweight yet dependable controller. So inferring the scene dynamics from sensor readings without explicit object tracking is a pivotal aspect of foresighted navigation among pedestrians. In this paper, we introduce a spatiotemporal attention pipeline for enhanced navigation based o…
▽ More
Foresighted robot navigation in dynamic indoor environments with cost-efficient hardware necessitates the use of a lightweight yet dependable controller. So inferring the scene dynamics from sensor readings without explicit object tracking is a pivotal aspect of foresighted navigation among pedestrians. In this paper, we introduce a spatiotemporal attention pipeline for enhanced navigation based on 2D~lidar sensor readings. This pipeline is complemented by a novel lidar-state representation that emphasizes dynamic obstacles over static ones. Subsequently, the attention mechanism enables selective scene perception across both space and time, resulting in improved overall navigation performance within dynamic scenarios. We thoroughly evaluated the approach in different scenarios and simulators, finding excellent generalization to unseen environments. The results demonstrate outstanding performance compared to state-of-the-art methods, thereby enabling the seamless deployment of the learned controller on a real robot.
△ Less
Submitted 28 February, 2024; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Subgoal-Driven Navigation in Dynamic Environments Using Attention-Based Deep Reinforcement Learning
Authors:
Jorge de Heuvel,
Weixian Shi,
Xiangyu Zeng,
Maren Bennewitz
Abstract:
Collision-free, goal-directed navigation in environments containing unknown static and dynamic obstacles is still a great challenge, especially when manual tuning of navigation policies or costly motion prediction needs to be avoided. In this paper, we therefore propose a subgoal-driven hierarchical navigation architecture that is trained with deep reinforcement learning and decouples obstacle avo…
▽ More
Collision-free, goal-directed navigation in environments containing unknown static and dynamic obstacles is still a great challenge, especially when manual tuning of navigation policies or costly motion prediction needs to be avoided. In this paper, we therefore propose a subgoal-driven hierarchical navigation architecture that is trained with deep reinforcement learning and decouples obstacle avoidance and motor control. In particular, we separate the navigation task into the prediction of the next subgoal position for avoiding collisions while moving toward the final target position, and the prediction of the robot's velocity controls. By relying on 2D lidar, our method learns to avoid obstacles while still achieving goal-directed behavior as well as to generate low-level velocity control commands to reach the subgoals. In our architecture, we apply the attention mechanism on the robot's 2D lidar readings and compute the importance of lidar scan segments for avoiding collisions. As we show in simulated and real-world experiments with a Turtlebot robot, our proposed method leads to smooth and safe trajectories among humans and significantly outperforms a state-of-the-art approach in terms of success rate. A supplemental video describing our approach is available online.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Reactive Correction of Object Placement Errors for Robotic Arrangement Tasks
Authors:
Benedikt Kreis,
Rohit Menon,
Bharath Kumar Adinarayan,
Jorge de Heuvel,
Maren Bennewitz
Abstract:
When arranging objects with robotic arms, the quality of the end result strongly depends on the achievable placement accuracy. However, even the most advanced robotic systems are prone to positioning errors that can occur at different steps of the manipulation process. Ignoring such errors can lead to the partial or complete failure of the arrangement. In this paper, we present a novel approach to…
▽ More
When arranging objects with robotic arms, the quality of the end result strongly depends on the achievable placement accuracy. However, even the most advanced robotic systems are prone to positioning errors that can occur at different steps of the manipulation process. Ignoring such errors can lead to the partial or complete failure of the arrangement. In this paper, we present a novel approach to autonomously detect and correct misplaced objects by pushing them with a robotic arm. We thoroughly tested our approach both in simulation and on real hardware using a Robotiq two-finger gripper mounted on a UR5 robotic arm. In our evaluation, we demonstrate the successful compensation for different errors injected during the manipulation of regular shaped objects. Consequently, we achieve a highly reliable object placement accuracy in the millimeter range.
△ Less
Submitted 12 May, 2023; v1 submitted 15 February, 2023;
originally announced February 2023.
-
Learning Depth Vision-Based Personalized Robot Navigation From Dynamic Demonstrations in Virtual Reality
Authors:
Jorge de Heuvel,
Nathan Corral,
Benedikt Kreis,
Jacobus Conradi,
Anne Driemel,
Maren Bennewitz
Abstract:
For the best human-robot interaction experience, the robot's navigation policy should take into account personal preferences of the user. In this paper, we present a learning framework complemented by a perception pipeline to train a depth vision-based, personalized navigation controller from user demonstrations. Our virtual reality interface enables the demonstration of robot navigation trajector…
▽ More
For the best human-robot interaction experience, the robot's navigation policy should take into account personal preferences of the user. In this paper, we present a learning framework complemented by a perception pipeline to train a depth vision-based, personalized navigation controller from user demonstrations. Our virtual reality interface enables the demonstration of robot navigation trajectories under motion of the user for dynamic interaction scenarios. The novel perception pipeline enrolls a variational autoencoder in combination with a motion predictor. It compresses the perceived depth images to a latent state representation to enable efficient reasoning of the learning agent about the robot's dynamic environment. In a detailed analysis and ablation study, we evaluate different configurations of the perception pipeline. To further quantify the navigation controller's quality of personalization, we develop and apply a novel metric to measure preference reflection based on the Fréchet Distance. We discuss the robot's navigation performance in various virtual scenes and demonstrate the first personalized robot navigation controller that solely relies on depth images. A supplemental video highlighting our approach is available online.
△ Less
Submitted 31 July, 2023; v1 submitted 4 October, 2022;
originally announced October 2022.
-
Handling Sparse Rewards in Reinforcement Learning Using Model Predictive Control
Authors:
Murad Dawood,
Nils Dengler,
Jorge de Heuvel,
Maren Bennewitz
Abstract:
Reinforcement learning (RL) has recently proven great success in various domains. Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour. Using a sparse reward conveniently mitigates these challenges. However, the sparse reward represents a challenge on its own, often resulting in unsuccessful…
▽ More
Reinforcement learning (RL) has recently proven great success in various domains. Yet, the design of the reward function requires detailed domain expertise and tedious fine-tuning to ensure that agents are able to learn the desired behaviour. Using a sparse reward conveniently mitigates these challenges. However, the sparse reward represents a challenge on its own, often resulting in unsuccessful training of the agent. In this paper, we therefore address the sparse reward problem in RL. Our goal is to find an effective alternative to reward shaping, without using costly human demonstrations, that would also be applicable to a wide range of domains. Hence, we propose to use model predictive control~(MPC) as an experience source for training RL agents in sparse reward environments. Without the need for reward shaping, we successfully apply our approach in the field of mobile robot navigation both in simulation and real-world experiments with a Kuboki Turtlebot 2. We furthermore demonstrate great improvement over pure RL algorithms in terms of success rate as well as number of collisions and timeouts. Our experiments show that MPC as an experience source improves the agent's learning process for a given task in the case of sparse rewards.
△ Less
Submitted 3 March, 2023; v1 submitted 4 October, 2022;
originally announced October 2022.
-
Learning Personalized Human-Aware Robot Navigation Using Virtual Reality Demonstrations from a User Study
Authors:
Jorge de Heuvel,
Nathan Corral,
Lilli Bruckschen,
Maren Bennewitz
Abstract:
For the most comfortable, human-aware robot navigation, subjective user preferences need to be taken into account. This paper presents a novel reinforcement learning framework to train a personalized navigation controller along with an intuitive virtual reality demonstration interface. The conducted user study provides evidence that our personalized approach significantly outperforms classical app…
▽ More
For the most comfortable, human-aware robot navigation, subjective user preferences need to be taken into account. This paper presents a novel reinforcement learning framework to train a personalized navigation controller along with an intuitive virtual reality demonstration interface. The conducted user study provides evidence that our personalized approach significantly outperforms classical approaches with more comfortable human-robot experiences. We achieve these results using only a few demonstration trajectories from non-expert users, who predominantly appreciate the intuitive demonstration setup. As we show in the experiments, the learned controller generalizes well to states not covered in the demonstration data, while still reflecting user preferences during navigation. Finally, we transfer the navigation controller without loss in performance to a real robot.
△ Less
Submitted 4 October, 2022; v1 submitted 28 March, 2022;
originally announced March 2022.