-
HOIGaze: Gaze Estimation During Hand-Object Interactions in Extended Reality Exploiting Eye-Hand-Head Coordination
Authors:
Zhiming Hu,
Daniel Haeufle,
Syn Schmitt,
Andreas Bulling
Abstract:
We present HOIGaze - a novel learning-based approach for gaze estimation during hand-object interactions (HOI) in extended reality (XR). HOIGaze addresses the challenging HOI setting by building on one key insight: The eye, hand, and head movements are closely coordinated during HOIs and this coordination can be exploited to identify samples that are most useful for gaze estimator training - as su…
▽ More
We present HOIGaze - a novel learning-based approach for gaze estimation during hand-object interactions (HOI) in extended reality (XR). HOIGaze addresses the challenging HOI setting by building on one key insight: The eye, hand, and head movements are closely coordinated during HOIs and this coordination can be exploited to identify samples that are most useful for gaze estimator training - as such, effectively denoising the training data. This denoising approach is in stark contrast to previous gaze estimation methods that treated all training samples as equal. Specifically, we propose: 1) a novel hierarchical framework that first recognises the hand currently visually attended to and then estimates gaze direction based on the attended hand; 2) a new gaze estimator that uses cross-modal Transformers to fuse head and hand-object features extracted using a convolutional neural network and a spatio-temporal graph convolutional network; and 3) a novel eye-head coordination loss that upgrades training samples belonging to the coordinated eye-head movements. We evaluate HOIGaze on the HOT3D and Aria digital twin (ADT) datasets and show that it significantly outperforms state-of-the-art methods, achieving an average improvement of 15.6% on HOT3D and 6.0% on ADT in mean angular error. To demonstrate the potential of our method, we further report significant performance improvements for the sample downstream task of eye-based activity recognition on ADT. Taken together, our results underline the significant information content available in eye-hand-head coordination and, as such, open up an exciting new direction for learning-based gaze estimation.
△ Less
Submitted 28 April, 2025;
originally announced April 2025.
-
HaHeAE: Learning Generalisable Joint Representations of Human Hand and Head Movements in Extended Reality
Authors:
Zhiming Hu,
Guanhua Zhang,
Zheming Yin,
Daniel Haeufle,
Syn Schmitt,
Andreas Bulling
Abstract:
Human hand and head movements are the most pervasive input modalities in extended reality (XR) and are significant for a wide range of applications. However, prior works on hand and head modelling in XR only explored a single modality or focused on specific applications. We present HaHeAE - a novel self-supervised method for learning generalisable joint representations of hand and head movements i…
▽ More
Human hand and head movements are the most pervasive input modalities in extended reality (XR) and are significant for a wide range of applications. However, prior works on hand and head modelling in XR only explored a single modality or focused on specific applications. We present HaHeAE - a novel self-supervised method for learning generalisable joint representations of hand and head movements in XR. At the core of our method is an autoencoder (AE) that uses a graph convolutional network-based semantic encoder and a diffusion-based stochastic encoder to learn the joint semantic and stochastic representations of hand-head movements. It also features a diffusion-based decoder to reconstruct the original signals. Through extensive evaluations on three public XR datasets, we show that our method 1) significantly outperforms commonly used self-supervised methods by up to 74.0% in terms of reconstruction quality and is generalisable across users, activities, and XR environments, 2) enables new applications, including interpretable hand-head cluster identification and variable hand-head movement generation, and 3) can serve as an effective feature extractor for downstream tasks. Together, these results demonstrate the effectiveness of our method and underline the potential of self-supervised methods for jointly modelling hand-head behaviours in extended reality.
△ Less
Submitted 16 May, 2025; v1 submitted 21 October, 2024;
originally announced October 2024.
-
HOIMotion: Forecasting Human Motion During Human-Object Interactions Using Egocentric 3D Object Bounding Boxes
Authors:
Zhiming Hu,
Zheming Yin,
Daniel Haeufle,
Syn Schmitt,
Andreas Bulling
Abstract:
We present HOIMotion - a novel approach for human motion forecasting during human-object interactions that integrates information about past body poses and egocentric 3D object bounding boxes. Human motion forecasting is important in many augmented reality applications but most existing methods have only used past body poses to predict future motion. HOIMotion first uses an encoder-residual graph…
▽ More
We present HOIMotion - a novel approach for human motion forecasting during human-object interactions that integrates information about past body poses and egocentric 3D object bounding boxes. Human motion forecasting is important in many augmented reality applications but most existing methods have only used past body poses to predict future motion. HOIMotion first uses an encoder-residual graph convolutional network (GCN) and multi-layer perceptrons to extract features from body poses and egocentric 3D object bounding boxes, respectively. Our method then fuses pose and object features into a novel pose-object graph and uses a residual-decoder GCN to forecast future body motion. We extensively evaluate our method on the Aria digital twin (ADT) and MoGaze datasets and show that HOIMotion consistently outperforms state-of-the-art methods by a large margin of up to 8.7% on ADT and 7.2% on MoGaze in terms of mean per joint position error. Complementing these evaluations, we report a human study (N=20) that shows that the improvements achieved by our method result in forecasted poses being perceived as both more precise and more realistic than those of existing methods. Taken together, these results reveal the significant information content available in egocentric 3D object bounding boxes for human motion forecasting and the effectiveness of our method in exploiting this information.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
GazeMotion: Gaze-guided Human Motion Forecasting
Authors:
Zhiming Hu,
Syn Schmitt,
Daniel Haeufle,
Andreas Bulling
Abstract:
We present GazeMotion, a novel method for human motion forecasting that combines information on past human poses with human eye gaze. Inspired by evidence from behavioural sciences showing that human eye and body movements are closely coordinated, GazeMotion first predicts future eye gaze from past gaze, then fuses predicted future gaze and past poses into a gaze-pose graph, and finally uses a res…
▽ More
We present GazeMotion, a novel method for human motion forecasting that combines information on past human poses with human eye gaze. Inspired by evidence from behavioural sciences showing that human eye and body movements are closely coordinated, GazeMotion first predicts future eye gaze from past gaze, then fuses predicted future gaze and past poses into a gaze-pose graph, and finally uses a residual graph convolutional network to forecast body motion. We extensively evaluate our method on the MoGaze, ADT, and GIMO benchmark datasets and show that it outperforms state-of-the-art methods by up to 7.4% improvement in mean per joint position error. Using head direction as a proxy to gaze, our method still achieves an average improvement of 5.5%. We finally report an online user study showing that our method also outperforms prior methods in terms of perceived realism. These results show the significant information content available in eye gaze for human motion forecasting as well as the effectiveness of our method in exploiting this information.
△ Less
Submitted 11 July, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
Generating Realistic Arm Movements in Reinforcement Learning: A Quantitative Comparison of Reward Terms and Task Requirements
Authors:
Jhon P. F. Charaja,
Isabell Wochner,
Pierre Schumacher,
Winfried Ilg,
Martin Giese,
Christophe Maufroy,
Andreas Bulling,
Syn Schmitt,
Georg Martius,
Daniel F. B. Haeufle
Abstract:
The mimicking of human-like arm movement characteristics involves the consideration of three factors during control policy synthesis: (a) chosen task requirements, (b) inclusion of noise during movement execution and (c) chosen optimality principles. Previous studies showed that when considering these factors (a-c) individually, it is possible to synthesize arm movements that either kinematically…
▽ More
The mimicking of human-like arm movement characteristics involves the consideration of three factors during control policy synthesis: (a) chosen task requirements, (b) inclusion of noise during movement execution and (c) chosen optimality principles. Previous studies showed that when considering these factors (a-c) individually, it is possible to synthesize arm movements that either kinematically match the experimental data or reproduce the stereotypical triphasic muscle activation pattern. However, to date no quantitative comparison has been made on how realistic the arm movement generated by each factor is; as well as whether a partial or total combination of all factors results in arm movements with human-like kinematic characteristics and a triphasic muscle pattern. To investigate this, we used reinforcement learning to learn a control policy for a musculoskeletal arm model, aiming to discern which combination of factors (a-c) results in realistic arm movements according to four frequently reported stereotypical characteristics. Our findings indicate that incorporating velocity and acceleration requirements into the reaching task, employing reward terms that encourage minimization of mechanical work, hand jerk, and control effort, along with the inclusion of noise during movement, leads to the emergence of realistic human arm movements in reinforcement learning. We expect that the gained insights will help in the future to better predict desired arm movements and corrective forces in wearable assistive devices.
△ Less
Submitted 27 November, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Learning to Control Emulated Muscles in Real Robots: Towards Exploiting Bio-Inspired Actuator Morphology
Authors:
Pierre Schumacher,
Lorenz Krause,
Jan Schneider,
Dieter Büchler,
Georg Martius,
Daniel Haeufle
Abstract:
Recent studies have demonstrated the immense potential of exploiting muscle actuator morphology for natural and robust movement -- in simulation. A validation on real robotic hardware is yet missing. In this study, we emulate muscle actuator properties on hardware in real-time, taking advantage of modern and affordable electric motors. We demonstrate that our setup can emulate a simplified muscle…
▽ More
Recent studies have demonstrated the immense potential of exploiting muscle actuator morphology for natural and robust movement -- in simulation. A validation on real robotic hardware is yet missing. In this study, we emulate muscle actuator properties on hardware in real-time, taking advantage of modern and affordable electric motors. We demonstrate that our setup can emulate a simplified muscle model on a real robot while being controlled by a learned policy. We improve upon an existing muscle model by deriving a damping rule that ensures that the model is not only performant and stable but also tuneable for the real hardware. Our policies are trained by reinforcement learning entirely in simulation, where we show that previously reported benefits of muscles extend to the case of quadruped locomotion and hopping: the learned policies are more robust and exhibit more regular gaits. Finally, we confirm that the learned policies can be executed on real hardware and show that sim-to-real transfer with real-time emulated muscles on a quadruped robot is possible. These results show that artificial muscles can be highly beneficial actuators for future generations of robust legged robots.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Identifying Policy Gradient Subspaces
Authors:
Jan Schneider,
Pierre Schumacher,
Simon Guist,
Le Chen,
Daniel Häufle,
Bernhard Schölkopf,
Dieter Büchler
Abstract:
Policy gradient methods hold great potential for solving complex continuous control tasks. Still, their training efficiency can be improved by exploiting structure within the optimization problem. Recent work indicates that supervised learning can be accelerated by leveraging the fact that gradients lie in a low-dimensional and slowly-changing subspace. In this paper, we conduct a thorough evaluat…
▽ More
Policy gradient methods hold great potential for solving complex continuous control tasks. Still, their training efficiency can be improved by exploiting structure within the optimization problem. Recent work indicates that supervised learning can be accelerated by leveraging the fact that gradients lie in a low-dimensional and slowly-changing subspace. In this paper, we conduct a thorough evaluation of this phenomenon for two popular deep policy gradient methods on various simulated benchmark tasks. Our results demonstrate the existence of such gradient subspaces despite the continuously changing data distribution inherent to reinforcement learning. These findings reveal promising directions for future work on more efficient reinforcement learning, e.g., through improving parameter-space exploration or enabling second-order optimization.
△ Less
Submitted 18 March, 2024; v1 submitted 12 January, 2024;
originally announced January 2024.
-
Investigating the Impact of Action Representations in Policy Gradient Algorithms
Authors:
Jan Schneider,
Pierre Schumacher,
Daniel Häufle,
Bernhard Schölkopf,
Dieter Büchler
Abstract:
Reinforcement learning~(RL) is a versatile framework for learning to solve complex real-world tasks. However, influences on the learning performance of RL algorithms are often poorly understood in practice. We discuss different analysis techniques and assess their effectiveness for investigating the impact of action representations in RL. Our experiments demonstrate that the action representation…
▽ More
Reinforcement learning~(RL) is a versatile framework for learning to solve complex real-world tasks. However, influences on the learning performance of RL algorithms are often poorly understood in practice. We discuss different analysis techniques and assess their effectiveness for investigating the impact of action representations in RL. Our experiments demonstrate that the action representation can significantly influence the learning performance on popular RL benchmark tasks. The analysis results indicate that some of the performance differences can be attributed to changes in the complexity of the optimization landscape. Finally, we discuss open challenges of analysis techniques for RL algorithms.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Natural and Robust Walking using Reinforcement Learning without Demonstrations in High-Dimensional Musculoskeletal Models
Authors:
Pierre Schumacher,
Thomas Geijtenbeek,
Vittorio Caggiano,
Vikash Kumar,
Syn Schmitt,
Georg Martius,
Daniel F. B. Haeufle
Abstract:
Humans excel at robust bipedal walking in complex natural environments. In each step, they adequately tune the interaction of biomechanical muscle dynamics and neuronal signals to be robust against uncertainties in ground conditions. However, it is still not fully understood how the nervous system resolves the musculoskeletal redundancy to solve the multi-objective control problem considering stab…
▽ More
Humans excel at robust bipedal walking in complex natural environments. In each step, they adequately tune the interaction of biomechanical muscle dynamics and neuronal signals to be robust against uncertainties in ground conditions. However, it is still not fully understood how the nervous system resolves the musculoskeletal redundancy to solve the multi-objective control problem considering stability, robustness, and energy efficiency. In computer simulations, energy minimization has been shown to be a successful optimization target, reproducing natural walking with trajectory optimization or reflex-based control methods. However, these methods focus on particular motions at a time and the resulting controllers are limited when compensating for perturbations. In robotics, reinforcement learning~(RL) methods recently achieved highly stable (and efficient) locomotion on quadruped systems, but the generation of human-like walking with bipedal biomechanical models has required extensive use of expert data sets. This strong reliance on demonstrations often results in brittle policies and limits the application to new behaviors, especially considering the potential variety of movements for high-dimensional musculoskeletal models in 3D. Achieving natural locomotion with RL without sacrificing its incredible robustness might pave the way for a novel approach to studying human walking in complex natural environments. Videos: https://sites.google.com/view/naturalwalkingrl
△ Less
Submitted 7 September, 2023; v1 submitted 6 September, 2023;
originally announced September 2023.
-
'Virtual pivot point' in human walking: always experimentally observed but simulations suggest it may not be necessary for stability
Authors:
L. Schreff,
D. F. B. Haeufle,
A. Badri-Spröwitz,
J. Vielemeyer,
R. Müller
Abstract:
The intersection of ground reaction forces near a point above the center of mass has been observed in computer simulation models and human walking experiments. Observed so ubiquitously, the intersection point (IP) is commonly assumed to provide postural stability for bipedal walking. In this study, we challenge this assumption by questioning if walking without an IP is possible. Deriving gaits wit…
▽ More
The intersection of ground reaction forces near a point above the center of mass has been observed in computer simulation models and human walking experiments. Observed so ubiquitously, the intersection point (IP) is commonly assumed to provide postural stability for bipedal walking. In this study, we challenge this assumption by questioning if walking without an IP is possible. Deriving gaits with a neuromuscular reflex model through multi-stage optimization, we found stable walking patterns that show no signs of the IP-typical intersection of ground reaction forces. The non-IP gaits found are stable and successfully rejected step-down perturbations, which indicates that an IP is not necessary for locomotion robustness or postural stability. A collision-based analysis shows that non-IP gaits feature center of mass (CoM) dynamics with vectors of the CoM velocity and ground reaction force increasingly opposing each other, indicating an increased mechanical cost of transport. Although our computer simulation results have yet to be confirmed through experimental studies, they already indicate that the role of the IP in postural stability should be further investigated. Moreover, our observations on the CoM dynamics and gait efficiency suggest that the IP may have an alternative or additional function that should be considered.
△ Less
Submitted 8 May, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Slack-based tunable damping leads to a trade-off between robustness and efficiency in legged locomotion
Authors:
An Mo,
Fabio Izzi,
Emre Cemal Gönen,
Daniel Haeufle,
Alexander Badri-Spröwitz
Abstract:
Animals run robustly in diverse terrain. This locomotion robustness is puzzling because axon conduction velocity is limited to a few ten meters per second. If reflex loops deliver sensory information with significant delays, one would expect a destabilizing effect on sensorimotor control. Hence, an alternative explanation describes a hierarchical structure of low-level adaptive mechanics and high-…
▽ More
Animals run robustly in diverse terrain. This locomotion robustness is puzzling because axon conduction velocity is limited to a few ten meters per second. If reflex loops deliver sensory information with significant delays, one would expect a destabilizing effect on sensorimotor control. Hence, an alternative explanation describes a hierarchical structure of low-level adaptive mechanics and high-level sensorimotor control to help mitigate the effects of transmission delays. Motivated by the concept of an adaptive mechanism triggering an immediate response, we developed a tunable physical damper system. Our mechanism combines a tendon with adjustable slackness connected to a physical damper. The slack damper allows adjustment of damping force, onset timing, effective stroke, and energy dissipation. We characterize the slack damper mechanism mounted to a legged robot controlled in open-loop mode. The robot hops vertically and planar over varying terrains and perturbations. During forward hopping, slack-based damping improves faster perturbation recovery (up to 170%) at higher energetic cost (27%). The tunable slack mechanism auto-engages the damper during perturbations, leading to a perturbation-trigger damping, improving robustness at minimum energetic cost. With the results from the slack damper mechanism, we propose a new functional interpretation of animals' redundant muscle tendons as tunable dampers.
△ Less
Submitted 1 December, 2022;
originally announced December 2022.
-
Learning with Muscles: Benefits for Data-Efficiency and Robustness in Anthropomorphic Tasks
Authors:
Isabell Wochner,
Pierre Schumacher,
Georg Martius,
Dieter Büchler,
Syn Schmitt,
Daniel F. B. Haeufle
Abstract:
Humans are able to outperform robots in terms of robustness, versatility, and learning of new tasks in a wide variety of movements. We hypothesize that highly nonlinear muscle dynamics play a large role in providing inherent stability, which is favorable to learning. While recent advances have been made in applying modern learning techniques to muscle-actuated systems both in simulation as well as…
▽ More
Humans are able to outperform robots in terms of robustness, versatility, and learning of new tasks in a wide variety of movements. We hypothesize that highly nonlinear muscle dynamics play a large role in providing inherent stability, which is favorable to learning. While recent advances have been made in applying modern learning techniques to muscle-actuated systems both in simulation as well as in robotics, so far, no detailed analysis has been performed to show the benefits of muscles when learning from scratch. Our study closes this gap and showcases the potential of muscle actuators for core robotics challenges in terms of data-efficiency, hyperparameter sensitivity, and robustness.
△ Less
Submitted 16 January, 2023; v1 submitted 8 July, 2022;
originally announced July 2022.
-
DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems
Authors:
Pierre Schumacher,
Daniel Häufle,
Dieter Büchler,
Syn Schmitt,
Georg Martius
Abstract:
Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by the finding that common expl…
▽ More
Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by the finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems, outperforming current approaches in all considered tasks in sample efficiency and robustness.
△ Less
Submitted 27 April, 2023; v1 submitted 30 May, 2022;
originally announced June 2022.
-
Effective Viscous Damping Enables Morphological Computation in Legged Locomotion
Authors:
An Mo,
Fabio Izzi,
Daniel F. B. Haeufle,
Alexander Badri-Spröwitz
Abstract:
Muscle models and animal observations suggest that physical damping is beneficial for stabilization. Still, only a few implementations of mechanical damping exist in compliant robotic legged locomotion. It remains unclear how physical damping can be exploited for locomotion tasks, while its advantages as sensor-free, adaptive force- and negative work-producing actuators are promising. In a simplif…
▽ More
Muscle models and animal observations suggest that physical damping is beneficial for stabilization. Still, only a few implementations of mechanical damping exist in compliant robotic legged locomotion. It remains unclear how physical damping can be exploited for locomotion tasks, while its advantages as sensor-free, adaptive force- and negative work-producing actuators are promising. In a simplified numerical leg model, we studied the energy dissipation from viscous and Coulomb damping during vertical drops with ground-level perturbations. A parallel spring-damper is engaged between touch-down and mid-stance, and its damper auto-disengages during mid-stance and takeoff. Our simulations indicate that an adjustable and viscous damper is desired. In hardware we explored effective viscous damping and adjustability and quantified the dissipated energy. We tested two mechanical, leg-mounted damping mechanisms; a commercial hydraulic damper, and a custom-made pneumatic damper. The pneumatic damper exploits a rolling diaphragm with an adjustable orifice, minimizing Coulomb damping effects while permitting adjustable resistance. Experimental results show that the leg-mounted, hydraulic damper exhibits the most effective viscous damping. Adjusting the orifice setting did not result in substantial changes of dissipated energy per drop, unlike adjusting damping parameters in the numerical model. Consequently, we also emphasize the importance of characterizing physical dampers during real legged impacts to evaluate their effectiveness for compliant legged locomotion.
△ Less
Submitted 6 June, 2020; v1 submitted 12 May, 2020;
originally announced May 2020.
-
Evaluating Morphological Computation in Muscle and DC-motor Driven Models of Human Hopping
Authors:
Keyan Ghazi-Zahedi,
Daniel F. B. Haeufle,
Guido Montufar,
Syn Schmitt,
Nihat Ay
Abstract:
In the context of embodied artificial intelligence, morphological computation refers to processes which are conducted by the body (and environment) that otherwise would have to be performed by the brain. Exploiting environmental and morphological properties is an important feature of embodied systems. The main reason is that it allows to significantly reduce the controller complexity. An important…
▽ More
In the context of embodied artificial intelligence, morphological computation refers to processes which are conducted by the body (and environment) that otherwise would have to be performed by the brain. Exploiting environmental and morphological properties is an important feature of embodied systems. The main reason is that it allows to significantly reduce the controller complexity. An important aspect of morphological computation is that it cannot be assigned to an embodied system per se, but that it is, as we show, behavior- and state-dependent. In this work, we evaluate two different measures of morphological computation that can be applied in robotic systems and in computer simulations of biological movement. As an example, these measures were evaluated on muscle and DC-motor driven hopping models. We show that a state-dependent analysis of the hopping behaviors provides additional insights that cannot be gained from the averaged measures alone. This work includes algorithms and computer code for the measures.
△ Less
Submitted 11 December, 2015; v1 submitted 1 December, 2015;
originally announced December 2015.