-
Automating 3D Dataset Generation with Neural Radiance Fields
Authors:
P. Schulz,
T. Hempel,
A. Al-Hamadi
Abstract:
3D detection is a critical task to understand spatial characteristics of the environment and is used in a variety of applications including robotics, augmented reality, and image retrieval. Training performant detection models require diverse, precisely annotated, and large scale datasets that involve complex and expensive creation processes. Hence, there are only few public 3D datasets that are a…
▽ More
3D detection is a critical task to understand spatial characteristics of the environment and is used in a variety of applications including robotics, augmented reality, and image retrieval. Training performant detection models require diverse, precisely annotated, and large scale datasets that involve complex and expensive creation processes. Hence, there are only few public 3D datasets that are additionally limited in their range of classes. In this work, we propose a pipeline for automatic generation of 3D datasets for arbitrary objects. By utilizing the universal 3D representation and rendering capabilities of Radiance Fields, our pipeline generates high quality 3D models for arbitrary objects. These 3D models serve as input for a synthetic dataset generator. Our pipeline is fast, easy to use and has a high degree of automation. Our experiments demonstrate, that 3D pose estimation networks, trained with our generated datasets, archive strong performance in typical application scenarios.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
A simple reconstruction method to infer nonreciprocal interactions and local driving in complex systems
Authors:
Tim Hempel,
Sarah A. M. Loos
Abstract:
Data-based inference of directed interactions in complex dynamical systems is a problem common to many disciplines of science. In this work, we study networks of spatially separate dynamical entities, which could represent physical systems that interact with each other by reciprocal or nonreciprocal, instantaneous or time-delayed interactions. We present a simple approach that combines Markov stat…
▽ More
Data-based inference of directed interactions in complex dynamical systems is a problem common to many disciplines of science. In this work, we study networks of spatially separate dynamical entities, which could represent physical systems that interact with each other by reciprocal or nonreciprocal, instantaneous or time-delayed interactions. We present a simple approach that combines Markov state models with directed information-theoretical measures for causal inference that can accurately infer the underlying interactions from noisy time series of the dynamical system states alone. Remarkably, this is possible despite the built-in simplification of a Markov assumption and the choice of a very coarse discretization at the level of probability estimation. Our test systems are an Ising chain with nonreciprocal coupling imposed by local driving of a single spin, and a system of delay-coupled linear stochastic processes. Stepping away from physical systems, the approach infers cause-effect relationships, or more generally, the direction of mutual or one-way influence. The presented method is agnostic to the number of interacting entities and details of the dynamics, so that it is widely applicable to problems in various fields.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
NITEC: Versatile Hand-Annotated Eye Contact Dataset for Ego-Vision Interaction
Authors:
Thorsten Hempel,
Magnus Jung,
Ahmed A. Abdelrahman,
Ayoub Al-Hamadi
Abstract:
Eye contact is a crucial non-verbal interaction modality and plays an important role in our everyday social life. While humans are very sensitive to eye contact, the capabilities of machines to capture a person's gaze are still mediocre. We tackle this challenge and present NITEC, a hand-annotated eye contact dataset for ego-vision interaction. NITEC exceeds existing datasets for ego-vision eye co…
▽ More
Eye contact is a crucial non-verbal interaction modality and plays an important role in our everyday social life. While humans are very sensitive to eye contact, the capabilities of machines to capture a person's gaze are still mediocre. We tackle this challenge and present NITEC, a hand-annotated eye contact dataset for ego-vision interaction. NITEC exceeds existing datasets for ego-vision eye contact in size and variety of demographics, social contexts, and lighting conditions, making it a valuable resource for advancing ego-vision-based eye contact research. Our extensive evaluations on NITEC demonstrate strong cross-dataset performance, emphasizing its effectiveness and adaptability in various scenarios, that allows seamless utilization to the fields of computer vision, human-computer interaction, and social robotics. We make our NITEC dataset publicly available to foster reproducibility and further exploration in the field of ego-vision interaction. https://github.com/thohemp/nitec
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Towards Robust and Unconstrained Full Range of Rotation Head Pose Estimation
Authors:
Thorsten Hempel,
Ahmed A. Abdelrahman,
Ayoub Al-Hamadi
Abstract:
Estimating the head pose of a person is a crucial problem for numerous applications that is yet mainly addressed as a subtask of frontal pose prediction. We present a novel method for unconstrained end-to-end head pose estimation to tackle the challenging task of full range of orientation head pose prediction. We address the issue of ambiguous rotation labels by introducing the rotation matrix for…
▽ More
Estimating the head pose of a person is a crucial problem for numerous applications that is yet mainly addressed as a subtask of frontal pose prediction. We present a novel method for unconstrained end-to-end head pose estimation to tackle the challenging task of full range of orientation head pose prediction. We address the issue of ambiguous rotation labels by introducing the rotation matrix formalism for our ground truth data and propose a continuous 6D rotation matrix representation for efficient and robust direct regression. This allows to efficiently learn full rotation appearance and to overcome the limitations of the current state-of-the-art. Together with new accumulated training data that provides full head pose rotation data and a geodesic loss approach for stable learning, we design an advanced model that is able to predict an extended range of head orientations. An extensive evaluation on public datasets demonstrates that our method significantly outperforms other state-of-the-art methods in an efficient and robust manner, while its advanced prediction range allows the expansion of the application area. We open-source our training and testing code along with our trained models: https://github.com/thohemp/6DRepNet360.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Automated Deception Detection from Videos: Using End-to-End Learning Based High-Level Features and Classification Approaches
Authors:
Laslo Dinges,
Marc-André Fiedler,
Ayoub Al-Hamadi,
Thorsten Hempel,
Ahmed Abdelrahman,
Joachim Weimann,
Dmitri Bershadskyy
Abstract:
Deception detection is an interdisciplinary field attracting researchers from psychology, criminology, computer science, and economics. We propose a multimodal approach combining deep learning and discriminative models for automated deception detection. Using video modalities, we employ convolutional end-to-end learning to analyze gaze, head pose, and facial expressions, achieving promising result…
▽ More
Deception detection is an interdisciplinary field attracting researchers from psychology, criminology, computer science, and economics. We propose a multimodal approach combining deep learning and discriminative models for automated deception detection. Using video modalities, we employ convolutional end-to-end learning to analyze gaze, head pose, and facial expressions, achieving promising results compared to state-of-the-art methods. Due to limited training data, we also utilize discriminative models for deception detection. Although sequence-to-class approaches are explored, discriminative models outperform them due to data scarcity. Our approach is evaluated on five datasets, including a new Rolling-Dice Experiment motivated by economic factors. Results indicate that facial expressions outperform gaze and head pose, and combining modalities with feature selection enhances detection performance. Differences in expressed features across datasets emphasize the importance of scenario-specific training data and the influence of context on deceptive behavior. Cross-dataset experiments reinforce these findings. Despite the challenges posed by low-stake datasets, including the Rolling-Dice Experiment, deception detection performance exceeds chance levels. Our proposed multimodal approach and comprehensive evaluation shed light on the potential of automating deception detection from video modalities, opening avenues for future research.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
Sentiment-based Engagement Strategies for intuitive Human-Robot Interaction
Authors:
Thorsten Hempel,
Laslo Dinges,
Ayoub Al-Hamadi
Abstract:
Emotion expressions serve as important communicative signals and are crucial cues in intuitive interactions between humans. Hence, it is essential to include these fundamentals in robotic behavior strategies when interacting with humans to promote mutual understanding and to reduce misjudgements. We tackle this challenge by detecting and using the emotional state and attention for a sentiment anal…
▽ More
Emotion expressions serve as important communicative signals and are crucial cues in intuitive interactions between humans. Hence, it is essential to include these fundamentals in robotic behavior strategies when interacting with humans to promote mutual understanding and to reduce misjudgements. We tackle this challenge by detecting and using the emotional state and attention for a sentiment analysis of potential human interaction partners to select well-adjusted engagement strategies. This way, we pave the way for more intuitive human-robot interactions, as the robot's action conforms to the person's mood and expectation. We propose four different engagement strategies with implicit and explicit communication techniques that we implement on a mobile robot platform for initial experiments.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.
-
Semantic-Aware Environment Perception for Mobile Human-Robot Interaction
Authors:
Thorsten Hempel,
Marc-André Fiedler,
Aly Khalifa,
Ayoub Al-Hamadi,
Laslo Dinges
Abstract:
Current technological advances open up new opportunities for bringing human-machine interaction to a new level of human-centered cooperation. In this context, a key issue is the semantic understanding of the environment in order to enable mobile robots more complex interactions and a facilitated communication with humans. Prerequisites are the vision-based registration of semantic objects and huma…
▽ More
Current technological advances open up new opportunities for bringing human-machine interaction to a new level of human-centered cooperation. In this context, a key issue is the semantic understanding of the environment in order to enable mobile robots more complex interactions and a facilitated communication with humans. Prerequisites are the vision-based registration of semantic objects and humans, where the latter are further analyzed for potential interaction partners. Despite significant research achievements, the reliable and fast registration of semantic information still remains a challenging task for mobile robots in real-world scenarios. In this paper, we present a vision-based system for mobile assistive robots to enable a semantic-aware environment perception without additional a-priori knowledge. We deploy our system on a mobile humanoid robot that enables us to test our methods in real-world applications.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Markov Field Models: scaling molecular kinetics approaches to large molecular machines
Authors:
Tim Hempel,
Simon Olsson,
Frank Noé
Abstract:
With recent advances in structural biology, including experimental techniques and deep learning-enabled high-precision structure predictions, molecular dynamics methods that scale up to large biomolecular systems are required. Current state-of-the-art approaches in molecular dynamics modeling focus on encoding global configurations of molecular systems as distinct states. This paradigm commands us…
▽ More
With recent advances in structural biology, including experimental techniques and deep learning-enabled high-precision structure predictions, molecular dynamics methods that scale up to large biomolecular systems are required. Current state-of-the-art approaches in molecular dynamics modeling focus on encoding global configurations of molecular systems as distinct states. This paradigm commands us to map out all possible structures and sample transitions between them, a task that becomes impossible for large-scale systems such as biomolecular complexes. To arrive at scalable molecular models, we suggest moving away from global state descriptions to a set of coupled models that each describe the dynamics of local domains or sites of the molecular system. We describe limitations in the current state-of-the-art global-state Markovian modeling approaches and then introduce Markov Field Models as an umbrella term that includes models from various scientific communities, including Independent Markov Decomposition, Ising and Potts Models, and (Dynamic) Graphical Models, and evaluate their use for computational molecular biology. Finally, we give a few examples of early adoptions of these ideas for modeling molecular kinetics and thermodynamics.
△ Less
Submitted 23 June, 2022;
originally announced June 2022.
-
An Online Semantic Mapping System for Extending and Enhancing Visual SLAM
Authors:
Thorsten Hempel,
Ayoub Al-Hamadi
Abstract:
We present a real-time semantic mapping approach for mobile vision systems with a 2D to 3D object detection pipeline and rapid data association for generated landmarks. Besides the semantic map enrichment the associated detections are further introduced as semantic constraints into a simultaneous localization and mapping (SLAM) system for pose correction purposes. This way, we are able generate ad…
▽ More
We present a real-time semantic mapping approach for mobile vision systems with a 2D to 3D object detection pipeline and rapid data association for generated landmarks. Besides the semantic map enrichment the associated detections are further introduced as semantic constraints into a simultaneous localization and mapping (SLAM) system for pose correction purposes. This way, we are able generate additional meaningful information that allows to achieve higher-level tasks, while simultaneously leveraging the view-invariance of object detections to improve the accuracy and the robustness of the odometry estimation. We propose tracklets of locally associated object observations to handle ambiguous and false predictions and an uncertainty-based greedy association scheme for an accelerated processing time. Our system reaches real-time capabilities with an average iteration duration of 65~ms and is able to improve the pose estimation of a state-of-the-art SLAM by up to 68% on a public dataset. Additionally, we implemented our approach as a modular ROS package that makes it straightforward for integration in arbitrary graph-based SLAM methods.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
L2CS-Net: Fine-Grained Gaze Estimation in Unconstrained Environments
Authors:
Ahmed A. Abdelrahman,
Thorsten Hempel,
Aly Khalifa,
Ayoub Al-Hamadi
Abstract:
Human gaze is a crucial cue used in various applications such as human-robot interaction and virtual reality. Recently, convolution neural network (CNN) approaches have made notable progress in predicting gaze direction. However, estimating gaze in-the-wild is still a challenging problem due to the uniqueness of eye appearance, lightning conditions, and the diversity of head pose and gaze directio…
▽ More
Human gaze is a crucial cue used in various applications such as human-robot interaction and virtual reality. Recently, convolution neural network (CNN) approaches have made notable progress in predicting gaze direction. However, estimating gaze in-the-wild is still a challenging problem due to the uniqueness of eye appearance, lightning conditions, and the diversity of head pose and gaze directions. In this paper, we propose a robust CNN-based model for predicting gaze in unconstrained settings. We propose to regress each gaze angle separately to improve the per-angel prediction accuracy, which will enhance the overall gaze performance. In addition, we use two identical losses, one for each angle, to improve network learning and increase its generalization. We evaluate our model with two popular datasets collected with unconstrained settings. Our proposed model achieves state-of-the-art accuracy of 3.92° and 10.41° on MPIIGaze and Gaze360 datasets, respectively. We make our code open source at https://github.com/Ahmednull/L2CS-Net.
△ Less
Submitted 7 March, 2022;
originally announced March 2022.
-
6D Rotation Representation For Unconstrained Head Pose Estimation
Authors:
Thorsten Hempel,
Ahmed A. Abdelrahman,
Ayoub Al-Hamadi
Abstract:
In this paper, we present a method for unconstrained end-to-end head pose estimation. We address the problem of ambiguous rotation labels by introducing the rotation matrix formalism for our ground truth data and propose a continuous 6D rotation matrix representation for efficient and robust direct regression. This way, our method can learn the full rotation appearance which is contrary to previou…
▽ More
In this paper, we present a method for unconstrained end-to-end head pose estimation. We address the problem of ambiguous rotation labels by introducing the rotation matrix formalism for our ground truth data and propose a continuous 6D rotation matrix representation for efficient and robust direct regression. This way, our method can learn the full rotation appearance which is contrary to previous approaches that restrict the pose prediction to a narrow-angle for satisfactory results. In addition, we propose a geodesic distance-based loss to penalize our network with respect to the SO(3) manifold geometry. Experiments on the public AFLW2000 and BIWI datasets demonstrate that our proposed method significantly outperforms other state-of-the-art methods by up to 20\%. We open-source our training and testing code along with our pre-trained models: https://github.com/thohemp/6DRepNet.
△ Less
Submitted 7 November, 2022; v1 submitted 25 February, 2022;
originally announced February 2022.
-
Deeptime: a Python library for machine learning dynamical models from time series data
Authors:
Moritz Hoffmann,
Martin Scherer,
Tim Hempel,
Andreas Mardt,
Brian de Silva,
Brooke E. Husic,
Stefan Klus,
Hao Wu,
Nathan Kutz,
Steven L. Brunton,
Frank Noé
Abstract:
Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic…
▽ More
Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software.
△ Less
Submitted 11 December, 2021; v1 submitted 28 October, 2021;
originally announced October 2021.