-
Feeling Machines: Ethics, Culture, and the Rise of Emotional AI
Authors:
Vivek Chavan,
Arsen Cenaj,
Shuyuan Shen,
Ariane Bar,
Srishti Binwani,
Tommaso Del Becaro,
Marius Funk,
Lynn Greschner,
Roberto Hung,
Stina Klein,
Romina Kleiner,
Stefanie Krause,
Sylwia Olbrych,
Vishvapalsinhji Parmar,
Jaleh Sarafraz,
Daria Soroko,
Daksitha Withanage Don,
Chang Zhou,
Hoang Thuy Duong Vu,
Parastoo Semnani,
Daniel Weinhardt,
Elisabeth Andre,
Jörg Krüger,
Xavier Fresquet
Abstract:
This paper explores the growing presence of emotionally responsive artificial intelligence through a critical and interdisciplinary lens. Bringing together the voices of early-career researchers from multiple fields, it explores how AI systems that simulate or interpret human emotions are reshaping our interactions in areas such as education, healthcare, mental health, caregiving, and digital life…
▽ More
This paper explores the growing presence of emotionally responsive artificial intelligence through a critical and interdisciplinary lens. Bringing together the voices of early-career researchers from multiple fields, it explores how AI systems that simulate or interpret human emotions are reshaping our interactions in areas such as education, healthcare, mental health, caregiving, and digital life. The analysis is structured around four central themes: the ethical implications of emotional AI, the cultural dynamics of human-machine interaction, the risks and opportunities for vulnerable populations, and the emerging regulatory, design, and technical considerations. The authors highlight the potential of affective AI to support mental well-being, enhance learning, and reduce loneliness, as well as the risks of emotional manipulation, over-reliance, misrepresentation, and cultural bias. Key challenges include simulating empathy without genuine understanding, encoding dominant sociocultural norms into AI systems, and insufficient safeguards for individuals in sensitive or high-risk contexts. Special attention is given to children, elderly users, and individuals with mental health challenges, who may interact with AI in emotionally significant ways. However, there remains a lack of cognitive or legal protections which are necessary to navigate such engagements safely. The report concludes with ten recommendations, including the need for transparency, certification frameworks, region-specific fine-tuning, human oversight, and longitudinal research. A curated supplementary section provides practical tools, models, and datasets to support further work in this domain.
△ Less
Submitted 14 June, 2025;
originally announced June 2025.
-
Physical Annotation for Automated Optical Inspection: A Concept for In-Situ, Pointer-Based Trainingdata Generation
Authors:
Oliver Krumpek,
Oliver Heimann,
Jörg Krüger
Abstract:
This paper introduces a novel physical annotation system designed to generate training data for automated optical inspection. The system uses pointer-based in-situ interaction to transfer the valuable expertise of trained inspection personnel directly into a machine learning (ML) training pipeline. Unlike conventional screen-based annotation methods, our system captures physical trajectories and c…
▽ More
This paper introduces a novel physical annotation system designed to generate training data for automated optical inspection. The system uses pointer-based in-situ interaction to transfer the valuable expertise of trained inspection personnel directly into a machine learning (ML) training pipeline. Unlike conventional screen-based annotation methods, our system captures physical trajectories and contours directly on the object, providing a more intuitive and efficient way to label data. The core technology uses calibrated, tracked pointers to accurately record user input and transform these spatial interactions into standardised annotation formats that are compatible with open-source annotation software. Additionally, a simple projector-based interface projects visual guidance onto the object to assist users during the annotation process, ensuring greater accuracy and consistency. The proposed concept bridges the gap between human expertise and automated data generation, enabling non-IT experts to contribute to the ML training pipeline and preventing the loss of valuable training samples. Preliminary evaluation results confirm the feasibility of capturing detailed annotation trajectories and demonstrate that integration with CVAT streamlines the workflow for subsequent ML tasks. This paper details the system architecture, calibration procedures and interface design, and discusses its potential contribution to future ML data generation for automated optical inspection.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
Authors:
Paul Koch,
Jörg Krüger,
Ankit Chowdhury,
Oliver Heimann
Abstract:
Generalized metric depth understanding is critical for precise vision-guided robotics, which current state-of-the-art (SOTA) vision-encoders do not support. To address this, we propose Vanishing Depth, a self-supervised training approach that extends pretrained RGB encoders to incorporate and align metric depth into their feature embeddings. Based on our novel positional depth encoding, we enable…
▽ More
Generalized metric depth understanding is critical for precise vision-guided robotics, which current state-of-the-art (SOTA) vision-encoders do not support. To address this, we propose Vanishing Depth, a self-supervised training approach that extends pretrained RGB encoders to incorporate and align metric depth into their feature embeddings. Based on our novel positional depth encoding, we enable stable depth density and depth distribution invariant feature extraction. We achieve performance improvements and SOTA results across a spectrum of relevant RGBD downstream tasks - without the necessity of finetuning the encoder. Most notably, we achieve 56.05 mIoU on SUN-RGBD segmentation, 88.3 RMSE on Void's depth completion, and 83.8 Top 1 accuracy on NYUv2 scene classification. In 6D-object pose estimation, we outperform our predecessors of DinoV2, EVA-02, and Omnivore and achieve SOTA results for non-finetuned encoders in several related RGBD downstream tasks.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
MVIP -- A Dataset and Methods for Application Oriented Multi-View and Multi-Modal Industrial Part Recognition
Authors:
Paul Koch,
Marian Schlüter,
Jörg Krüger
Abstract:
We present MVIP, a novel dataset for multi-modal and multi-view application-oriented industrial part recognition. Here we are the first to combine a calibrated RGBD multi-view dataset with additional object context such as physical properties, natural language, and super-classes. The current portfolio of available datasets offers a wide range of representations to design and benchmark related meth…
▽ More
We present MVIP, a novel dataset for multi-modal and multi-view application-oriented industrial part recognition. Here we are the first to combine a calibrated RGBD multi-view dataset with additional object context such as physical properties, natural language, and super-classes. The current portfolio of available datasets offers a wide range of representations to design and benchmark related methods. In contrast to existing classification challenges, industrial recognition applications offer controlled multi-modal environments but at the same time have different problems than traditional 2D/3D classification challenges. Frequently, industrial applications must deal with a small amount or increased number of training data, visually similar parts, and varying object sizes, while requiring a robust near 100% top 5 accuracy under cost and time constraints. Current methods tackle such challenges individually, but direct adoption of these methods within industrial applications is complex and requires further research. Our main goal with MVIP is to study and push transferability of various state-of-the-art methods within related downstream tasks towards an efficient deployment of industrial classifiers. Additionally, we intend to push with MVIP research regarding several modality fusion topics, (automated) synthetic data generation, and complex data sampling -- combined in a single application-oriented benchmark.
△ Less
Submitted 21 February, 2025;
originally announced February 2025.
-
CTARR: A fast and robust method for identifying anatomical regions on CT images via atlas registration
Authors:
Thomas Buddenkotte,
Roland Opfer,
Julia Krüger,
Alessa Hering,
Mireia Crispin-Ortuzar
Abstract:
Medical image analysis tasks often focus on regions or structures located in a particular location within the patient's body. Often large parts of the image may not be of interest for the image analysis task. When using deep-learning based approaches, this causes an unnecessary increases the computational burden during inference and raises the chance of errors. In this paper, we introduce CTARR, a…
▽ More
Medical image analysis tasks often focus on regions or structures located in a particular location within the patient's body. Often large parts of the image may not be of interest for the image analysis task. When using deep-learning based approaches, this causes an unnecessary increases the computational burden during inference and raises the chance of errors. In this paper, we introduce CTARR, a novel generic method for CT Anatomical Region Recognition. The method serves as a pre-processing step for any deep learning-based CT image analysis pipeline by automatically identifying the pre-defined anatomical region that is relevant for the follow-up task and removing the rest. It can be used in (i) image segmentation to prevent false positives in anatomically implausible regions and speeding up the inference, (ii) image classification to produce image crops that are consistent in their anatomical context, and (iii) image registration by serving as a fast pre-registration step. Our proposed method is based on atlas registration and provides a fast and robust way to crop any anatomical region encoded as one or multiple bounding box(es) from any unlabeled CT scan of the brain, chest, abdomen and/or pelvis. We demonstrate the utility and robustness of the proposed method in the context of medical image segmentation by evaluating it on six datasets of public segmentation challenges. The foreground voxels in the regions of interest are preserved in the vast majority of cases and tasks (97.45-100%) while taking only fractions of a seconds to compute (0.1-0.21s) on a deep learning workstation and greatly reducing the segmentation runtime (2.0-12.7x). Our code is available at https://github.com/ThomasBudd/ctarr.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
Space-LLaVA: a Vision-Language Model Adapted to Extraterrestrial Applications
Authors:
Matthew Foutter,
Daniele Gammelli,
Justin Kruger,
Ethan Foss,
Praneet Bhoj,
Tommaso Guffanti,
Simone D'Amico,
Marco Pavone
Abstract:
Foundation Models (FMs), e.g., large language models, possess attributes of intelligence which offer promise to endow a robot with the contextual understanding necessary to navigate complex, unstructured tasks in the wild. We see three core challenges in the future of space robotics that motivate building an FM for the space robotics community: 1) Scalability of ground-in-the-loop operations; 2) G…
▽ More
Foundation Models (FMs), e.g., large language models, possess attributes of intelligence which offer promise to endow a robot with the contextual understanding necessary to navigate complex, unstructured tasks in the wild. We see three core challenges in the future of space robotics that motivate building an FM for the space robotics community: 1) Scalability of ground-in-the-loop operations; 2) Generalizing prior knowledge to novel environments; and 3) Multi-modality in tasks and sensor data. As a first-step towards a space foundation model, we programmatically augment three extraterrestrial databases with fine-grained language annotations inspired by the sensory reasoning necessary to e.g., identify a site of scientific interest on Mars, building a synthetic dataset of visual-question-answer and visual instruction-following tuples. We fine-tune a pre-trained LLaVA 13B checkpoint on our augmented dataset to adapt a Vision-Language Model (VLM) to the visual semantic features in an extraterrestrial environment, demonstrating FMs as a tool for specialization and enhancing a VLM's zero-shot performance on unseen task types in comparison to state-of-the-art VLMs. Ablation studies show that fine-tuning the language backbone and vision-language adapter in concert is key to facilitate adaption while a small percentage, e.g., 20%, of the pre-training data can be used to safeguard against catastrophic forgetting.
△ Less
Submitted 18 January, 2025; v1 submitted 12 August, 2024;
originally announced August 2024.
-
GPT-4's One-Dimensional Mapping of Morality: How the Accuracy of Country-Estimates Depends on Moral Domain
Authors:
Pontus Strimling,
Joel Krueger,
Simon Karlsson
Abstract:
Prior research demonstrates that Open AI's GPT models can predict variations in moral opinions between countries but that the accuracy tends to be substantially higher among high-income countries compared to low-income ones. This study aims to replicate previous findings and advance the research by examining how accuracy varies with different types of moral questions. Using responses from the Worl…
▽ More
Prior research demonstrates that Open AI's GPT models can predict variations in moral opinions between countries but that the accuracy tends to be substantially higher among high-income countries compared to low-income ones. This study aims to replicate previous findings and advance the research by examining how accuracy varies with different types of moral questions. Using responses from the World Value Survey and the European Value Study, covering 18 moral issues across 63 countries, we calculated country-level mean scores for each moral issue and compared them with GPT-4's predictions. Confirming previous findings, our results show that GPT-4 has greater predictive success in high-income than in low-income countries. However, our factor analysis reveals that GPT-4 bases its predictions primarily on a single dimension, presumably reflecting countries' degree of conservatism/liberalism. Conversely, the real-world moral landscape appears to be two-dimensional, differentiating between personal-sexual and violent-dishonest issues. When moral issues are categorized based on their moral domain, GPT-4's predictions are found to be remarkably accurate in the personal-sexual domain, across both high-income (r = .77) and low-income (r = .58) countries. Yet the predictive accuracy significantly drops in the violent-dishonest domain for both high-income (r = .30) and low-income (r = -.16) countries, indicating that GPT-4's one-dimensional world-view does not fully capture the complexity of the moral landscape. In sum, this study underscores the importance of not only considering country-specific characteristics to understand GPT-4's moral understanding, but also the characteristics of the moral issues at hand.
△ Less
Submitted 5 June, 2024;
originally announced July 2024.
-
Leveraging the Mahalanobis Distance to enhance Unsupervised Brain MRI Anomaly Detection
Authors:
Finn Behrendt,
Debayan Bhattacharya,
Robin Mieling,
Lennart Maack,
Julia Krüger,
Roland Opfer,
Alexander Schlaefer
Abstract:
Unsupervised Anomaly Detection (UAD) methods rely on healthy data distributions to identify anomalies as outliers. In brain MRI, a common approach is reconstruction-based UAD, where generative models reconstruct healthy brain MRIs, and anomalies are detected as deviations between input and reconstruction. However, this method is sensitive to imperfect reconstructions, leading to false positives th…
▽ More
Unsupervised Anomaly Detection (UAD) methods rely on healthy data distributions to identify anomalies as outliers. In brain MRI, a common approach is reconstruction-based UAD, where generative models reconstruct healthy brain MRIs, and anomalies are detected as deviations between input and reconstruction. However, this method is sensitive to imperfect reconstructions, leading to false positives that impede the segmentation. To address this limitation, we construct multiple reconstructions with probabilistic diffusion models. We then analyze the resulting distribution of these reconstructions using the Mahalanobis distance to identify anomalies as outliers. By leveraging information about normal variations and covariance of individual pixels within this distribution, we effectively refine anomaly scoring, leading to improved segmentation. Our experimental results demonstrate substantial performance improvements across various data sets. Specifically, compared to relying solely on single reconstructions, our approach achieves relative improvements of 15.9%, 35.4%, 48.0%, and 4.7% in terms of AUPRC for the BRATS21, ATLAS, MSLUB and WMH data sets, respectively.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
On the Application of Egocentric Computer Vision to Industrial Scenarios
Authors:
Vivek Chavan,
Oliver Heimann,
Jörg Krüger
Abstract:
Egocentric vision aims to capture and analyse the world from the first-person perspective. We explore the possibilities for egocentric wearable devices to improve and enhance industrial use cases w.r.t. data collection, annotation, labelling and downstream applications. This would contribute to easier data collection and allow users to provide additional context. We envision that this approach cou…
▽ More
Egocentric vision aims to capture and analyse the world from the first-person perspective. We explore the possibilities for egocentric wearable devices to improve and enhance industrial use cases w.r.t. data collection, annotation, labelling and downstream applications. This would contribute to easier data collection and allow users to provide additional context. We envision that this approach could serve as a supplement to the traditional industrial Machine Vision workflow. Code, Dataset and related resources will be available at: https://github.com/Vivek9Chavan/EgoVis24
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Starling Formation-Flying Optical Experiment: Initial Operations and Flight Results
Authors:
Justin Kruger,
Soon S. Hwang,
Simone D'Amico
Abstract:
This paper presents initial flight results for distributed optical angles-only navigation of a swarm of small spacecraft, conducted during the Starling Formation-Flying Optical Experiment (StarFOX). StarFOX is a core payload of the NASA Starling mission, which consists of four CubeSats launched in 2023. Prior angles-only flight demonstrations have only featured one observer and target and have rel…
▽ More
This paper presents initial flight results for distributed optical angles-only navigation of a swarm of small spacecraft, conducted during the Starling Formation-Flying Optical Experiment (StarFOX). StarFOX is a core payload of the NASA Starling mission, which consists of four CubeSats launched in 2023. Prior angles-only flight demonstrations have only featured one observer and target and have relied upon a-priori target orbit knowledge for initialization, translational maneuvers to resolve target range, and external absolute orbit updates to maintain convergence. StarFOX overcomes these limitations by applying the angles-only Absolute and Relative Trajectory Measurement System (ARTMS), which integrates three novel algorithms. Image Processing detects and tracks multiple targets in images from each satellite's on-board camera. Batch Orbit Determination computes initial swarm orbit estimates from bearing angle batches. Sequential Orbit Determination leverages an unscented Kalman filter to refine swarm state estimates over time. Multi-observer measurements shared over an intersatellite link are seamlessly fused to enable absolute and relative orbit determination. StarFOX flight data presents the first demonstrations of autonomous angles-only navigation for a satellite swarm, including multi-target and multi-observer relative navigation; autonomous initialization of navigation for unknown targets; and simultaneous absolute and relative orbit determination. Relative positioning uncertainties of 1.3% of target range (1$σ$) are achieved for a single observer under challenging measurement conditions, reduced to 0.6% (1$σ$) with multiple observers. Results demonstrate promising performance with regards to ongoing StarFOX campaigns and the application of angles-only navigation to future distributed missions.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Diffusion Models with Ensembled Structure-Based Anomaly Scoring for Unsupervised Anomaly Detection
Authors:
Finn Behrendt,
Debayan Bhattacharya,
Lennart Maack,
Julia Krüger,
Roland Opfer,
Robin Mieling,
Alexander Schlaefer
Abstract:
Supervised deep learning techniques show promise in medical image analysis. However, they require comprehensive annotated data sets, which poses challenges, particularly for rare diseases. Consequently, unsupervised anomaly detection (UAD) emerges as a viable alternative for pathology segmentation, as only healthy data is required for training. However, recent UAD anomaly scoring functions often f…
▽ More
Supervised deep learning techniques show promise in medical image analysis. However, they require comprehensive annotated data sets, which poses challenges, particularly for rare diseases. Consequently, unsupervised anomaly detection (UAD) emerges as a viable alternative for pathology segmentation, as only healthy data is required for training. However, recent UAD anomaly scoring functions often focus on intensity only and neglect structural differences, which impedes the segmentation performance. This work investigates the potential of Structural Similarity (SSIM) to bridge this gap. SSIM captures both intensity and structural disparities and can be advantageous over the classical $l1$ error. However, we show that there is more than one optimal kernel size for the SSIM calculation for different pathologies. Therefore, we investigate an adaptive ensembling strategy for various kernel sizes to offer a more pathology-agnostic scoring mechanism. We demonstrate that this ensembling strategy can enhance the performance of DMs and mitigate the sensitivity to different kernel sizes across varying pathologies, highlighting its promise for brain MRI anomaly detection.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
ASAP-Repair: API-Specific Automated Program Repair Based on API Usage Graphs
Authors:
Sebastian Nielebock,
Paul Blockhaus,
Jacob Krüger,
Frank Ortmeier
Abstract:
Modern software development relies on the reuse of code via Application Programming Interfaces (APIs). Such reuse relieves developers from learning and developing established algorithms and data structures anew, enabling them to focus on their problem at hand. However, there is also the risk of misusing an API due to a lack of understanding or proper documentation. While many techniques target API…
▽ More
Modern software development relies on the reuse of code via Application Programming Interfaces (APIs). Such reuse relieves developers from learning and developing established algorithms and data structures anew, enabling them to focus on their problem at hand. However, there is also the risk of misusing an API due to a lack of understanding or proper documentation. While many techniques target API misuse detection, only limited efforts have been put into automatically repairing API misuses. In this paper, we present our advances on our technique API-Specific Automated Program Repair (ASAP-Repair). ASAP-Repair is intended to fix API misuses based on API Usage Graphs (AUGs) by leveraging API usage templates of state-of-the-art API misuse detectors. We demonstrate that ASAP-Repair is in principle applicable on an established API misuse dataset. Moreover, we discuss next steps and challenges to evolve ASAP-Repair towards a full-fledged Automatic Program Repair (APR) technique.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Guided Reconstruction with Conditioned Diffusion Models for Unsupervised Anomaly Detection in Brain MRIs
Authors:
Finn Behrendt,
Debayan Bhattacharya,
Robin Mieling,
Lennart Maack,
Julia Krüger,
Roland Opfer,
Alexander Schlaefer
Abstract:
The application of supervised models to clinical screening tasks is challenging due to the need for annotated data for each considered pathology. Unsupervised Anomaly Detection (UAD) is an alternative approach that aims to identify any anomaly as an outlier from a healthy training distribution. A prevalent strategy for UAD in brain MRI involves using generative models to learn the reconstruction o…
▽ More
The application of supervised models to clinical screening tasks is challenging due to the need for annotated data for each considered pathology. Unsupervised Anomaly Detection (UAD) is an alternative approach that aims to identify any anomaly as an outlier from a healthy training distribution. A prevalent strategy for UAD in brain MRI involves using generative models to learn the reconstruction of healthy brain anatomy for a given input image. As these models should fail to reconstruct unhealthy structures, the reconstruction errors indicate anomalies. However, a significant challenge is to balance the accurate reconstruction of healthy anatomy and the undesired replication of abnormal structures. While diffusion models have shown promising results with detailed and accurate reconstructions, they face challenges in preserving intensity characteristics, resulting in false positives. We propose conditioning the denoising process of diffusion models with additional information derived from a latent representation of the input image. We demonstrate that this conditioning allows for accurate and local adaptation to the general input intensity distribution while avoiding the replication of unhealthy structures. We compare the novel approach to different state-of-the-art methods and for different data sets. Our results show substantial improvements in the segmentation performance, with the Dice score improved by 11.9%, 20.0%, and 44.6%, for the BraTS, ATLAS and MSLUB data sets, respectively, while maintaining competitive performance on the WMH data set. Furthermore, our results indicate effective domain adaptation across different MRI acquisitions and simulated contrasts, an important attribute for general anomaly detection methods. The code for our work is available at https://github.com/FinnBehrendt/Conditioned-Diffusion-Models-UAD
△ Less
Submitted 23 January, 2025; v1 submitted 7 December, 2023;
originally announced December 2023.
-
Patched Diffusion Models for Unsupervised Anomaly Detection in Brain MRI
Authors:
Finn Behrendt,
Debayan Bhattacharya,
Julia Krüger,
Roland Opfer,
Alexander Schlaefer
Abstract:
The use of supervised deep learning techniques to detect pathologies in brain MRI scans can be challenging due to the diversity of brain anatomy and the need for annotated data sets. An alternative approach is to use unsupervised anomaly detection, which only requires sample-level labels of healthy brains to create a reference representation. This reference representation can then be compared to u…
▽ More
The use of supervised deep learning techniques to detect pathologies in brain MRI scans can be challenging due to the diversity of brain anatomy and the need for annotated data sets. An alternative approach is to use unsupervised anomaly detection, which only requires sample-level labels of healthy brains to create a reference representation. This reference representation can then be compared to unhealthy brain anatomy in a pixel-wise manner to identify abnormalities. To accomplish this, generative models are needed to create anatomically consistent MRI scans of healthy brains. While recent diffusion models have shown promise in this task, accurately generating the complex structure of the human brain remains a challenge. In this paper, we propose a method that reformulates the generation task of diffusion models as a patch-based estimation of healthy brain anatomy, using spatial context to guide and improve reconstruction. We evaluate our approach on data of tumors and multiple sclerosis lesions and demonstrate a relative improvement of 25.1% compared to existing baselines.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
Never Skip Leg Day Again: Training the Lower Body with Vertical Jumps in a Virtual Reality Exergame
Authors:
Sebastian Cmentowski,
Sukran Karaosmanoglu,
Lennart Nacke,
Frank Steinicke,
Jens Krüger
Abstract:
Virtual Reality (VR) exergames can increase engagement in and motivation for physical activities. Most VR exergames focus on the upper body because many VR setups only track the users' heads and hands. To become a serious alternative to existing exercise programs, VR exergames must provide a balanced workout and train the lower limbs, too. To address this issue, we built a VR exergame focused on v…
▽ More
Virtual Reality (VR) exergames can increase engagement in and motivation for physical activities. Most VR exergames focus on the upper body because many VR setups only track the users' heads and hands. To become a serious alternative to existing exercise programs, VR exergames must provide a balanced workout and train the lower limbs, too. To address this issue, we built a VR exergame focused on vertical jump training to explore full-body exercise applications. To create a safe and effective training, nine domain experts participated in our prototype design. Our mixed-methods study confirms that the jump-centered exercises provided a worthy challenge and positive player experience, indicating long-term retention. Based on our findings, we present five design implications to guide future work: avoid an unintended forward drift, consider technical constraints, address safety concerns in full-body VR exergames, incorporate rhythmic elements with fluent movement patterns, adapt difficulty to players' fitness progression status.
△ Less
Submitted 6 March, 2024; v1 submitted 6 February, 2023;
originally announced February 2023.
-
Data-Efficient Vision Transformers for Multi-Label Disease Classification on Chest Radiographs
Authors:
Finn Behrendt,
Debayan Bhattacharya,
Julia Krüger,
Roland Opfer,
Alexander Schlaefer
Abstract:
Radiographs are a versatile diagnostic tool for the detection and assessment of pathologies, for treatment planning or for navigation and localization purposes in clinical interventions. However, their interpretation and assessment by radiologists can be tedious and error-prone. Thus, a wide variety of deep learning methods have been proposed to support radiologists interpreting radiographs. Mostl…
▽ More
Radiographs are a versatile diagnostic tool for the detection and assessment of pathologies, for treatment planning or for navigation and localization purposes in clinical interventions. However, their interpretation and assessment by radiologists can be tedious and error-prone. Thus, a wide variety of deep learning methods have been proposed to support radiologists interpreting radiographs. Mostly, these approaches rely on convolutional neural networks (CNN) to extract features from images. Especially for the multi-label classification of pathologies on chest radiographs (Chest X-Rays, CXR), CNNs have proven to be well suited. On the Contrary, Vision Transformers (ViTs) have not been applied to this task despite their high classification performance on generic images and interpretable local saliency maps which could add value to clinical interventions. ViTs do not rely on convolutions but on patch-based self-attention and in contrast to CNNs, no prior knowledge of local connectivity is present. While this leads to increased capacity, ViTs typically require an excessive amount of training data which represents a hurdle in the medical domain as high costs are associated with collecting large medical data sets. In this work, we systematically compare the classification performance of ViTs and CNNs for different data set sizes and evaluate more data-efficient ViT variants (DeiT). Our results show that while the performance between ViTs and CNNs is on par with a small benefit for ViTs, DeiTs outperform the former if a reasonably large data set is available for training.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
Automated Change Rule Inference for Distance-Based API Misuse Detection
Authors:
Sebastian Nielebock,
Paul Blockhaus,
Jacob Krüger,
Frank Ortmeier
Abstract:
Developers build on Application Programming Interfaces (APIs) to reuse existing functionalities of code libraries. Despite the benefits of reusing established libraries (e.g., time savings, high quality), developers may diverge from the API's intended usage; potentially causing bugs or, more specifically, API misuses. Recent research focuses on developing techniques to automatically detect API mis…
▽ More
Developers build on Application Programming Interfaces (APIs) to reuse existing functionalities of code libraries. Despite the benefits of reusing established libraries (e.g., time savings, high quality), developers may diverge from the API's intended usage; potentially causing bugs or, more specifically, API misuses. Recent research focuses on developing techniques to automatically detect API misuses, but many suffer from a high false-positive rate. In this article, we improve on this situation by proposing ChaRLI (Change RuLe Inference), a technique for automatically inferring change rules from developers' fixes of API misuses based on API Usage Graphs (AUGs). By subsequently applying graph-distance algorithms, we use change rules to discriminate API misuses from correct usages. This allows developers to reuse others' fixes of an API misuse at other code locations in the same or another project. We evaluated the ability of change rules to detect API misuses based on three datasets and found that the best mean relative precision (i.e., for testable usages) ranges from 77.1 % to 96.1 % while the mean recall ranges from 0.007 % to 17.7 % for individual change rules. These results underpin that ChaRLI and our misuse detection are helpful complements to existing API misuse detectors.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Unsupervised Anomaly Detection in 3D Brain MRI using Deep Learning with impured training data
Authors:
Finn Behrendt,
Marcel Bengs,
Frederik Rogge,
Julia Krüger,
Roland Opfer,
Alexander Schlaefer
Abstract:
The detection of lesions in magnetic resonance imaging (MRI)-scans of human brains remains challenging, time-consuming and error-prone. Recently, unsupervised anomaly detection (UAD) methods have shown promising results for this task. These methods rely on training data sets that solely contain healthy samples. Compared to supervised approaches, this significantly reduces the need for an extensive…
▽ More
The detection of lesions in magnetic resonance imaging (MRI)-scans of human brains remains challenging, time-consuming and error-prone. Recently, unsupervised anomaly detection (UAD) methods have shown promising results for this task. These methods rely on training data sets that solely contain healthy samples. Compared to supervised approaches, this significantly reduces the need for an extensive amount of labeled training data. However, data labelling remains error-prone. We study how unhealthy samples within the training data affect anomaly detection performance for brain MRI-scans. For our evaluations, we consider three publicly available data sets and use autoencoders (AE) as a well-established baseline method for UAD. We systematically evaluate the effect of impured training data by injecting different quantities of unhealthy samples to our training set of healthy samples from T1-weighted MRI-scans. We evaluate a method to identify falsely labeled samples directly during training based on the reconstruction error of the AE. Our results show that training with impured data decreases the UAD performance notably even with few falsely labeled samples. By performing outlier removal directly during training based on the reconstruction-loss, we demonstrate that falsely labeled data can be detected and removed to mitigate the effect of falsely labeled data. Overall, we highlight the importance of clean data sets for UAD in brain MRI and demonstrate an approach for detecting falsely labeled data directly during training.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
A Laboratory Experiment on Using Different Financial-Incentivization Schemes in Software-Engineering Experimentation
Authors:
Dmitri Bershadskyy,
Jacob Krüger,
Gül Çalıklı,
Siegmar Otto,
Sarah Zabel,
Jannik Greif,
Robert Heyer
Abstract:
In software-engineering research, many empirical studies are conducted with open-source or industry developers. However, in contrast to other research communities like economics or psychology, only few experiments use financial incentives (i.e., paying money) as a strategy to motivate participants' behavior and reward their performance. The most recent version of the SIGSOFT Empirical Standards me…
▽ More
In software-engineering research, many empirical studies are conducted with open-source or industry developers. However, in contrast to other research communities like economics or psychology, only few experiments use financial incentives (i.e., paying money) as a strategy to motivate participants' behavior and reward their performance. The most recent version of the SIGSOFT Empirical Standards mentions payouts only for increasing participation in surveys, but not for mimicking real-world motivations and behavior in experiments. Within this article, we report a controlled experiment in which we tackled this gap by studying how different financial incentivization schemes impact developers. For this purpose, we first conducted a survey on financial incentives used in the real-world, based on which we designed three incentivization schemes: (1) a performance-dependent scheme that employees prefer, (2) a scheme that is performance-independent, and (3) a scheme that mimics open-source development. Then, using a between-subject experimental design, we explored how these three schemes impact participants' performance. Our findings indicate that the different schemes can impact participants' performance in software-engineering experiments. Due to the small sample sizes, our results are not statistically significant, but we can still observe clear tendencies. Our contributions help understand the impact of financial incentives on participants in experiments as well as real-world scenarios, guiding researchers in designing experiments and organizations in compensating developers.
△ Less
Submitted 16 September, 2024; v1 submitted 22 February, 2022;
originally announced February 2022.
-
Towards High-Payload Admittance Control for Manual Guidance with Environmental Contact
Authors:
Kevin Haninger,
Marcel Radke,
Axel Vick,
Jörg Krüger
Abstract:
Force control enables hands-on teaching and physical collaboration, with the potential to improve ergonomics and flexibility of automation. Established methods for the design of compliance, impedance control, and \rev{collision response} can achieve free-space stability and acceptable peak contact force on lightweight, lower payload robots. Scaling collaboration to higher payloads can allow new ap…
▽ More
Force control enables hands-on teaching and physical collaboration, with the potential to improve ergonomics and flexibility of automation. Established methods for the design of compliance, impedance control, and \rev{collision response} can achieve free-space stability and acceptable peak contact force on lightweight, lower payload robots. Scaling collaboration to higher payloads can allow new applications, but introduces challenges due to the more significant payload dynamics and the use of higher-payload industrial robots.
To achieve high-payload manual guidance with contact, this paper proposes and validates new mechatronic design methods: standard admittance control is extended with damping feedback, compliant structures are integrated to the environment, and a contact response method which allows continuous admittance control is proposed. These methods are compared with respect to free-space stability, contact stability, and peak contact force. The resulting methods are then applied to realize two contact-rich tasks on a 16 kg payload (peg in hole and slot assembly) and free-space co-manipulation of a 50 kg payload.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
Unsupervised Anomaly Detection in 3D Brain MRI using Deep Learning with Multi-Task Brain Age Prediction
Authors:
Marcel Bengs,
Finn Behrendt,
Max-Heinrich Laves,
Julia Krüger,
Roland Opfer,
Alexander Schlaefer
Abstract:
Lesion detection in brain Magnetic Resonance Images (MRIs) remains a challenging task. MRIs are typically read and interpreted by domain experts, which is a tedious and time-consuming process. Recently, unsupervised anomaly detection (UAD) in brain MRI with deep learning has shown promising results to provide a quick, initial assessment. So far, these methods only rely on the visual appearance of…
▽ More
Lesion detection in brain Magnetic Resonance Images (MRIs) remains a challenging task. MRIs are typically read and interpreted by domain experts, which is a tedious and time-consuming process. Recently, unsupervised anomaly detection (UAD) in brain MRI with deep learning has shown promising results to provide a quick, initial assessment. So far, these methods only rely on the visual appearance of healthy brain anatomy for anomaly detection. Another biomarker for abnormal brain development is the deviation between the brain age and the chronological age, which is unexplored in combination with UAD. We propose deep learning for UAD in 3D brain MRI considering additional age information. We analyze the value of age information during training, as an additional anomaly score, and systematically study several architecture concepts. Based on our analysis, we propose a novel deep learning approach for UAD with multi-task age prediction. We use clinical T1-weighted MRIs of 1735 healthy subjects and the publicly available BraTs 2019 data set for our study. Our novel approach significantly improves UAD performance with an AUC of 92.60% compared to an AUC-score of 84.37% using previous approaches without age information.
△ Less
Submitted 31 January, 2022;
originally announced January 2022.
-
Contact Information Flow and Design of Compliance
Authors:
Kevin Haninger,
Marcel Radke,
Richard Hartisch,
Jörg Krüger
Abstract:
Identifying changes in contact during contact-rich manipulation can detect task state or errors, enabling improved robustness and autonomy. The ability to detect contact is affected by the mechatronic design of the robot, especially its physical compliance. Established methods can design physical compliance for many aspects of contact performance (e.g. peak contact force, motion/force control band…
▽ More
Identifying changes in contact during contact-rich manipulation can detect task state or errors, enabling improved robustness and autonomy. The ability to detect contact is affected by the mechatronic design of the robot, especially its physical compliance. Established methods can design physical compliance for many aspects of contact performance (e.g. peak contact force, motion/force control bandwidth), but are based on time-invariant dynamic models. A change in contact mode is a discrete change in coupled robot-environment dynamics, not easily considered in existing design methods. Towards designing robots which can robustly detect changes in contact mode online, this paper investigates how mechatronic design can improve contact estimation, with a focus on the impact of the location and degree of compliance. A design metric of information gain is proposed which measures how much position/force measurements reduce uncertainty in the contact mode estimate. This information gain is developed for fully- and partially-observed systems, as partial observability can arise from joint flexibility in the robot or environmental inertia. Hardware experiments with various compliant setups validate that information gain predicts the speed and certainty with which contact is detected in (i) monitoring of contact-rich assembly and (ii) collision detection.
△ Less
Submitted 16 March, 2022; v1 submitted 24 October, 2021;
originally announced October 2021.
-
3-Dimensional Deep Learning with Spatial Erasing for Unsupervised Anomaly Segmentation in Brain MRI
Authors:
Marcel Bengs,
Finn Behrendt,
Julia Krüger,
Roland Opfer,
Alexander Schlaefer
Abstract:
Purpose. Brain Magnetic Resonance Images (MRIs) are essential for the diagnosis of neurological diseases. Recently, deep learning methods for unsupervised anomaly detection (UAD) have been proposed for the analysis of brain MRI. These methods rely on healthy brain MRIs and eliminate the requirement of pixel-wise annotated data compared to supervised deep learning. While a wide range of methods for…
▽ More
Purpose. Brain Magnetic Resonance Images (MRIs) are essential for the diagnosis of neurological diseases. Recently, deep learning methods for unsupervised anomaly detection (UAD) have been proposed for the analysis of brain MRI. These methods rely on healthy brain MRIs and eliminate the requirement of pixel-wise annotated data compared to supervised deep learning. While a wide range of methods for UAD have been proposed, these methods are mostly 2D and only learn from MRI slices, disregarding that brain lesions are inherently 3D and the spatial context of MRI volumes remains unexploited.
Methods. We investigate whether using increased spatial context by using MRI volumes combined with spatial erasing leads to improved unsupervised anomaly segmentation performance compared to learning from slices. We evaluate and compare 2D variational autoencoder (VAE) to their 3D counterpart, propose 3D input erasing, and systemically study the impact of the data set size on the performance.
Results. Using two publicly available segmentation data sets for evaluation, 3D VAE outperform their 2D counterpart, highlighting the advantage of volumetric context. Also, our 3D erasing methods allow for further performance improvements. Our best performing 3D VAE with input erasing leads to an average DICE score of 31.40% compared to 25.76% for the 2D VAE.
Conclusions. We propose 3D deep learning methods for UAD in brain MRI combined with 3D erasing and demonstrate that 3D methods clearly outperform their 2D counterpart for anomaly segmentation. Also, our spatial erasing method allows for further performance improvements and reduces the requirement for large data sets.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
An Experimental Analysis of Graph-Distance Algorithms for Comparing API Usages
Authors:
Sebastian Nielebock,
Paul Blockhaus,
Jacob Krüger,
Frank Ortmeier
Abstract:
Modern software development heavily relies on the reuse of functionalities through Application Programming Interfaces (APIs). However, client developers can have issues identifying the correct usage of a certain API, causing misuses accompanied by software crashes or usability bugs. Therefore, researchers have aimed at identifying API misuses automatically by comparing client code usages to correc…
▽ More
Modern software development heavily relies on the reuse of functionalities through Application Programming Interfaces (APIs). However, client developers can have issues identifying the correct usage of a certain API, causing misuses accompanied by software crashes or usability bugs. Therefore, researchers have aimed at identifying API misuses automatically by comparing client code usages to correct API usages. Some techniques rely on certain API-specific graph-based data structures to improve the abstract representation of API usages. Such techniques need to compare graphs, for instance, by computing distance metrics based on the minimal graph edit distance or the largest common subgraphs, whose computations are known to be NP-hard problems. Fortunately, there exist many abstractions for simplifying graph distance computation. However, their applicability for comparing graph representations of API usages has not been analyzed. In this paper, we provide a comparison of different distance algorithms of API-usage graphs regarding correctness and runtime. Particularly, correctness relates to the algorithms' ability to identify similar correct API usages, but also to discriminate similar correct and false usages as well as non-similar usages. For this purpose, we systematically identified a set of eight graph-based distance algorithms and applied them on two datasets of real-world API usages and misuses. Interestingly, our results suggest that existing distance algorithms are not reliable for comparing API usage graphs. To improve on this situation, we identified and discuss the algorithms' issues, based on which we formulate hypotheses to initiate research on overcoming them.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Effects of Task Type and Wall Appearance on Collision Behavior in Virtual Environments
Authors:
Sebastian Cmentowski,
Jens Krüger
Abstract:
Driven by the games community, virtual reality setups have lately evolved into affordable and consumer-ready mobile headsets. However, despite these promising improvements, it remains challenging to convey immersive and engaging VR games as players are usually limited to experience the virtual world by vision and hearing only. One prominent example of such open challenges is the disparity between…
▽ More
Driven by the games community, virtual reality setups have lately evolved into affordable and consumer-ready mobile headsets. However, despite these promising improvements, it remains challenging to convey immersive and engaging VR games as players are usually limited to experience the virtual world by vision and hearing only. One prominent example of such open challenges is the disparity between the real surroundings and the virtual environment. As virtual obstacles usually do not have a physical counterpart, players might walk through walls enclosing the level. Thus, past research mainly focussed on multisensory collision feedback to deter players from ignoring obstacles. However, the underlying causative reasons for such unwanted behavior have mostly remained unclear.
Our work investigates how task types and wall appearances influence the players' incentives to walk through virtual walls. Therefore, we conducted a user study, confronting the participants with different task motivations and walls of varying opacity and realism. Our evaluation reveals that players generally adhere to realistic behavior, as long as the experience feels interesting and diverse. Furthermore, we found that opaque walls excel in deterring subjects from cutting short, whereas different degrees of realism had no significant influence on walking trajectories. Finally, we use collected player feedback to discuss individual reasons for the observed behavior.
△ Less
Submitted 6 March, 2024; v1 submitted 18 July, 2021;
originally announced July 2021.
-
"I Packed My Bag and in It I Put...": A Taxonomy of Inventory Systems for Virtual Reality Games
Authors:
Sebastian Cmentowski,
Andrey Krekhov,
Jens Krüger
Abstract:
On a journey, a backpack is a perfect place to store and organize the necessary provisions and tools. Similarly, carrying and managing items is a central part of most digital games, providing significant prospects for the player experience. Even though VR games are gradually becoming more mature, most of them still avoid this essential feature. Some of the reasons for this deficit are the addition…
▽ More
On a journey, a backpack is a perfect place to store and organize the necessary provisions and tools. Similarly, carrying and managing items is a central part of most digital games, providing significant prospects for the player experience. Even though VR games are gradually becoming more mature, most of them still avoid this essential feature. Some of the reasons for this deficit are the additional requirements and challenges that VR imposes on developers to achieve a compelling user experience. We structure the ample design space of VR inventories by analyzing popular VR games and developing a structural taxonomy. We combine our insights with feedback from game developers to identify the essential building blocks and design choices. Finally, we propose meaningful design implications and demonstrate the practical use of our work in action.
△ Less
Submitted 6 March, 2024; v1 submitted 18 July, 2021;
originally announced July 2021.
-
HALF: Holistic Auto Machine Learning for FPGAs
Authors:
Jonas Ney,
Dominik Loroch,
Vladimir Rybalkin,
Nico Weber,
Jens Krüger,
Norbert Wehn
Abstract:
Deep Neural Networks (DNNs) are capable of solving complex problems in domains related to embedded systems, such as image and natural language processing. To efficiently implement DNNs on a specific FPGA platform for a given cost criterion, e.g. energy efficiency, an enormous amount of design parameters has to be considered from the topology down to the final hardware implementation. Interdependen…
▽ More
Deep Neural Networks (DNNs) are capable of solving complex problems in domains related to embedded systems, such as image and natural language processing. To efficiently implement DNNs on a specific FPGA platform for a given cost criterion, e.g. energy efficiency, an enormous amount of design parameters has to be considered from the topology down to the final hardware implementation. Interdependencies between the different design layers have to be taken into account and explored efficiently, making it hardly possible to find optimized solutions manually. An automatic, holistic design approach can improve the quality of DNN implementations on FPGA significantly. To this end, we present a cross-layer design space exploration methodology. It comprises optimizations starting from a hardware-aware topology search for DNNs down to the final optimized implementation for a given FPGA platform. The methodology is implemented in our Holistic Auto machine Learning for FPGAs (HALF) framework, which combines an evolutionary search algorithm, various optimization steps and a library of parametrizable hardware DNN modules. HALF automates both the exploration process and the implementation of optimized solutions on a target FPGA platform for various applications. We demonstrate the performance of HALF on a medical use case for arrhythmia detection for three different design goals, i.e. low-energy, low-power and high-throughput respectively. Our FPGA implementation outperforms a TensorRT optimized model on an Nvidia Jetson platform in both throughput and energy consumption.
△ Less
Submitted 20 October, 2021; v1 submitted 28 June, 2021;
originally announced June 2021.
-
AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories
Authors:
Sebastian Nielebock,
Paul Blockhaus,
Jacob Krüger,
Frank Ortmeier
Abstract:
Many developers and organizations implement apps for Android, the most widely used operating system for mobile devices. Common problems developers face are the various hardware devices, customized Android variants, and frequent updates, forcing them to implement workarounds for the different versions and variants of Android APIs used in practice. In this paper, we contribute the Android Compatibil…
▽ More
Many developers and organizations implement apps for Android, the most widely used operating system for mobile devices. Common problems developers face are the various hardware devices, customized Android variants, and frequent updates, forcing them to implement workarounds for the different versions and variants of Android APIs used in practice. In this paper, we contribute the Android Compatibility checkS dataSet (AndroidCompass) that comprises changes to compatibility checks developers use to enforce workarounds for specific Android versions in their apps. We extracted 80,324 changes to compatibility checks from 1,394 apps by analyzing the version histories of 2,399 projects from the F-Droid catalog. With AndroidCompass, we aim to provide data on when and how developers introduced or evolved workarounds to handle Android incompatibilities. We hope that AndroidCompass fosters research to deal with version incompatibilities, address potential design flaws, identify security concerns, and help derive solutions for other developers, among others-helping researchers to develop and evaluate novel techniques, and Android app as well as operating-system developers in engineering their software.
△ Less
Submitted 17 March, 2021;
originally announced March 2021.
-
Towards Sneaking as a Playful Input Modality for Virtual Environments
Authors:
Sebastian Cmentowski,
Andrey Krekhov,
André Zenner,
Daniel Kucharski,
Jens Krüger
Abstract:
Using virtual reality setups, users can fade out of their surroundings and dive fully into a thrilling and appealing virtual environment. The success of such immersive experiences depends heavily on natural and engaging interactions with the virtual world. As developers tend to focus on intuitive hand controls, other aspects of the broad range of full-body capabilities are easily left vacant. One…
▽ More
Using virtual reality setups, users can fade out of their surroundings and dive fully into a thrilling and appealing virtual environment. The success of such immersive experiences depends heavily on natural and engaging interactions with the virtual world. As developers tend to focus on intuitive hand controls, other aspects of the broad range of full-body capabilities are easily left vacant. One repeatedly overlooked input modality is the user's gait. Even though users may walk physically to explore the environment, it usually does not matter how they move. However, gait-based interactions, using the variety of information contained in human gait, could offer interesting benefits for immersive experiences. For instance, stealth VR-games could profit from this additional range of interaction fidelity in the form of a sneaking-based input modality. In our work, we explore the potential of sneaking as a playful input modality for virtual environments. Therefore, we discuss possible sneaking-based gameplay mechanisms and develop three technical approaches, including precise foot-tracking and two abstraction levels. Our evaluation reveals the potential of sneaking-based interactions in IVEs, offering unique challenges and thrilling gameplay. For these interactions, precise tracking of individual footsteps is unnecessary, as a more abstract approach focusing on the players' intention offers the same experience while providing better comprehensible feedback. Based on these findings, we discuss the broader potential and individual strengths of our gait-centered interactions.
△ Less
Submitted 6 March, 2024; v1 submitted 3 February, 2021;
originally announced February 2021.
-
Deadeye: A Novel Preattentive Visualization Technique Based on Dichoptic Presentation
Authors:
Andrey Krekhov,
Jens Krueger
Abstract:
Preattentive visual features such as hue or flickering can effectively draw attention to an object of interest -- for instance, an important feature in a scientific visualization. These features appear to pop out and can be recognized by our visual system, independently from the number of distractors. Most cues do not take advantage of the fact that most humans have two eyes. In cases where binocu…
▽ More
Preattentive visual features such as hue or flickering can effectively draw attention to an object of interest -- for instance, an important feature in a scientific visualization. These features appear to pop out and can be recognized by our visual system, independently from the number of distractors. Most cues do not take advantage of the fact that most humans have two eyes. In cases where binocular vision is applied, it is almost exclusively used to convey depth by exposing stereo pairs. We present Deadeye, a novel preattentive visualization technique based on presenting different stimuli to each eye. The target object is rendered for one eye only and is instantly detected by our visual system. In contrast to existing cues, Deadeye does not modify any visual properties of the target and, thus, is particularly suited for visualization applications. Our evaluation confirms that Deadeye is indeed perceived preattentively. We also explore a conjunction search based on our technique and show that, in contrast to 3D depth, the task cannot be processed in parallel.
△ Less
Submitted 18 January, 2021;
originally announced January 2021.
-
Streaming VR Games to the Broad Audience: A Comparison of the First-Person and Third-Person Perspectives
Authors:
Katharina Emmerich,
Andrey Krekhov,
Sebastian Cmentowski,
Jens Krueger
Abstract:
The spectatorship experience for virtual reality (VR) games differs strongly from its non-VR precursor. When watching non-VR games on platforms such as Twitch, spectators just see what the player sees, as the physical interaction is mostly unimportant for the overall impression. In VR, the immersive full-body interaction is a crucial part of the player experience. Hence, content creators, such as…
▽ More
The spectatorship experience for virtual reality (VR) games differs strongly from its non-VR precursor. When watching non-VR games on platforms such as Twitch, spectators just see what the player sees, as the physical interaction is mostly unimportant for the overall impression. In VR, the immersive full-body interaction is a crucial part of the player experience. Hence, content creators, such as streamers, often rely on green screens or similar solutions to offer a mixed-reality third-person view to disclose their full-body actions. Our work compares the most popular realizations of the first-person and the third-person perspective in an online survey (N=217) with three different VR games. Contrary to the current trend to stream in third-person, our key result is that most viewers prefer the first-person version, which they attribute mostly to the better focus on in-game actions and higher involvement. Based on the study insights, we provide design recommendations for both perspectives.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
Towards Learning Controllable Representations of Physical Systems
Authors:
Kevin Haninger,
Raul Vicente Garcia,
Joerg Krueger
Abstract:
Learned representations of dynamical systems reduce dimensionality, potentially supporting downstream reinforcement learning (RL). However, no established methods predict a representation's suitability for control and evaluation is largely done via downstream RL performance, slowing representation design. Towards a principled evaluation of representations for control, we consider the relationship…
▽ More
Learned representations of dynamical systems reduce dimensionality, potentially supporting downstream reinforcement learning (RL). However, no established methods predict a representation's suitability for control and evaluation is largely done via downstream RL performance, slowing representation design. Towards a principled evaluation of representations for control, we consider the relationship between the true state and the corresponding representations, proposing that ideally each representation corresponds to a unique true state. This motivates two metrics: temporal smoothness and high mutual information between true state/representation. These metrics are related to established representation objectives, and studied on Lagrangian systems where true state, information requirements, and statistical properties of the state can be formalized for a broad class of systems. These metrics are shown to predict reinforcement learning performance in a simulated peg-in-hole task when comparing variants of autoencoder-based representations.
△ Less
Submitted 24 November, 2020; v1 submitted 16 November, 2020;
originally announced November 2020.
-
Playing With Friends -- The Importance of Social Play During the COVID-19 Pandemic
Authors:
Sebastian Cmentowski,
Jens Krüger
Abstract:
In early 2020, the virus SARS-CoV-2 evolved into a new pandemic, forcing governments worldwide to establish social distancing measures. Consequently, people had to switch to online media, such as social networks or videotelephony, to keep in touch with friends and family. In this context, online games, combining entertainment with social interactions, also experienced a notable growth. In our work…
▽ More
In early 2020, the virus SARS-CoV-2 evolved into a new pandemic, forcing governments worldwide to establish social distancing measures. Consequently, people had to switch to online media, such as social networks or videotelephony, to keep in touch with friends and family. In this context, online games, combining entertainment with social interactions, also experienced a notable growth. In our work, we focused on the potential of games as a replacement for social contacts in the COVID-19 crisis. Our online survey results indicate that the value of games for social needs depends on individual gaming habits. Participants playing mostly multiplayer games increased their playtime and mentioned social play as a key motivator. Contrarily, non-players were not motivated to add games as communication channels. We deduce that such crises mainly catalyze existing gaming habits.
△ Less
Submitted 6 March, 2024; v1 submitted 31 October, 2020;
originally announced November 2020.
-
Silhouette Games: An Interactive One-Way Mirror Approach to Watching Players in VR
Authors:
Andrey Krekhov,
Daniel Preuß,
Sebastian Cmentowski,
Jens Krüger
Abstract:
Watching others play is a key ingredient of digital games and an important aspect of games user research. However, spectatorship is not very popular in virtual reality, as such games strongly rely on one's feelings of presence. In other words, the head-mounted display creates a barrier between the player and the audience. We contribute an alternative watching approach consisting of two major compo…
▽ More
Watching others play is a key ingredient of digital games and an important aspect of games user research. However, spectatorship is not very popular in virtual reality, as such games strongly rely on one's feelings of presence. In other words, the head-mounted display creates a barrier between the player and the audience. We contribute an alternative watching approach consisting of two major components: a dynamic view frustum that renders the game scene from the current spectator position and a one-way mirror in front of the screen. This mirror, together with our silhouetting algorithm, allows seeing the player's reflection at the correct position in the virtual world. An exploratory survey emphasizes the overall positive experience of the viewers in our setup. In particular, the participants enjoyed their ability to explore the virtual surrounding via physical repositioning and to observe the blended player during object manipulations. Apart from requesting a larger screen, the participants expressed a strong need to interact with the player. Consequently, we suggest utilizing our technology as a foundation for novel playful experiences with the overarching goal to transform the passive spectator into a collocated player.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Multiple Sclerosis Lesion Activity Segmentation with Attention-Guided Two-Path CNNs
Authors:
Nils Gessert,
Julia Krüger,
Roland Opfer,
Ann-Christin Ostwaldt,
Praveena Manogaran,
Hagen H. Kitzler,
Sven Schippling,
Alexander Schlaefer
Abstract:
Multiple sclerosis is an inflammatory autoimmune demyelinating disease that is characterized by lesions in the central nervous system. Typically, magnetic resonance imaging (MRI) is used for tracking disease progression. Automatic image processing methods can be used to segment lesions and derive quantitative lesion parameters. So far, methods have focused on lesion segmentation for individual MRI…
▽ More
Multiple sclerosis is an inflammatory autoimmune demyelinating disease that is characterized by lesions in the central nervous system. Typically, magnetic resonance imaging (MRI) is used for tracking disease progression. Automatic image processing methods can be used to segment lesions and derive quantitative lesion parameters. So far, methods have focused on lesion segmentation for individual MRI scans. However, for monitoring disease progression, \textit{lesion activity} in terms of new and enlarging lesions between two time points is a crucial biomarker. For this problem, several classic methods have been proposed, e.g., using difference volumes. Despite their success for single-volume lesion segmentation, deep learning approaches are still rare for lesion activity segmentation. In this work, convolutional neural networks (CNNs) are studied for lesion activity segmentation from two time points. For this task, CNNs are designed and evaluated that combine the information from two points in different ways. In particular, two-path architectures with attention-guided interactions are proposed that enable effective information exchange between the two time point's processing paths. It is demonstrated that deep learning-based methods outperform classic approaches and it is shown that attention-guided interactions significantly improve performance. Furthermore, the attention modules produce plausible attention maps that have a masking effect that suppresses old, irrelevant lesions. A lesion-wise false positive rate of 26.4% is achieved at a true positive rate of 74.2%, which is not significantly different from the interrater performance.
△ Less
Submitted 5 August, 2020;
originally announced August 2020.
-
Uncertainty Quantification in Deep Residual Neural Networks
Authors:
Lukasz Wandzik,
Raul Vicente Garcia,
Jörg Krüger
Abstract:
Uncertainty quantification is an important and challenging problem in deep learning. Previous methods rely on dropout layers which are not present in modern deep architectures or batch normalization which is sensitive to batch sizes. In this work, we address the problem of uncertainty quantification in deep residual networks by using a regularization technique called stochastic depth. We show that…
▽ More
Uncertainty quantification is an important and challenging problem in deep learning. Previous methods rely on dropout layers which are not present in modern deep architectures or batch normalization which is sensitive to batch sizes. In this work, we address the problem of uncertainty quantification in deep residual networks by using a regularization technique called stochastic depth. We show that training residual networks using stochastic depth can be interpreted as a variational approximation to the intractable posterior over the weights in Bayesian neural networks. We demonstrate that by sampling from a distribution of residual networks with varying depth and shared weights, meaningful uncertainty estimates can be obtained. Moreover, compared to the original formulation of residual networks, our method produces well-calibrated softmax probabilities with only minor changes to the network's structure. We evaluate our approach on popular computer vision datasets and measure the quality of uncertainty estimates. We also test the robustness to domain shift and show that our method is able to express higher predictive uncertainty on out-of-distribution samples. Finally, we demonstrate how the proposed approach could be used to obtain uncertainty estimates in facial verification applications.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.
-
A Discrete Probabilistic Approach to Dense Flow Visualization
Authors:
Daniel Preuß,
Tino Weinkauf,
Jens Krüger
Abstract:
Dense flow visualization is a popular visualization paradigm. Traditionally, the various models and methods in this area use a continuous formulation, resting upon the solid foundation of functional analysis. In this work, we examine a discrete formulation of dense flow visualization. From probability theory, we derive a similarity matrix that measures the similarity between different points in th…
▽ More
Dense flow visualization is a popular visualization paradigm. Traditionally, the various models and methods in this area use a continuous formulation, resting upon the solid foundation of functional analysis. In this work, we examine a discrete formulation of dense flow visualization. From probability theory, we derive a similarity matrix that measures the similarity between different points in the flow domain, leading to the discovery of a whole new class of visualization models. Using this matrix, we propose a novel visualization approach consisting of the computation of spectral embeddings, i.e., characteristic domain maps, defined by particle mixture probabilities. These embeddings are scalar fields that give insight into the mixing processes of the flow on different scales. The approach of spectral embeddings is already well studied in image segmentation, and we see that spectral embeddings are connected to Fourier expansions and frequencies. We showcase the utility of our method using different 2D and 3D flows.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
4D Deep Learning for Multiple Sclerosis Lesion Activity Segmentation
Authors:
Nils Gessert,
Marcel Bengs,
Julia Krüger,
Roland Opfer,
Ann-Christin Ostwaldt,
Praveena Manogaran,
Sven Schippling,
Alexander Schlaefer
Abstract:
Multiple sclerosis lesion activity segmentation is the task of detecting new and enlarging lesions that appeared between a baseline and a follow-up brain MRI scan. While deep learning methods for single-scan lesion segmentation are common, deep learning approaches for lesion activity have only been proposed recently. Here, a two-path architecture processes two 3D MRI volumes from two time points.…
▽ More
Multiple sclerosis lesion activity segmentation is the task of detecting new and enlarging lesions that appeared between a baseline and a follow-up brain MRI scan. While deep learning methods for single-scan lesion segmentation are common, deep learning approaches for lesion activity have only been proposed recently. Here, a two-path architecture processes two 3D MRI volumes from two time points. In this work, we investigate whether extending this problem to full 4D deep learning using a history of MRI volumes and thus an extended baseline can improve performance. For this purpose, we design a recurrent multi-encoder-decoder architecture for processing 4D data. We find that adding more temporal information is beneficial and our proposed architecture outperforms previous approaches with a lesion-wise true positive rate of 0.84 at a lesion-wise false positive rate of 0.19.
△ Less
Submitted 29 May, 2020; v1 submitted 20 April, 2020;
originally announced April 2020.
-
Toward a Taxonomy of Inventory Systems for Virtual Reality Games
Authors:
Sebastian Cmentowski,
Andrey Krekhov,
Ann-Marie Müller,
Jens Krüger
Abstract:
Virtual reality (VR) games are gradually becoming more elaborated and feature-rich, but fail to reach the complexity of traditional digital games. One common feature that is used to extend and organize complex gameplay is the in-game inventory, which allows players to obtain and carry new tools and items throughout their journey. However, VR imposes additional requirements and challenges that impe…
▽ More
Virtual reality (VR) games are gradually becoming more elaborated and feature-rich, but fail to reach the complexity of traditional digital games. One common feature that is used to extend and organize complex gameplay is the in-game inventory, which allows players to obtain and carry new tools and items throughout their journey. However, VR imposes additional requirements and challenges that impede the implementation of this important feature and hinder games to unleash their full potential. Our current work focuses on the design space of inventories in VR games. We introduce this sparsely researched topic by constructing a first taxonomy of the underlying design considerations and building blocks. Furthermore, we present three different inventories that were designed using our taxonomy and evaluate them in an early qualitative study. The results underline the importance of our research and reveal promising insights that show the huge potential for VR games.
△ Less
Submitted 6 March, 2024; v1 submitted 9 August, 2019;
originally announced August 2019.
-
Outstanding: A Multi-Perspective Travel Approach for Virtual Reality Games
Authors:
Sebastian Cmentowski,
Andrey Krekhov,
Jens Krüger
Abstract:
In virtual reality games, players dive into fictional environments and can experience a compelling and immersive world. State-of-the-art VR systems allow for natural and intuitive navigation through physical walking. However, the tracking space is still limited, and viable alternatives are required to reach further virtual destinations. Our work focuses on the exploration of vast open worlds - an…
▽ More
In virtual reality games, players dive into fictional environments and can experience a compelling and immersive world. State-of-the-art VR systems allow for natural and intuitive navigation through physical walking. However, the tracking space is still limited, and viable alternatives are required to reach further virtual destinations. Our work focuses on the exploration of vast open worlds - an area where existing local navigation approaches such as the arc-based teleport are not ideally suited and world-in-miniature techniques potentially reduce presence. We present a novel alternative for open environments: Our idea is to equip players with the ability to switch from first-person to a third-person bird's eye perspective on demand. From above, players can command their avatar and initiate travels over large distance. Our evaluation reveals a significant increase in spatial orientation while avoiding cybersickness and preserving presence, enjoyment, and competence. We summarize our findings in a set of comprehensive design guidelines to help developers integrate our technique.
△ Less
Submitted 6 March, 2024; v1 submitted 1 August, 2019;
originally announced August 2019.
-
Beyond Human: Animals as an Escape from Stereotype Avatars in Virtual Reality Games
Authors:
Andrey Krekhov,
Sebastian Cmentowski,
Katharina Emmerich,
Jens Krüger
Abstract:
Virtual reality setups are particularly suited to create a tight bond between players and their avatars up to a degree where we start perceiving the virtual representation as our own body. We hypothesize that such an illusion of virtual body ownership (IVBO) has a particularly high, yet overlooked potential for nonhumanoid avatars. To validate our claim, we use the example of three very different…
▽ More
Virtual reality setups are particularly suited to create a tight bond between players and their avatars up to a degree where we start perceiving the virtual representation as our own body. We hypothesize that such an illusion of virtual body ownership (IVBO) has a particularly high, yet overlooked potential for nonhumanoid avatars. To validate our claim, we use the example of three very different creatures---a scorpion, a rhino, and a bird---to explore possible avatar controls and game mechanics based on specific animal abilities. A quantitative evaluation underpins the high game enjoyment arising from embodying such nonhuman morphologies, including additional body parts and obtaining respective superhuman skills, which allows us to derive a set of novel design implications. Furthermore, the experiment reveals a correlation between IVBO and game enjoyment, which is a further indication that nonhumanoid creatures offer a meaningful design space for VR games worth further investigation.
△ Less
Submitted 17 July, 2019;
originally announced July 2019.
-
The Illusion of Animal Body Ownership and Its Potential for Virtual Reality Games
Authors:
Andrey Krekhov,
Sebastian Cmentowski,
Jens Krüger
Abstract:
Virtual reality offers the unique possibility to experience a virtual representation as our own body. In contrast to previous research that predominantly studied this phenomenon for humanoid avatars, our work focuses on virtual animals. In this paper, we discuss different body tracking approaches to control creatures such as spiders or bats and the respective virtual body ownership effects. Our em…
▽ More
Virtual reality offers the unique possibility to experience a virtual representation as our own body. In contrast to previous research that predominantly studied this phenomenon for humanoid avatars, our work focuses on virtual animals. In this paper, we discuss different body tracking approaches to control creatures such as spiders or bats and the respective virtual body ownership effects. Our empirical results demonstrate that virtual body ownership is also applicable for nonhumanoids and can even outperform human-like avatars in certain cases. An additional survey confirms the general interest of people in creating such experiences and allows us to initiate a broad discussion regarding the applicability of animal embodiment for educational and entertainment purposes.
△ Less
Submitted 11 July, 2019;
originally announced July 2019.
-
Integrating Visualization Literacy into Computer Graphics Education Using the Example of Dear Data
Authors:
Andrey Krekhov,
Michael Michalski,
Jens Krüger
Abstract:
The amount of visual communication we are facing is rapidly increasing, and skills to process, understand, and generate visual representations are in high demand. Especially students focusing on computer graphics and visualization can benefit from a more diverse education on visual literacy, as they often have to work on graphical representations for broad masses after their graduation. Our propos…
▽ More
The amount of visual communication we are facing is rapidly increasing, and skills to process, understand, and generate visual representations are in high demand. Especially students focusing on computer graphics and visualization can benefit from a more diverse education on visual literacy, as they often have to work on graphical representations for broad masses after their graduation. Our proposed teaching approach incorporates basic design thinking principles into traditional visualization and graphics education. Our course was inspired by the book Dear Data that was the subject of a lively discussion at the closing capstone of IEEE VIS 2017. The paper outlines our 12-week teaching experiment and summarizes the results extracted from accompanying questionnaires and interviews. In particular, we provide insights into the creation process and pain points of visualization novices, discuss the observed interplay between visualization tasks and design thinking, and finally draw design implications for visual literacy education in general.
△ Less
Submitted 10 July, 2019;
originally announced July 2019.
-
Deadeye Visualization Revisited: Investigation of Preattentiveness and Applicability in Virtual Environments
Authors:
Andrey Krekhov,
Sebastian Cmentowski,
Andre Waschk,
Jens Krüger
Abstract:
Visualizations rely on highlighting to attract and guide our attention. To make an object of interest stand out independently from a number of distractors, the underlying visual cue, e.g., color, has to be preattentive. In our prior work, we introduced Deadeye as an instantly recognizable highlighting technique that works by rendering the target object for one eye only. In contrast to prior approa…
▽ More
Visualizations rely on highlighting to attract and guide our attention. To make an object of interest stand out independently from a number of distractors, the underlying visual cue, e.g., color, has to be preattentive. In our prior work, we introduced Deadeye as an instantly recognizable highlighting technique that works by rendering the target object for one eye only. In contrast to prior approaches, Deadeye excels by not modifying any visual properties of the target. However, in the case of 2D visualizations, the method requires an additional setup to allow dichoptic presentation, which is a considerable drawback. As a follow-up to requests from the community, this paper explores Deadeye as a highlighting technique for 3D visualizations, because such stereoscopic scenarios support dichoptic presentation out of the box. Deadeye suppresses binocular disparities for the target object, so we cannot assume the applicability of our technique as a given fact. With this motivation, the paper presents quantitative evaluations of Deadeye in VR, including configurations with multiple heterogeneous distractors as an important robustness challenge. After confirming the preserved preattentiveness (all average accuracies above 90 %) under such real-world conditions, we explore VR volume rendering as an example application scenario for Deadeye. We depict a possible workflow for integrating our technique, conduct an exploratory survey to demonstrate benefits and limitations, and finally provide related design implications.
△ Less
Submitted 10 July, 2019;
originally announced July 2019.
-
Fast Updates on Read-Optimized Databases Using Multi-Core CPUs
Authors:
Jens Krueger,
Changkyu Kim,
Martin Grund,
Nadathur Satish,
David Schwalb,
Jatin Chhugani,
Hasso Plattner,
Pradeep Dubey,
Alexander Zeier
Abstract:
Read-optimized columnar databases use differential updates to handle writes by maintaining a separate write-optimized delta partition which is periodically merged with the read-optimized and compressed main partition. This merge process introduces significant overheads and unacceptable downtimes in update intensive systems, aspiring to combine transactional and analytical workloads into one system…
▽ More
Read-optimized columnar databases use differential updates to handle writes by maintaining a separate write-optimized delta partition which is periodically merged with the read-optimized and compressed main partition. This merge process introduces significant overheads and unacceptable downtimes in update intensive systems, aspiring to combine transactional and analytical workloads into one system. In the first part of the paper, we report data analyses of 12 SAP Business Suite customer systems. In the second half, we present an optimized merge process reducing the merge overhead of current systems by a factor of 30. Our linear-time merge algorithm exploits the underlying high compute and bandwidth resources of modern multi-core CPUs with architecture-aware optimizations and efficient parallelization. This enables compressed in-memory column stores to handle the transactional update rate required by enterprise applications, while keeping properties of read-optimized databases for analytic-style queries.
△ Less
Submitted 30 September, 2011;
originally announced September 2011.