-
Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking
Authors:
Xin Wei,
Huakun Liu,
Yutaro Hirao,
Monica Perusquia-Hernandez,
Katsutoshi Masai,
Hideaki Uchiyama,
Kiyoshi Kiyokawa
Abstract:
Refractive errors are among the most common visual impairments globally, yet their diagnosis often relies on active user participation and clinical oversight. This study explores a passive method for estimating refractive power using two eye movement recording techniques: electrooculography (EOG) and video-based eye tracking. Using a publicly available dataset recorded under varying diopter condit…
▽ More
Refractive errors are among the most common visual impairments globally, yet their diagnosis often relies on active user participation and clinical oversight. This study explores a passive method for estimating refractive power using two eye movement recording techniques: electrooculography (EOG) and video-based eye tracking. Using a publicly available dataset recorded under varying diopter conditions, we trained Long Short-Term Memory (LSTM) models to classify refractive power from unimodal (EOG or eye tracking) and multimodal configuration. We assess performance in both subject-dependent and subject-independent settings to evaluate model personalization and generalizability across individuals. Results show that the multimodal model consistently outperforms unimodal models, achieving the highest average accuracy in both settings: 96.207\% in the subject-dependent scenario and 8.882\% in the subject-independent scenario. However, generalization remains limited, with classification accuracy only marginally above chance in the subject-independent evaluations. Statistical comparisons in the subject-dependent setting confirmed that the multimodal model significantly outperformed the EOG and eye-tracking models. However, no statistically significant differences were found in the subject-independent setting. Our findings demonstrate both the potential and current limitations of eye movement data-based refractive error estimation, contributing to the development of continuous, non-invasive screening methods using EOG signals and eye-tracking data.
△ Less
Submitted 24 May, 2025;
originally announced May 2025.
-
UMotion: Uncertainty-driven Human Motion Estimation from Inertial and Ultra-wideband Units
Authors:
Huakun Liu,
Hiroki Ota,
Xin Wei,
Yutaro Hirao,
Monica Perusquia-Hernandez,
Hideaki Uchiyama,
Kiyoshi Kiyokawa
Abstract:
Sparse wearable inertial measurement units (IMUs) have gained popularity for estimating 3D human motion. However, challenges such as pose ambiguity, data drift, and limited adaptability to diverse bodies persist. To address these issues, we propose UMotion, an uncertainty-driven, online fusing-all state estimation framework for 3D human shape and pose estimation, supported by six integrated, body-…
▽ More
Sparse wearable inertial measurement units (IMUs) have gained popularity for estimating 3D human motion. However, challenges such as pose ambiguity, data drift, and limited adaptability to diverse bodies persist. To address these issues, we propose UMotion, an uncertainty-driven, online fusing-all state estimation framework for 3D human shape and pose estimation, supported by six integrated, body-worn ultra-wideband (UWB) distance sensors with IMUs. UWB sensors measure inter-node distances to infer spatial relationships, aiding in resolving pose ambiguities and body shape variations when combined with anthropometric data. Unfortunately, IMUs are prone to drift, and UWB sensors are affected by body occlusions. Consequently, we develop a tightly coupled Unscented Kalman Filter (UKF) framework that fuses uncertainties from sensor data and estimated human motion based on individual body shape. The UKF iteratively refines IMU and UWB measurements by aligning them with uncertain human motion constraints in real-time, producing optimal estimates for each. Experiments on both synthetic and real-world datasets demonstrate the effectiveness of UMotion in stabilizing sensor data and the improvement over state of the art in pose accuracy.
△ Less
Submitted 14 May, 2025;
originally announced May 2025.
-
MagicCraft: Natural Language-Driven Generation of Dynamic and Interactive 3D Objects for Commercial Metaverse Platforms
Authors:
Ryutaro Kurai,
Takefumi Hiraki,
Yuichi Hiroi,
Yutaro Hirao,
Monica Perusquía-Hernández,
Hideaki Uchiyama,
Kiyoshi Kiyokawa
Abstract:
Metaverse platforms are rapidly evolving to provide immersive spaces for user interaction and content creation. However, the generation of dynamic and interactive 3D objects remains challenging due to the need for advanced 3D modeling and programming skills. To address this challenge, we present MagicCraft, a system that generates functional 3D objects from natural language prompts for metaverse p…
▽ More
Metaverse platforms are rapidly evolving to provide immersive spaces for user interaction and content creation. However, the generation of dynamic and interactive 3D objects remains challenging due to the need for advanced 3D modeling and programming skills. To address this challenge, we present MagicCraft, a system that generates functional 3D objects from natural language prompts for metaverse platforms. MagicCraft uses generative AI models to manage the entire content creation pipeline: converting user text descriptions into images, transforming images into 3D models, predicting object behavior, and assigning necessary attributes and scripts. It also provides an interactive interface for users to refine generated objects by adjusting features such as orientation, scale, seating positions, and grip points.
Implemented on Cluster, a commercial metaverse platform, MagicCraft was evaluated by 7 expert CG designers and 51 general users. Results show that MagicCraft significantly reduces the time and skill required to create 3D objects. Users with no prior experience in 3D modeling or programming successfully created complex, interactive objects and deployed them in the metaverse. Expert feedback highlighted the system's potential to improve content creation workflows and support rapid prototyping. By integrating AI-generated content into metaverse platforms, MagicCraft makes 3D content creation more accessible.
△ Less
Submitted 30 April, 2025;
originally announced April 2025.
-
MagicItem: Dynamic Behavior Design of Virtual Objects with Large Language Models in a Consumer Metaverse Platform
Authors:
Ryutaro Kurai,
Takefumi Hiraki,
Yuichi Hiroi,
Yutaro Hirao,
Monica Perusquia-Hernandez,
Hideaki Uchiyama,
Kiyoshi Kiyokawa
Abstract:
To create rich experiences in virtual reality (VR) environments, it is essential to define the behavior of virtual objects through programming. However, programming in 3D spaces requires a wide range of background knowledge and programming skills. Although Large Language Models (LLMs) have provided programming support, they are still primarily aimed at programmers. In metaverse platforms, where ma…
▽ More
To create rich experiences in virtual reality (VR) environments, it is essential to define the behavior of virtual objects through programming. However, programming in 3D spaces requires a wide range of background knowledge and programming skills. Although Large Language Models (LLMs) have provided programming support, they are still primarily aimed at programmers. In metaverse platforms, where many users inhabit VR spaces, most users are unfamiliar with programming, making it difficult for them to modify the behavior of objects in the VR environment easily. Existing LLM-based script generation methods for VR spaces require multiple lengthy iterations to implement the desired behaviors and are difficult to integrate into the operation of metaverse platforms. To address this issue, we propose a tool that generates behaviors for objects in VR spaces from natural language within Cluster, a metaverse platform with a large user base. By integrating LLMs with the Cluster Script provided by this platform, we enable users with limited programming experience to define object behaviors within the platform freely. We have also integrated our tool into a commercial metaverse platform and are conducting online experiments with 63 general users of the platform. The experiments show that even users with no programming background can successfully generate behaviors for objects in VR spaces, resulting in a highly satisfying system. Our research contributes to democratizing VR content creation by enabling non-programmers to design dynamic behaviors for virtual objects in metaverse platforms.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
ShareYourReality: Investigating Haptic Feedback and Agency in Virtual Avatar Co-embodiment
Authors:
Karthikeya Puttur Venkatraj,
Wo Meijer,
Monica Perusquía-Hernández,
Gijs Huisman,
Abdallah El Ali
Abstract:
Virtual co-embodiment enables two users to share a single avatar in Virtual Reality (VR). During such experiences, the illusion of shared motion control can break during joint-action activities, highlighting the need for position-aware feedback mechanisms. Drawing on the perceptual crossing paradigm, we explore how haptics can enable non-verbal coordination between co-embodied participants. In a w…
▽ More
Virtual co-embodiment enables two users to share a single avatar in Virtual Reality (VR). During such experiences, the illusion of shared motion control can break during joint-action activities, highlighting the need for position-aware feedback mechanisms. Drawing on the perceptual crossing paradigm, we explore how haptics can enable non-verbal coordination between co-embodied participants. In a within-subjects study (20 participant pairs), we examined the effects of vibrotactile haptic feedback (None, Present) and avatar control distribution (25-75%, 50-50%, 75-25%) across two VR reaching tasks (Targeted, Free-choice) on participants Sense of Agency (SoA), co-presence, body ownership, and motion synchrony. We found (a) lower SoA in the free-choice with haptics than without, (b) higher SoA during the shared targeted task, (c) co-presence and body ownership were significantly higher in the free-choice task, (d) players hand motions synchronized more in the targeted task. We provide cautionary considerations when including haptic feedback mechanisms for avatar co-embodiment experiences.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Ensemble Learning to Assess Dynamics of Affective Experience Ratings and Physiological Change
Authors:
Felix Dollack,
Kiyoshi Kiyokawa,
Huakun Liu,
Monica Perusquia-Hernandez,
Chirag Raman,
Hideaki Uchiyama,
Xin Wei
Abstract:
The congruence between affective experiences and physiological changes has been a debated topic for centuries. Recent technological advances in measurement and data analysis provide hope to solve this epic challenge. Open science and open data practices, together with data analysis challenges open to the academic community, are also promising tools for solving this problem. In this entry to the Em…
▽ More
The congruence between affective experiences and physiological changes has been a debated topic for centuries. Recent technological advances in measurement and data analysis provide hope to solve this epic challenge. Open science and open data practices, together with data analysis challenges open to the academic community, are also promising tools for solving this problem. In this entry to the Emotion Physiology and Experience Collaboration (EPiC) challenge, we propose a data analysis solution that combines theoretical assumptions with data-driven methodologies. We used feature engineering and ensemble selection. Each predictor was trained on subsets of the training data that would maximize the information available for training. Late fusion was used with an averaging step. We chose to average considering a ``wisdom of crowds'' strategy. This strategy yielded an overall RMSE of 1.19 in the test set. Future work should carefully explore if our assumptions are correct and the potential of weighted fusion.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
PhysioCHI: Towards Best Practices for Integrating Physiological Signals in HCI
Authors:
Francesco Chiossi,
Ekaterina R. Stepanova,
Benjamin Tag,
Monica Perusquia-Hernandez,
Alexandra Kitson,
Arindam Dey,
Sven Mayer,
Abdallah El Ali
Abstract:
Recently, we saw a trend toward using physiological signals in interactive systems. These signals, offering deep insights into users' internal states and health, herald a new era for HCI. However, as this is an interdisciplinary approach, many challenges arise for HCI researchers, such as merging diverse disciplines, from understanding physiological functions to design expertise. Also, isolated re…
▽ More
Recently, we saw a trend toward using physiological signals in interactive systems. These signals, offering deep insights into users' internal states and health, herald a new era for HCI. However, as this is an interdisciplinary approach, many challenges arise for HCI researchers, such as merging diverse disciplines, from understanding physiological functions to design expertise. Also, isolated research endeavors limit the scope and reach of findings. This workshop aims to bridge these gaps, fostering cross-disciplinary discussions on usability, open science, and ethics tied to physiological data in HCI. In this workshop, we will discuss best practices for embedding physiological signals in interactive systems. Through collective efforts, we seek to craft a guiding document for best practices in physiological HCI research, ensuring that it remains grounded in shared principles and methodologies as the field advances.
△ Less
Submitted 11 December, 2023; v1 submitted 7 December, 2023;
originally announced December 2023.
-
Facial movement synergies and Action Unit detection from distal wearable Electromyography and Computer Vision
Authors:
Monica Perusquia-Hernandez,
Felix Dollack,
Chun Kwang Tan,
Shushi Namba,
Saho Ayabe-Kanamura,
Kenji Suzuki
Abstract:
Distal facial Electromyography (EMG) can be used to detect smiles and frowns with reasonable accuracy. It capitalizes on volume conduction to detect relevant muscle activity, even when the electrodes are not placed directly on the source muscle. The main advantage of this method is to prevent occlusion and obstruction of the facial expression production, whilst allowing EMG measurements. However,…
▽ More
Distal facial Electromyography (EMG) can be used to detect smiles and frowns with reasonable accuracy. It capitalizes on volume conduction to detect relevant muscle activity, even when the electrodes are not placed directly on the source muscle. The main advantage of this method is to prevent occlusion and obstruction of the facial expression production, whilst allowing EMG measurements. However, measuring EMG distally entails that the exact source of the facial movement is unknown. We propose a novel method to estimate specific Facial Action Units (AUs) from distal facial EMG and Computer Vision (CV). This method is based on Independent Component Analysis (ICA), Non-Negative Matrix Factorization (NNMF), and sorting of the resulting components to determine which is the most likely to correspond to each CV-labeled action unit (AU). Performance on the detection of AU06 (Orbicularis Oculi) and AU12 (Zygomaticus Major) was estimated by calculating the agreement with Human Coders. The results of our proposed algorithm showed an accuracy of 81% and a Cohen's Kappa of 0.49 for AU6; and accuracy of 82% and a Cohen's Kappa of 0.53 for AU12. This demonstrates the potential of distal EMG to detect individual facial movements. Using this multimodal method, several AU synergies were identified. We quantified the co-occurrence and timing of AU6 and AU12 in posed and spontaneous smiles using the human-coded labels, and for comparison, using the continuous CV-labels. The co-occurrence analysis was also performed on the EMG-based labels to uncover the relationship between muscle synergies and the kinematics of visible facial movement.
△ Less
Submitted 20 August, 2020;
originally announced August 2020.
-
Robot mirroring: A framework for self-tracking feedback through empathy with an artificial agent representing the self
Authors:
Monica Perusquía-Hernández,
David Antonio Gómez Jáuregui,
Marisabel Cuberos-Balda,
Diego Paez-Granados
Abstract:
Current technologies have enabled us to track and quantify our physical state and behavior. Self-tracking aims to achieve increased awareness to decrease undesired behaviors and lead to a healthier lifestyle. However, inappropriately communicated self-tracking results might cause the opposite effect. In this work, we propose a subtle self-tracking feedback by mirroring the self's state into an art…
▽ More
Current technologies have enabled us to track and quantify our physical state and behavior. Self-tracking aims to achieve increased awareness to decrease undesired behaviors and lead to a healthier lifestyle. However, inappropriately communicated self-tracking results might cause the opposite effect. In this work, we propose a subtle self-tracking feedback by mirroring the self's state into an artificial agent. By eliciting empathy towards the artificial agent and fostering helping behaviors, users would help themselves as well. Finally, we reflected on the implications of this design framework, and the methodology to design and implement it. A series of interviews to expert designers pointed out to the importance of having multidisciplinary teams working in parallel. Moreover, an agile methodology with a sprint zero for the initial design, and shifted user research, design, and implementation sprints were proposed. Similar systems with data flow and hardware dependencies would also benefit from the proposed agile design process.
△ Less
Submitted 20 March, 2019;
originally announced March 2019.