-
The Robotability Score: Enabling Harmonious Robot Navigation on Urban Streets
Authors:
Matt Franchi,
Maria Teresa Parreira,
Fanjun Bu,
Wendy Ju
Abstract:
This paper introduces the Robotability Score ($R$), a novel metric that quantifies the suitability of urban environments for autonomous robot navigation. Through expert interviews and surveys, we identify and weigh key features contributing to R for wheeled robots on urban streets. Our findings reveal that pedestrian density, crowd dynamics and pedestrian flow are the most critical factors, collec…
▽ More
This paper introduces the Robotability Score ($R$), a novel metric that quantifies the suitability of urban environments for autonomous robot navigation. Through expert interviews and surveys, we identify and weigh key features contributing to R for wheeled robots on urban streets. Our findings reveal that pedestrian density, crowd dynamics and pedestrian flow are the most critical factors, collectively accounting for 28% of the total score. Computing robotability across New York City yields significant variation; the area of highest R is 3.0 times more "robotable" than the area of lowest R. Deployments of a physical robot on high and low robotability areas show the adequacy of the score in anticipating the ease of robot navigation. This new framework for evaluating urban landscapes aims to reduce uncertainty in robot deployment while respecting established mobility patterns and urban planning principles, contributing to the discourse on harmonious human-robot environments.
△ Less
Submitted 15 April, 2025;
originally announced April 2025.
-
Co-Designing with Algorithms: Unpacking the Complex Role of GenAI in Interactive System Design Education
Authors:
Hauke Sandhaus,
Quiquan Gu,
Maria Teresa Parreira,
Wendy Ju
Abstract:
Generative Artificial Intelligence (GenAI) is transforming Human-Computer Interaction (HCI) education and technology design, yet its impact remains poorly understood. This study explores how graduate students in an applied HCI course used GenAI tools during interactive device design. Despite no encouragement, all groups integrated GenAI into their workflows. Through 12 post-class group interviews,…
▽ More
Generative Artificial Intelligence (GenAI) is transforming Human-Computer Interaction (HCI) education and technology design, yet its impact remains poorly understood. This study explores how graduate students in an applied HCI course used GenAI tools during interactive device design. Despite no encouragement, all groups integrated GenAI into their workflows. Through 12 post-class group interviews, we identified how GenAI co-design behaviors present both benefits, such as enhanced creativity and faster design iterations, and risks, including shallow learning and reflection. Benefits were most evident during the execution phases, while the discovery and reflection phases showed limited gains. A taxonomy of usage patterns revealed that students' outcomes depended more on how they used GenAI than the specific tasks performed. These findings highlight the need for HCI education to adapt to GenAI's role and offer recommendations for curricula to better prepare future designers for effective creative co-design.
△ Less
Submitted 24 April, 2025; v1 submitted 17 October, 2024;
originally announced October 2024.
-
ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions
Authors:
Micol Spitale,
Maria Teresa Parreira,
Maia Stiber,
Minja Axelsson,
Neval Kara,
Garima Kankariya,
Chien-Ming Huang,
Malte Jung,
Wendy Ju,
Hatice Gunes
Abstract:
Despite the recent advancements in robotics and machine learning (ML), the deployment of autonomous robots in our everyday lives is still an open challenge. This is due to multiple reasons among which are their frequent mistakes, such as interrupting people or having delayed responses, as well as their limited ability to understand human speech, i.e., failure in tasks like transcribing speech to t…
▽ More
Despite the recent advancements in robotics and machine learning (ML), the deployment of autonomous robots in our everyday lives is still an open challenge. This is due to multiple reasons among which are their frequent mistakes, such as interrupting people or having delayed responses, as well as their limited ability to understand human speech, i.e., failure in tasks like transcribing speech to text. These mistakes may disrupt interactions and negatively influence human perception of these robots. To address this problem, robots need to have the ability to detect human-robot interaction (HRI) failures. The ERR@HRI 2024 challenge tackles this by offering a benchmark multimodal dataset of robot failures during human-robot interactions (HRI), encouraging researchers to develop and benchmark multimodal machine learning models to detect these failures. We created a dataset featuring multimodal non-verbal interaction data, including facial, speech, and pose features from video clips of interactions with a robotic coach, annotated with labels indicating the presence or absence of robot mistakes, user awkwardness, and interaction ruptures, allowing for the training and evaluation of predictive models. Challenge participants have been invited to submit their multimodal ML models for detection of robot errors and to be evaluated against various performance metrics such as accuracy, precision, recall, F1 score, with and without a margin of error reflecting the time-sensitivity of these metrics. The results of this challenge will help the research field in better understanding the robot failures in human-robot interactions and designing autonomous robots that can mitigate their own errors after successfully detecting them.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Student Reflections on Self-Initiated GenAI Use in HCI Education
Authors:
Hauke Sandhaus,
Maria Teresa Parreira,
Wendy Ju
Abstract:
This study explores students' self-initiated use of Generative Artificial Intelligence (GenAI) tools in an interactive systems design class. Through 12 group interviews, students revealed the dual nature of GenAI in (1) stimulating creativity and (2) speeding up design iterations, alongside concerns over its potential to cause shallow learning and reliance. GenAI's benefits were pronounced in the…
▽ More
This study explores students' self-initiated use of Generative Artificial Intelligence (GenAI) tools in an interactive systems design class. Through 12 group interviews, students revealed the dual nature of GenAI in (1) stimulating creativity and (2) speeding up design iterations, alongside concerns over its potential to cause shallow learning and reliance. GenAI's benefits were pronounced in the execution phase of design, aiding rapid prototyping and ideation, while its use in initial insight generation posed risks to depth and reflective practice. This reflection highlights the complex role of GenAI in Human-Computer Interaction education, emphasizing the need for balanced integration to leverage its advantages without compromising fundamental learning outcomes.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
A Study on Domain Generalization for Failure Detection through Human Reactions in HRI
Authors:
Maria Teresa Parreira,
Sukruth Gowdru Lingaraju,
Adolfo Ramirez-Aristizabal,
Manaswi Saha,
Michael Kuniavsky,
Wendy Ju
Abstract:
Machine learning models are commonly tested in-distribution (same dataset); performance almost always drops in out-of-distribution settings. For HRI research, the goal is often to develop generalized models. This makes domain generalization - retaining performance in different settings - a critical issue. In this study, we present a concise analysis of domain generalization in failure detection mo…
▽ More
Machine learning models are commonly tested in-distribution (same dataset); performance almost always drops in out-of-distribution settings. For HRI research, the goal is often to develop generalized models. This makes domain generalization - retaining performance in different settings - a critical issue. In this study, we present a concise analysis of domain generalization in failure detection models trained on human facial expressions. Using two distinct datasets of humans reacting to videos where error occurs, one from a controlled lab setting and another collected online, we trained deep learning models on each dataset. When testing these models on the alternate dataset, we observed a significant performance drop. We reflect on the causes for the observed model behavior and leave recommendations. This work emphasizes the need for HRI research focusing on improving model robustness and real-life applicability.
△ Less
Submitted 10 March, 2024;
originally announced March 2024.
-
What Predicts Interpersonal Affect? Preliminary Analyses from Retrospective Evaluations
Authors:
Maria Teresa Parreira,
Michael J. Sack,
Malte Jung
Abstract:
While the field of affective computing has contributed to greatly improving the seamlessness of human-robot interactions, the focus has primarily been on the emotional processing of the self, rather than the perception of the other. To address this gap, in a user study with 30 participant dyads, we collected the users' retrospective ratings of the interpersonal perception of the other interactant,…
▽ More
While the field of affective computing has contributed to greatly improving the seamlessness of human-robot interactions, the focus has primarily been on the emotional processing of the self, rather than the perception of the other. To address this gap, in a user study with 30 participant dyads, we collected the users' retrospective ratings of the interpersonal perception of the other interactant, after a short interaction. We made use of CORAE, a novel web-based open-source tool for COntinuous Retrospective Affect Evaluation. In this work, we analyze how these interpersonal ratings correlate with different aspects of the interaction, namely personality traits, participation balance, and sentiment analysis. Notably, we discovered that conversational imbalance has a significant effect on the retrospective ratings, among other findings. By employing these analyses and methodologies, we lay the groundwork for enhanced human-robot interactions, wherein affect is understood as a highly dynamic and context-dependent outcome of interaction history.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
A Systematic Review on Reproducibility in Child-Robot Interaction
Authors:
Micol Spitale,
Rebecca Stower,
Elmira Yadollahi,
Maria Teresa Parreira,
Nida Itrat Abbasi,
Iolanda Leite,
Hatice Gunes
Abstract:
Research reproducibility - i.e., rerunning analyses on original data to replicate the results - is paramount for guaranteeing scientific validity. However, reproducibility is often very challenging, especially in research fields where multi-disciplinary teams are involved, such as child-robot interaction (CRI). This paper presents a systematic review of the last three years (2020-2022) of research…
▽ More
Research reproducibility - i.e., rerunning analyses on original data to replicate the results - is paramount for guaranteeing scientific validity. However, reproducibility is often very challenging, especially in research fields where multi-disciplinary teams are involved, such as child-robot interaction (CRI). This paper presents a systematic review of the last three years (2020-2022) of research in CRI under the lens of reproducibility, by analysing the field for transparency in reporting. Across a total of 325 studies, we found deficiencies in reporting demographics (e.g. age of participants), study design and implementation (e.g. length of interactions), and open data (e.g. maintaining an active code repository). From this analysis, we distill a set of guidelines and provide a checklist to systematically report CRI studies to help and guide research to improve reproducibility in CRI and beyond.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
"How Did They Come Across?" Lessons Learned from Continuous Affective Ratings
Authors:
Maria Teresa Parreira,
Michael J. Sack,
Hifza Javed,
Nawid Jamali,
Malte Jung
Abstract:
Social distance, or perception of the other, is recognized as a dynamic dimension of an interaction, but yet to be widely explored or understood. Through CORAE, a novel web-based open-source tool for COntinuous Retrospective Affect Evaluation, we collected retrospective ratings of interpersonal perceptions between 12 participant dyads. In this work, we explore how different aspects of these intera…
▽ More
Social distance, or perception of the other, is recognized as a dynamic dimension of an interaction, but yet to be widely explored or understood. Through CORAE, a novel web-based open-source tool for COntinuous Retrospective Affect Evaluation, we collected retrospective ratings of interpersonal perceptions between 12 participant dyads. In this work, we explore how different aspects of these interactions reflect on the ratings collected, through a discourse analysis of individual and social behavior of the interactants. We found that different events observed in the ratings can be mapped to complex interaction phenomena, shedding light on relevant interaction features that may play a role in interpersonal understanding and grounding. This paves the way for better, more seamless human-robot interactions, where affect is interpreted as highly dynamic and contingent on interaction history.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
What Could a Social Mediator Robot Do? Lessons from Real-World Mediation Scenarios
Authors:
Thomas H. Weisswange,
Hifza Javed,
Manuel Dietrich,
Tuan Vu Pham,
Maria Teresa Parreira,
Michael Sack,
Nawid Jamali
Abstract:
The use of social robots as instruments for social mediation has been gaining traction in the field of Human-Robot Interaction (HRI). So far, the design of such robots and their behaviors is often driven by technological platforms and experimental setups in controlled laboratory environments. To address complex social relationships in the real world, it is crucial to consider the actual needs and…
▽ More
The use of social robots as instruments for social mediation has been gaining traction in the field of Human-Robot Interaction (HRI). So far, the design of such robots and their behaviors is often driven by technological platforms and experimental setups in controlled laboratory environments. To address complex social relationships in the real world, it is crucial to consider the actual needs and consequences of the situations found therein. This includes understanding when a mediator is necessary, what specific role such a robot could play, and how it moderates human social dynamics. In this paper, we discuss six relevant roles for robotic mediators that we identified by investigating a collection of videos showing realistic group situations. We further discuss mediation behaviors and target measures to evaluate the success of such interventions. We hope that our findings can inspire future research on robot-assisted social mediation by highlighting a wider set of mediation applications than those found in prior studies. Specifically, we aim to inform the categorization and selection of interaction scenarios that reflect real situations, where a mediation robot can have a positive and meaningful impact on group dynamics.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
CORAE: A Tool for Intuitive and Continuous Retrospective Evaluation of Interactions
Authors:
Michael J. Sack,
Maria Teresa Parreira,
Jenny Fu,
Asher Lipman,
Hifza Javed,
Nawid Jamali,
Malte Jung
Abstract:
This paper introduces CORAE, a novel web-based open-source tool for COntinuous Retrospective Affect Evaluation, designed to capture continuous affect data about interpersonal perceptions in dyadic interactions. Grounded in behavioral ecology perspectives of emotion, this approach replaces valence as the relevant rating dimension with approach and withdrawal, reflecting the degree to which behavior…
▽ More
This paper introduces CORAE, a novel web-based open-source tool for COntinuous Retrospective Affect Evaluation, designed to capture continuous affect data about interpersonal perceptions in dyadic interactions. Grounded in behavioral ecology perspectives of emotion, this approach replaces valence as the relevant rating dimension with approach and withdrawal, reflecting the degree to which behavior is perceived as increasing or decreasing social distance. We conducted a study to experimentally validate the efficacy of our platform with 24 participants. The tool's effectiveness was tested in the context of dyadic negotiation, revealing insights about how interpersonal dynamics evolve over time. We find that the continuous affect rating method is consistent with individuals' perception of the overall interaction. This paper contributes to the growing body of research on affective computing and offers a valuable tool for researchers interested in investigating the temporal dynamics of affect and emotion in social interactions.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
The Bystander Affect Detection (BAD) Dataset for Failure Detection in HRI
Authors:
Alexandra Bremers,
Maria Teresa Parreira,
Xuanyu Fang,
Natalie Friedman,
Adolfo Ramirez-Aristizabal,
Alexandria Pabst,
Mirjana Spasojevic,
Michael Kuniavsky,
Wendy Ju
Abstract:
For a robot to repair its own error, it must first know it has made a mistake. One way that people detect errors is from the implicit reactions from bystanders -- their confusion, smirks, or giggles clue us in that something unexpected occurred. To enable robots to detect and act on bystander responses to task failures, we developed a novel method to elicit bystander responses to human and robot e…
▽ More
For a robot to repair its own error, it must first know it has made a mistake. One way that people detect errors is from the implicit reactions from bystanders -- their confusion, smirks, or giggles clue us in that something unexpected occurred. To enable robots to detect and act on bystander responses to task failures, we developed a novel method to elicit bystander responses to human and robot errors. Using 46 different stimulus videos featuring a variety of human and machine task failures, we collected a total of 2452 webcam videos of human reactions from 54 participants. To test the viability of the collected data, we used the bystander reaction dataset as input to a deep-learning model, BADNet, to predict failure occurrence. We tested different data labeling methods and learned how they affect model performance, achieving precisions above 90%. We discuss strategies to model bystander reactions and predict failure and how this approach can be used in real-world robotic deployments to detect errors and improve robot performance. As part of this work, we also contribute with the "Bystander Affect Detection" (BAD) dataset of bystander reactions, supporting the development of better prediction models.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Using Social Cues to Recognize Task Failures for HRI: Overview, State-of-the-Art, and Future Directions
Authors:
Alexandra Bremers,
Alexandria Pabst,
Maria Teresa Parreira,
Wendy Ju
Abstract:
Robots that carry out tasks and interact in complex environments will inevitably commit errors. Error detection is thus an essential ability for robots to master to work efficiently and productively. People can leverage social feedback to get an indication of whether an action was successful or not. With advances in computing and artificial intelligence (AI), it is increasingly possible for robots…
▽ More
Robots that carry out tasks and interact in complex environments will inevitably commit errors. Error detection is thus an essential ability for robots to master to work efficiently and productively. People can leverage social feedback to get an indication of whether an action was successful or not. With advances in computing and artificial intelligence (AI), it is increasingly possible for robots to achieve a similar capability of collecting social feedback. In this work, we take this one step further and propose a framework for how social cues can be used as feedback signals to recognize task failures for human-robot interaction (HRI). Our proposed framework sets out a research agenda based on insights from the literature on behavioral science, human-robot interaction, and machine learning to focus on three areas: 1) social cues as feedback (from behavioral science), 2) recognizing task failures in robots (from HRI), and 3) approaches for autonomous detection of HRI task failures based on social cues (from machine learning). We propose a taxonomy of error detection based on self-awareness and social feedback. Finally, we provide recommendations for HRI researchers and practitioners interested in developing robots that detect task errors using human social cues. This article is intended for interdisciplinary HRI researchers and practitioners, where the third theme of our analysis provides more technical details aiming toward the practical implementation of these systems.
△ Less
Submitted 29 May, 2024; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Robot Duck Debugging: Can Attentive Listening Improve Problem Solving?
Authors:
Maria Teresa Parreira,
Sarah Gillet,
Iolanda Leite
Abstract:
While thinking aloud has been reported to positively affect problem-solving, the effects of the presence of an embodied entity (e.g., a social robot) to whom words can be directed remain mostly unexplored. In this work, we investigated the role of a robot in a "rubber duck debugging" setting, by analyzing how a robot's listening behaviors could support a thinking-aloud problem-solving session. Par…
▽ More
While thinking aloud has been reported to positively affect problem-solving, the effects of the presence of an embodied entity (e.g., a social robot) to whom words can be directed remain mostly unexplored. In this work, we investigated the role of a robot in a "rubber duck debugging" setting, by analyzing how a robot's listening behaviors could support a thinking-aloud problem-solving session. Participants completed two different tasks while speaking their thoughts aloud to either a robot or an inanimate object (a giant rubber duck). We implemented and tested two types of listener behavior in the robot: a rule-based heuristic and a deep-learning-based model. In a between-subject user study with 101 participants, we evaluated how the presence of a robot affected users' engagement in thinking aloud, behavior during the task, and self-reported user experience. In addition, we explored the impact of the two robot listening behaviors on those measures. In contrast to prior work, our results indicate that neither the rule-based heuristic nor the deep learning robot conditions improved performance or perception of the task, compared to an inanimate object. We discuss potential explanations and shed light on the feasibility of designing social robots as assistive tools in thinking-aloud problem-solving tasks.
△ Less
Submitted 18 January, 2023; v1 submitted 16 January, 2023;
originally announced January 2023.