People Attribute Purpose to Autonomous Vehicles When Explaining Their Behavior: Insights from Cognitive Science for Explainable AI
Authors:
Balint Gyevnar,
Stephanie Droop,
Tadeg Quillien,
Shay B. Cohen,
Neil R. Bramley,
Christopher G. Lucas,
Stefano V. Albrecht
Abstract:
It is often argued that effective human-centered explainable artificial intelligence (XAI) should resemble human reasoning. However, empirical investigations of how concepts from cognitive science can aid the design of XAI are lacking. Based on insights from cognitive science, we propose a framework of explanatory modes to analyze how people frame explanations, whether mechanistic, teleological, o…
▽ More
It is often argued that effective human-centered explainable artificial intelligence (XAI) should resemble human reasoning. However, empirical investigations of how concepts from cognitive science can aid the design of XAI are lacking. Based on insights from cognitive science, we propose a framework of explanatory modes to analyze how people frame explanations, whether mechanistic, teleological, or counterfactual. Using the complex safety-critical domain of autonomous driving, we conduct an experiment consisting of two studies on (i) how people explain the behavior of a vehicle in 14 unique scenarios (N1=54) and (ii) how they perceive these explanations (N2=382), curating the novel Human Explanations for Autonomous Driving Decisions (HEADD) dataset. Our main finding is that participants deem teleological explanations significantly better quality than counterfactual ones, with perceived teleology being the best predictor of perceived quality. Based on our results, we argue that explanatory modes are an important axis of analysis when designing and evaluating XAI and highlight the need for a principled and empirically grounded understanding of the cognitive mechanisms of explanation. The HEADD dataset and our code are available at: https://datashare.ed.ac.uk/handle/10283/8930.
△ Less
Submitted 3 February, 2025; v1 submitted 11 March, 2024;
originally announced March 2024.
Selective imitation on the basis of reward function similarity
Authors:
Max Taylor-Davies,
Stephanie Droop,
Christopher G. Lucas
Abstract:
Imitation is a key component of human social behavior, and is widely used by both children and adults as a way to navigate uncertain or unfamiliar situations. But in an environment populated by multiple heterogeneous agents pursuing different goals or objectives, indiscriminate imitation is unlikely to be an effective strategy -- the imitator must instead determine who is most useful to copy. Ther…
▽ More
Imitation is a key component of human social behavior, and is widely used by both children and adults as a way to navigate uncertain or unfamiliar situations. But in an environment populated by multiple heterogeneous agents pursuing different goals or objectives, indiscriminate imitation is unlikely to be an effective strategy -- the imitator must instead determine who is most useful to copy. There are likely many factors that play into these judgements, depending on context and availability of information. Here we investigate the hypothesis that these decisions involve inferences about other agents' reward functions. We suggest that people preferentially imitate the behavior of others they deem to have similar reward functions to their own. We further argue that these inferences can be made on the basis of very sparse or indirect data, by leveraging an inductive bias toward positing the existence of different \textit{groups} or \textit{types} of people with similar reward functions, allowing learners to select imitation targets without direct evidence of alignment.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.