-
Advancing Face-to-Face Emotion Communication: A Multimodal Dataset (AFFEC)
Authors:
Meisam J. Sekiavandi,
Laurits Dixen,
Jostein Fimland,
Sree Keerthi Desu,
Antonia-Bianca Zserai,
Ye Sul Lee,
Maria Barrett,
Paolo Burelli
Abstract:
Emotion recognition has the potential to play a pivotal role in enhancing human-computer interaction by enabling systems to accurately interpret and respond to human affect. Yet, capturing emotions in face-to-face contexts remains challenging due to subtle nonverbal cues, variations in personal traits, and the real-time dynamics of genuine interactions. Existing emotion recognition datasets often…
▽ More
Emotion recognition has the potential to play a pivotal role in enhancing human-computer interaction by enabling systems to accurately interpret and respond to human affect. Yet, capturing emotions in face-to-face contexts remains challenging due to subtle nonverbal cues, variations in personal traits, and the real-time dynamics of genuine interactions. Existing emotion recognition datasets often rely on limited modalities or controlled conditions, thereby missing the richness and variability found in real-world scenarios.
In this work, we introduce Advancing Face-to-Face Emotion Communication (AFFEC), a multimodal dataset designed to address these gaps. AFFEC encompasses 84 simulated emotional dialogues across six distinct emotions, recorded from 73 participants over more than 5,000 trials and annotated with more than 20,000 labels. It integrates electroencephalography (EEG), eye-tracking, galvanic skin response (GSR), facial videos, and Big Five personality assessments. Crucially, AFFEC explicitly distinguishes between felt emotions (the participant's internal affect) and perceived emotions (the observer's interpretation of the stimulus).
Baseline analyses spanning unimodal features and straightforward multimodal fusion demonstrate that even minimal processing yields classification performance significantly above chance, especially for arousal. Incorporating personality traits further improves predictions of felt emotions, highlighting the importance of individual differences. By bridging controlled experimentation with more realistic face-to-face stimuli, AFFEC offers a unique resource for researchers aiming to develop context-sensitive, adaptive, and personalized emotion recognition models.
△ Less
Submitted 1 May, 2025; v1 submitted 26 April, 2025;
originally announced April 2025.
-
Exploring the Temporal Dynamics of Facial Mimicry in Emotion Processing Using Action Units
Authors:
Meisam Jamshidi Seikavandi,
Jostein Fimland,
Maria Jung Barrett,
Paolo Burelli
Abstract:
Facial mimicry - the automatic, unconscious imitation of others' expressions - is vital for emotional understanding. This study investigates how mimicry differs across emotions using Face Action Units from videos and participants' responses. Dynamic Time Warping quantified the temporal alignment between participants' and stimuli's facial expressions, revealing significant emotional variations. Pos…
▽ More
Facial mimicry - the automatic, unconscious imitation of others' expressions - is vital for emotional understanding. This study investigates how mimicry differs across emotions using Face Action Units from videos and participants' responses. Dynamic Time Warping quantified the temporal alignment between participants' and stimuli's facial expressions, revealing significant emotional variations. Post-hoc tests indicated greater mimicry for 'Fear' than 'Happy' and reduced mimicry for 'Anger' compared to 'Fear'. The mimicry correlations with personality traits like Extraversion and Agreeableness were significant, showcasing subtle yet meaningful connections. These findings suggest specific emotions evoke stronger mimicry, with personality traits playing a secondary role in emotional alignment. Notably, our results highlight how personality-linked mimicry mechanisms extend beyond interpersonal communication to affective computing applications, such as remote human-human interactions and human-virtual-agent scenarios. Insights from temporal facial mimicry - e.g., designing digital agents that adaptively mirror user expressions - enable developers to create empathetic, personalized systems, enhancing emotional resonance and user engagement.
△ Less
Submitted 21 March, 2025;
originally announced March 2025.
-
Modelling Emotions in Face-to-Face Setting: The Interplay of Eye-Tracking, Personality, and Temporal Dynamics
Authors:
Meisam Jamshidi Seikavandi,
Jostein Fimland,
Maria Barrett,
Paolo Burelli
Abstract:
Accurate emotion recognition is pivotal for nuanced and engaging human-computer interactions, yet remains difficult to achieve, especially in dynamic, conversation-like settings. In this study, we showcase how integrating eye-tracking data, temporal dynamics, and personality traits can substantially enhance the detection of both perceived and felt emotions. Seventy-three participants viewed short,…
▽ More
Accurate emotion recognition is pivotal for nuanced and engaging human-computer interactions, yet remains difficult to achieve, especially in dynamic, conversation-like settings. In this study, we showcase how integrating eye-tracking data, temporal dynamics, and personality traits can substantially enhance the detection of both perceived and felt emotions. Seventy-three participants viewed short, speech-containing videos from the CREMA-D dataset, while being recorded for eye-tracking signals (pupil size, fixation patterns), Big Five personality assessments, and self-reported emotional states. Our neural network models combined these diverse inputs including stimulus emotion labels for contextual cues and yielded marked performance gains compared to the state-of-the-art. Specifically, perceived valence predictions reached a macro F1-score of 0.76, and models incorporating personality traits and stimulus information demonstrated significant improvements in felt emotion accuracy. These results highlight the benefit of unifying physiological, individual and contextual factors to address the subjectivity and complexity of emotional expression. Beyond validating the role of user-specific data in capturing subtle internal states, our findings inform the design of future affective computing and human-agent systems, paving the way for more adaptive and cross-individual emotional intelligence in real-world interactions.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Modeling Face Emotion Perception from Naturalistic Face Viewing: Insights from Fixational Events and Gaze Strategies
Authors:
Meisam J. Seikavandi,
Maria J. Barrett,
Paolo Burelli
Abstract:
Face Emotion Recognition (FER) is essential for social interactions and understanding others' mental states. Utilizing eye tracking to investigate FER has yielded insights into cognitive processes. In this study, we utilized an instructionless paradigm to collect eye movement data from 21 participants, examining two FER processes: free viewing and grounded FER. We analyzed fixational, pupillary, a…
▽ More
Face Emotion Recognition (FER) is essential for social interactions and understanding others' mental states. Utilizing eye tracking to investigate FER has yielded insights into cognitive processes. In this study, we utilized an instructionless paradigm to collect eye movement data from 21 participants, examining two FER processes: free viewing and grounded FER. We analyzed fixational, pupillary, and microsaccadic events from eye movements, establishing their correlation with emotion perception and performance in the grounded task. By identifying regions of interest on the face, we explored the impact of eye-gaze strategies on face processing, their connection to emotions, and performance in emotion perception. During free viewing, participants displayed specific attention patterns for various emotions. In grounded tasks, where emotions were interpreted based on words, we assessed performance and contextual understanding. Notably, gaze patterns during free viewing predicted success in grounded FER tasks, underscoring the significance of initial gaze behavior. We also employed features from pre-trained deep-learning models for face recognition to enhance the scalability and comparability of attention analysis during free viewing across different datasets and populations. This method facilitated the prediction and modeling of individual emotion perception performance from minimal observations. Our findings advance the understanding of the link between eye movements and emotion perception, with implications for psychology, human-computer interaction, and affective computing, and pave the way for developing precise emotion recognition systems.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Benchmark Early and Red Team Often: A Framework for Assessing and Managing Dual-Use Hazards of AI Foundation Models
Authors:
Anthony M. Barrett,
Krystal Jackson,
Evan R. Murphy,
Nada Madkour,
Jessica Newman
Abstract:
A concern about cutting-edge or "frontier" AI foundation models is that an adversary may use the models for preparing chemical, biological, radiological, nuclear, (CBRN), cyber, or other attacks. At least two methods can identify foundation models with potential dual-use capability; each has advantages and disadvantages: A. Open benchmarks (based on openly available questions and answers), which a…
▽ More
A concern about cutting-edge or "frontier" AI foundation models is that an adversary may use the models for preparing chemical, biological, radiological, nuclear, (CBRN), cyber, or other attacks. At least two methods can identify foundation models with potential dual-use capability; each has advantages and disadvantages: A. Open benchmarks (based on openly available questions and answers), which are low-cost but accuracy-limited by the need to omit security-sensitive details; and B. Closed red team evaluations (based on private evaluation by CBRN and cyber experts), which are higher-cost but can achieve higher accuracy by incorporating sensitive details. We propose a research and risk-management approach using a combination of methods including both open benchmarks and closed red team evaluations, in a way that leverages advantages of both methods. We recommend that one or more groups of researchers with sufficient resources and access to a range of near-frontier and frontier foundation models run a set of foundation models through dual-use capability evaluation benchmarks and red team evaluations, then analyze the resulting sets of models' scores on benchmark and red team evaluations to see how correlated those are. If, as we expect, there is substantial correlation between the dual-use potential benchmark scores and the red team evaluation scores, then implications include the following: The open benchmarks should be used frequently during foundation model development as a quick, low-cost measure of a model's dual-use potential; and if a particular model gets a high score on the dual-use potential benchmark, then more in-depth red team assessments of that model's dual-use capability should be performed. We also discuss limitations and mitigations for our approach, e.g., if model developers try to game benchmarks by including a version of benchmark test data in a model's training data.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Can Humans Identify Domains?
Authors:
Maria Barrett,
Max Müller-Eberstein,
Elisa Bassignana,
Amalie Brogaard Pauli,
Mike Zhang,
Rob van der Goot
Abstract:
Textual domain is a crucial property within the Natural Language Processing (NLP) community due to its effects on downstream model performance. The concept itself is, however, loosely defined and, in practice, refers to any non-typological property, such as genre, topic, medium or style of a document. We investigate the core notion of domains via human proficiency in identifying related intrinsic…
▽ More
Textual domain is a crucial property within the Natural Language Processing (NLP) community due to its effects on downstream model performance. The concept itself is, however, loosely defined and, in practice, refers to any non-typological property, such as genre, topic, medium or style of a document. We investigate the core notion of domains via human proficiency in identifying related intrinsic textual properties, specifically the concepts of genre (communicative purpose) and topic (subject matter). We publish our annotations in *TGeGUM*: A collection of 9.1k sentences from the GUM dataset (Zeldes, 2017) with single sentence and larger context (i.e., prose) annotations for one of 11 genres (source type), and its topic/subtopic as per the Dewey Decimal library classification system (Dewey, 1979), consisting of 10/100 hierarchical topics of increased granularity. Each instance is annotated by three annotators, for a total of 32.7k annotations, allowing us to examine the level of human disagreement and the relative difficulty of each annotation task. With a Fleiss' kappa of at most 0.53 on the sentence level and 0.66 at the prose level, it is evident that despite the ubiquity of domains in NLP, there is little human consensus on how to define them. By training classifiers to perform the same task, we find that this uncertainty also extends to NLP models.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Actionable Guidance for High-Consequence AI Risk Management: Towards Standards Addressing AI Catastrophic Risks
Authors:
Anthony M. Barrett,
Dan Hendrycks,
Jessica Newman,
Brandie Nonnecke
Abstract:
Artificial intelligence (AI) systems can provide many beneficial capabilities but also risks of adverse events. Some AI systems could present risks of events with very high or catastrophic consequences at societal scale. The US National Institute of Standards and Technology (NIST) has been developing the NIST Artificial Intelligence Risk Management Framework (AI RMF) as voluntary guidance on AI ri…
▽ More
Artificial intelligence (AI) systems can provide many beneficial capabilities but also risks of adverse events. Some AI systems could present risks of events with very high or catastrophic consequences at societal scale. The US National Institute of Standards and Technology (NIST) has been developing the NIST Artificial Intelligence Risk Management Framework (AI RMF) as voluntary guidance on AI risk assessment and management for AI developers and others. For addressing risks of events with catastrophic consequences, NIST indicated a need to translate from high level principles to actionable risk management guidance.
In this document, we provide detailed actionable-guidance recommendations focused on identifying and managing risks of events with very high or catastrophic consequences, intended as a risk management practices resource for NIST for AI RMF version 1.0 (released in January 2023), or for AI RMF users, or for other AI risk management guidance and standards as appropriate. We also provide our methodology for our recommendations.
We provide actionable-guidance recommendations for AI RMF 1.0 on: identifying risks from potential unintended uses and misuses of AI systems; including catastrophic-risk factors within the scope of risk assessments and impact assessments; identifying and mitigating human rights harms; and reporting information on AI risk factors including catastrophic-risk factors.
In addition, we provide recommendations on additional issues for a roadmap for later versions of the AI RMF or supplementary publications. These include: providing an AI RMF Profile with supplementary guidance for cutting-edge increasingly multi-purpose or general-purpose AI.
We aim for this work to be a concrete risk-management practices contribution, and to stimulate constructive dialogue on how to address catastrophic risks and associated issues in AI standards.
△ Less
Submitted 23 February, 2023; v1 submitted 17 June, 2022;
originally announced June 2022.
-
The Copenhagen Corpus of Eye Tracking Recordings from Natural Reading of Danish Texts
Authors:
Nora Hollenstein,
Maria Barrett,
Marina Björnsdóttir
Abstract:
Eye movement recordings from reading are one of the richest signals of human language processing. Corpora of eye movements during reading of contextualized running text is a way of making such records available for natural language processing purposes. Such corpora already exist in some languages. We present CopCo, the Copenhagen Corpus of eye tracking recordings from natural reading of Danish tex…
▽ More
Eye movement recordings from reading are one of the richest signals of human language processing. Corpora of eye movements during reading of contextualized running text is a way of making such records available for natural language processing purposes. Such corpora already exist in some languages. We present CopCo, the Copenhagen Corpus of eye tracking recordings from natural reading of Danish texts. It is the first eye tracking corpus of its kind for the Danish language. CopCo includes 1,832 sentences with 34,897 tokens of Danish text extracted from a collection of speech manuscripts. This first release of the corpus contains eye tracking data from 22 participants. It will be extended continuously with more participants and texts from other genres. We assess the data quality of the recorded eye movements and find that the extracted features are in line with related research. The dataset available here: https://osf.io/ud8s5/.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
Decoding EEG Brain Activity for Multi-Modal Natural Language Processing
Authors:
Nora Hollenstein,
Cedric Renggli,
Benjamin Glaus,
Maria Barrett,
Marius Troendle,
Nicolas Langer,
Ce Zhang
Abstract:
Until recently, human behavioral data from reading has mainly been of interest to researchers to understand human cognition. However, these human language processing signals can also be beneficial in machine learning-based natural language processing tasks. Using EEG brain activity to this purpose is largely unexplored as of yet. In this paper, we present the first large-scale study of systematica…
▽ More
Until recently, human behavioral data from reading has mainly been of interest to researchers to understand human cognition. However, these human language processing signals can also be beneficial in machine learning-based natural language processing tasks. Using EEG brain activity to this purpose is largely unexplored as of yet. In this paper, we present the first large-scale study of systematically analyzing the potential of EEG brain activity data for improving natural language processing tasks, with a special focus on which features of the signal are most beneficial. We present a multi-modal machine learning architecture that learns jointly from textual input as well as from EEG features. We find that filtering the EEG signals into frequency bands is more beneficial than using the broadband signal. Moreover, for a range of word embedding types, EEG data improves binary and ternary sentiment classification and outperforms multiple baselines. For more complex tasks such as relation detection, further research is needed. Finally, EEG data shows to be particularly promising when limited training data is available.
△ Less
Submitted 13 July, 2021; v1 submitted 17 February, 2021;
originally announced February 2021.
-
Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias
Authors:
Ana Valeria Gonzalez,
Maria Barrett,
Rasmus Hvingelby,
Kellie Webster,
Anders Søgaard
Abstract:
The one-sided focus on English in previous studies of gender bias in NLP misses out on opportunities in other languages: English challenge datasets such as GAP and WinoGender highlight model preferences that are "hallucinatory", e.g., disambiguating gender-ambiguous occurrences of 'doctor' as male doctors. We show that for languages with type B reflexivization, e.g., Swedish and Russian, we can co…
▽ More
The one-sided focus on English in previous studies of gender bias in NLP misses out on opportunities in other languages: English challenge datasets such as GAP and WinoGender highlight model preferences that are "hallucinatory", e.g., disambiguating gender-ambiguous occurrences of 'doctor' as male doctors. We show that for languages with type B reflexivization, e.g., Swedish and Russian, we can construct multi-task challenge datasets for detecting gender bias that lead to unambiguously wrong model predictions: In these languages, the direct translation of 'the doctor removed his mask' is not ambiguous between a coreferential reading and a disjoint reading. Instead, the coreferential reading requires a non-gendered pronoun, and the gendered, possessive pronouns are anti-reflexive. We present a multilingual, multi-task challenge dataset, which spans four languages and four NLP tasks and focuses only on this phenomenon. We find evidence for gender bias across all task-language combinations and correlate model bias with national labor market statistics.
△ Less
Submitted 28 September, 2020; v1 submitted 24 September, 2020;
originally announced September 2020.
-
Human brain activity for machine attention
Authors:
Lukas Muttenthaler,
Nora Hollenstein,
Maria Barrett
Abstract:
Cognitively inspired NLP leverages human-derived data to teach machines about language processing mechanisms. Recently, neural networks have been augmented with behavioral data to solve a range of NLP tasks spanning syntax and semantics. We are the first to exploit neuroscientific data, namely electroencephalography (EEG), to inform a neural attention model about language processing of the human b…
▽ More
Cognitively inspired NLP leverages human-derived data to teach machines about language processing mechanisms. Recently, neural networks have been augmented with behavioral data to solve a range of NLP tasks spanning syntax and semantics. We are the first to exploit neuroscientific data, namely electroencephalography (EEG), to inform a neural attention model about language processing of the human brain. The challenge in working with EEG data is that features are exceptionally rich and need extensive pre-processing to isolate signals specific to text processing. We devise a method for finding such EEG features to supervise machine attention through combining theoretically motivated cropping with random forest tree splits. After this dimensionality reduction, the pre-processed EEG features are capable of distinguishing two reading tasks retrieved from a publicly available EEG corpus. We apply these features to regularise attention on relation classification and show that EEG is more informative than strong baselines. This improvement depends on both the cognitive load of the task and the EEG frequency domain. Hence, informing neural attention models with EEG signals is beneficial but requires further investigation to understand which dimensions are the most useful across NLP tasks.
△ Less
Submitted 2 October, 2020; v1 submitted 9 June, 2020;
originally announced June 2020.
-
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations
Authors:
Mostafa Abdou,
Vinit Ravishankar,
Maria Barrett,
Yonatan Belinkov,
Desmond Elliott,
Anders Søgaard
Abstract:
Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight…
▽ More
Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones. Overall, humans are correct more often than out-of-the-box models, and the models are sometimes right for the wrong reasons. Finally, we show that fine-tuning on a large, task-specific dataset can offer a solution to these issues.
△ Less
Submitted 7 May, 2020; v1 submitted 4 May, 2020;
originally announced May 2020.
-
A fully 3D multi-path convolutional neural network with feature fusion and feature weighting for automatic lesion identification in brain MRI images
Authors:
Yunzhe Xue,
Meiyan Xie,
Fadi G. Farhat,
Olga Boukrina,
A. M. Barrett,
Jeffrey R. Binder,
Usman W. Roshan,
William W. Graves
Abstract:
We propose a fully 3D multi-path convolutional network to predict stroke lesions from 3D brain MRI images. Our multi-path model has independent encoders for different modalities containing residual convolutional blocks, weighted multi-path feature fusion from different modalities, and weighted fusion modules to combine encoder and decoder features. Compared to existing 3D CNNs like DeepMedic, 3D U…
▽ More
We propose a fully 3D multi-path convolutional network to predict stroke lesions from 3D brain MRI images. Our multi-path model has independent encoders for different modalities containing residual convolutional blocks, weighted multi-path feature fusion from different modalities, and weighted fusion modules to combine encoder and decoder features. Compared to existing 3D CNNs like DeepMedic, 3D U-Net, and AnatomyNet, our networks achieves the highest statistically significant cross-validation accuracy of 60.5% on the large ATLAS benchmark of 220 patients. We also test our model on multi-modal images from the Kessler Foundation and Medical College Wisconsin and achieve a statistically significant cross-validation accuracy of 65%, significantly outperforming the multi-modal 3D U-Net and DeepMedic. Overall our model offers a principled, extensible multi-path approach that outperforms multi-channel alternatives and achieves high Dice accuracies on existing benchmarks.
△ Less
Submitted 16 November, 2019; v1 submitted 17 July, 2019;
originally announced July 2019.
-
A multi-path 2.5 dimensional convolutional neural network system for segmenting stroke lesions in brain MRI images
Authors:
Yunzhe Xue,
Fadi G. Farhat,
Olga Boukrina,
A . M. Barrett,
Jeffrey R. Binder,
Usman W. Roshan,
William W. Graves
Abstract:
Automatic identification of brain lesions from magnetic resonance imaging (MRI) scans of stroke survivors would be a useful aid in patient diagnosis and treatment planning. We propose a multi-modal multi-path convolutional neural network system for automating stroke lesion segmentation. Our system has nine end-to-end UNets that take as input 2-dimensional (2D) slices and examines all three planes…
▽ More
Automatic identification of brain lesions from magnetic resonance imaging (MRI) scans of stroke survivors would be a useful aid in patient diagnosis and treatment planning. We propose a multi-modal multi-path convolutional neural network system for automating stroke lesion segmentation. Our system has nine end-to-end UNets that take as input 2-dimensional (2D) slices and examines all three planes with three different normalizations. Outputs from these nine total paths are concatenated into a 3D volume that is then passed to a 3D convolutional neural network to output a final lesion mask. We trained and tested our method on datasets from three sources: Medical College of Wisconsin (MCW), Kessler Foundation (KF), and the publicly available Anatomical Tracings of Lesions After Stroke (ATLAS) dataset. Cross-study validation results (with independent training and validation datasets) were obtained to compare with previous methods based on naive Bayes, random forests, and three recently published convolutional neural networks. Model performance was quantified in terms of the Dice coefficient. Training on the KF and MCW images and testing on the ATLAS images yielded a mean Dice coefficient of 0.54. This was reliably better than the next best previous model, UNet, at 0.47. Reversing the train and test datasets yields a mean Dice of 0.47 on KF and MCW images, whereas the next best UNet reaches 0.45. With all three datasets combined, the current system compared to previous methods also attained a reliably higher cross-validation accuracy. It also achieved high Dice values for many smaller lesions that existing methods have difficulty identifying. Overall, our system is a clear improvement over previous methods for automating stroke lesion segmentation, bringing us an important step closer to the inter-rater accuracy level of human experts.
△ Less
Submitted 26 May, 2019;
originally announced May 2019.
-
Advancing NLP with Cognitive Language Processing Signals
Authors:
Nora Hollenstein,
Maria Barrett,
Marius Troendle,
Francesco Bigiolli,
Nicolas Langer,
Ce Zhang
Abstract:
When we read, our brain processes language and generates cognitive processing data such as gaze patterns and brain activity. These signals can be recorded while reading. Cognitive language processing data such as eye-tracking features have shown improvements on single NLP tasks. We analyze whether using such human features can show consistent improvement across tasks and data sources. We present a…
▽ More
When we read, our brain processes language and generates cognitive processing data such as gaze patterns and brain activity. These signals can be recorded while reading. Cognitive language processing data such as eye-tracking features have shown improvements on single NLP tasks. We analyze whether using such human features can show consistent improvement across tasks and data sources. We present an extensive investigation of the benefits and limitations of using cognitive processing data for NLP. Specifically, we use gaze and EEG features to augment models of named entity recognition, relation classification, and sentiment analysis. These methods significantly outperform the baselines and show the potential and current limitations of employing human language processing data for NLP.
△ Less
Submitted 4 April, 2019;
originally announced April 2019.
-
A Model of Pathways to Artificial Superintelligence Catastrophe for Risk and Decision Analysis
Authors:
Anthony M. Barrett,
Seth D. Baum
Abstract:
An artificial superintelligence (ASI) is artificial intelligence that is significantly more intelligent than humans in all respects. While ASI does not currently exist, some scholars propose that it could be created sometime in the future, and furthermore that its creation could cause a severe global catastrophe, possibly even resulting in human extinction. Given the high stakes, it is important t…
▽ More
An artificial superintelligence (ASI) is artificial intelligence that is significantly more intelligent than humans in all respects. While ASI does not currently exist, some scholars propose that it could be created sometime in the future, and furthermore that its creation could cause a severe global catastrophe, possibly even resulting in human extinction. Given the high stakes, it is important to analyze ASI risk and factor the risk into decisions related to ASI research and development. This paper presents a graphical model of major pathways to ASI catastrophe, focusing on ASI created via recursive self-improvement. The model uses the established risk and decision analysis modeling paradigms of fault trees and influence diagrams in order to depict combinations of events and conditions that could lead to AI catastrophe, as well as intervention options that could decrease risks. The events and conditions include select aspects of the ASI itself as well as the human process of ASI research, development, and management. Model structure is derived from published literature on ASI risk. The model offers a foundation for rigorous quantitative evaluation and decision making on the long-term risk of ASI catastrophe.
△ Less
Submitted 25 July, 2016;
originally announced July 2016.