-
A Comparative Analysis of Counterfactual Explanation Methods for Text Classifiers
Authors:
Stephen McAleese,
Mark Keane
Abstract:
Counterfactual explanations can be used to interpret and debug text classifiers by producing minimally altered text inputs that change a classifier's output. In this work, we evaluate five methods for generating counterfactual explanations for a BERT text classifier on two datasets using three evaluation metrics. The results of our experiments suggest that established white-box substitution-based…
▽ More
Counterfactual explanations can be used to interpret and debug text classifiers by producing minimally altered text inputs that change a classifier's output. In this work, we evaluate five methods for generating counterfactual explanations for a BERT text classifier on two datasets using three evaluation metrics. The results of our experiments suggest that established white-box substitution-based methods are effective at generating valid counterfactuals that change the classifier's output. In contrast, newer methods based on large language models (LLMs) excel at producing natural and linguistically plausible text counterfactuals but often fail to generate valid counterfactuals that alter the classifier's output. Based on these results, we recommend developing new counterfactual explanation methods that combine the strengths of established gradient-based approaches and newer LLM-based techniques to generate high-quality, valid, and plausible text counterfactual explanations.
△ Less
Submitted 4 November, 2024;
originally announced November 2024.
-
Even-Ifs From If-Onlys: Are the Best Semi-Factual Explanations Found Using Counterfactuals As Guides?
Authors:
Saugat Aryal,
Mark T. Keane
Abstract:
Recently, counterfactuals using "if-only" explanations have become very popular in eXplainable AI (XAI), as they describe which changes to feature-inputs of a black-box AI system result in changes to a (usually negative) decision-outcome. Even more recently, semi-factuals using "even-if" explanations have gained more attention. They elucidate the feature-input changes that do not change the decisi…
▽ More
Recently, counterfactuals using "if-only" explanations have become very popular in eXplainable AI (XAI), as they describe which changes to feature-inputs of a black-box AI system result in changes to a (usually negative) decision-outcome. Even more recently, semi-factuals using "even-if" explanations have gained more attention. They elucidate the feature-input changes that do not change the decision-outcome of the AI system, with a potential to suggest more beneficial recourses. Some semi-factual methods use counterfactuals to the query-instance to guide semi-factual production (so-called counterfactual-guided methods), whereas others do not (so-called counterfactual-free methods). In this work, we perform comprehensive tests of 8 semi-factual methods on 7 datasets using 5 key metrics, to determine whether counterfactual guidance is necessary to find the best semi-factuals. The results of these tests suggests not, but rather that computing other aspects of the decision space lead to better semi-factual XAI.
△ Less
Submitted 25 April, 2024; v1 submitted 1 March, 2024;
originally announced March 2024.
-
Advancing Post Hoc Case Based Explanation with Feature Highlighting
Authors:
Eoin Kenny,
Eoin Delaney,
Mark Keane
Abstract:
Explainable AI (XAI) has been proposed as a valuable tool to assist in downstream tasks involving human and AI collaboration. Perhaps the most psychologically valid XAI techniques are case based approaches which display 'whole' exemplars to explain the predictions of black box AI systems. However, for such post hoc XAI methods dealing with images, there has been no attempt to improve their scope b…
▽ More
Explainable AI (XAI) has been proposed as a valuable tool to assist in downstream tasks involving human and AI collaboration. Perhaps the most psychologically valid XAI techniques are case based approaches which display 'whole' exemplars to explain the predictions of black box AI systems. However, for such post hoc XAI methods dealing with images, there has been no attempt to improve their scope by using multiple clear feature 'parts' of the images to explain the predictions while linking back to relevant cases in the training data, thus allowing for more comprehensive explanations that are faithful to the underlying model. Here, we address this gap by proposing two general algorithms (latent and super pixel based) which can isolate multiple clear feature parts in a test image, and then connect them to the explanatory cases found in the training data, before testing their effectiveness in a carefully designed user study. Results demonstrate that the proposed approach appropriately calibrates a users feelings of 'correctness' for ambiguous classifications in real world data on the ImageNet dataset, an effect which does not happen when just showing the explanation without feature highlighting.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Industrial Memories: Exploring the Findings of Government Inquiries with Neural Word Embedding and Machine Learning
Authors:
Susan Leavy,
Emilie Pine,
Mark T Keane
Abstract:
We present a text mining system to support the exploration of large volumes of text detailing the findings of government inquiries. Despite their historical significance and potential societal impact, key findings of inquiries are often hidden within lengthy documents and remain inaccessible to the general public. We transform the findings of the Irish government's inquiry into industrial schools…
▽ More
We present a text mining system to support the exploration of large volumes of text detailing the findings of government inquiries. Despite their historical significance and potential societal impact, key findings of inquiries are often hidden within lengthy documents and remain inaccessible to the general public. We transform the findings of the Irish government's inquiry into industrial schools and through the use of word embedding, text classification and visualisation, present an interactive web-based platform that enables the exploration of the text to uncover new historical insights.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
Explaining Groups of Instances Counterfactually for XAI: A Use Case, Algorithm and User Study for Group-Counterfactuals
Authors:
Greta Warren,
Mark T. Keane,
Christophe Gueret,
Eoin Delaney
Abstract:
Counterfactual explanations are an increasingly popular form of post hoc explanation due to their (i) applicability across problem domains, (ii) proposed legal compliance (e.g., with GDPR), and (iii) reliance on the contrastive nature of human explanation. Although counterfactual explanations are normally used to explain individual predictive-instances, we explore a novel use case in which groups…
▽ More
Counterfactual explanations are an increasingly popular form of post hoc explanation due to their (i) applicability across problem domains, (ii) proposed legal compliance (e.g., with GDPR), and (iii) reliance on the contrastive nature of human explanation. Although counterfactual explanations are normally used to explain individual predictive-instances, we explore a novel use case in which groups of similar instances are explained in a collective fashion using ``group counterfactuals'' (e.g., to highlight a repeating pattern of illness in a group of patients). These group counterfactuals meet a human preference for coherent, broad explanations covering multiple events/instances. A novel, group-counterfactual algorithm is proposed to generate high-coverage explanations that are faithful to the to-be-explained model. This explanation strategy is also evaluated in a large, controlled user study (N=207), using objective (i.e., accuracy) and subjective (i.e., confidence, explanation satisfaction, and trust) psychological measures. The results show that group counterfactuals elicit modest but definite improvements in people's understanding of an AI system. The implications of these findings for counterfactual methods and for XAI are discussed.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Even if Explanations: Prior Work, Desiderata & Benchmarks for Semi-Factual XAI
Authors:
Saugat Aryal,
Mark T Keane
Abstract:
Recently, eXplainable AI (XAI) research has focused on counterfactual explanations as post-hoc justifications for AI-system decisions (e.g. a customer refused a loan might be told: If you asked for a loan with a shorter term, it would have been approved). Counterfactuals explain what changes to the input-features of an AI system change the output-decision. However, there is a sub-type of counterfa…
▽ More
Recently, eXplainable AI (XAI) research has focused on counterfactual explanations as post-hoc justifications for AI-system decisions (e.g. a customer refused a loan might be told: If you asked for a loan with a shorter term, it would have been approved). Counterfactuals explain what changes to the input-features of an AI system change the output-decision. However, there is a sub-type of counterfactual, semi-factuals, that have received less attention in AI (though the Cognitive Sciences have studied them extensively). This paper surveys these literatures to summarise historical and recent breakthroughs in this area. It defines key desiderata for semi-factual XAI and reports benchmark tests of historical algorithms (along with a novel, naieve method) to provide a solid basis for future algorithmic developments.
△ Less
Submitted 8 May, 2023; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Explaining Classifications to Non Experts: An XAI User Study of Post Hoc Explanations for a Classifier When People Lack Expertise
Authors:
Courtney Ford,
Mark T Keane
Abstract:
Very few eXplainable AI (XAI) studies consider how users understanding of explanations might change depending on whether they know more or less about the to be explained domain (i.e., whether they differ in their expertise). Yet, expertise is a critical facet of most high stakes, human decision making (e.g., understanding how a trainee doctor differs from an experienced consultant). Accordingly, t…
▽ More
Very few eXplainable AI (XAI) studies consider how users understanding of explanations might change depending on whether they know more or less about the to be explained domain (i.e., whether they differ in their expertise). Yet, expertise is a critical facet of most high stakes, human decision making (e.g., understanding how a trainee doctor differs from an experienced consultant). Accordingly, this paper reports a novel, user study (N=96) on how peoples expertise in a domain affects their understanding of post-hoc explanations by example for a deep-learning, black box classifier. The results show that peoples understanding of explanations for correct and incorrect classifications changes dramatically, on several dimensions (e.g., response times, perceptions of correctness and helpfulness), when the image-based domain considered is familiar (i.e., MNIST) as opposed to unfamiliar (i.e., Kannada MNIST). The wider implications of these new findings for XAI strategies are discussed.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Counterfactual Explanations for Misclassified Images: How Human and Machine Explanations Differ
Authors:
Eoin Delaney,
Arjun Pakrashi,
Derek Greene,
Mark T. Keane
Abstract:
Counterfactual explanations have emerged as a popular solution for the eXplainable AI (XAI) problem of elucidating the predictions of black-box deep-learning systems due to their psychological validity, flexibility across problem domains and proposed legal compliance. While over 100 counterfactual methods exist, claiming to generate plausible explanations akin to those preferred by people, few hav…
▽ More
Counterfactual explanations have emerged as a popular solution for the eXplainable AI (XAI) problem of elucidating the predictions of black-box deep-learning systems due to their psychological validity, flexibility across problem domains and proposed legal compliance. While over 100 counterfactual methods exist, claiming to generate plausible explanations akin to those preferred by people, few have actually been tested on users ($\sim7\%$). So, the psychological validity of these counterfactual algorithms for effective XAI for image data is not established. This issue is addressed here using a novel methodology that (i) gathers ground truth human-generated counterfactual explanations for misclassified images, in two user studies and, then, (ii) compares these human-generated ground-truth explanations to computationally-generated explanations for the same misclassifications. Results indicate that humans do not "minimally edit" images when generating counterfactual explanations. Instead, they make larger, "meaningful" edits that better approximate prototypes in the counterfactual class.
△ Less
Submitted 16 December, 2022;
originally announced December 2022.
-
Features of Explainability: How users understand counterfactual and causal explanations for categorical and continuous features in XAI
Authors:
Greta Warren,
Mark T Keane,
Ruth M J Byrne
Abstract:
Counterfactual explanations are increasingly used to address interpretability, recourse, and bias in AI decisions. However, we do not know how well counterfactual explanations help users to understand a systems decisions, since no large scale user studies have compared their efficacy to other sorts of explanations such as causal explanations (which have a longer track record of use in rule based a…
▽ More
Counterfactual explanations are increasingly used to address interpretability, recourse, and bias in AI decisions. However, we do not know how well counterfactual explanations help users to understand a systems decisions, since no large scale user studies have compared their efficacy to other sorts of explanations such as causal explanations (which have a longer track record of use in rule based and decision tree models). It is also unknown whether counterfactual explanations are equally effective for categorical as for continuous features, although current methods assume they do. Hence, in a controlled user study with 127 volunteer participants, we tested the effects of counterfactual and causal explanations on the objective accuracy of users predictions of the decisions made by a simple AI system, and participants subjective judgments of satisfaction and trust in the explanations. We discovered a dissociation between objective and subjective measures: counterfactual explanations elicit higher accuracy of predictions than no-explanation control descriptions but no higher accuracy than causal explanations, yet counterfactual explanations elicit greater satisfaction and trust than causal explanations. We also found that users understand explanations referring to categorical features more readily than those referring to continuous features. We discuss the implications of these findings for current and future counterfactual methods in XAI.
△ Less
Submitted 21 April, 2022;
originally announced April 2022.
-
Solving the Class Imbalance Problem Using a Counterfactual Method for Data Augmentation
Authors:
Mohammed Temraz,
Mark T. Keane
Abstract:
Learning from class imbalanced datasets poses challenges for many machine learning algorithms. Many real-world domains are, by definition, class imbalanced by virtue of having a majority class that naturally has many more instances than its minority class (e.g. genuine bank transactions occur much more often than fraudulent ones). Many methods have been proposed to solve the class imbalance proble…
▽ More
Learning from class imbalanced datasets poses challenges for many machine learning algorithms. Many real-world domains are, by definition, class imbalanced by virtue of having a majority class that naturally has many more instances than its minority class (e.g. genuine bank transactions occur much more often than fraudulent ones). Many methods have been proposed to solve the class imbalance problem, among the most popular being oversampling techniques (such as SMOTE). These methods generate synthetic instances in the minority class, to balance the dataset, performing data augmentations that improve the performance of predictive machine learning (ML) models. In this paper we advance a novel data augmentation method (adapted from eXplainable AI), that generates synthetic, counterfactual instances in the minority class. Unlike other oversampling techniques, this method adaptively combines exist-ing instances from the dataset, using actual feature-values rather than interpolating values between instances. Several experiments using four different classifiers and 25 datasets are reported, which show that this Counterfactual Augmentation method (CFA) generates useful synthetic data points in the minority class. The experiments also show that CFA is competitive with many other oversampling methods many of which are variants of SMOTE. The basis for CFAs performance is discussed, along with the conditions under which it is likely to perform better or worse in future tests.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Voice-assisted Image Labelling for Endoscopic Ultrasound Classification using Neural Networks
Authors:
Ester Bonmati,
Yipeng Hu,
Alexander Grimwood,
Gavin J. Johnson,
George Goodchild,
Margaret G. Keane,
Kurinchi Gurusamy,
Brian Davidson,
Matthew J. Clarkson,
Stephen P. Pereira,
Dean C. Barratt
Abstract:
Ultrasound imaging is a commonly used technology for visualising patient anatomy in real-time during diagnostic and therapeutic procedures. High operator dependency and low reproducibility make ultrasound imaging and interpretation challenging with a steep learning curve. Automatic image classification using deep learning has the potential to overcome some of these challenges by supporting ultraso…
▽ More
Ultrasound imaging is a commonly used technology for visualising patient anatomy in real-time during diagnostic and therapeutic procedures. High operator dependency and low reproducibility make ultrasound imaging and interpretation challenging with a steep learning curve. Automatic image classification using deep learning has the potential to overcome some of these challenges by supporting ultrasound training in novices, as well as aiding ultrasound image interpretation in patient with complex pathology for more experienced practitioners. However, the use of deep learning methods requires a large amount of data in order to provide accurate results. Labelling large ultrasound datasets is a challenging task because labels are retrospectively assigned to 2D images without the 3D spatial context available in vivo or that would be inferred while visually tracking structures between frames during the procedure. In this work, we propose a multi-modal convolutional neural network (CNN) architecture that labels endoscopic ultrasound (EUS) images from raw verbal comments provided by a clinician during the procedure. We use a CNN composed of two branches, one for voice data and another for image data, which are joined to predict image labels from the spoken names of anatomical landmarks. The network was trained using recorded verbal comments from expert operators. Our results show a prediction accuracy of 76% at image level on a dataset with 5 different labels. We conclude that the addition of spoken commentaries can increase the performance of ultrasound image classification, and eliminate the burden of manually labelling large EUS datasets necessary for deep learning applications.
△ Less
Submitted 12 October, 2021;
originally announced October 2021.
-
Uncertainty Estimation and Out-of-Distribution Detection for Counterfactual Explanations: Pitfalls and Solutions
Authors:
Eoin Delaney,
Derek Greene,
Mark T. Keane
Abstract:
Whilst an abundance of techniques have recently been proposed to generate counterfactual explanations for the predictions of opaque black-box systems, markedly less attention has been paid to exploring the uncertainty of these generated explanations. This becomes a critical issue in high-stakes scenarios, where uncertain and misleading explanations could have dire consequences (e.g., medical diagn…
▽ More
Whilst an abundance of techniques have recently been proposed to generate counterfactual explanations for the predictions of opaque black-box systems, markedly less attention has been paid to exploring the uncertainty of these generated explanations. This becomes a critical issue in high-stakes scenarios, where uncertain and misleading explanations could have dire consequences (e.g., medical diagnosis and treatment planning). Moreover, it is often difficult to determine if the generated explanations are well grounded in the training data and sensitive to distributional shifts. This paper proposes several practical solutions that can be leveraged to solve these problems by establishing novel connections with other research works in explainability (e.g., trust scores) and uncertainty estimation (e.g., Monte Carlo Dropout). Two experiments demonstrate the utility of our proposed solutions.
△ Less
Submitted 20 July, 2021;
originally announced July 2021.
-
Twin Systems for DeepCBR: A Menagerie of Deep Learning and Case-Based Reasoning Pairings for Explanation and Data Augmentation
Authors:
Mark T Keane,
Eoin M Kenny,
Mohammed Temraz,
Derek Greene,
Barry Smyth
Abstract:
Recently, it has been proposed that fruitful synergies may exist between Deep Learning (DL) and Case Based Reasoning (CBR); that there are insights to be gained by applying CBR ideas to problems in DL (what could be called DeepCBR). In this paper, we report on a program of research that applies CBR solutions to the problem of Explainable AI (XAI) in the DL. We describe a series of twin-systems pai…
▽ More
Recently, it has been proposed that fruitful synergies may exist between Deep Learning (DL) and Case Based Reasoning (CBR); that there are insights to be gained by applying CBR ideas to problems in DL (what could be called DeepCBR). In this paper, we report on a program of research that applies CBR solutions to the problem of Explainable AI (XAI) in the DL. We describe a series of twin-systems pairings of opaque DL models with transparent CBR models that allow the latter to explain the former using factual, counterfactual and semi-factual explanation strategies. This twinning shows that functional abstractions of DL (e.g., feature weights, feature importance and decision boundaries) can be used to drive these explanatory solutions. We also raise the prospect that this research also applies to the problem of Data Augmentation in DL, underscoring the fecundity of these DeepCBR ideas.
△ Less
Submitted 13 June, 2021; v1 submitted 29 April, 2021;
originally announced April 2021.
-
Handling Climate Change Using Counterfactuals: Using Counterfactuals in Data Augmentation to Predict Crop Growth in an Uncertain Climate Future
Authors:
Mohammed Temraz,
Eoin Kenny,
Elodie Ruelle,
Laurence Shalloo,
Barry Smyth,
Mark T Keane
Abstract:
Climate change poses a major challenge to humanity, especially in its impact on agriculture, a challenge that a responsible AI should meet. In this paper, we examine a CBR system (PBI-CBR) designed to aid sustainable dairy farming by supporting grassland management, through accurate crop growth prediction. As climate changes, PBI-CBRs historical cases become less useful in predicting future grass…
▽ More
Climate change poses a major challenge to humanity, especially in its impact on agriculture, a challenge that a responsible AI should meet. In this paper, we examine a CBR system (PBI-CBR) designed to aid sustainable dairy farming by supporting grassland management, through accurate crop growth prediction. As climate changes, PBI-CBRs historical cases become less useful in predicting future grass growth. Hence, we extend PBI-CBR using data augmentation, to specifically handle disruptive climate events, using a counterfactual method (from XAI). Study 1 shows that historical, extreme climate-events (climate outlier cases) tend to be used by PBI-CBR to predict grass growth during climate disrupted periods. Study 2 shows that synthetic outliers, generated as counterfactuals on a outlier-boundary, improve the predictive accuracy of PBICBR, during the drought of 2018. This study also shows that an instance-based counterfactual method does better than a benchmark, constraint-guided method.
△ Less
Submitted 8 April, 2021;
originally announced April 2021.
-
If Only We Had Better Counterfactual Explanations: Five Key Deficits to Rectify in the Evaluation of Counterfactual XAI Techniques
Authors:
Mark T Keane,
Eoin M Kenny,
Eoin Delaney,
Barry Smyth
Abstract:
In recent years, there has been an explosion of AI research on counterfactual explanations as a solution to the problem of eXplainable AI (XAI). These explanations seem to offer technical, psychological and legal benefits over other explanation techniques. We survey 100 distinct counterfactual explanation methods reported in the literature. This survey addresses the extent to which these methods h…
▽ More
In recent years, there has been an explosion of AI research on counterfactual explanations as a solution to the problem of eXplainable AI (XAI). These explanations seem to offer technical, psychological and legal benefits over other explanation techniques. We survey 100 distinct counterfactual explanation methods reported in the literature. This survey addresses the extent to which these methods have been adequately evaluated, both psychologically and computationally, and quantifies the shortfalls occurring. For instance, only 21% of these methods have been user tested. Five key deficits in the evaluation of these methods are detailed and a roadmap, with standardised benchmark evaluations, is proposed to resolve the issues arising; issues, that currently effectively block scientific progress in this field.
△ Less
Submitted 26 February, 2021;
originally announced March 2021.
-
A Few Good Counterfactuals: Generating Interpretable, Plausible and Diverse Counterfactual Explanations
Authors:
Barry Smyth,
Mark T Keane
Abstract:
Counterfactual explanations provide a potentially significant solution to the Explainable AI (XAI) problem, but good, native counterfactuals have been shown to rarely occur in most datasets. Hence, the most popular methods generate synthetic counterfactuals using blind perturbation. However, such methods have several shortcomings: the resulting counterfactuals (i) may not be valid data-points (the…
▽ More
Counterfactual explanations provide a potentially significant solution to the Explainable AI (XAI) problem, but good, native counterfactuals have been shown to rarely occur in most datasets. Hence, the most popular methods generate synthetic counterfactuals using blind perturbation. However, such methods have several shortcomings: the resulting counterfactuals (i) may not be valid data-points (they often use features that do not naturally occur), (ii) may lack the sparsity of good counterfactuals (if they modify too many features), and (iii) may lack diversity (if the generated counterfactuals are minimal variants of one another). We describe a method designed to overcome these problems, one that adapts native counterfactuals in the original dataset, to generate sparse, diverse synthetic counterfactuals from naturally occurring features. A series of experiments are reported that systematically explore parametric variations of this novel method on common datasets to establish the conditions for optimal performance.
△ Less
Submitted 22 January, 2021;
originally announced January 2021.
-
Predicting Illness for a Sustainable Dairy Agriculture: Predicting and Explaining the Onset of Mastitis in Dairy Cows
Authors:
Cathal Ryan,
Christophe Guéret,
Donagh Berry,
Medb Corcoran,
Mark T. Keane,
Brian Mac Namee
Abstract:
Mastitis is a billion dollar health problem for the modern dairy industry, with implications for antibiotic resistance. The use of AI techniques to identify the early onset of this disease, thus has significant implications for the sustainability of this agricultural sector. Current approaches to treating mastitis involve antibiotics and this practice is coming under ever increasing scrutiny. Usin…
▽ More
Mastitis is a billion dollar health problem for the modern dairy industry, with implications for antibiotic resistance. The use of AI techniques to identify the early onset of this disease, thus has significant implications for the sustainability of this agricultural sector. Current approaches to treating mastitis involve antibiotics and this practice is coming under ever increasing scrutiny. Using machine learning models to identify cows at risk of developing mastitis and applying targeted treatment regimes to only those animals promotes a more sustainable approach. Incorrect predictions from such models, however, can lead to monetary losses, unnecessary use of antibiotics, and even the premature death of animals, so it is important to generate compelling explanations for predictions to build trust with users and to better support their decision making. In this paper we demonstrate a system developed to predict mastitis infections in cows and provide explanations of these predictions using counterfactuals. We demonstrate the system and describe the engagement with farmers undertaken to build it.
△ Less
Submitted 7 January, 2021; v1 submitted 6 January, 2021;
originally announced January 2021.
-
Instance-based Counterfactual Explanations for Time Series Classification
Authors:
Eoin Delaney,
Derek Greene,
Mark T. Keane
Abstract:
In recent years, there has been a rapidly expanding focus on explaining the predictions made by black-box AI systems that handle image and tabular data. However, considerably less attention has been paid to explaining the predictions of opaque AI systems handling time series data. In this paper, we advance a novel model-agnostic, case-based technique -- Native Guide -- that generates counterfactua…
▽ More
In recent years, there has been a rapidly expanding focus on explaining the predictions made by black-box AI systems that handle image and tabular data. However, considerably less attention has been paid to explaining the predictions of opaque AI systems handling time series data. In this paper, we advance a novel model-agnostic, case-based technique -- Native Guide -- that generates counterfactual explanations for time series classifiers. Given a query time series, $T_{q}$, for which a black-box classification system predicts class, $c$, a counterfactual time series explanation shows how $T_{q}$ could change, such that the system predicts an alternative class, $c'$. The proposed instance-based technique adapts existing counterfactual instances in the case-base by highlighting and modifying discriminative areas of the time series that underlie the classification. Quantitative and qualitative results from two comparative experiments indicate that Native Guide generates plausible, proximal, sparse and diverse explanations that are better than those produced by key benchmark counterfactual methods.
△ Less
Submitted 24 June, 2021; v1 submitted 28 September, 2020;
originally announced September 2020.
-
On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning
Authors:
Eoin M. Kenny,
Mark T. Keane
Abstract:
There is a growing concern that the recent progress made in AI, especially regarding the predictive competence of deep learning models, will be undermined by a failure to properly explain their operation and outputs. In response to this disquiet counterfactual explanations have become massively popular in eXplainable AI (XAI) due to their proposed computational psychological, and legal benefits. I…
▽ More
There is a growing concern that the recent progress made in AI, especially regarding the predictive competence of deep learning models, will be undermined by a failure to properly explain their operation and outputs. In response to this disquiet counterfactual explanations have become massively popular in eXplainable AI (XAI) due to their proposed computational psychological, and legal benefits. In contrast however, semifactuals, which are a similar way humans commonly explain their reasoning, have surprisingly received no attention. Most counterfactual methods address tabular rather than image data, partly due to the nondiscrete nature of the latter making good counterfactuals difficult to define. Additionally generating plausible looking explanations which lie on the data manifold is another issue which hampers progress. This paper advances a novel method for generating plausible counterfactuals (and semifactuals) for black box CNN classifiers doing computer vision. The present method, called PlausIble Exceptionality-based Contrastive Explanations (PIECE), modifies all exceptional features in a test image to be normal from the perspective of the counterfactual class (hence concretely defining a counterfactual). Two controlled experiments compare this method to others in the literature, showing that PIECE not only generates the most plausible counterfactuals on several measures, but also the best semifactuals.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
Play MNIST For Me! User Studies on the Effects of Post-Hoc, Example-Based Explanations & Error Rates on Debugging a Deep Learning, Black-Box Classifier
Authors:
Courtney Ford,
Eoin M. Kenny,
Mark T. Keane
Abstract:
This paper reports two experiments (N=349) on the impact of post hoc explanations by example and error rates on peoples perceptions of a black box classifier. Both experiments show that when people are given case based explanations, from an implemented ANN CBR twin system, they perceive miss classifications to be more correct. They also show that as error rates increase above 4%, people trust the…
▽ More
This paper reports two experiments (N=349) on the impact of post hoc explanations by example and error rates on peoples perceptions of a black box classifier. Both experiments show that when people are given case based explanations, from an implemented ANN CBR twin system, they perceive miss classifications to be more correct. They also show that as error rates increase above 4%, people trust the classifier less and view it as being less correct, less reasonable and less trustworthy. The implications of these results for XAI are discussed.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
Good Counterfactuals and Where to Find Them: A Case-Based Technique for Generating Counterfactuals for Explainable AI (XAI)
Authors:
Mark T. Keane,
Barry Smyth
Abstract:
Recently, a groundswell of research has identified the use of counterfactual explanations as a potentially significant solution to the Explainable AI (XAI) problem. It is argued that (a) technically, these counterfactual cases can be generated by permuting problem-features until a class change is found, (b) psychologically, they are much more causally informative than factual explanations, (c) leg…
▽ More
Recently, a groundswell of research has identified the use of counterfactual explanations as a potentially significant solution to the Explainable AI (XAI) problem. It is argued that (a) technically, these counterfactual cases can be generated by permuting problem-features until a class change is found, (b) psychologically, they are much more causally informative than factual explanations, (c) legally, they are GDPR-compliant. However, there are issues around the finding of good counterfactuals using current techniques (e.g. sparsity and plausibility). We show that many commonly-used datasets appear to have few good counterfactuals for explanation purposes. So, we propose a new case based approach for generating counterfactuals using novel ideas about the counterfactual potential and explanatory coverage of a case-base. The new technique reuses patterns of good counterfactuals, present in a case-base, to generate analogous counterfactuals that can explain new problems and their solutions. Several experiments show how this technique can improve the counterfactual potential and explanatory coverage of case-bases that were previously found wanting.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
The Twin-System Approach as One Generic Solution for XAI: An Overview of ANN-CBR Twins for Explaining Deep Learning
Authors:
Mark T. Keane,
Eoin M. Kenny
Abstract:
The notion of twin systems is proposed to address the eXplainable AI (XAI) problem, where an uninterpretable black-box system is mapped to a white-box 'twin' that is more interpretable. In this short paper, we overview very recent work that advances a generic solution to the XAI problem, the so called twin system approach. The most popular twinning in the literature is that between an Artificial N…
▽ More
The notion of twin systems is proposed to address the eXplainable AI (XAI) problem, where an uninterpretable black-box system is mapped to a white-box 'twin' that is more interpretable. In this short paper, we overview very recent work that advances a generic solution to the XAI problem, the so called twin system approach. The most popular twinning in the literature is that between an Artificial Neural Networks (ANN ) as a black box and Case Based Reasoning (CBR) system as a white-box, where the latter acts as an interpretable proxy for the former. We outline how recent work reviving this idea has applied it to deep learning methods. Furthermore, we detail the many fruitful directions in which this work may be taken; such as, determining the most (i) accurate feature-weighting methods to be used, (ii) appropriate deployments for explanatory cases, (iii) useful cases of explanatory value to users.
△ Less
Submitted 20 May, 2019;
originally announced May 2019.
-
The Unexpected Unexpected and the Expected Unexpected: How People's Conception of the Unexpected is Not That Unexpected
Authors:
Molly S Quinn,
Kathleen Campbell,
Mark T Keane
Abstract:
The answers people give when asked to 'think of the unexpected' for everyday event scenarios appear to be more expected than unexpected. There are expected unexpected outcomes that closely adhere to the given information in a scenario, based on familiar disruptions and common plan-failures. There are also unexpected unexpected outcomes that are more inventive, that depart from given information, a…
▽ More
The answers people give when asked to 'think of the unexpected' for everyday event scenarios appear to be more expected than unexpected. There are expected unexpected outcomes that closely adhere to the given information in a scenario, based on familiar disruptions and common plan-failures. There are also unexpected unexpected outcomes that are more inventive, that depart from given information, adding new concepts/actions. However, people seem to tend to conceive of the unexpected as the former more than the latter. Study 1 tests these proposals by analysing the object-concepts people mention in their reports of the unexpected and the agreement between their answers. Study 2 shows that object-choices are weakly influenced by recency, the order of sentences in the scenario. The implications of these results for ideas in philosophy, psychology and computing is discussed
△ Less
Submitted 17 May, 2019;
originally announced May 2019.
-
How Case Based Reasoning Explained Neural Networks: An XAI Survey of Post-Hoc Explanation-by-Example in ANN-CBR Twins
Authors:
Mark T Keane,
Eoin M Kenny
Abstract:
This paper surveys an approach to the XAI problem, using post-hoc explanation by example, that hinges on twinning Artificial Neural Networks (ANNs) with Case-Based Reasoning (CBR) systems, so-called ANN-CBR twins. A systematic survey of 1100+ papers was carried out to identify the fragmented literature on this topic and to trace it influence through to more recent work involving Deep Neural Networ…
▽ More
This paper surveys an approach to the XAI problem, using post-hoc explanation by example, that hinges on twinning Artificial Neural Networks (ANNs) with Case-Based Reasoning (CBR) systems, so-called ANN-CBR twins. A systematic survey of 1100+ papers was carried out to identify the fragmented literature on this topic and to trace it influence through to more recent work involving Deep Neural Networks (DNNs). The paper argues that this twin-system approach, especially using ANN-CBR twins, presents one possible coherent, generic solution to the XAI problem (and, indeed, XCBR problem). The paper concludes by road-mapping some future directions for this XAI solution involving (i) further tests of feature-weighting techniques, (iii) explorations of how explanatory cases might best be deployed (e.g., in counterfactuals, near-miss cases, a fortori cases), and (iii) the raising of the unwelcome and, much ignored, issue of human user evaluation.
△ Less
Submitted 17 May, 2019;
originally announced May 2019.
-
Plotting Markson's 'Mistress'
Authors:
Kelleher Conor,
Mark T. Keane
Abstract:
The post-modern novel 'Wittgenstein's Mistress' by David Markson (1988) presents the reader with a very challenging non linear narrative, that itself appears to one of the novel's themes. We present a distant reading of this work designed to complement a close reading of it by David Foster Wallace (1990). Using a combination of text analysis, entity recognition and networks, we plot repetitive str…
▽ More
The post-modern novel 'Wittgenstein's Mistress' by David Markson (1988) presents the reader with a very challenging non linear narrative, that itself appears to one of the novel's themes. We present a distant reading of this work designed to complement a close reading of it by David Foster Wallace (1990). Using a combination of text analysis, entity recognition and networks, we plot repetitive structures in the novel's narrative relating them to its critical analysis.
△ Less
Submitted 17 May, 2019;
originally announced May 2019.
-
Beef Cattle Instance Segmentation Using Fully Convolutional Neural Network
Authors:
Aram Ter-Sarkisov,
Robert Ross,
John Kelleher,
Bernadette Earley,
Michael Keane
Abstract:
We present an instance segmentation algorithm trained and applied to a CCTV recording of beef cattle during a winter finishing period. A fully convolutional network was transformed into an instance segmentation network that learns to label each instance of an animal separately. We introduce a conceptually simple framework that the network uses to output a single prediction for every animal. These…
▽ More
We present an instance segmentation algorithm trained and applied to a CCTV recording of beef cattle during a winter finishing period. A fully convolutional network was transformed into an instance segmentation network that learns to label each instance of an animal separately. We introduce a conceptually simple framework that the network uses to output a single prediction for every animal. These results are a contribution towards behaviour analysis in winter finishing beef cattle for early detection of animal welfare-related problems.
△ Less
Submitted 20 September, 2018; v1 submitted 5 July, 2018;
originally announced July 2018.
-
On Supporting Digital Journalism: Case Studies in Co-Designing Journalistic Tools
Authors:
Georgiana Ifrim,
Derek Greene,
Mark T. Keane,
Claudia Orellana-Rodriguez,
Bichen Shi,
Gevorg Poghosyan
Abstract:
Since 2013 researchers at University College Dublin in the Insight Centre for Data Analytics have been involved in a significant research programme in digital journalism, specifically targeting tools and social media guidelines to support the work of journalists. Most of this programme was undertaken in collaboration with The Irish Times. This collaboration involved identifying key problems curren…
▽ More
Since 2013 researchers at University College Dublin in the Insight Centre for Data Analytics have been involved in a significant research programme in digital journalism, specifically targeting tools and social media guidelines to support the work of journalists. Most of this programme was undertaken in collaboration with The Irish Times. This collaboration involved identifying key problems currently faced by digital journalists, developing tools as solutions to these problems, and then iteratively co-designing these tools with feedback from journalists. This paper reports on our experiences and learnings from this research programme, with a view to informing similar efforts in the future.
△ Less
Submitted 14 October, 2017;
originally announced October 2017.
-
Helping News Editors Write Better Headlines: A Recommender to Improve the Keyword Contents & Shareability of News Headlines
Authors:
Terrence Szymanski,
Claudia Orellana-Rodriguez,
Mark T. Keane
Abstract:
We present a software tool that employs state-of-the-art natural language processing (NLP) and machine learning techniques to help newspaper editors compose effective headlines for online publication. The system identifies the most salient keywords in a news article and ranks them based on both their overall popularity and their direct relevance to the article. The system also uses a supervised re…
▽ More
We present a software tool that employs state-of-the-art natural language processing (NLP) and machine learning techniques to help newspaper editors compose effective headlines for online publication. The system identifies the most salient keywords in a news article and ranks them based on both their overall popularity and their direct relevance to the article. The system also uses a supervised regression model to identify headlines that are likely to be widely shared on social media. The user interface is designed to simplify and speed the editor's decision process on the composition of the headline. As such, the tool provides an efficient way to combine the benefits of automated predictors of engagement and search-engine optimization (SEO) with human judgments of overall headline quality.
△ Less
Submitted 26 May, 2017;
originally announced May 2017.
-
A Computational Theory of Subjective Probability
Authors:
Phil Maguire,
Philippe Moser,
Rebecca Maguire,
Mark Keane
Abstract:
In this article we demonstrate how algorithmic probability theory is applied to situations that involve uncertainty. When people are unsure of their model of reality, then the outcome they observe will cause them to update their beliefs. We argue that classical probability cannot be applied in such cases, and that subjective probability must instead be used. In Experiment 1 we show that, when judg…
▽ More
In this article we demonstrate how algorithmic probability theory is applied to situations that involve uncertainty. When people are unsure of their model of reality, then the outcome they observe will cause them to update their beliefs. We argue that classical probability cannot be applied in such cases, and that subjective probability must instead be used. In Experiment 1 we show that, when judging the probability of lottery number sequences, people apply subjective rather than classical probability. In Experiment 2 we examine the conjunction fallacy and demonstrate that the materials used by Tversky and Kahneman (1983) involve model uncertainty. We then provide a formal mathematical proof that, for every uncertain model, there exists a conjunction of outcomes which is more subjectively probable than either of its constituents in isolation.
△ Less
Submitted 8 May, 2014;
originally announced May 2014.
-
It's distributions all the way down!: Second order changes in statistical distributions also occur
Authors:
M. T. Keane,
A. Gerow
Abstract:
The textual, big-data literature misses Bentley, OBrien, & Brocks (Bentley et als) message on distributions; it largely examines the first-order effects of how a single, signature distribution can predict population behaviour, neglecting second-order effects involving distributional shifts, either between signature distributions or within a given signature distribution. Indeed, Bentley et al. them…
▽ More
The textual, big-data literature misses Bentley, OBrien, & Brocks (Bentley et als) message on distributions; it largely examines the first-order effects of how a single, signature distribution can predict population behaviour, neglecting second-order effects involving distributional shifts, either between signature distributions or within a given signature distribution. Indeed, Bentley et al. themselves under-emphasise the potential richness of the latter, within-distribution effects.
△ Less
Submitted 27 February, 2014;
originally announced February 2014.
-
Cognitive residues of similarity
Authors:
Stephanie OToole,
Mark T. Keane
Abstract:
What are the cognitive after-effects of making a similarity judgement? What, cognitively, is left behind and what effect might these residues have on subsequent processing? In this paper, we probe for such after-effects using a visual search task, performed after a task in which pictures of real-world objects were compared. So, target objects were first presented in a comparison task (e.g., rate t…
▽ More
What are the cognitive after-effects of making a similarity judgement? What, cognitively, is left behind and what effect might these residues have on subsequent processing? In this paper, we probe for such after-effects using a visual search task, performed after a task in which pictures of real-world objects were compared. So, target objects were first presented in a comparison task (e.g., rate the similarity of this object to another) thus, presumably, modifying some of their features before asking people to visually search for the same object in complex scenes (with distractors and camouflaged backgrounds). As visual search is known to be influenced by the features of target objects, then any after-effects of the comparison task should be revealed in subsequent visual searches. Results showed that when people previously rated an object as being high on a scale (e.g., colour similarity or general similarity) then visual search is inhibited (slower RTs and more saccades in eye-tracking) relative to an object being rated as low in the same scale. There was also some evidence that different comparison tasks (e.g., compare on colour or compare on general similarity) have differential effects on visual search.
△ Less
Submitted 9 August, 2013;
originally announced August 2013.
-
Surprise: Youve got some explaining to do
Authors:
Meadhbh Foster,
Mark T. Keane
Abstract:
Why are some events more surprising than others? We propose that events that are more difficult to explain are those that are more surprising. The two experiments reported here test the impact of different event outcomes (Outcome-Type) and task demands (Task) on ratings of surprise for simple story scenarios. For the Outcome-Type variable, participants saw outcomes that were either known or less-k…
▽ More
Why are some events more surprising than others? We propose that events that are more difficult to explain are those that are more surprising. The two experiments reported here test the impact of different event outcomes (Outcome-Type) and task demands (Task) on ratings of surprise for simple story scenarios. For the Outcome-Type variable, participants saw outcomes that were either known or less-known surprising outcomes for each scenario. For the Task variable, participants either answered comprehension questions or provided an explanation of the outcome. Outcome-Type reliably affected surprise judgments; known outcomes were rated as less surprising than less-known outcomes. Task also reliably affected surprise judgments; when people provided an explanation it lowered surprise judgments relative to simply answering comprehension questions. Both experiments thus provide evidence on this less-explored explanation aspect of surprise, specifically showing that ease of explanation is a key factor in determining the level of surprise experienced.
△ Less
Submitted 9 August, 2013;
originally announced August 2013.
-
Innovation networks
Authors:
Petra Ahrweiler,
Mark T. Keane
Abstract:
This paper advances a framework for modeling the component interactions between cognitive and social aspects of scientific creativity and technological innovation. Specifically, it aims to characterize Innovation Networks; those networks that involve the interplay of people, ideas and organizations to create new, technologically feasible, commercially-realizable products, processes and organizatio…
▽ More
This paper advances a framework for modeling the component interactions between cognitive and social aspects of scientific creativity and technological innovation. Specifically, it aims to characterize Innovation Networks; those networks that involve the interplay of people, ideas and organizations to create new, technologically feasible, commercially-realizable products, processes and organizational structures. The tri-partite framework captures networks of ideas (Concept Level), people (Individual Level) and social structures (Social-Organizational Level) and the interactions between these levels. At the concept level, new ideas are the nodes that are created and linked, kept open for further investigation or closed if solved by actors at the individual or organizational levels. At the individual level, the nodes are actors linked by shared worldviews (based on shared professional, educational, experiential backgrounds) who are the builders of the concept level. At the social-organizational level, the nodes are organizations linked by common efforts on a given project (e.g., a company-university collaboration) that by virtue of their intellectual property or rules of governance constrain the actions of individuals (at the Individual Level) or ideas (at the Concept Level). After describing this framework and its implications we paint a number of scenarios to flesh out how it can be applied.
△ Less
Submitted 9 August, 2013;
originally announced August 2013.
-
Deconstructing analogy
Authors:
Mark Keane
Abstract:
Analogy has been shown to be important in many key cognitive abilities, including learning, problem solving, creativity and language change. For cognitive models of analogy, the fundamental computational question is how its inherent complexity (its NP-hardness) is solved by the human cognitive system. Indeed, different models of analogical processing can be categorized by the simplification strate…
▽ More
Analogy has been shown to be important in many key cognitive abilities, including learning, problem solving, creativity and language change. For cognitive models of analogy, the fundamental computational question is how its inherent complexity (its NP-hardness) is solved by the human cognitive system. Indeed, different models of analogical processing can be categorized by the simplification strategies they adopt to make this computational problem more tractable. In this paper, I deconstruct several of these models in terms of the simplification-strategies they use; a deconstruction that provides some interesting perspectives on the relative differences between them. Later, I consider whether any of these computational simplifications reflect the actual strategies used by people and sketch a new cognitive model that tries to present a closer fit to the psychological evidence.
△ Less
Submitted 9 August, 2013;
originally announced August 2013.
-
Identifying Metaphoric Antonyms in a Corpus Analysis of Finance Articles
Authors:
Aaron Gerow,
Mark Keane
Abstract:
Using a corpus of 17,000+ financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP and DOWN verbs used to describe movements of indices, stocks and shares. In Study 1 participants identified antonyms of these verbs in a free-response task and a matching task from which the most commonly identified antonyms were compiled. In Study 2, we deter…
▽ More
Using a corpus of 17,000+ financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP and DOWN verbs used to describe movements of indices, stocks and shares. In Study 1 participants identified antonyms of these verbs in a free-response task and a matching task from which the most commonly identified antonyms were compiled. In Study 2, we determined whether the argument-distributions for the verbs in these antonym-pairs were sufficiently similar to predict the most frequently-identified antonym. Cosine similarity correlates moderately with the proportions of antonym-pairs identified by people (r = 0.31). More impressively, 87% of the time the most frequently-identified antonym is either the first- or second-most similar pair in the set of alternatives. The implications of these results for distributional approaches to determining metaphoric knowledge are discussed.
△ Less
Submitted 4 February, 2013; v1 submitted 13 December, 2012;
originally announced December 2012.
-
Identifying Metaphor Hierarchies in a Corpus Analysis of Finance Articles
Authors:
Aaron Georw,
Mark Keane
Abstract:
Using a corpus of over 17,000 financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP- and DOWN-verbs used to describe movements of indices, stocks, and shares. Using measures of the overlap in the argument distributions of these verbs and k-means clustering of their distributions, we advance evidence for the proposal that the metaphors ref…
▽ More
Using a corpus of over 17,000 financial news reports (involving over 10M words), we perform an analysis of the argument-distributions of the UP- and DOWN-verbs used to describe movements of indices, stocks, and shares. Using measures of the overlap in the argument distributions of these verbs and k-means clustering of their distributions, we advance evidence for the proposal that the metaphors referred to by these verbs are organised into hierarchical structures of superordinate and subordinate groups.
△ Less
Submitted 13 December, 2012;
originally announced December 2012.
-
Mining the Web for the Voice of the Herd to Track Stock Market Bubbles
Authors:
Aaron Gerow,
Mark Keane
Abstract:
We show that power-law analyses of financial commentaries from newspaper web-sites can be used to identify stock market bubbles, supplementing traditional volatility analyses. Using a four-year corpus of 17,713 online, finance-related articles (10M+ words) from the Financial Times, the New York Times, and the BBC, we show that week-to-week changes in power-law distributions reflect market movement…
▽ More
We show that power-law analyses of financial commentaries from newspaper web-sites can be used to identify stock market bubbles, supplementing traditional volatility analyses. Using a four-year corpus of 17,713 online, finance-related articles (10M+ words) from the Financial Times, the New York Times, and the BBC, we show that week-to-week changes in power-law distributions reflect market movements of the Dow Jones Industrial Average (DJI), the FTSE-100, and the NIKKEI-225. Notably, the statistical regularities in language track the 2007 stock market bubble, showing emerging structure in the language of commentators, as progressively greater agreement arose in their positive perceptions of the market. Furthermore, during the bubble period, a marked divergence in positive language occurs as revealed by a Kullback-Leibler analysis.
△ Less
Submitted 11 December, 2012;
originally announced December 2012.