-
Validation of Conformal Prediction in Cervical Atypia Classification
Authors:
Misgina Tsighe Hagos,
Antti Suutala,
Dmitrii Bychkov,
Hakan Kücükel,
Joar von Bahr,
Milda Poceviciute,
Johan Lundin,
Nina Linder,
Claes Lundström
Abstract:
Deep learning based cervical cancer classification can potentially increase access to screening in low-resource regions. However, deep learning models are often overconfident and do not reliably reflect diagnostic uncertainty. Moreover, they are typically optimized to generate maximum-likelihood predictions, which fail to convey uncertainty or ambiguity in their results. Such challenges can be add…
▽ More
Deep learning based cervical cancer classification can potentially increase access to screening in low-resource regions. However, deep learning models are often overconfident and do not reliably reflect diagnostic uncertainty. Moreover, they are typically optimized to generate maximum-likelihood predictions, which fail to convey uncertainty or ambiguity in their results. Such challenges can be addressed using conformal prediction, a model-agnostic framework for generating prediction sets that contain likely classes for trained deep-learning models. The size of these prediction sets indicates model uncertainty, contracting as model confidence increases. However, existing conformal prediction evaluation primarily focuses on whether the prediction set includes or covers the true class, often overlooking the presence of extraneous classes. We argue that prediction sets should be truthful and valuable to end users, ensuring that the listed likely classes align with human expectations rather than being overly relaxed and including false positives or unlikely classes. In this study, we comprehensively validate conformal prediction sets using expert annotation sets collected from multiple annotators. We evaluate three conformal prediction approaches applied to three deep-learning models trained for cervical atypia classification. Our expert annotation-based analysis reveals that conventional coverage-based evaluations overestimate performance and that current conformal prediction methods often produce prediction sets that are not well aligned with human labels. Additionally, we explore the capabilities of the conformal prediction methods in identifying ambiguous and out-of-distribution data.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Rethinking Knee Osteoarthritis Severity Grading: A Few Shot Self-Supervised Contrastive Learning Approach
Authors:
Niamh Belton,
Misgina Tsighe Hagos,
Aonghus Lawlor,
Kathleen M. Curran
Abstract:
Knee Osteoarthritis (OA) is a debilitating disease affecting over 250 million people worldwide. Currently, radiologists grade the severity of OA on an ordinal scale from zero to four using the Kellgren-Lawrence (KL) system. Recent studies have raised concern in relation to the subjectivity of the KL grading system, highlighting the requirement for an automated system, while also indicating that fi…
▽ More
Knee Osteoarthritis (OA) is a debilitating disease affecting over 250 million people worldwide. Currently, radiologists grade the severity of OA on an ordinal scale from zero to four using the Kellgren-Lawrence (KL) system. Recent studies have raised concern in relation to the subjectivity of the KL grading system, highlighting the requirement for an automated system, while also indicating that five ordinal classes may not be the most appropriate approach for assessing OA severity. This work presents preliminary results of an automated system with a continuous grading scale. This system, namely SS-FewSOME, uses self-supervised pre-training to learn robust representations of the features of healthy knee X-rays. It then assesses the OA severity by the X-rays' distance to the normal representation space. SS-FewSOME initially trains on only 'few' examples of healthy knee X-rays, thus reducing the barriers to clinical implementation by eliminating the need for large training sets and costly expert annotations that existing automated systems require. The work reports promising initial results, obtaining a positive Spearman Rank Correlation Coefficient of 0.43, having had access to only 30 ground truth labels at training time.
△ Less
Submitted 18 June, 2024;
originally announced July 2024.
-
Distance-Aware eXplanation Based Learning
Authors:
Misgina Tsighe Hagos,
Niamh Belton,
Kathleen M. Curran,
Brian Mac Namee
Abstract:
eXplanation Based Learning (XBL) is an interactive learning approach that provides a transparent method of training deep learning models by interacting with their explanations. XBL augments loss functions to penalize a model based on deviation of its explanations from user annotation of image features. The literature on XBL mostly depends on the intersection of visual model explanations and image…
▽ More
eXplanation Based Learning (XBL) is an interactive learning approach that provides a transparent method of training deep learning models by interacting with their explanations. XBL augments loss functions to penalize a model based on deviation of its explanations from user annotation of image features. The literature on XBL mostly depends on the intersection of visual model explanations and image feature annotations. We present a method to add a distance-aware explanation loss to categorical losses that trains a learner to focus on important regions of a training dataset. Distance is an appropriate approach for calculating explanation loss since visual model explanations such as Gradient-weighted Class Activation Mapping (Grad-CAMs) are not strictly bounded as annotations and their intersections may not provide complete information on the deviation of a model's focus from relevant image regions. In addition to assessing our model using existing metrics, we propose an interpretability metric for evaluating visual feature-attribution based model explanations that is more informative of the model's performance than existing metrics. We demonstrate performance of our proposed method on three image classification tasks.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Unlearning Spurious Correlations in Chest X-ray Classification
Authors:
Misgina Tsighe Hagos,
Kathleen M. Curran,
Brian Mac Namee
Abstract:
Medical image classification models are frequently trained using training datasets derived from multiple data sources. While leveraging multiple data sources is crucial for achieving model generalization, it is important to acknowledge that the diverse nature of these sources inherently introduces unintended confounders and other challenges that can impact both model accuracy and transparency. A n…
▽ More
Medical image classification models are frequently trained using training datasets derived from multiple data sources. While leveraging multiple data sources is crucial for achieving model generalization, it is important to acknowledge that the diverse nature of these sources inherently introduces unintended confounders and other challenges that can impact both model accuracy and transparency. A notable confounding factor in medical image classification, particularly in musculoskeletal image classification, is skeletal maturation-induced bone growth observed during adolescence. We train a deep learning model using a Covid-19 chest X-ray dataset and we showcase how this dataset can lead to spurious correlations due to unintended confounding regions. eXplanation Based Learning (XBL) is a deep learning approach that goes beyond interpretability by utilizing model explanations to interactively unlearn spurious correlations. This is achieved by integrating interactive user feedback, specifically feature annotations. In our study, we employed two non-demanding manual feedback mechanisms to implement an XBL-based approach for effectively eliminating these spurious correlations. Our results underscore the promising potential of XBL in constructing robust models even in the presence of confounding factors.
△ Less
Submitted 3 August, 2023; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Learning from Exemplary Explanations
Authors:
Misgina Tsighe Hagos,
Kathleen M. Curran,
Brian Mac Namee
Abstract:
eXplanation Based Learning (XBL) is a form of Interactive Machine Learning (IML) that provides a model refining approach via user feedback collected on model explanations. Although the interactivity of XBL promotes model transparency, XBL requires a huge amount of user interaction and can become expensive as feedback is in the form of detailed annotation rather than simple category labelling which…
▽ More
eXplanation Based Learning (XBL) is a form of Interactive Machine Learning (IML) that provides a model refining approach via user feedback collected on model explanations. Although the interactivity of XBL promotes model transparency, XBL requires a huge amount of user interaction and can become expensive as feedback is in the form of detailed annotation rather than simple category labelling which is more common in IML. This expense is exacerbated in high stakes domains such as medical image classification. To reduce the effort and expense of XBL we introduce a new approach that uses two input instances and their corresponding Gradient Weighted Class Activation Mapping (GradCAM) model explanations as exemplary explanations to implement XBL. Using a medical image classification task, we demonstrate that, using minimal human input, our approach produces improved explanations (+0.02, +3%) and achieves reduced classification performance (-0.04, -4%) when compared against a model trained without interactions.
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
Interpretable Weighted Siamese Network to Predict the Time to Onset of Alzheimer's Disease from MRI Images
Authors:
Misgina Tsighe Hagos,
Niamh Belton,
Ronan P. Killeen,
Kathleen M. Curran,
Brian Mac Namee
Abstract:
Alzheimer's Disease (AD) is a progressive disease preceded by Mild Cognitive Impairment (MCI). Early detection of AD is crucial for making treatment decisions. However, most of the literature on computer-assisted detection of AD focuses on classifying brain images into one of three major categories: healthy, MCI, and AD; or categorizing MCI patients into (1) progressive: those who progress from MC…
▽ More
Alzheimer's Disease (AD) is a progressive disease preceded by Mild Cognitive Impairment (MCI). Early detection of AD is crucial for making treatment decisions. However, most of the literature on computer-assisted detection of AD focuses on classifying brain images into one of three major categories: healthy, MCI, and AD; or categorizing MCI patients into (1) progressive: those who progress from MCI to AD at a future examination time, and (2) stable: those who stay as MCI and never progress to AD. This misses the opportunity to accurately identify the trajectory of progressive MCI patients. In this paper, we revisit the brain image classification task for AD identification and re-frame it as an ordinal classification task to predict how close a patient is to the severe AD stage. To this end, we select progressive MCI patients from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and construct an ordinal dataset with a prediction target that indicates the time to progression to AD. We train a Siamese network model to predict the time to onset of AD based on MRI brain images. We also propose a Weighted variety of Siamese network and compare its performance to a baseline model. Our evaluations show that incorporating a weighting factor to Siamese networks brings considerable performance gain at predicting how close input brain MRI images are to progressing to AD. Moreover, we complement our results with an interpretation of the learned embedding space of the Siamese networks using a model explainability technique.
△ Less
Submitted 14 September, 2023; v1 submitted 14 April, 2023;
originally announced April 2023.
-
FewSOME: One-Class Few Shot Anomaly Detection with Siamese Networks
Authors:
Niamh Belton,
Misgina Tsighe Hagos,
Aonghus Lawlor,
Kathleen M. Curran
Abstract:
Recent Anomaly Detection techniques have progressed the field considerably but at the cost of increasingly complex training pipelines. Such techniques require large amounts of training data, resulting in computationally expensive algorithms that are unsuitable for settings where only a small amount of normal samples are available for training. We propose 'Few Shot anOMaly detection' (FewSOME), a d…
▽ More
Recent Anomaly Detection techniques have progressed the field considerably but at the cost of increasingly complex training pipelines. Such techniques require large amounts of training data, resulting in computationally expensive algorithms that are unsuitable for settings where only a small amount of normal samples are available for training. We propose 'Few Shot anOMaly detection' (FewSOME), a deep One-Class Anomaly Detection algorithm with the ability to accurately detect anomalies having trained on 'few' examples of the normal class and no examples of the anomalous class. We describe FewSOME to be of low complexity given its low data requirement and short training time. FewSOME is aided by pretrained weights with an architecture based on Siamese Networks. By means of an ablation study, we demonstrate how our proposed loss, 'Stop Loss', improves the robustness of FewSOME. Our experiments demonstrate that FewSOME performs at state-of-the-art level on benchmark datasets MNIST, CIFAR-10, F-MNIST and MVTec AD while training on only 30 normal samples, a minute fraction of the data that existing methods are trained on. Moreover, our experiments show FewSOME to be robust to contaminated datasets. We also report F1 score and balanced accuracy in addition to AUC as a benchmark for future techniques to be compared against. Code available; https://github.com/niamhbelton/FewSOME.
△ Less
Submitted 12 June, 2023; v1 submitted 17 January, 2023;
originally announced January 2023.
-
Identifying Spurious Correlations and Correcting them with an Explanation-based Learning
Authors:
Misgina Tsighe Hagos,
Kathleen M. Curran,
Brian Mac Namee
Abstract:
Identifying spurious correlations learned by a trained model is at the core of refining a trained model and building a trustworthy model. We present a simple method to identify spurious correlations that have been learned by a model trained for image classification problems. We apply image-level perturbations and monitor changes in certainties of predictions made using the trained model. We demons…
▽ More
Identifying spurious correlations learned by a trained model is at the core of refining a trained model and building a trustworthy model. We present a simple method to identify spurious correlations that have been learned by a model trained for image classification problems. We apply image-level perturbations and monitor changes in certainties of predictions made using the trained model. We demonstrate this approach using an image classification dataset that contains images with synthetically generated spurious regions and show that the trained model was overdependent on spurious regions. Moreover, we remove the learned spurious correlations with an explanation based learning approach.
△ Less
Submitted 5 December, 2022; v1 submitted 15 November, 2022;
originally announced November 2022.
-
Impact of Feedback Type on Explanatory Interactive Learning
Authors:
Misgina Tsighe Hagos,
Kathleen M. Curran,
Brian Mac Namee
Abstract:
Explanatory Interactive Learning (XIL) collects user feedback on visual model explanations to implement a Human-in-the-Loop (HITL) based interactive learning scenario. Different user feedback types will have different impacts on user experience and the cost associated with collecting feedback since different feedback types involve different levels of image annotation. Although XIL has been used to…
▽ More
Explanatory Interactive Learning (XIL) collects user feedback on visual model explanations to implement a Human-in-the-Loop (HITL) based interactive learning scenario. Different user feedback types will have different impacts on user experience and the cost associated with collecting feedback since different feedback types involve different levels of image annotation. Although XIL has been used to improve classification performance in multiple domains, the impact of different user feedback types on model performance and explanation accuracy is not well studied. To guide future XIL work we compare the effectiveness of two different user feedback types in image classification tasks: (1) instructing an algorithm to ignore certain spurious image features, and (2) instructing an algorithm to focus on certain valid image features. We use explanations from a Gradient-weighted Class Activation Mapping (GradCAM) based XIL model to support both feedback types. We show that identifying and annotating spurious image features that a model finds salient results in superior classification and explanation accuracy than user feedback that tells a model to focus on valid image features.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Posture Prediction for Healthy Sitting using a Smart Chair
Authors:
Tariku Adane Gelaw,
Misgina Tsighe Hagos
Abstract:
Poor sitting habits have been identified as a risk factor to musculoskeletal disorders and lower back pain especially on the elderly, disabled people, and office workers. In the current computerized world, even while involved in leisure or work activity, people tend to spend most of their days sitting at computer desks. This can result in spinal pain and related problems. Therefore, a means to rem…
▽ More
Poor sitting habits have been identified as a risk factor to musculoskeletal disorders and lower back pain especially on the elderly, disabled people, and office workers. In the current computerized world, even while involved in leisure or work activity, people tend to spend most of their days sitting at computer desks. This can result in spinal pain and related problems. Therefore, a means to remind people about their sitting habits and provide recommendations to counterbalance, such as physical exercise, is important. Posture recognition for seated postures have not received enough attention as most works focus on standing postures. Wearable sensors, pressure or force sensors, videos and images were used for posture recognition in the literature. The aim of this study is to build Machine Learning models for classifying sitting posture of a person by analyzing data collected from a chair platted with two 32 by 32 pressure sensors at its seat and backrest. Models were built using five algorithms: Random Forest (RF), Gaussian Naïve Bayes, Logistic Regression, Support Vector Machine and Deep Neural Network (DNN). All the models are evaluated using KFold cross-validation technique. This paper presents experiments conducted using the two separate datasets, controlled and realistic, and discusses results achieved at classifying six sitting postures. Average classification accuracies of 98% and 97% were achieved on the controlled and realistic datasets, respectively.
△ Less
Submitted 5 January, 2022;
originally announced January 2022.
-
Optimising Knee Injury Detection with Spatial Attention and Validating Localisation Ability
Authors:
Niamh Belton,
Ivan Welaratne,
Adil Dahlan,
Ronan T Hearne,
Misgina Tsighe Hagos,
Aonghus Lawlor,
Kathleen M. Curran
Abstract:
This work employs a pre-trained, multi-view Convolutional Neural Network (CNN) with a spatial attention block to optimise knee injury detection. An open-source Magnetic Resonance Imaging (MRI) data set with image-level labels was leveraged for this analysis. As MRI data is acquired from three planes, we compare our technique using data from a single-plane and multiple planes (multi-plane). For mul…
▽ More
This work employs a pre-trained, multi-view Convolutional Neural Network (CNN) with a spatial attention block to optimise knee injury detection. An open-source Magnetic Resonance Imaging (MRI) data set with image-level labels was leveraged for this analysis. As MRI data is acquired from three planes, we compare our technique using data from a single-plane and multiple planes (multi-plane). For multi-plane, we investigate various methods of fusing the planes in the network. This analysis resulted in the novel 'MPFuseNet' network and state-of-the-art Area Under the Curve (AUC) scores for detecting Anterior Cruciate Ligament (ACL) tears and Abnormal MRIs, achieving AUC scores of 0.977 and 0.957 respectively. We then developed an objective metric, Penalised Localisation Accuracy (PLA), to validate the model's localisation ability. This metric compares binary masks generated from Grad-Cam output and the radiologist's annotations on a sample of MRIs. We also extracted explainability features in a model-agnostic approach that were then verified as clinically relevant by the radiologist.
△ Less
Submitted 18 August, 2021;
originally announced August 2021.
-
Automated Smartphone based System for Diagnosis of Diabetic Retinopathy
Authors:
Misgina Tsighe Hagos,
Shri Kant,
Surayya Ado Bala
Abstract:
Early diagnosis of diabetic retinopathy for treatment of the disease has been failing to reach diabetic people living in rural areas. Shortage of trained ophthalmologists, limited availability of healthcare centers, and expensiveness of diagnostic equipment are among the reasons. Although many deep learning-based automatic diagnosis of diabetic retinopathy techniques have been implemented in the l…
▽ More
Early diagnosis of diabetic retinopathy for treatment of the disease has been failing to reach diabetic people living in rural areas. Shortage of trained ophthalmologists, limited availability of healthcare centers, and expensiveness of diagnostic equipment are among the reasons. Although many deep learning-based automatic diagnosis of diabetic retinopathy techniques have been implemented in the literature, these methods still fail to provide a point-of-care diagnosis. This raises the need for an independent diagnostic of diabetic retinopathy that can be used by a non-expert. Recently the usage of smartphones has been increasing across the world. Automated diagnoses of diabetic retinopathy can be deployed on smartphones in order to provide an instant diagnosis to diabetic people residing in remote areas. In this paper, inception based convolutional neural network and binary decision tree-based ensemble of classifiers have been proposed and implemented to detect and classify diabetic retinopathy. The proposed method was further imported into a smartphone application for mobile-based classification, which provides an offline and automatic system for diagnosis of diabetic retinopathy.
△ Less
Submitted 7 April, 2020;
originally announced April 2020.
-
Diagnosis of Diabetic Retinopathy in Ethiopia: Before the Deep Learning based Automation
Authors:
Misgina Tsighe Hagos
Abstract:
Introducing automated Diabetic Retinopathy (DR) diagnosis into Ethiopia is still a challenging task, despite recent reports that present trained Deep Learning (DL) based DR classifiers surpassing manual graders. This is mainly because of the expensive cost of conventional retinal imaging devices used in DL based classifiers. Current approaches that provide mobile based binary classification of DR,…
▽ More
Introducing automated Diabetic Retinopathy (DR) diagnosis into Ethiopia is still a challenging task, despite recent reports that present trained Deep Learning (DL) based DR classifiers surpassing manual graders. This is mainly because of the expensive cost of conventional retinal imaging devices used in DL based classifiers. Current approaches that provide mobile based binary classification of DR, and the way towards a cheaper and offline multi-class classification of DR will be discussed in this paper.
△ Less
Submitted 29 April, 2020; v1 submitted 20 March, 2020;
originally announced March 2020.
-
Point-of-Care Diabetic Retinopathy Diagnosis: A Standalone Mobile Application Approach
Authors:
Misgina Tsighe Hagos
Abstract:
Although deep learning research and applications have grown rapidly over the past decade, it has shown limitation in healthcare applications and its reachability to people in remote areas. One of the challenges of incorporating deep learning in medical data classification or prediction is the shortage of annotated training data in the healthcare industry. Medical data sharing privacy issues and li…
▽ More
Although deep learning research and applications have grown rapidly over the past decade, it has shown limitation in healthcare applications and its reachability to people in remote areas. One of the challenges of incorporating deep learning in medical data classification or prediction is the shortage of annotated training data in the healthcare industry. Medical data sharing privacy issues and limited patient population size can be stated as some of the reasons for training data insufficiency in healthcare. Methods to exploit deep learning applications in healthcare have been proposed and implemented in this dissertation.
Traditional diagnosis of diabetic retinopathy requires trained ophthalmologists and expensive imaging equipment to reach healthcare centres in order to provide facilities for treatment of preventable blindness. Diabetic people residing in remote areas with shortage of healthcare services and ophthalmologists usually fail to get periodical diagnosis of diabetic retinopathy thereby facing the probability of vision loss or impairment. Deep learning and mobile application development have been integrated in this dissertation to provide an easy to use point-of-care smartphone based diagnosis of diabetic retinopathy. In order to solve the challenge of shortage of healthcare centres and trained ophthalmologists, the standalone diagnostic service was built so as to be operated by a non-expert without an internet connection. This approach could be transferred to other areas of medical image classification.
△ Less
Submitted 26 January, 2020;
originally announced February 2020.
-
Transfer Learning based Detection of Diabetic Retinopathy from Small Dataset
Authors:
Misgina Tsighe Hagos,
Shri Kant
Abstract:
Annotated training data insufficiency remains to be one of the challenges of applying deep learning in medical data classification problems. Transfer learning from an already trained deep convolutional network can be used to reduce the cost of training from scratch and to train with small training data for deep learning. This raises the question of whether we can use transfer learning to overcome…
▽ More
Annotated training data insufficiency remains to be one of the challenges of applying deep learning in medical data classification problems. Transfer learning from an already trained deep convolutional network can be used to reduce the cost of training from scratch and to train with small training data for deep learning. This raises the question of whether we can use transfer learning to overcome the training data insufficiency problem in deep learning based medical data classifications. Deep convolutional networks have been achieving high performance results on the ImageNet Large Scale Visual Recognition Competition (ILSVRC) image classification challenge. One example is the Inception-V3 model that was the first runner up on the ILSVRC 2015 challenge. Inception modules that help to extract different sized features of input images in one level of convolution are the unique features of the Inception-V3. In this work, we have used a pretrained Inception-V3 model to take advantage of its Inception modules for Diabetic Retinopathy detection. In order to tackle the labelled data insufficiency problem, we sub-sampled a smaller version of the Kaggle Diabetic Retinopathy classification challenge dataset for model training, and tested the model's accuracy on a previously unseen data subset. Our technique could be used in other deep learning based medical image classification problems facing the challenge of labeled training data insufficiency.
△ Less
Submitted 22 May, 2019; v1 submitted 17 May, 2019;
originally announced May 2019.