-
Towards Biomarker Discovery for Early Cerebral Palsy Detection: Evaluating Explanations Through Kinematic Perturbations
Authors:
Kimji N. Pellano,
Inga Strümke,
Daniel Groos,
Lars Adde,
Pål Haugen,
Espen Alexander F. Ihlen
Abstract:
Cerebral Palsy (CP) is a prevalent motor disability in children, for which early detection can significantly improve treatment outcomes. While skeleton-based Graph Convolutional Network (GCN) models have shown promise in automatically predicting CP risk from infant videos, their "black-box" nature raises concerns about clinical explainability. To address this, we introduce a perturbation framework…
▽ More
Cerebral Palsy (CP) is a prevalent motor disability in children, for which early detection can significantly improve treatment outcomes. While skeleton-based Graph Convolutional Network (GCN) models have shown promise in automatically predicting CP risk from infant videos, their "black-box" nature raises concerns about clinical explainability. To address this, we introduce a perturbation framework tailored for infant movement features and use it to compare two explainable AI (XAI) methods: Class Activation Mapping (CAM) and Gradient-weighted Class Activation Mapping (Grad-CAM). First, we identify significant and non-significant body keypoints in very low- and very high-risk infant video snippets based on the XAI attribution scores. We then conduct targeted velocity and angular perturbations, both individually and in combination, on these keypoints to assess how the GCN model's risk predictions change. Our results indicate that velocity-driven features of the arms, hips, and legs have a dominant influence on CP risk predictions, while angular perturbations have a more modest impact. Furthermore, CAM and Grad-CAM show partial convergence in their explanations for both low- and high-risk CP groups. Our findings demonstrate the use of XAI-driven movement analysis for early CP prediction and offer insights into potential movement-based biomarker discovery that warrant further clinical validation.
△ Less
Submitted 19 February, 2025;
originally announced March 2025.
-
Choose Your Explanation: A Comparison of SHAP and GradCAM in Human Activity Recognition
Authors:
Felix Tempel,
Daniel Groos,
Espen Alexander F. Ihlen,
Lars Adde,
Inga Strümke
Abstract:
Explaining machine learning (ML) models using eXplainable AI (XAI) techniques has become essential to make them more transparent and trustworthy. This is especially important in high-stakes domains like healthcare, where understanding model decisions is critical to ensure ethical, sound, and trustworthy outcome predictions. However, users are often confused about which explanability method to choo…
▽ More
Explaining machine learning (ML) models using eXplainable AI (XAI) techniques has become essential to make them more transparent and trustworthy. This is especially important in high-stakes domains like healthcare, where understanding model decisions is critical to ensure ethical, sound, and trustworthy outcome predictions. However, users are often confused about which explanability method to choose for their specific use case. We present a comparative analysis of widely used explainability methods, Shapley Additive Explanations (SHAP) and Gradient-weighted Class Activation Mapping (Grad-CAM), within the domain of human activity recognition (HAR) utilizing graph convolutional networks (GCNs). By evaluating these methods on skeleton-based data from two real-world datasets, including a healthcare-critical cerebral palsy (CP) case, this study provides vital insights into both approaches' strengths, limitations, and differences, offering a roadmap for selecting the most appropriate explanation method based on specific models and applications. We quantitatively and quantitatively compare these methods, focusing on feature importance ranking, interpretability, and model sensitivity through perturbation experiments. While SHAP provides detailed input feature attribution, Grad-CAM delivers faster, spatially oriented explanations, making both methods complementary depending on the application's requirements. Given the importance of XAI in enhancing trust and transparency in ML models, particularly in sensitive environments like healthcare, our research demonstrates how SHAP and Grad-CAM could complement each other to provide more interpretable and actionable model explanations.
△ Less
Submitted 13 April, 2025; v1 submitted 20 December, 2024;
originally announced December 2024.
-
Explaining Human Activity Recognition with SHAP: Validating Insights with Perturbation and Quantitative Measures
Authors:
Felix Tempel,
Espen Alexander F. Ihlen,
Lars Adde,
Inga Strümke
Abstract:
In Human Activity Recognition (HAR), understanding the intricacy of body movements within high-risk applications is essential. This study uses SHapley Additive exPlanations (SHAP) to explain the decision-making process of Graph Convolution Networks (GCNs) when classifying activities with skeleton data. We employ SHAP to explain two real-world datasets: one for cerebral palsy (CP) classification an…
▽ More
In Human Activity Recognition (HAR), understanding the intricacy of body movements within high-risk applications is essential. This study uses SHapley Additive exPlanations (SHAP) to explain the decision-making process of Graph Convolution Networks (GCNs) when classifying activities with skeleton data. We employ SHAP to explain two real-world datasets: one for cerebral palsy (CP) classification and the widely used NTU RGB+D 60 action recognition dataset. To test the explanation, we introduce a novel perturbation approach that modifies the model's edge importance matrix, allowing us to evaluate the impact of specific body key points on prediction outcomes. To assess the fidelity of our explanations, we employ informed perturbation, targeting body key points identified as important by SHAP and comparing them against random perturbation as a control condition. This perturbation enables a judgment on whether the body key points are truly influential or non-influential based on the SHAP values. Results on both datasets show that body key points identified as important through SHAP have the largest influence on the accuracy, specificity, and sensitivity metrics. Our findings highlight that SHAP can provide granular insights into the input feature contribution to the prediction outcome of GCNs in HAR tasks. This demonstrates the potential for more interpretable and trustworthy models in high-stakes applications like healthcare or rehabilitation.
△ Less
Submitted 21 March, 2025; v1 submitted 6 November, 2024;
originally announced November 2024.
-
Lightweight Neural Architecture Search for Cerebral Palsy Detection
Authors:
Felix Tempel,
Espen Alexander F. Ihlen,
Inga Strümke
Abstract:
The neurological condition known as cerebral palsy (CP) first manifests in infancy or early childhood and has a lifelong impact on motor coordination and body movement. CP is one of the leading causes of childhood disabilities, and early detection is crucial for providing appropriate treatment. However, such detection relies on assessments by human experts trained in methods like general movement…
▽ More
The neurological condition known as cerebral palsy (CP) first manifests in infancy or early childhood and has a lifelong impact on motor coordination and body movement. CP is one of the leading causes of childhood disabilities, and early detection is crucial for providing appropriate treatment. However, such detection relies on assessments by human experts trained in methods like general movement assessment (GMA). These are not widely accessible, especially in developing countries. Conventional machine learning approaches offer limited predictive performance on CP detection tasks, and the approaches developed by the few available domain experts are generally dataset-specific, restricting their applicability beyond the context for which these were created. To address these challenges, we propose a neural architecture search (NAS) algorithm applying a reinforcement learning update scheme capable of efficiently optimizing for the best architectural and hyperparameter combination to discover the most suitable neural network configuration for detecting CP. Our method performs better on a real-world CP dataset than other approaches in the field, which rely on large ensembles. As our approach is less resource-demanding and performs better, it is particularly suitable for implementation in resource-constrained settings, including rural or developing areas with limited access to medical experts and the required diagnostic tools. The resulting model's lightweight architecture and efficient computation time allow for deployment on devices with limited processing power, reducing the need for expensive infrastructure, and can, therefore, be integrated into clinical workflows to provide timely and accurate support for early CP diagnosis.
△ Less
Submitted 30 September, 2024;
originally announced September 2024.
-
Evaluating Explainable AI Methods in Deep Learning Models for Early Detection of Cerebral Palsy
Authors:
Kimji N. Pellano,
Inga Strümke,
Daniel Groos,
Lars Adde,
Espen Alexander F. Ihlen
Abstract:
Early detection of Cerebral Palsy (CP) is crucial for effective intervention and monitoring. This paper tests the reliability and applicability of Explainable AI (XAI) methods using a deep learning method that predicts CP by analyzing skeletal data extracted from video recordings of infant movements. Specifically, we use XAI evaluation metrics -- namely faithfulness and stability -- to quantitativ…
▽ More
Early detection of Cerebral Palsy (CP) is crucial for effective intervention and monitoring. This paper tests the reliability and applicability of Explainable AI (XAI) methods using a deep learning method that predicts CP by analyzing skeletal data extracted from video recordings of infant movements. Specifically, we use XAI evaluation metrics -- namely faithfulness and stability -- to quantitatively assess the reliability of Class Activation Mapping (CAM) and Gradient-weighted Class Activation Mapping (Grad-CAM) in this specific medical application. We utilize a unique dataset of infant movements and apply skeleton data perturbations without distorting the original dynamics of the infant movements. Our CP prediction model utilizes an ensemble approach, so we evaluate the XAI metrics performances for both the overall ensemble and the individual models. Our findings indicate that both XAI methods effectively identify key body points influencing CP predictions and that the explanations are robust against minor data perturbations. Grad-CAM significantly outperforms CAM in the RISv metric, which measures stability in terms of velocity. In contrast, CAM performs better in the RISb metric, which relates to bone stability, and the RRS metric, which assesses internal representation robustness. Individual models within the ensemble show varied results, and neither CAM nor Grad-CAM consistently outperform the other, with the ensemble approach providing a representation of outcomes from its constituent models.
△ Less
Submitted 13 August, 2024;
originally announced September 2024.
-
From Movements to Metrics: Evaluating Explainable AI Methods in Skeleton-Based Human Activity Recognition
Authors:
Kimji N. Pellano,
Inga Strümke,
Espen Alexander F. Ihlen
Abstract:
The advancement of deep learning in human activity recognition (HAR) using 3D skeleton data is critical for applications in healthcare, security, sports, and human-computer interaction. This paper tackles a well-known gap in the field, which is the lack of testing in the applicability and reliability of XAI evaluation metrics in the skeleton-based HAR domain. We have tested established XAI metrics…
▽ More
The advancement of deep learning in human activity recognition (HAR) using 3D skeleton data is critical for applications in healthcare, security, sports, and human-computer interaction. This paper tackles a well-known gap in the field, which is the lack of testing in the applicability and reliability of XAI evaluation metrics in the skeleton-based HAR domain. We have tested established XAI metrics namely faithfulness and stability on Class Activation Mapping (CAM) and Gradient-weighted Class Activation Mapping (Grad-CAM) to address this problem. The study also introduces a perturbation method that respects human biomechanical constraints to ensure realistic variations in human movement. Our findings indicate that \textit{faithfulness} may not be a reliable metric in certain contexts, such as with the EfficientGCN model. Conversely, stability emerges as a more dependable metric when there is slight input data perturbations. CAM and Grad-CAM are also found to produce almost identical explanations, leading to very similar XAI metric performance. This calls for the need for more diversified metrics and new XAI methods applied in skeleton-based HAR.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
AutoGCN -- Towards Generic Human Activity Recognition with Neural Architecture Search
Authors:
Felix Tempel,
Inga Strümke,
Espen Alexander F. Ihlen
Abstract:
This paper introduces AutoGCN, a generic Neural Architecture Search (NAS) algorithm for Human Activity Recognition (HAR) using Graph Convolution Networks (GCNs). HAR has gained attention due to advances in deep learning, increased data availability, and enhanced computational capabilities. At the same time, GCNs have shown promising results in modeling relationships between body key points in a sk…
▽ More
This paper introduces AutoGCN, a generic Neural Architecture Search (NAS) algorithm for Human Activity Recognition (HAR) using Graph Convolution Networks (GCNs). HAR has gained attention due to advances in deep learning, increased data availability, and enhanced computational capabilities. At the same time, GCNs have shown promising results in modeling relationships between body key points in a skeletal graph. While domain experts often craft dataset-specific GCN-based methods, their applicability beyond this specific context is severely limited. AutoGCN seeks to address this limitation by simultaneously searching for the ideal hyperparameters and architecture combination within a versatile search space using a reinforcement controller while balancing optimal exploration and exploitation behavior with a knowledge reservoir during the search process. We conduct extensive experiments on two large-scale datasets focused on skeleton-based action recognition to assess the proposed algorithm's performance. Our experimental results underscore the effectiveness of AutoGCN in constructing optimal GCN architectures for HAR, outperforming conventional NAS and GCN methods, as well as random search. These findings highlight the significance of a diverse search space and an expressive input representation to enhance the network performance and generalizability.
△ Less
Submitted 12 March, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Towards human-level performance on automatic pose estimation of infant spontaneous movements
Authors:
Daniel Groos,
Lars Adde,
Ragnhild Støen,
Heri Ramampiaro,
Espen A. F. Ihlen
Abstract:
Assessment of spontaneous movements can predict the long-term developmental disorders in high-risk infants. In order to develop algorithms for automated prediction of later disorders, highly precise localization of segments and joints by infant pose estimation is required. Four types of convolutional neural networks were trained and evaluated on a novel infant pose dataset, covering the large vari…
▽ More
Assessment of spontaneous movements can predict the long-term developmental disorders in high-risk infants. In order to develop algorithms for automated prediction of later disorders, highly precise localization of segments and joints by infant pose estimation is required. Four types of convolutional neural networks were trained and evaluated on a novel infant pose dataset, covering the large variation in 1 424 videos from a clinical international community. The localization performance of the networks was evaluated as the deviation between the estimated keypoint positions and human expert annotations. The computational efficiency was also assessed to determine the feasibility of the neural networks in clinical practice. The best performing neural network had a similar localization error to the inter-rater spread of human expert annotations, while still operating efficiently. Overall, the results of our study show that pose estimation of infant spontaneous movements has a great potential to support research initiatives on early detection of developmental disorders in children with perinatal brain injuries by quantifying infant movements from video recordings with human-level performance.
△ Less
Submitted 16 December, 2021; v1 submitted 12 October, 2020;
originally announced October 2020.
-
EfficientPose: Scalable single-person pose estimation
Authors:
Daniel Groos,
Heri Ramampiaro,
Espen A. F. Ihlen
Abstract:
Single-person human pose estimation facilitates markerless movement analysis in sports, as well as in clinical applications. Still, state-of-the-art models for human pose estimation generally do not meet the requirements of real-life applications. The proliferation of deep learning techniques has resulted in the development of many advanced approaches. However, with the progresses in the field, mo…
▽ More
Single-person human pose estimation facilitates markerless movement analysis in sports, as well as in clinical applications. Still, state-of-the-art models for human pose estimation generally do not meet the requirements of real-life applications. The proliferation of deep learning techniques has resulted in the development of many advanced approaches. However, with the progresses in the field, more complex and inefficient models have also been introduced, which have caused tremendous increases in computational demands. To cope with these complexity and inefficiency challenges, we propose a novel convolutional neural network architecture, called EfficientPose, which exploits recently proposed EfficientNets in order to deliver efficient and scalable single-person pose estimation. EfficientPose is a family of models harnessing an effective multi-scale feature extractor and computationally efficient detection blocks using mobile inverted bottleneck convolutions, while at the same time ensuring that the precision of the pose configurations is still improved. Due to its low complexity and efficiency, EfficientPose enables real-world applications on edge devices by limiting the memory footprint and computational cost. The results from our experiments, using the challenging MPII single-person benchmark, show that the proposed EfficientPose models substantially outperform the widely-used OpenPose model both in terms of accuracy and computational efficiency. In particular, our top-performing model achieves state-of-the-art accuracy on single-person MPII, with low-complexity ConvNets.
△ Less
Submitted 4 December, 2020; v1 submitted 25 April, 2020;
originally announced April 2020.