-
A Tutorial on Explainable Image Classification for Dementia Stages Using Convolutional Neural Network and Gradient-weighted Class Activation Mapping
Authors:
Kevin Kam Fung Yuen
Abstract:
This paper presents a tutorial of an explainable approach using Convolutional Neural Network (CNN) and Gradient-weighted Class Activation Mapping (Grad-CAM) to classify four progressive dementia stages based on open MRI brain images. The detailed implementation steps are demonstrated with an explanation. Whilst the proposed CNN architecture is demonstrated to achieve more than 99% accuracy for the…
▽ More
This paper presents a tutorial of an explainable approach using Convolutional Neural Network (CNN) and Gradient-weighted Class Activation Mapping (Grad-CAM) to classify four progressive dementia stages based on open MRI brain images. The detailed implementation steps are demonstrated with an explanation. Whilst the proposed CNN architecture is demonstrated to achieve more than 99% accuracy for the test dataset, the computational procedure of CNN remains a black box. The visualisation based on Grad-CAM is attempted to explain such very high accuracy and may provide useful information for physicians. Future motivation based on this work is discussed.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Navigating Uncertainties in Machine Learning for Structural Dynamics: A Comprehensive Review of Probabilistic and Non-Probabilistic Approaches in Forward and Inverse Problems
Authors:
Wang-Ji Yan,
Lin-Feng Mei,
Jiang Mo,
Costas Papadimitriou,
Ka-Veng Yuen,
Michael Beer
Abstract:
In the era of big data, machine learning (ML) has become a powerful tool in various fields, notably impacting structural dynamics. ML algorithms offer advantages by modeling physical phenomena based on data, even in the absence of underlying mechanisms. However, uncertainties such as measurement noise and modeling errors can compromise the reliability of ML predictions, highlighting the need for e…
▽ More
In the era of big data, machine learning (ML) has become a powerful tool in various fields, notably impacting structural dynamics. ML algorithms offer advantages by modeling physical phenomena based on data, even in the absence of underlying mechanisms. However, uncertainties such as measurement noise and modeling errors can compromise the reliability of ML predictions, highlighting the need for effective uncertainty awareness to enhance prediction robustness. This paper presents a comprehensive review on navigating uncertainties in ML, categorizing uncertainty-aware approaches into probabilistic methods (including Bayesian and frequentist perspectives) and non-probabilistic methods (such as interval learning and fuzzy learning). Bayesian neural networks, known for their uncertainty quantification and nonlinear mapping capabilities, are emphasized for their superior performance and potential. The review covers various techniques and methodologies for addressing uncertainties in ML, discussing fundamentals and implementation procedures of each method. While providing a concise overview of fundamental concepts, the paper refrains from in-depth critical explanations. Strengths and limitations of each approach are examined, along with their applications in structural dynamic forward problems like response prediction, sensitivity assessment, and reliability analysis, and inverse problems like system identification, model updating, and damage identification. Additionally, the review identifies research gaps and suggests future directions for investigations, aiming to provide comprehensive insights to the research community. By offering an extensive overview of both probabilistic and non-probabilistic approaches, this review aims to assist researchers and practitioners in making informed decisions when utilizing ML techniques to address uncertainties in structural dynamic problems.
△ Less
Submitted 16 August, 2024;
originally announced August 2024.
-
Random vector functional link neural network based ensemble deep learning for short-term load forecasting
Authors:
Ruobin Gao,
Liang Du,
P. N. Suganthan,
Qin Zhou,
Kum Fai Yuen
Abstract:
Electricity load forecasting is crucial for the power systems' planning and maintenance. However, its un-stationary and non-linear characteristics impose significant difficulties in anticipating future demand. This paper proposes a novel ensemble deep Random Vector Functional Link (edRVFL) network for electricity load forecasting. The weights of hidden layers are randomly initialized and kept fixe…
▽ More
Electricity load forecasting is crucial for the power systems' planning and maintenance. However, its un-stationary and non-linear characteristics impose significant difficulties in anticipating future demand. This paper proposes a novel ensemble deep Random Vector Functional Link (edRVFL) network for electricity load forecasting. The weights of hidden layers are randomly initialized and kept fixed during the training process. The hidden layers are stacked to enforce deep representation learning. Then, the model generates the forecasts by ensembling the outputs of each layer. Moreover, we also propose to augment the random enhancement features by empirical wavelet transformation (EWT). The raw load data is decomposed by EWT in a walk-forward fashion, not introducing future data leakage problems in the decomposition process. Finally, all the sub-series generated by the EWT, including raw data, are fed into the edRVFL for forecasting purposes. The proposed model is evaluated on twenty publicly available time series from the Australian Energy Market Operator of the year 2020. The simulation results demonstrate the proposed model's superior performance over eleven forecasting methods in three error metrics and statistical tests on electricity load forecasting tasks.
△ Less
Submitted 29 July, 2021;
originally announced July 2021.
-
Looking at Hands in Autonomous Vehicles: A ConvNet Approach using Part Affinity Fields
Authors:
Kevan Yuen,
Mohan M. Trivedi
Abstract:
In the context of autonomous driving, where humans may need to take over in the event where the computer may issue a takeover request, a key step towards driving safety is the monitoring of the hands to ensure the driver is ready for such a request. This work, focuses on the first step of this process, which is to locate the hands. Such a system must work in real-time and under varying harsh light…
▽ More
In the context of autonomous driving, where humans may need to take over in the event where the computer may issue a takeover request, a key step towards driving safety is the monitoring of the hands to ensure the driver is ready for such a request. This work, focuses on the first step of this process, which is to locate the hands. Such a system must work in real-time and under varying harsh lighting conditions. This paper introduces a fast ConvNet approach, based on the work of original work of OpenPose for full body joint estimation. The network is modified with fewer parameters and retrained using our own day-time naturalistic autonomous driving dataset to estimate joint and affinity heatmaps for driver & passenger's wrist and elbows, for a total of 8 joint classes and part affinity fields between each wrist-elbow pair. The approach runs real-time on real-world data at 40 fps on multiple drivers and passengers. The system is extensively evaluated both quantitatively and qualitatively, showing at least 95% detection performance on joint localization and arm-angle estimation.
△ Less
Submitted 3 April, 2018;
originally announced April 2018.
-
An Occluded Stacked Hourglass Approach to Facial Landmark Localization and Occlusion Estimation
Authors:
Kevan Yuen,
Mohan M. Trivedi
Abstract:
A key step to driver safety is to observe the driver's activities with the face being a key step in this process to extracting information such as head pose, blink rate, yawns, talking to passenger which can then help derive higher level information such as distraction, drowsiness, intent, and where they are looking. In the context of driving safety, it is important for the system perform robust e…
▽ More
A key step to driver safety is to observe the driver's activities with the face being a key step in this process to extracting information such as head pose, blink rate, yawns, talking to passenger which can then help derive higher level information such as distraction, drowsiness, intent, and where they are looking. In the context of driving safety, it is important for the system perform robust estimation under harsh lighting and occlusion but also be able to detect when the occlusion occurs so that information predicted from occluded parts of the face can be taken into account properly. This paper introduces the Occluded Stacked Hourglass, based on the work of original Stacked Hourglass network for body pose joint estimation, which is retrained to process a detected face window and output 68 occlusion heat maps, each corresponding to a facial landmark. Landmark location, occlusion levels and a refined face detection score, to reject false positives, are extracted from these heat maps. Using the facial landmark locations, features such as head pose and eye/mouth openness can be extracted to derive driver attention and activity. The system is evaluated for face detection, head pose, and occlusion estimation on various datasets in the wild, both quantitatively and qualitatively, and shows state-of-the-art results.
△ Less
Submitted 5 February, 2018;
originally announced February 2018.
-
Dynamics of Driver's Gaze: Explorations in Behavior Modeling & Maneuver Prediction
Authors:
Sujitha Martin,
Sourabh Vora,
Kevan Yuen,
Mohan M. Trivedi
Abstract:
The study and modeling of driver's gaze dynamics is important because, if and how the driver is monitoring the driving environment is vital for driver assistance in manual mode, for take-over requests in highly automated mode and for semantic perception of the surround in fully autonomous mode. We developed a machine vision based framework to classify driver's gaze into context rich zones of inter…
▽ More
The study and modeling of driver's gaze dynamics is important because, if and how the driver is monitoring the driving environment is vital for driver assistance in manual mode, for take-over requests in highly automated mode and for semantic perception of the surround in fully autonomous mode. We developed a machine vision based framework to classify driver's gaze into context rich zones of interest and model driver's gaze behavior by representing gaze dynamics over a time period using gaze accumulation, glance duration and glance frequencies. As a use case, we explore the driver's gaze dynamic patterns during maneuvers executed in freeway driving, namely, left lane change maneuver, right lane change maneuver and lane keeping. It is shown that condensing gaze dynamics into durations and frequencies leads to recurring patterns based on driver activities. Furthermore, modeling these patterns show predictive powers in maneuver detection up to a few hundred milliseconds a priori.
△ Less
Submitted 31 January, 2018;
originally announced February 2018.
-
A Multimodal, Full-Surround Vehicular Testbed for Naturalistic Studies and Benchmarking: Design, Calibration and Deployment
Authors:
Akshay Rangesh,
Kevan Yuen,
Ravi Kumar Satzoda,
Rakesh Nattoji Rajaram,
Pujitha Gunaratne,
Mohan M. Trivedi
Abstract:
Recent progress in autonomous and semi-autonomous driving has been made possible in part through an assortment of sensors that provide the intelligent agent with an enhanced perception of its surroundings. It has been clear for quite some while now that for intelligent vehicles to function effectively in all situations and conditions, a fusion of different sensor technologies is essential. Consequ…
▽ More
Recent progress in autonomous and semi-autonomous driving has been made possible in part through an assortment of sensors that provide the intelligent agent with an enhanced perception of its surroundings. It has been clear for quite some while now that for intelligent vehicles to function effectively in all situations and conditions, a fusion of different sensor technologies is essential. Consequently, the availability of synchronized multi-sensory data streams are necessary to promote the development of fusion based algorithms for low, mid and high level semantic tasks. In this paper, we provide a comprehensive description of LISA-A: our heavily sensorized, full-surround testbed capable of providing high quality data from a slew of synchronized and calibrated sensors such as cameras, LIDARs, radars, and the IMU/GPS. The vehicle has recorded over 100 hours of real world data for a very diverse set of weather, traffic and daylight conditions. All captured data is accurately calibrated and synchronized using timestamps, and stored safely in high performance servers mounted inside the vehicle itself. Details on the testbed instrumentation, sensor layout, sensor outputs, calibration and synchronization are described in this paper.
△ Less
Submitted 30 January, 2019; v1 submitted 21 September, 2017;
originally announced September 2017.