-
Dialz: A Python Toolkit for Steering Vectors
Authors:
Zara Siddique,
Liam D. Turner,
Luis Espinosa-Anke
Abstract:
We introduce Dialz, a framework for advancing research on steering vectors for open-source LLMs, implemented in Python. Steering vectors allow users to modify activations at inference time to amplify or weaken a 'concept', e.g. honesty or positivity, providing a more powerful alternative to prompting or fine-tuning. Dialz supports a diverse set of tasks, including creating contrastive pair dataset…
▽ More
We introduce Dialz, a framework for advancing research on steering vectors for open-source LLMs, implemented in Python. Steering vectors allow users to modify activations at inference time to amplify or weaken a 'concept', e.g. honesty or positivity, providing a more powerful alternative to prompting or fine-tuning. Dialz supports a diverse set of tasks, including creating contrastive pair datasets, computing and applying steering vectors, and visualizations. Unlike existing libraries, Dialz emphasizes modularity and usability, enabling both rapid prototyping and in-depth analysis. We demonstrate how Dialz can be used to reduce harmful outputs such as stereotypes, while also providing insights into model behaviour across different layers. We release Dialz with full documentation, tutorials, and support for popular open-source models to encourage further research in safe and controllable language generation. Dialz enables faster research cycles and facilitates insights into model interpretability, paving the way for safer, more transparent, and more reliable AI systems.
△ Less
Submitted 3 June, 2025; v1 submitted 4 May, 2025;
originally announced May 2025.
-
Shifting Perspectives: Steering Vector Ensembles for Robust Bias Mitigation in LLMs
Authors:
Zara Siddique,
Irtaza Khalid,
Liam D. Turner,
Luis Espinosa-Anke
Abstract:
We present a novel approach to bias mitigation in large language models (LLMs) by applying steering vectors to modify model activations in forward passes. We employ Bayesian optimization to systematically identify effective contrastive pair datasets across nine bias axes. When optimized on the BBQ dataset, our individually tuned steering vectors achieve average improvements of 12.2%, 4.7%, and 3.2…
▽ More
We present a novel approach to bias mitigation in large language models (LLMs) by applying steering vectors to modify model activations in forward passes. We employ Bayesian optimization to systematically identify effective contrastive pair datasets across nine bias axes. When optimized on the BBQ dataset, our individually tuned steering vectors achieve average improvements of 12.2%, 4.7%, and 3.2% over the baseline for Mistral, Llama, and Qwen, respectively. Building on these promising results, we introduce Steering Vector Ensembles (SVE), a method that averages multiple individually optimized steering vectors, each targeting a specific bias axis such as age, race, or gender. By leveraging their collective strength, SVE outperforms individual steering vectors in both bias reduction and maintaining model performance. The work presents the first systematic investigation of steering vectors for bias mitigation, and we demonstrate that SVE is a powerful and computationally efficient strategy for reducing bias in LLMs, with broader implications for enhancing AI safety.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
Automatic Extraction of Metaphoric Analogies from Literary Texts: Task Formulation, Dataset Construction, and Evaluation
Authors:
Joanne Boisson,
Zara Siddique,
Hsuvas Borkakoty,
Dimosthenis Antypas,
Luis Espinosa Anke,
Jose Camacho-Collados
Abstract:
Extracting metaphors and analogies from free text requires high-level reasoning abilities such as abstraction and language understanding. Our study focuses on the extraction of the concepts that form metaphoric analogies in literary texts. To this end, we construct a novel dataset in this domain with the help of domain experts. We compare the out-of-the-box ability of recent large language models…
▽ More
Extracting metaphors and analogies from free text requires high-level reasoning abilities such as abstraction and language understanding. Our study focuses on the extraction of the concepts that form metaphoric analogies in literary texts. To this end, we construct a novel dataset in this domain with the help of domain experts. We compare the out-of-the-box ability of recent large language models (LLMs) to structure metaphoric mappings from fragments of texts containing proportional analogies. The models are further evaluated on the generation of implicit elements of the analogy, which are indirectly suggested in the texts and inferred by human readers. The competitive results obtained by LLMs in our experiments are encouraging and open up new avenues such as automatically extracting analogies and metaphors from text instead of investing resources in domain experts to manually label data.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
Digital Twins in Additive Manufacturing: A Systematic Review
Authors:
Md Manjurul Ahsan,
Yingtao Liu,
Shivakumar Raman,
Zahed Siddique
Abstract:
Digital Twins (DTs) are becoming popular in Additive Manufacturing (AM) due to their ability to create virtual replicas of physical components of AM machines, which helps in real-time production monitoring. Advanced techniques such as Machine Learning (ML), Augmented Reality (AR), and simulation-based models play key roles in developing intelligent and adaptable DTs in manufacturing processes. How…
▽ More
Digital Twins (DTs) are becoming popular in Additive Manufacturing (AM) due to their ability to create virtual replicas of physical components of AM machines, which helps in real-time production monitoring. Advanced techniques such as Machine Learning (ML), Augmented Reality (AR), and simulation-based models play key roles in developing intelligent and adaptable DTs in manufacturing processes. However, questions remain regarding scalability, the integration of high-quality data, and the computational power required for real-time applications in developing DTs. Understanding the current state of DTs in AM is essential to address these challenges and fully utilize their potential in advancing AM processes. Considering this opportunity, this work aims to provide a comprehensive overview of DTs in AM by addressing the following four research questions: (1) What are the key types of DTs used in AM and their specific applications? (2) What are the recent developments and implementations of DTs? (3) How are DTs employed in process improvement and hybrid manufacturing? (4) How are DTs integrated with Industry 4.0 technologies? By discussing current applications and techniques, we aim to offer a better understanding and potential future research directions for researchers and practitioners in AM and DTs.
△ Less
Submitted 1 November, 2024; v1 submitted 1 September, 2024;
originally announced September 2024.
-
A Comprehensive Survey on Diffusion Models and Their Applications
Authors:
Md Manjurul Ahsan,
Shivakumar Raman,
Yingtao Liu,
Zahed Siddique
Abstract:
Diffusion Models are probabilistic models that create realistic samples by simulating the diffusion process, gradually adding and removing noise from data. These models have gained popularity in domains such as image processing, speech synthesis, and natural language processing due to their ability to produce high-quality samples. As Diffusion Models are being adopted in various domains, existing…
▽ More
Diffusion Models are probabilistic models that create realistic samples by simulating the diffusion process, gradually adding and removing noise from data. These models have gained popularity in domains such as image processing, speech synthesis, and natural language processing due to their ability to produce high-quality samples. As Diffusion Models are being adopted in various domains, existing literature reviews that often focus on specific areas like computer vision or medical imaging may not serve a broader audience across multiple fields. Therefore, this review presents a comprehensive overview of Diffusion Models, covering their theoretical foundations and algorithmic innovations. We highlight their applications in diverse areas such as media quality, authenticity, synthesis, image transformation, healthcare, and more. By consolidating current knowledge and identifying emerging trends, this review aims to facilitate a deeper understanding and broader adoption of Diffusion Models and provide guidelines for future researchers and practitioners across diverse disciplines.
△ Less
Submitted 1 July, 2024;
originally announced August 2024.
-
Who is better at math, Jenny or Jingzhen? Uncovering Stereotypes in Large Language Models
Authors:
Zara Siddique,
Liam D. Turner,
Luis Espinosa-Anke
Abstract:
Large language models (LLMs) have been shown to propagate and amplify harmful stereotypes, particularly those that disproportionately affect marginalised communities. To understand the effect of these stereotypes more comprehensively, we introduce GlobalBias, a dataset of 876k sentences incorporating 40 distinct gender-by-ethnicity groups alongside descriptors typically used in bias literature, wh…
▽ More
Large language models (LLMs) have been shown to propagate and amplify harmful stereotypes, particularly those that disproportionately affect marginalised communities. To understand the effect of these stereotypes more comprehensively, we introduce GlobalBias, a dataset of 876k sentences incorporating 40 distinct gender-by-ethnicity groups alongside descriptors typically used in bias literature, which enables us to study a broad set of stereotypes from around the world. We use GlobalBias to directly probe a suite of LMs via perplexity, which we use as a proxy to determine how certain stereotypes are represented in the model's internal representations. Following this, we generate character profiles based on given names and evaluate the prevalence of stereotypes in model outputs. We find that the demographic groups associated with various stereotypes remain consistent across model likelihoods and model outputs. Furthermore, larger models consistently display higher levels of stereotypical outputs, even when explicitly instructed not to.
△ Less
Submitted 9 October, 2024; v1 submitted 9 July, 2024;
originally announced July 2024.
-
Defect Analysis of 3D Printed Cylinder Object Using Transfer Learning Approaches
Authors:
Md Manjurul Ahsan,
Shivakumar Raman,
Zahed Siddique
Abstract:
Additive manufacturing (AM) is gaining attention across various industries like healthcare, aerospace, and automotive. However, identifying defects early in the AM process can reduce production costs and improve productivity - a key challenge. This study explored the effectiveness of machine learning (ML) approaches, specifically transfer learning (TL) models, for defect detection in 3D-printed cy…
▽ More
Additive manufacturing (AM) is gaining attention across various industries like healthcare, aerospace, and automotive. However, identifying defects early in the AM process can reduce production costs and improve productivity - a key challenge. This study explored the effectiveness of machine learning (ML) approaches, specifically transfer learning (TL) models, for defect detection in 3D-printed cylinders. Images of cylinders were analyzed using models including VGG16, VGG19, ResNet50, ResNet101, InceptionResNetV2, and MobileNetV2. Performance was compared across two datasets using accuracy, precision, recall, and F1-score metrics. In the first study, VGG16, InceptionResNetV2, and MobileNetV2 achieved perfect scores. In contrast, ResNet50 had the lowest performance, with an average F1-score of 0.32. Similarly, in the second study, MobileNetV2 correctly classified all instances, while ResNet50 struggled with more false positives and fewer true positives, resulting in an F1-score of 0.75. Overall, the findings suggest certain TL models like MobileNetV2 can deliver high accuracy for AM defect classification, although performance varies across algorithms. The results provide insights into model optimization and integration needs for reliable automated defect analysis during 3D printing. By identifying the top-performing TL techniques, this study aims to enhance AM product quality through robust image-based monitoring and inspection.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
BSGAN: A Novel Oversampling Technique for Imbalanced Pattern Recognitions
Authors:
Md Manjurul Ahsan,
Shivakumar Raman,
Zahed Siddique
Abstract:
Class imbalanced problems (CIP) are one of the potential challenges in developing unbiased Machine Learning (ML) models for predictions. CIP occurs when data samples are not equally distributed between the two or multiple classes. Borderline-Synthetic Minority Oversampling Techniques (SMOTE) is one of the approaches that has been used to balance the imbalance data by oversampling the minor (limite…
▽ More
Class imbalanced problems (CIP) are one of the potential challenges in developing unbiased Machine Learning (ML) models for predictions. CIP occurs when data samples are not equally distributed between the two or multiple classes. Borderline-Synthetic Minority Oversampling Techniques (SMOTE) is one of the approaches that has been used to balance the imbalance data by oversampling the minor (limited) samples. One of the potential drawbacks of existing Borderline-SMOTE is that it focuses on the data samples that lay at the border point and gives more attention to the extreme observations, ultimately limiting the creation of more diverse data after oversampling, and that is the almost scenario for the most of the borderline-SMOTE based oversampling strategies. As an effect, marginalization occurs after oversampling. To address these issues, in this work, we propose a hybrid oversampling technique by combining the power of borderline SMOTE and Generative Adversarial Network to generate more diverse data that follow Gaussian distributions. We named it BSGAN and tested it on four highly imbalanced datasets: Ecoli, Wine quality, Yeast, and Abalone. Our preliminary computational results reveal that BSGAN outperformed existing borderline SMOTE and GAN-based oversampling techniques and created a more diverse dataset that follows normal distribution after oversampling effect.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Invariant Scattering Transform for Medical Imaging
Authors:
Md Manjurul Ahsan,
Shivakumar Raman,
Zahed Siddique
Abstract:
Over the years, the Invariant Scattering Transform (IST) technique has become popular for medical image analysis, including using wavelet transform computation using Convolutional Neural Networks (CNN) to capture patterns' scale and orientation in the input signal. IST aims to be invariant to transformations that are common in medical images, such as translation, rotation, scaling, and deformation…
▽ More
Over the years, the Invariant Scattering Transform (IST) technique has become popular for medical image analysis, including using wavelet transform computation using Convolutional Neural Networks (CNN) to capture patterns' scale and orientation in the input signal. IST aims to be invariant to transformations that are common in medical images, such as translation, rotation, scaling, and deformation, used to improve the performance in medical imaging applications such as segmentation, classification, and registration, which can be integrated into machine learning algorithms for disease detection, diagnosis, and treatment planning. Additionally, combining IST with deep learning approaches has the potential to leverage their strengths and enhance medical image analysis outcomes. This study provides an overview of IST in medical imaging by considering the types of IST, their application, limitations, and potential scopes for future researchers and practitioners.
△ Less
Submitted 31 May, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
Imbalanced Class Data Performance Evaluation and Improvement using Novel Generative Adversarial Network-based Approach: SSG and GBO
Authors:
Md Manjurul Ahsan,
Md Shahin Ali,
Zahed Siddique
Abstract:
Class imbalance in a dataset is one of the major challenges that can significantly impact the performance of machine learning models resulting in biased predictions. Numerous techniques have been proposed to address class imbalanced problems, including, but not limited to, Oversampling, Undersampling, and cost-sensitive approaches. Due to its ability to generate synthetic data, oversampling techni…
▽ More
Class imbalance in a dataset is one of the major challenges that can significantly impact the performance of machine learning models resulting in biased predictions. Numerous techniques have been proposed to address class imbalanced problems, including, but not limited to, Oversampling, Undersampling, and cost-sensitive approaches. Due to its ability to generate synthetic data, oversampling techniques such as the Synthetic Minority Oversampling Technique (SMOTE) is among the most widely used methodology by researchers. However, one of SMOTE's potential disadvantages is that newly created minor samples may overlap with major samples. As an effect, the probability of ML models' biased performance towards major classes increases. Recently, generative adversarial network (GAN) has garnered much attention due to its ability to create almost real samples. However, GAN is hard to train even though it has much potential. This study proposes two novel techniques: GAN-based Oversampling (GBO) and Support Vector Machine-SMOTE-GAN (SSG) to overcome the limitations of the existing oversampling approaches. The preliminary computational result shows that SSG and GBO performed better on the expanded imbalanced eight benchmark datasets than the original SMOTE. The study also revealed that the minor sample generated by SSG demonstrates Gaussian distributions, which is often difficult to achieve using original SMOTE.
△ Less
Submitted 23 October, 2022;
originally announced October 2022.
-
The development of a portable elbow exoskeleton with a Twisted Strings Actuator to assist patients with upper limb inhabitation
Authors:
Rupal Roy,
MM Rashid,
Md Manjurul Ahsan,
Zahed Siddique
Abstract:
Over the years, the number of exoskeleton devices utilized for upper-limb rehabilitation has increased dramatically, each with its own set of pros and cons. Most exoskeletons are not portable, limiting their utility to daily use for house patients. Additionally, the huge size of some grounded exoskeletons consumes space while maintaining a sophisticated structure and require more expensive materia…
▽ More
Over the years, the number of exoskeleton devices utilized for upper-limb rehabilitation has increased dramatically, each with its own set of pros and cons. Most exoskeletons are not portable, limiting their utility to daily use for house patients. Additionally, the huge size of some grounded exoskeletons consumes space while maintaining a sophisticated structure and require more expensive materials. In other words, to maintain affordability, the device's structure must be simple. Thus, in this work, a portable elbow exoskeleton is developed using SolidWorks to incorporate a Twisted Strings Actuator (TSA) to aid in upper-limb rehabilitation and to provide an alternative for those with compromised limbs to recuperate. Experiments are conducted to identify the optimal value for building a more flexible elbow exoskeleton prototype by analyzing stress, strain conditions, torque, forces, and strings. Preliminary computational findings reveal that for the proposed intended prototype, a string length of.033 m and a torque value ranging from 1.5 Nm to 3 Nm are optimal.
△ Less
Submitted 25 January, 2022;
originally announced February 2022.
-
Design and Development of an Autonomous Surface Vehicle for Water Quality Monitoring
Authors:
MM Rashid,
Rupal Roy,
Md Manjurul Ahsan,
Zahed Siddique
Abstract:
Manually monitoring water quality is very exhausting and requires several hours of sampling and laboratory testing for a particular body of water. This article presents a solution to test water properties like electrical conductivity and pH with a remote-controlled floating vehicle that minimizes time intervals. An autonomous surface vehicle (ASV) has been designed mathematically and operated via…
▽ More
Manually monitoring water quality is very exhausting and requires several hours of sampling and laboratory testing for a particular body of water. This article presents a solution to test water properties like electrical conductivity and pH with a remote-controlled floating vehicle that minimizes time intervals. An autonomous surface vehicle (ASV) has been designed mathematically and operated via MATLAB \& Simulink simulation where the Proportional integral derivative (PID) controller has been considered. A PVC model with Small waterplane area twin-hull (SWATH) technology is used to develop this vehicle. Manually collected data is compared to online sensors, suggesting a better solution for determining water properties such as dissolved oxygen (DO), biochemical oxygen demand (BOD), temperature, conductivity, total alkalinity, and bacteria. Preliminary computational results show the promising result, as Sungai Pasu rivers tested water falls in the safe range of pH (~6.8-7.14) using the developed ASV.
△ Less
Submitted 25 January, 2022;
originally announced January 2022.
-
Industry 4.0 in Health care: A systematic review
Authors:
Md Manjurul Ahsan,
Zahed Siddique
Abstract:
Industry 4.0 in health care has evolved drastically over the past century. In fact, it is evolving every day, with new tools and strategies being developed by physicians and researchers alike. Health care and technology have been intertwined together with the advancement of cloud computing and big data. This study aims to analyze the impact of industry 4.0 in health care systems. To do so, a syste…
▽ More
Industry 4.0 in health care has evolved drastically over the past century. In fact, it is evolving every day, with new tools and strategies being developed by physicians and researchers alike. Health care and technology have been intertwined together with the advancement of cloud computing and big data. This study aims to analyze the impact of industry 4.0 in health care systems. To do so, a systematic literature review was carried out considering peer-reviewed articles extracted from the two popular databases: Scopus and Web of Science (WoS). PRISMA statement 2015 was used to include and exclude that data. At first, a bibliometric analysis was carried out using 346 articles considering the following factors: publication by year, journal, authors, countries, institutions, authors' keywords, and citations. Finally, qualitative analysis was carried out based on selected 32 articles considering the following factors: a conceptual framework, schedule problems, security, COVID-19, digital supply chain, and blockchain technology. Study finding suggests that during the onset of COVID-19, health care and industry 4.0 has been merged and evolved jointly, considering various crisis such as data security, resource allocation, and data transparency. Industry 4.0 enables many technologies such as the internet of things (IoT), blockchain, big data, cloud computing, machine learning, deep learning, information, and communication technologies (ICT) to track patients' records and helps reduce social transmission COVID-19 and so on. The study findings will give future researchers and practitioners some insights regarding the integration of health care and Industry 4.0.
△ Less
Submitted 13 January, 2022;
originally announced January 2022.
-
Machine Learning-Based Disease Diagnosis:A Bibliometric Analysis
Authors:
Md Manjurul Ahsan,
Zahed Siddique
Abstract:
Machine Learning (ML) has garnered considerable attention from researchers and practitioners as a new and adaptable tool for disease diagnosis. With the advancement of ML and the proliferation of papers and research in this field, a complete examination of Machine Learning-Based Disease Diagnosis (MLBDD) is required. From a bibliometrics standpoint, this article comprehensively studies MLBDD paper…
▽ More
Machine Learning (ML) has garnered considerable attention from researchers and practitioners as a new and adaptable tool for disease diagnosis. With the advancement of ML and the proliferation of papers and research in this field, a complete examination of Machine Learning-Based Disease Diagnosis (MLBDD) is required. From a bibliometrics standpoint, this article comprehensively studies MLBDD papers from 2012 to 2021. Consequently, with particular keywords, 1710 papers with associate information have been extracted from the Scopus and Web of Science (WOS) database and integrated into the excel datasheet for further analysis. First, we examine the publication structures based on yearly publications and the most productive countries/regions, institutions, and authors. Second, the co-citation networks of countries/regions, institutions, authors, and articles are visualized using R-studio software. They are further examined in terms of citation structure and the most influential ones. This article gives an overview of MLBDD for researchers interested in the subject and conducts a thorough and complete study of MLBDD for those interested in conducting more research in this field.
△ Less
Submitted 7 January, 2022;
originally announced January 2022.
-
Machine learning based disease diagnosis: A comprehensive review
Authors:
Md Manjurul Ahsan,
Zahed Siddique
Abstract:
Globally, there is a substantial unmet need to diagnose various diseases effectively. The complexity of the different disease mechanisms and underlying symptoms of the patient population presents massive challenges to developing the early diagnosis tool and effective treatment. Machine Learning (ML), an area of Artificial Intelligence (AI), enables researchers, physicians, and patients to solve so…
▽ More
Globally, there is a substantial unmet need to diagnose various diseases effectively. The complexity of the different disease mechanisms and underlying symptoms of the patient population presents massive challenges to developing the early diagnosis tool and effective treatment. Machine Learning (ML), an area of Artificial Intelligence (AI), enables researchers, physicians, and patients to solve some of these issues. Based on relevant research, this review explains how Machine Learning (ML) and Deep Learning (DL) are being used to help in the early identification of numerous diseases. To begin, a bibliometric study of the publication is given using data from the Scopus and Web of Science (WOS) databases. The bibliometric study of 1216 publications was undertaken to determine the most prolific authors, nations, organizations, and most cited articles. The review then summarizes the most recent trends and approaches in Machine Learning-based Disease Diagnosis (MLBDD), considering the following factors: algorithm, disease types, data type, application, and evaluation metrics. Finally, the paper highlights key results and provides insight into future trends and opportunities in the MLBDD area.
△ Less
Submitted 31 December, 2021;
originally announced December 2021.
-
Driver Drowsiness Detection Using Ensemble Convolutional Neural Networks on YawDD
Authors:
Rais Mohammad Salman,
Mahbubur Rashid,
Rupal Roy,
Md Manjurul Ahsan,
Zahed Siddique
Abstract:
Driver drowsiness detection using videos/images is one of the most essential areas in today's time for driver safety. The development of deep learning techniques, notably Convolutional Neural Networks (CNN), applied in computer vision applications such as drowsiness detection, has shown promising results due to the tremendous increase in technology in the recent few decades. Eyes that are closed o…
▽ More
Driver drowsiness detection using videos/images is one of the most essential areas in today's time for driver safety. The development of deep learning techniques, notably Convolutional Neural Networks (CNN), applied in computer vision applications such as drowsiness detection, has shown promising results due to the tremendous increase in technology in the recent few decades. Eyes that are closed or blinking excessively, yawning, nodding, and occlusion are all key aspects of drowsiness. In this work, we have applied four different Convolutional Neural Network (CNN) techniques on the YawDD dataset to detect and examine the extent of drowsiness depending on the yawning frequency with specific pose and occlusion variation. Preliminary computational results show that our proposed Ensemble Convolutional Neural Network (ECNN) outperformed the traditional CNN-based approach by achieving an F1 score of 0.935, whereas the other three CNN, such as CNN1, CNN2, and CNN3 approaches gained 0.92, 0.90, and 0.912 F1 scores, respectively.
△ Less
Submitted 19 December, 2021;
originally announced December 2021.
-
Machine Learning-Based Heart Disease Diagnosis: A Systematic Literature Review
Authors:
Md Manjurul Ahsan,
Zahed Siddique
Abstract:
Heart disease is one of the significant challenges in today's world and one of the leading causes of many deaths worldwide. Recent advancement of machine learning (ML) application demonstrates that using electrocardiogram (ECG) and patient data, detecting heart disease during the early stage is feasible. However, both ECG and patient data are often imbalanced, which ultimately raises a challenge f…
▽ More
Heart disease is one of the significant challenges in today's world and one of the leading causes of many deaths worldwide. Recent advancement of machine learning (ML) application demonstrates that using electrocardiogram (ECG) and patient data, detecting heart disease during the early stage is feasible. However, both ECG and patient data are often imbalanced, which ultimately raises a challenge for the traditional ML to perform unbiasedly. Over the years, several data level and algorithm level solutions have been exposed by many researchers and practitioners. To provide a broader view of the existing literature, this study takes a systematic literature review (SLR) approach to uncover the challenges associated with imbalanced data in heart diseases predictions. Before that, we conducted a meta-analysis using 451 referenced literature acquired from the reputed journals between 2012 and November 15, 2021. For in-depth analysis, 49 referenced literature has been considered and studied, taking into account the following factors: heart disease type, algorithms, applications, and solutions. Our SLR study revealed that the current approaches encounter various open problems/issues when dealing with imbalanced data, eventually hindering their practical applicability and functionality.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.