-
Modified Genetic Algorithm for Feature Selection and Hyper Parameter Optimization: Case of XGBoost in Spam Prediction
Authors:
Nazeeh Ghatasheh,
Ismail Altaharwa,
Khaled Aldebei
Abstract:
Recently, spam on online social networks has attracted attention in the research and business world. Twitter has become the preferred medium to spread spam content. Many research efforts attempted to encounter social networks spam. Twitter brought extra challenges represented by the feature space size, and imbalanced data distributions. Usually, the related research works focus on part of these ma…
▽ More
Recently, spam on online social networks has attracted attention in the research and business world. Twitter has become the preferred medium to spread spam content. Many research efforts attempted to encounter social networks spam. Twitter brought extra challenges represented by the feature space size, and imbalanced data distributions. Usually, the related research works focus on part of these main challenges or produce black-box models. In this paper, we propose a modified genetic algorithm for simultaneous dimensionality reduction and hyper parameter optimization over imbalanced datasets. The algorithm initialized an eXtreme Gradient Boosting classifier and reduced the features space of tweets dataset; to generate a spam prediction model. The model is validated using a 50 times repeated 10-fold stratified cross-validation, and analyzed using nonparametric statistical tests. The resulted prediction model attains on average 82.32\% and 92.67\% in terms of geometric mean and accuracy respectively, utilizing less than 10\% of the total feature space. The empirical results show that the modified genetic algorithm outperforms $Chi^2$ and $PCA$ feature selection methods. In addition, eXtreme Gradient Boosting outperforms many machine learning algorithms, including BERT-based deep learning model, in spam prediction. Furthermore, the proposed approach is applied to SMS spam modeling and compared to related works.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Modeling the Telemarketing Process using Genetic Algorithms and Extreme Boosting: Feature Selection and Cost-Sensitive Analytical Approach
Authors:
Nazeeh Ghatasheh,
Ismail Altaharwa,
Khaled Aldebei
Abstract:
Currently, almost all direct marketing activities take place virtually rather than in person, weakening interpersonal skills at an alarming pace. Furthermore, businesses have been striving to sense and foster the tendency of their clients to accept a marketing offer. The digital transformation and the increased virtual presence forced firms to seek novel marketing research approaches. This researc…
▽ More
Currently, almost all direct marketing activities take place virtually rather than in person, weakening interpersonal skills at an alarming pace. Furthermore, businesses have been striving to sense and foster the tendency of their clients to accept a marketing offer. The digital transformation and the increased virtual presence forced firms to seek novel marketing research approaches. This research aims at leveraging the power of telemarketing data in modeling the willingness of clients to make a term deposit and finding the most significant characteristics of the clients. Real-world data from a Portuguese bank and national socio-economic metrics are used to model the telemarketing decision-making process. This research makes two key contributions. First, propose a novel genetic algorithm-based classifier to select the best discriminating features and tune classifier parameters simultaneously. Second, build an explainable prediction model. The best-generated classification models were intensively validated using 50 times repeated 10-fold stratified cross-validation and the selected features have been analyzed. The models significantly outperform the related works in terms of class of interest accuracy, they attained an average of 89.07\% and 0.059 in terms of geometric mean and type I error respectively. The model is expected to maximize the potential profit margin at the least possible cost and provide more insights to support marketing decision-making.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Optimizing Software Effort Estimation Models Using Firefly Algorithm
Authors:
Nazeeh Ghatasheh,
Hossam Faris,
Ibrahim Aljarah,
Rizik M. H. Al-Sayyed
Abstract:
Software development effort estimation is considered a fundamental task for software development life cycle as well as for managing project cost, time and quality. Therefore, accurate estimation is a substantial factor in projects success and reducing the risks. In recent years, software effort estimation has received a considerable amount of attention from researchers and became a challenge for s…
▽ More
Software development effort estimation is considered a fundamental task for software development life cycle as well as for managing project cost, time and quality. Therefore, accurate estimation is a substantial factor in projects success and reducing the risks. In recent years, software effort estimation has received a considerable amount of attention from researchers and became a challenge for software industry. In the last two decades, many researchers and practitioners proposed statistical and machine learning-based models for software effort estimation. In this work, Firefly Algorithm is proposed as a metaheuristic optimization method for optimizing the parameters of three COCOMO-based models. These models include the basic COCOMO model and other two models proposed in the literature as extensions of the basic COCOMO model. The developed estimation models are evaluated using different evaluation metrics. Experimental results show high accuracy and significant error minimization of Firefly Algorithm over other metaheuristic optimization algorithms including Genetic Algorithms and Particle Swarm Optimization.
△ Less
Submitted 8 January, 2019;
originally announced March 2019.
-
Robotics Evolution: from Remote Brain to Cloud
Authors:
Alaa F. Sheta,
Nazeeh Ghatasheh,
Hossam Faris,
Ali Rodan
Abstract:
Robotic systems have been evolving since decades and touching almost all aspects of life, either for leisure or critical applications. Most of traditional robotic systems operate in well-defined environments utilizing pre-configured on-board processing units. However, modern and foreseen robotic applications ask for complex processing requirements that exceed the limits of on-board computing power…
▽ More
Robotic systems have been evolving since decades and touching almost all aspects of life, either for leisure or critical applications. Most of traditional robotic systems operate in well-defined environments utilizing pre-configured on-board processing units. However, modern and foreseen robotic applications ask for complex processing requirements that exceed the limits of on-board computing power. Cloud computing and the related technologies have high potential to overcome on-board hardware restrictions and can improve the performance efficiency. This research highlights the advancements in robotic systems with focus on cloud robotics as an emerging trend. There exists an extensive amount of effort to leverage the potentials of robotic systems and to handle arising shortcomings. Moreover, there are promising insights for future breed of intelligent, flexible, and autonomous robotic systems in the Internet of Things era.
△ Less
Submitted 8 January, 2019;
originally announced January 2019.