-
Factors affecting the COVID-19 risk in the US counties: an innovative approach by combining unsupervised and supervised learning
Authors:
Samira Ziyadidegan,
Moein Razavi,
Homa Pesarakli,
Amir Hossein Javid,
Madhav Erraguntla
Abstract:
The COVID-19 disease spreads swiftly, and nearly three months after the first positive case was confirmed in China, Coronavirus started to spread all over the United States. Some states and counties reported high number of positive cases and deaths, while some reported lower COVID-19 related cases and mortality. In this paper, the factors that could affect the risk of COVID-19 infection and mortal…
▽ More
The COVID-19 disease spreads swiftly, and nearly three months after the first positive case was confirmed in China, Coronavirus started to spread all over the United States. Some states and counties reported high number of positive cases and deaths, while some reported lower COVID-19 related cases and mortality. In this paper, the factors that could affect the risk of COVID-19 infection and mortality were analyzed in county level. An innovative method by using K-means clustering and several classification models is utilized to determine the most critical factors. Results showed that mean temperature, percent of people below poverty, percent of adults with obesity, air pressure, population density, wind speed, longitude, and percent of uninsured people were the most significant attributes
△ Less
Submitted 5 December, 2021; v1 submitted 24 June, 2021;
originally announced June 2021.
-
Coronary Artery Disease Diagnosis; Ranking the Significant Features Using Random Trees Model
Authors:
Javad Hassannataj Joloudari,
Edris Hassannataj Joloudari,
Hamid Saadatfar,
Mohammad GhasemiGol,
Seyyed Mohammad Razavi,
Amir Mosavi,
Narjes Nabipour,
Shahaboddin Shamshirband,
Laszlo Nadai
Abstract:
Heart disease is one of the most common diseases in middle-aged citizens. Among the vast number of heart diseases, the coronary artery disease (CAD) is considered as a common cardiovascular disease with a high death rate. The most popular tool for diagnosing CAD is the use of medical imaging, e.g., angiography. However, angiography is known for being costly and also associated with a number of sid…
▽ More
Heart disease is one of the most common diseases in middle-aged citizens. Among the vast number of heart diseases, the coronary artery disease (CAD) is considered as a common cardiovascular disease with a high death rate. The most popular tool for diagnosing CAD is the use of medical imaging, e.g., angiography. However, angiography is known for being costly and also associated with a number of side effects. Hence, the purpose of this study is to increase the accuracy of coronary heart disease diagnosis through selecting significant predictive features in order of their ranking. In this study, we propose an integrated method using machine learning. The machine learning methods of random trees (RTs), decision tree of C5.0, support vector machine (SVM), decision tree of Chi-squared automatic interaction detection (CHAID) are used in this study. The proposed method shows promising results and the study confirms that RTs model outperforms other models.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Parameter Selection Algorithm For Continuous Variables
Authors:
Peyman Tavallali,
Marianne Razavi,
Sean Brady
Abstract:
In this article, we propose a new algorithm for supervised learning methods, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, an ideal selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathemati…
▽ More
In this article, we propose a new algorithm for supervised learning methods, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, an ideal selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we present here has the potential to produce an optimal subset of variables, rendering the overall process of model selection to be more efficient. The core objective of this paper is to introduce a new estimation technique for the classical least square regression framework. This new automatic variable transformation and model selection method could offer an optimal and stable model that minimizes the mean square error and variability, while combining all possible subset selection methodology and including variable transformations and interaction. Moreover, this novel method controls multicollinearity, leading to an optimal set of explanatory variables.
△ Less
Submitted 19 January, 2017;
originally announced January 2017.