-
Fuzzy Gene Selection and Cancer Classification Based on Deep Learning Model
Authors:
Mahmood Khalsan,
Mu Mu,
Eman Salih Al-Shamery,
Lee Machado,
Suraj Ajit,
Michael Opoku Agyeman
Abstract:
Machine learning (ML) approaches have been used to develop highly accurate and efficient applications in many fields including bio-medical science. However, even with advanced ML techniques, cancer classification using gene expression data is still complicated because of the high dimensionality of the datasets employed. We developed a new fuzzy gene selection technique (FGS) to identify informativ…
▽ More
Machine learning (ML) approaches have been used to develop highly accurate and efficient applications in many fields including bio-medical science. However, even with advanced ML techniques, cancer classification using gene expression data is still complicated because of the high dimensionality of the datasets employed. We developed a new fuzzy gene selection technique (FGS) to identify informative genes to facilitate cancer classification and reduce the dimensionality of the available gene expression data. Three feature selection methods (Mutual Information, F-ClassIf, and Chi-squared) were evaluated and employed to obtain the score and rank for each gene. Then, using Fuzzification and Defuzzification methods to obtain the best single score for each gene, which aids in the identification of significant genes. Our study applied the fuzzy measures to six gene expression datasets including four Microarray and two RNA-seq datasets for evaluating the proposed algorithm. With our FGS-enhanced method, the cancer classification model achieved 96.5%,96.2%,96%, and 95.9% for accuracy, precision, recall, and f1-score respectively, which is significantly higher than 69.2% accuracy, 57.8% precision, 66% recall, and 58.2% f1-score when the standard MLP method was used. In examining the six datasets that were used, the proposed model demonstrates it's capacity to classify cancer effectively.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Abordagem probabilística para análise de confiabilidade de dados gerados em sequenciamentos multiplex na plataforma ABI SOLiD
Authors:
Fabio M. F. Lobato,
Carlos D. N. Damasceno,
Péricles L. Machado,
Nandamudi L. Vijaykumar,
André R. dos Santos,
Sylvain H. Darnet,
André N. A. Gonçalves,
Dayse O. de Alencar,
Ádamo L. de Santana
Abstract:
The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer…
▽ More
The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer provides a mixture of all samples in a single output. This process must be secure to avoid any harm that may scramble further analysis. In this context, realized the need to develop a probabilistic model capable of assigning a degree of confidence in the marking system used in multiplex sequencing. The results confirmed the adequacy of the model obtained, which allows, among other things, to guide a process of filtering the data and evaluation of the sequencing protocol used.
△ Less
Submitted 11 August, 2021; v1 submitted 27 July, 2021;
originally announced July 2021.
-
Optimal quarantine strategies for the COVID-19 pandemic in a population with a discrete age structure
Authors:
João A. M. Gondim,
Larissa Machado
Abstract:
The goal of this work is to study the optimal controls for the COVID-19 epidemic in Brazil. We consider an age-structured SEIRQ model with quarantine compartment, where the controls are the quarantine entrance parameters. We then compare the optimal controls for different quarantine lengths and distribution of the total control cost by assessing their respective reductions in deaths in comparison…
▽ More
The goal of this work is to study the optimal controls for the COVID-19 epidemic in Brazil. We consider an age-structured SEIRQ model with quarantine compartment, where the controls are the quarantine entrance parameters. We then compare the optimal controls for different quarantine lengths and distribution of the total control cost by assessing their respective reductions in deaths in comparison to the same period without quarantine. The best strategy provides a calendar of when to relax the isolation measures for each age group. Finally, we analyse how a delay in the beginning of the quarantine affects this calendar by changing the initial conditions.
△ Less
Submitted 19 May, 2020;
originally announced May 2020.