-
Clonal-Based Cellular Automata in Bioinformatics
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu
Abstract:
This paper aims at providing a survey on the problems that can be easily addressed by clonalbased cellular automata in bioinformatics. Researchers try to address the problems in bioinformatics independent of each problem. None of the researchers has tried to relate the major problems in bioinformatics and find a solution using common frame work. We tried to find various problems in bioinformatics…
▽ More
This paper aims at providing a survey on the problems that can be easily addressed by clonalbased cellular automata in bioinformatics. Researchers try to address the problems in bioinformatics independent of each problem. None of the researchers has tried to relate the major problems in bioinformatics and find a solution using common frame work. We tried to find various problems in bioinformatics which can be addressed easily by clonal based cellular automata. Extensive literature survey is conducted. We have considered some papers in various journals and conferences for conduct of our research. This paper provides intuition towards relating various problems in bioinformatics logically and tries to attain a common frame work with respect to clonal based cellular automata classifier for addressing the same.
△ Less
Submitted 13 May, 2014;
originally announced May 2014.
-
A Fast Multiple Attractor Cellular Automata with Modified Clonal Classifier for Splicing Site Prediction in Human Genome
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu,
SSSN Usha Devi N
Abstract:
Bioinformatics encompass storing, analyzing and interpreting the biological data. Most of the challenges for Machine Learning methods like Cellular Automata is to furnish the functional information with the corresponding biological sequences. In eukaryotes DNA is divided into introns and exons. The introns will be removed to make the coding region by a process called splicing. By indentifying a sp…
▽ More
Bioinformatics encompass storing, analyzing and interpreting the biological data. Most of the challenges for Machine Learning methods like Cellular Automata is to furnish the functional information with the corresponding biological sequences. In eukaryotes DNA is divided into introns and exons. The introns will be removed to make the coding region by a process called splicing. By indentifying a splice site we can easily specify the DNA sequence category (Donor/Accepter/Neither).Splicing sites play an important role in understanding the genes. A class of CA which can handle fuzzy logic is employed with modified clonal algorithm is proposed to identify the splicing site. This classifier is tested with Irvine Primate Splice Junction Database. It is compared with NNspIICE, GENIO, HSPL and SPIICE VIEW. The reported accuracy and efficiency of prediction is quite promising.
△ Less
Submitted 23 April, 2014;
originally announced April 2014.
-
AIS-MACA- Z: MACA based Clonal Classifier for Splicing Site, Protein Coding and Promoter Region Identification in Eukaryotes
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu,
SSSN Usha Devi N
Abstract:
Bioinformatics incorporates information regarding biological data storage, accessing mechanisms and presentation of characteristics within this data. Most of the problems in bioinformatics and be addressed efficiently by computer techniques. This paper aims at building a classifier based on Multiple Attractor Cellular Automata (MACA) which uses fuzzy logic with version Z to predict splicing site,…
▽ More
Bioinformatics incorporates information regarding biological data storage, accessing mechanisms and presentation of characteristics within this data. Most of the problems in bioinformatics and be addressed efficiently by computer techniques. This paper aims at building a classifier based on Multiple Attractor Cellular Automata (MACA) which uses fuzzy logic with version Z to predict splicing site, protein coding and promoter region identification in eukaryotes. It is strengthened with an artificial immune system technique (AIS), Clonal algorithm for choosing rules of best fitness. The proposed classifier can handle DNA sequences of lengths 54,108,162,252,354. This classifier gives the exact boundaries of both protein and promoter regions with an average accuracy of 90.6%. This classifier can predict the splicing site with 97% accuracy. This classifier was tested with 1, 97,000 data components which were taken from Fickett & Toung , EPDnew, and other sequences from a renowned medical university.
△ Less
Submitted 3 April, 2014;
originally announced April 2014.
-
Cellular Automata and Its Applications in Bioinformatics: A Review
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu,
SSSN Usha Devi N
Abstract:
This paper aims at providing a survey on the problems that can be easily addressed by cellular automata in bioinformatics. Some of the authors have proposed algorithms for addressing some problems in bioinformatics but the application of cellular automata in bioinformatics is a virgin field in research. None of the researchers has tried to relate the major problems in bioinformatics and find a com…
▽ More
This paper aims at providing a survey on the problems that can be easily addressed by cellular automata in bioinformatics. Some of the authors have proposed algorithms for addressing some problems in bioinformatics but the application of cellular automata in bioinformatics is a virgin field in research. None of the researchers has tried to relate the major problems in bioinformatics and find a common solution. Extensive literature surveys were conducted. We have considered some papers in various journals and conferences for conduct of our research. This paper provides intuition towards relating various problems in bioinformatics logically and tries to attain a common frame work for addressing the same.
△ Less
Submitted 2 April, 2014;
originally announced April 2014.
-
AIS-INMACA: A Novel Integrated MACA Based Clonal Classifier for Protein Coding and Promoter Region Prediction
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu
Abstract:
Most of the problems in bioinformatics are now the challenges in computing. This paper aims at building a classifier based on Multiple Attractor Cellular Automata (MACA) which uses fuzzy logic. It is strengthened with an artificial Immune System Technique (AIS), Clonal algorithm for identifying a protein coding and promoter region in a given DNA sequence. The proposed classifier is named as AIS-IN…
▽ More
Most of the problems in bioinformatics are now the challenges in computing. This paper aims at building a classifier based on Multiple Attractor Cellular Automata (MACA) which uses fuzzy logic. It is strengthened with an artificial Immune System Technique (AIS), Clonal algorithm for identifying a protein coding and promoter region in a given DNA sequence. The proposed classifier is named as AIS-INMACA introduces a novel concept to combine CA with artificial immune system to produce a better classifier which can address major problems in bioinformatics. This will be the first integrated algorithm which can predict both promoter and protein coding regions. To obtain good fitness rules the basic concept of Clonal selection algorithm was used. The proposed classifier can handle DNA sequences of lengths 54,108,162,252,354. This classifier gives the exact boundaries of both protein and promoter regions with an average accuracy of 89.6%. This classifier was tested with 97,000 data components which were taken from Fickett & Toung, MPromDb, and other sequences from a renowned medical university. This proposed classifier can handle huge data sets and can find protein and promoter regions even in mixed and overlapped DNA sequences. This work also aims at identifying the logicality between the major problems in bioinformatics and tries to obtaining a common frame work for addressing major problems in bioinformatics like protein structure prediction, RNA structure prediction, predicting the splicing pattern of any primary transcript and analysis of information content in DNA, RNA, protein sequences and structure. This work will attract more researchers towards application of CA as a potential pattern classifier to many important problems in bioinformatics
△ Less
Submitted 24 March, 2014;
originally announced March 2014.
-
An Extensive Repot on the Efficiency of AIS-INMACA (A Novel Integrated MACA based Clonal Classifier for Protein Coding and Promoter Region Prediction)
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu
Abstract:
This paper exclusively reports the efficiency of AIS-INMACA. AIS-INMACA has created good impact on solving major problems in bioinformatics like protein region identification and promoter region prediction with less time (Pokkuluri Kiran Sree, 2014). This AIS-INMACA is now came with several variations (Pokkuluri Kiran Sree, 2014) towards projecting it as a tool in bioinformatics for solving many p…
▽ More
This paper exclusively reports the efficiency of AIS-INMACA. AIS-INMACA has created good impact on solving major problems in bioinformatics like protein region identification and promoter region prediction with less time (Pokkuluri Kiran Sree, 2014). This AIS-INMACA is now came with several variations (Pokkuluri Kiran Sree, 2014) towards projecting it as a tool in bioinformatics for solving many problems in bioinformatics. So this paper will be very much useful for so many researchers who are working in the domain of bioinformatics with cellular automata.
△ Less
Submitted 5 March, 2014;
originally announced March 2014.
-
Identification of Protein Coding Regions in Genomic DNA Using Unsupervised FMACA Based Pattern Classifier
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu
Abstract:
Genes carry the instructions for making proteins that are found in a cell as a specific sequence of nucleotides that are found in DNA molecules. But, the regions of these genes that code for proteins may occupy only a small region of the sequence. Identifying the coding regions play a vital role in understanding these genes. In this paper we propose a unsupervised Fuzzy Multiple Attractor Cellular…
▽ More
Genes carry the instructions for making proteins that are found in a cell as a specific sequence of nucleotides that are found in DNA molecules. But, the regions of these genes that code for proteins may occupy only a small region of the sequence. Identifying the coding regions play a vital role in understanding these genes. In this paper we propose a unsupervised Fuzzy Multiple Attractor Cellular Automata (FMCA) based pattern classifier to identify the coding region of a DNA sequence. We propose a distinct K-Means algorithm for designing FMACA classifier which is simple, efficient and produces more accurate classifier than that has previously been obtained for a range of different sequence lengths. Experimental results confirm the scalability of the proposed Unsupervised FCA based classifier to handle large volume of datasets irrespective of the number of classes, tuples and attributes. Good classification accuracy has been established.
△ Less
Submitted 24 January, 2014;
originally announced January 2014.
-
HMACA: Towards Proposing a Cellular Automata Based Tool for Protein Coding, Promoter Region Identification and Protein Structure Prediction
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu,
SSSN Usha Devi N
Abstract:
Human body consists of lot of cells, each cell consist of DeOxaRibo Nucleic Acid (DNA). Identifying the genes from the DNA sequences is a very difficult task. But identifying the coding regions is more complex task compared to the former. Identifying the protein which occupy little place in genes is a really challenging issue. For understating the genes coding region analysis plays an important ro…
▽ More
Human body consists of lot of cells, each cell consist of DeOxaRibo Nucleic Acid (DNA). Identifying the genes from the DNA sequences is a very difficult task. But identifying the coding regions is more complex task compared to the former. Identifying the protein which occupy little place in genes is a really challenging issue. For understating the genes coding region analysis plays an important role. Proteins are molecules with macro structure that are responsible for a wide range of vital biochemical functions, which includes acting as oxygen, cell signaling, antibody production, nutrient transport and building up muscle fibers. Promoter region identification and protein structure prediction has gained a remarkable attention in recent years. Even though there are some identification techniques addressing this problem, the approximate accuracy in identifying the promoter region is closely 68% to 72%. We have developed a Cellular Automata based tool build with hybrid multiple attractor cellular automata (HMACA) classifier for protein coding region, promoter region identification and protein structure prediction which predicts the protein and promoter regions with an accuracy of 76%. This tool also predicts the structure of protein with an accuracy of 80%.
△ Less
Submitted 21 January, 2014;
originally announced January 2014.
-
Towards a Cellular Automata Based Network Intrusion Detection System with Power Level Metric in Wireless Adhoc Networks (IDFADNWCA)
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu
Abstract:
Adhoc wireless network with their changing topology and distributed nature are more prone to intruders. The efficiency of an Intrusion detection system in the case of an adhoc network is not only determined by its dynamicity in monitoring but also in its flexibility in utilizing the available power in each of its nodes. In this paper we propose a hybrid intrusion detection system, based on a power…
▽ More
Adhoc wireless network with their changing topology and distributed nature are more prone to intruders. The efficiency of an Intrusion detection system in the case of an adhoc network is not only determined by its dynamicity in monitoring but also in its flexibility in utilizing the available power in each of its nodes. In this paper we propose a hybrid intrusion detection system, based on a power level metric for potential adhoc hosts, which is used to determine the duration for which a particular node can support a network-monitoring node. The detection of intrusions in the network is done with the help of Cellular Automata (CA). IDFADNWCA (Intrusion Detection for Adhoc Networks with Cellular Automata) focuses on the available power level in each of the nodes and determines the network monitors. Power Level Metric in the network results in maintaining power for network monitoring, with monitors changing often, since it is an iterative power optimal solution to identify nodes for distributed agent based intrusion detection. The advantage of this approach entails is the inherent flexibility it provides, by means of considering only fewer nodes for reestablishing network monitors.
△ Less
Submitted 16 January, 2014;
originally announced January 2014.
-
Investigating Cellular Automata Based Network Intrusion Detection System For Fixed Networks (NIDWCA)
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu
Abstract:
Network Intrusion Detection Systems (NIDS) are computer systems which monitor a network with the aim of discerning malicious from benign activity on that network. With the recent growth of the Internet such security limitations are becoming more and more pressing. Most of the current network intrusion detection systems relay on labeled training data. An Unsupervised CA based anomaly detection tech…
▽ More
Network Intrusion Detection Systems (NIDS) are computer systems which monitor a network with the aim of discerning malicious from benign activity on that network. With the recent growth of the Internet such security limitations are becoming more and more pressing. Most of the current network intrusion detection systems relay on labeled training data. An Unsupervised CA based anomaly detection technique that was trained with unlabelled data is capable of detecting previously unseen attacks. This new approach, based on the Cellular Automata classifier (CAC) with Genetic Algorithms (GA), is used to classify program behavior as normal or intrusive. Parameters and evolution process for CAC with GA are discussed in detail. This implementation considers both temporal and spatial information of network connections in encoding the network connection information into rules in NIDS. Preliminary experiments with KDD Cup data set show that the CAC classifier with Genetic Algorithms can effectively detect intrusive attacks and achieve a low false positive rate. Training a NIDWCA (Network Intrusion Detection with Cellular Automata) classifier takes significantly shorter time than any other conventional techniques.
△ Less
Submitted 13 January, 2014;
originally announced January 2014.
-
PSMACA: An Automated Protein Structure Prediction Using MACA (Multiple Attractor Cellular Automata)
Authors:
Pokkuluri Kiran Sree,
Inamupudi Ramesh Babu,
SSSN Usha Devi N
Abstract:
Protein Structure Predication from sequences of amino acid has gained a remarkable attention in recent years. Even though there are some prediction techniques addressing this problem, the approximate accuracy in predicting the protein structure is closely 75%. An automated procedure was evolved with MACA (Multiple Attractor Cellular Automata) for predicting the structure of the protein. Most of th…
▽ More
Protein Structure Predication from sequences of amino acid has gained a remarkable attention in recent years. Even though there are some prediction techniques addressing this problem, the approximate accuracy in predicting the protein structure is closely 75%. An automated procedure was evolved with MACA (Multiple Attractor Cellular Automata) for predicting the structure of the protein. Most of the existing approaches are sequential which will classify the input into four major classes and these are designed for similar sequences. PSMACA is designed to identify ten classes from the sequences that share twilight zone similarity and identity with the training sequences. This method also predicts three states (helix, strand, and coil) for the structure. Our comprehensive design considers 10 feature selection methods and 4 classifiers to develop MACA (Multiple Attractor Cellular Automata) based classifiers that are build for each of the ten classes. We have tested the proposed classifier with twilight-zone and 1-high-similarity benchmark datasets with over three dozens of modern competing predictors shows that PSMACA provides the best overall accuracy that ranges between 77% and 88.7% depending on the dataset.
△ Less
Submitted 12 January, 2014;
originally announced January 2014.
-
Improving Quality of Clustering using Cellular Automata for Information retrieval
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu
Abstract:
Clustering has been widely applied to Information Retrieval (IR) on the grounds of its potential improved effectiveness over inverted file search. Clustering is a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set .A clustering quality measure is a function that, given a data set and its p…
▽ More
Clustering has been widely applied to Information Retrieval (IR) on the grounds of its potential improved effectiveness over inverted file search. Clustering is a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set .A clustering quality measure is a function that, given a data set and its partition into clusters, returns a non-negative real number representing the quality of that clustering. Moreover, they may behave in a different way depending on the features of the data set and their input parameters values. Therefore, in most applications the resulting clustering scheme requires some sort of evaluation as regards its validity. The quality of clustering can be enhanced by using a Cellular Automata Classifier for information retrieval. In this study we take the view that if cellular automata with clustering is applied to search results (query-specific clustering), then it has the potential to increase the retrieval effectiveness compared both to that of static clustering and of conventional inverted file search. We conducted a number of experiments using ten document collections and eight hierarchic clustering methods. Our results show that the effectiveness of query-specific clustering with cellular automata is indeed higher and suggest that there is scope for its application to IR.
△ Less
Submitted 12 January, 2014;
originally announced January 2014.
-
Face Detection from still and Video Images using Unsupervised Cellular Automata with K means clustering algorithm
Authors:
P. Kiran Sree,
I. Ramesh Babu
Abstract:
Pattern recognition problem rely upon the features inherent in the pattern of images. Face detection and recognition is one of the challenging research areas in the field of computer vision. In this paper, we present a method to identify skin pixels from still and video images using skin color. Face regions are identified from this skin pixel region. Facial features such as eyes, nose and mouth ar…
▽ More
Pattern recognition problem rely upon the features inherent in the pattern of images. Face detection and recognition is one of the challenging research areas in the field of computer vision. In this paper, we present a method to identify skin pixels from still and video images using skin color. Face regions are identified from this skin pixel region. Facial features such as eyes, nose and mouth are then located. Faces are recognized from color images using an RBF based neural network. Unsupervised Cellular Automata with K means clustering algorithm is used to locate different facial elements. Orientation is corrected by using eyes. Parameters like inter eye distance, nose length, mouth position, Discrete Cosine Transform (DCT) coefficients etc. are computed and used for a Radial Basis Function (RBF) based neural network. This approach reliably works for face sequence with orientation in head, expressions etc.
△ Less
Submitted 15 December, 2013;
originally announced December 2013.
-
Power-Aware Hybrid Intrusion Detection System (PHIDS) using Cellular Automata in Wireless AdHoc Networks
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu
Abstract:
Adhoc wireless network with their changing topology and distributed nature are more prone to intruders. The network monitoring functionality should be in operation as long as the network exists with nil constraints. The efficiency of an Intrusion detection system in the case of an adhoc network is not only determined by its dynamicity in monitoring but also in its flexibility in utilizing the avai…
▽ More
Adhoc wireless network with their changing topology and distributed nature are more prone to intruders. The network monitoring functionality should be in operation as long as the network exists with nil constraints. The efficiency of an Intrusion detection system in the case of an adhoc network is not only determined by its dynamicity in monitoring but also in its flexibility in utilizing the available power in each of its nodes. In this paper we propose a hybrid intrusion detection system, based on a power level metric for potential adhoc hosts, which is used to determine the duration for which a particular node can support a network monitoring node. Power aware hybrid intrusion detection system focuses on the available power level in each of the nodes and determines the network monitors. Power awareness in the network results in maintaining power for network monitoring, with monitors changing often, since it is an iterative power optimal solution to identify nodes for distributed agent based intrusion detection. The advantage that this approach entails is the inherent flexibility it provides, by means of considering only fewer nodes for reestablishing network monitors. The detection of intrusions in the network is done with the help of Cellular Automat CA. The CAs classify a packet routed through the network either as normal or an intrusion. The use of CAs enable in the identification of already occurred intrusions as well as new intrusions.
△ Less
Submitted 6 December, 2013;
originally announced December 2013.
-
FELFCNCA: Fast & Efficient Log File Compression Using Non Linear Cellular Automata Classifier
Authors:
P. Kiran Sree,
Inampudi Ramesh Babu,
SSSN Usha Devi N
Abstract:
Log Files are created for Traffic Analysis, Maintenance, Software debugging, customer management at multiple places like System Services, User Monitoring Applications, Network servers, database management systems which must be kept for long periods of time. These Log files may grow to huge sizes in this complex systems and environments. For storage and convenience log files must be compressed. Mos…
▽ More
Log Files are created for Traffic Analysis, Maintenance, Software debugging, customer management at multiple places like System Services, User Monitoring Applications, Network servers, database management systems which must be kept for long periods of time. These Log files may grow to huge sizes in this complex systems and environments. For storage and convenience log files must be compressed. Most of the existing algorithms do not take temporal redundancy specific Log Files into consideration. We propose a Non Linear based Classifier which introduces a multidimensional log file compression scheme described in eight variants, differing in complexity and attained compression ratios. The FELFCNCA scheme introduces a transformation for log file whose compressible output is far better than general purpose algorithms. This proposed method was found lossless and fully automatic. It does not impose any constraint on the size of log file
△ Less
Submitted 18 November, 2013;
originally announced December 2013.
-
CAVDM: Cellular Automata Based Video Cloud Mining Framework for Information Retrieval
Authors:
P. Kiran Sree,
Inampudi Ramesh Babu,
SSSN Usha Devi N
Abstract:
Cloud Mining technique can be applied to various documents. Acquisition and storage of video data is an easy task but retrieval of information from video data is a challenging task. So video Cloud Mining plays an important role in efficient video data management for information retrieval. This paper proposes a Cellular Automata based framework for video Cloud Mining to extract the information from…
▽ More
Cloud Mining technique can be applied to various documents. Acquisition and storage of video data is an easy task but retrieval of information from video data is a challenging task. So video Cloud Mining plays an important role in efficient video data management for information retrieval. This paper proposes a Cellular Automata based framework for video Cloud Mining to extract the information from video data. This includes developing the technique for shot detection then key frame analysis is considered to compare the frames of each shot to each others to define the relationship between shots. Cellular automata based hierarchical clustering technique is adopted to make a group of similar shots to detect the particular event on some requirement as per user demand.
△ Less
Submitted 18 November, 2013;
originally announced November 2013.
-
Multiple Attractor Cellular Automata (MACA) for Addressing Major Problems in Bioinformatics
Authors:
Pokkuluri Kiran Sree,
Inampudi Ramesh Babu,
SSSN Usha Devi Nedunuri
Abstract:
CA has grown as potential classifier for addressing major problems in bioinformatics. Lot of bioinformatics problems like predicting the protein coding region, finding the promoter region, predicting the structure of protein and many other problems in bioinformatics can be addressed through Cellular Automata. Even though there are some prediction techniques addressing these problems, the approxima…
▽ More
CA has grown as potential classifier for addressing major problems in bioinformatics. Lot of bioinformatics problems like predicting the protein coding region, finding the promoter region, predicting the structure of protein and many other problems in bioinformatics can be addressed through Cellular Automata. Even though there are some prediction techniques addressing these problems, the approximate accuracy level is very less. An automated procedure was proposed with MACA (Multiple Attractor Cellular Automata) which can address all these problems. The genetic algorithm is also used to find rules with good fitness values. Extensive experiments are conducted for reporting the accuracy of the proposed tool. The average accuracy of MACA when tested with ENCODE, BG570, HMR195, Fickett and Tongue, ASP67 datasets is 78%.
△ Less
Submitted 16 October, 2013;
originally announced October 2013.
-
Network Intrusion Detection Using FP Tree Rules
Authors:
P. Srinivasulu,
J. Ranga Rao,
I. Ramesh Babu
Abstract:
In the faceless world of the Internet,online fraud is one of the greatest reasons of loss for web merchants.Advanced solutions are needed to protect e businesses from the constant problems of fraud.Many popular fraud detection algorithms require supervised training,which needs human intervention to prepare training cases.Since it is quite often for an online transaction database to ha e Terabyte l…
▽ More
In the faceless world of the Internet,online fraud is one of the greatest reasons of loss for web merchants.Advanced solutions are needed to protect e businesses from the constant problems of fraud.Many popular fraud detection algorithms require supervised training,which needs human intervention to prepare training cases.Since it is quite often for an online transaction database to ha e Terabyte level storage,human investigation to identify fraudulent transactions is very costly.This paper describes the automatic design of user profiling method for the purpose of fraud detection.We use a FP (Frequent Pattern) Tree rule learning algorithm to adaptively profile legitimate customer behavior in a transaction database.Then the incoming transactions are compared against the user profile to uncover the anomalies The anomaly outputs are used as input to an accumulation system for combining evidence to generate high confidence fraud alert value. Favorable experimental results are presented.
△ Less
Submitted 14 June, 2010;
originally announced June 2010.