Bioinformatics / Biomedical Informatics / Medical Informatics / Health Informatics
Hossein Bankikoshki; Seyed Ali Seyyedsalehi; Fatemeh Zare Mirakabad
Volume 11, Issue 3 , September 2017, , Pages 219-230
Abstract
The use of genomic nucleotide sequences as biochemical signals in machine learning methods is possible by converting these sequences into numerical codes. This conversion results in an unrealistic increase in the dimension of the data and encounters some data analysis operations such as visualization ...
Read More
The use of genomic nucleotide sequences as biochemical signals in machine learning methods is possible by converting these sequences into numerical codes. This conversion results in an unrealistic increase in the dimension of the data and encounters some data analysis operations such as visualization and feature extraction with constraints. Therefore, one should use the dimensionality reduction technics in order to return the data to its real dimension. In this study, a deep autoencoder neural network has been used to reduce the dimension of binding site sequence data on the human genome. In order to determine whether the information of real data is preserved in compressed data, we perform a two-class classification using a support vector machine. The results show that information is almost entirely preserved in compression. Then, compressed data is used for visualization as well as feature selection by analysis of variance. The results show that the first, the tenth and eighth positions in the sequences are the most informative positions. While the majority of the previous works deal with gene expression data of microarrays and compare a few dimension reduction algorithms, this paper for the first time uses an autoencoder on nucleotide sequence data and provides a comprehensive comparison between the performance of the dimension reduction technics and machine learning algorithms.
Bioinformatics / Biomedical Informatics / Medical Informatics / Health Informatics
Amin Janghorbani; Mohammad Hasan Moradi
Volume 10, Issue 3 , October 2016, , Pages 197-209
Abstract
Babies are born under 2,500 g., defined as low birth weight (LBW) babies. They are exposed to the higher risks of mortality, congenital malformations, mental retardation, and other physical and neurological impairments. 15.5 % of births around the world are LBW. Reduction of the rate of LBW births to ...
Read More
Babies are born under 2,500 g., defined as low birth weight (LBW) babies. They are exposed to the higher risks of mortality, congenital malformations, mental retardation, and other physical and neurological impairments. 15.5 % of births around the world are LBW. Reduction of the rate of LBW births to one-third is one of the aims of United Nations Children’s Fund program. Prognosis of LBW births can play a critical role in the reduction of these cases. Also, it helps clinicians to make timely and efficient clinical decisions to save these babies' life. In this study, a hybrid framework called fuzzy evidential network with a good ability to manage different aspects of uncertainty is a selected as the LBW prognosis model. The accuracy of prognosis and the performance of the fuzzy evidential network in the management of missing values of the clinical database were investigated and compared with well-known prognosis models of LBW. The results showed that the fuzzy evidential network has higher prognosis accuracy (84.8%) than other prognosis models. On the other hand, the fusion of naïve Bayes and the fuzzy evidential network outputs resulted in higher prognosis accuracy (85.2%). In addition, the fuzzy evidential network performance in the management of uncertainty induced by imputation method, was better than other prognosis models of this study. The performance loss of this framework as the results of the missing data increment, is less than other models.
Bioinformatics / Biomedical Informatics / Medical Informatics / Health Informatics
Mina Jafari; Behnam Ghavami; Vahid Sattari Naeini
Volume 9, Issue 4 , February 2015, , Pages 375-386
Abstract
The inference of Gene Regulatory Network (GRN) using gene expression data is significantly important in order to understand gene dependencies, regulatory functions among genes, biological processes, way of process occurrence and avoiding some unplanned processes (disease). The accurate inference of GRN ...
Read More
The inference of Gene Regulatory Network (GRN) using gene expression data is significantly important in order to understand gene dependencies, regulatory functions among genes, biological processes, way of process occurrence and avoiding some unplanned processes (disease). The accurate inference of GRN needs the accurate inference of predictor set. Generally, the main limitations of the predictor set inference are the small number of samples, the large number of genes and also the possibility influence of noise in gene expression data. Hence, providing efficient methods to infer predictor set with high reliability is a serious need. In this paper, an efficient method is proposed to infer predictor set using Gravitational Search Algorithm (GSA). A GSA is used for each target gene to infer the predictor subset of the gene. In a population, a mass represents a predictor subset of the associated gene. The initial population per target gene is generated by Pearson Correlation Coefficient (PCC). In order to guide the GSA, Mean Conditional Entropy (MCE) is used as the assessment criterion. Experimental results show that the proposed method has a good ability to infer the predictor set with high reliability. In addition, we also compared the proposed algorithm with a recent similar method based on genetic algorithm. Comparison results reveal the advantage of the proposed algorithm on biological datasets with small data volumes and large network scales.
Biomedical Signal Processing / Medical Signal Processing / Biosignal Processing
Ali Khadem; Gholam Ali Hossein-Zadeh
Volume 6, Issue 1 , June 2012, , Pages 57-69
Abstract
Exploring the causal (delayed) brain relations is an important topic in the Neuroscience. The traditional estimators of brain causal (delayed) relations are mainly model-based and put restrictive assumptions on the brain dynamics. In the recent years, some nonparametric measures have been introduced ...
Read More
Exploring the causal (delayed) brain relations is an important topic in the Neuroscience. The traditional estimators of brain causal (delayed) relations are mainly model-based and put restrictive assumptions on the brain dynamics. In the recent years, some nonparametric measures have been introduced to solve this problem. Among them, the most important one is Transfer Entropy (TE) which is based on the information theory and Conditional Mutual Information concept. However, in the presence of significant instantaneous relations that are observed extensively in the brain functional datasets, TE may estimate the causal (delayed) relations inaccurately. In this paper, two information theoretic based measures called Instantaneous Interaction (II) and Modified Transfer entropy (MTE) are introduced to estimate the instantaneous and causal (delayed) brain relations, respectively. MTE is used instead of TE whenever II is significant. These measures are evaluated on 3 simulated models and eyes-closed resting state EEG data. The simulation results show high ability of II to estimate the linear and nonlinear instantaneous relations. Also, based on the simulation results MTE outperforms TE to estimate causal (delayed) relations in presence of significant instantaneous relations (significant II). For the real EEG data, II detects a significant instantaneous relation between Posterior and Frontal EEG channels. Also MTE detects the information flow from Posterior EEG channels to Frontal ones more significantly than TE does. So in presence of significant instantaneous relations in the real EEG data, MTE outperforms TE.