The Ability of Different Imputation Methods to Preserve the Significant Genes and Pathways in Cancer
Overview
Affiliations
Deciphering important genes and pathways from incomplete gene expression data could facilitate a better understanding of cancer. Different imputation methods can be applied to estimate the missing values. In our study, we evaluated various imputation methods for their performance in preserving significant genes and pathways. In the first step, 5% genes are considered in random for two types of ignorable and non-ignorable missingness mechanisms with various missing rates. Next, 10 well-known imputation methods were applied to the complete datasets. The significance analysis of microarrays (SAM) method was applied to detect the significant genes in rectal and lung cancers to showcase the utility of imputation approaches in preserving significant genes. To determine the impact of different imputation methods on the identification of important genes, the chi-squared test was used to compare the proportions of overlaps between significant genes detected from original data and those detected from the imputed datasets. Additionally, the significant genes are tested for their enrichment in important pathways, using the ConsensusPathDB. Our results showed that almost all the significant genes and pathways of the original dataset can be detected in all imputed datasets, indicating that there is no significant difference in the performance of various imputation methods tested. The source code and selected datasets are available on http://profiles.bs.ipm.ir/softwares/imputation_methods/.
Maghsoudi M, Aghdam R, Eslahchi C Sci Rep. 2023; 13(1):8663.
PMID: 37248269 PMC: 10226989. DOI: 10.1038/s41598-023-35588-5.
A comprehensive survey on computational learning methods for analysis of gene expression data.
Bhandari N, Walambe R, Kotecha K, Khare S Front Mol Biosci. 2022; 9:907150.
PMID: 36458095 PMC: 9706412. DOI: 10.3389/fmolb.2022.907150.
Genomic data imputation with variational auto-encoders.
Qiu Y, Zheng H, Gevaert O Gigascience. 2020; 9(8).
PMID: 32761097 PMC: 7407276. DOI: 10.1093/gigascience/giaa082.
Significant random signatures reveals new biomarker for breast cancer.
Saberi Ansar E, Eslahchii C, Rahimi M, Geranpayeh L, Ebrahimi M, Aghdam R BMC Med Genomics. 2019; 12(1):160.
PMID: 31703592 PMC: 6842262. DOI: 10.1186/s12920-019-0609-1.