Variable Selection and Pattern Recognition with Gene Expression Data Generated by the Microarray Technology
Overview
Affiliations
Lack of adequate statistical methods for the analysis of microarray data remains the most critical deterrent to uncovering the true potential of these promising techniques in basic and translational biological studies. The popular practice of drawing important biological conclusions from just one replicate (slide) should be discouraged. In this paper, we discuss some modern trends in statistical analysis of microarray data with a special focus on statistical classification (pattern recognition) and variable selection. In addressing these issues we consider the utility of some distances between random vectors and their nonparametric estimates obtained from gene expression data. Performance of the proposed distances is tested by computer simulations and analysis of gene expression data on two different types of human leukemia. In experimental settings, the error rate is estimated by cross-validation, while a control sample is generated in computer simulation experiments aimed at testing the proposed gene selection procedures and associated classification rules.
Lee S, Joshi G, Son C, Lee G Diagnostics (Basel). 2023; 13(4).
PMID: 36832226 PMC: 9955403. DOI: 10.3390/diagnostics13040736.
Zhu X, Chen J, Zeng X, Liang J, Li C, Liu S Proc IEEE Int Conf Comput Vis. 2022; 2021:2814-2824.
PMID: 35350748 PMC: 8959907. DOI: 10.1109/iccv48922.2021.00283.
Maximizing the reusability of gene expression data by predicting missing metadata.
Lung P, Zhong D, Pang X, Li Y, Zhang J PLoS Comput Biol. 2020; 16(11):e1007450.
PMID: 33156882 PMC: 7673503. DOI: 10.1371/journal.pcbi.1007450.
Gupta V, Crudu A, Matsuoka Y, Ghosh S, Rozot R, Marat X NPJ Syst Biol Appl. 2019; 5:42.
PMID: 31798962 PMC: 6879499. DOI: 10.1038/s41540-019-0119-y.
Super-delta: a new differential gene expression analysis procedure with robust data normalization.
Liu Y, Zhang J, Qiu X BMC Bioinformatics. 2017; 18(1):582.
PMID: 29268715 PMC: 5740711. DOI: 10.1186/s12859-017-1992-2.