Statistical Significance Threshold Criteria for Analysis of Microarray Gene Expression Data
Overview
Molecular Biology
Public Health
Affiliations
The methodological advancement in microarray data analysis on the basis of false discovery rate (FDR) control, such as the q-value plots, allows the investigator to examine the FDR from several perspectives. However, when FDR control at the "customary" levels 0.01, 0.05, or 0.1 does not provide fruitful findings, there is little guidance for making the trade off between the significance threshold and the FDR level by sound statistical or biological considerations. Thus, meaningful statistical significance criteria that complement the existing FDR methods for large-scale multiple tests are desirable. Three statistical significance criteria, the profile information criterion, the total error proportion, and the guide-gene driven selection, are developed in this research. The first two are general significance threshold criteria for large-scale multiple tests; the profile information criterion is related to the recent theoretical studies of the connection between FDR control and minimax estimation, and the total error proportion is closely related to the asymptotic properties of FDR control in terms of the total error risk. The guide-gene driven selection is an approach to combining statistical significance and the existing biological knowledge of the study at hand. Error properties of these criteria are investigated theoretically and by simulation. The proposed methods are illustrated and compared using an example of genomic screening for novel Arf gene targets. Operating characteristics of q-value and the proposed significance threshold criteria are investigated and compared in a simulation study that employs a model mimicking a gene regulatory pathway. A guideline for using these criteria is provided. Splus/R code is available from the corresponding author upon request.
Baf155 regulates skeletal muscle metabolism via HIF-1a signaling.
Kang J, Kim D, Rhee J, Seo J, Park I, Kim J PLoS Biol. 2023; 21(7):e3002192.
PMID: 37478146 PMC: 10396025. DOI: 10.1371/journal.pbio.3002192.
Bone mineral density in children with acute lymphoblastic leukemia.
Inaba H, Cao X, Han A, Panetta J, Ness K, Metzger M Cancer. 2017; 124(5):1025-1035.
PMID: 29266176 PMC: 5821586. DOI: 10.1002/cncr.31184.
Genetics of pleiotropic effects of dexamethasone.
Ramsey L, Pounds S, Cheng C, Cao X, Yang W, Smith C Pharmacogenet Genomics. 2017; 27(8):294-302.
PMID: 28628558 PMC: 5523978. DOI: 10.1097/FPC.0000000000000293.
Genetics of ancestry-specific risk for relapse in acute lymphoblastic leukemia.
Karol S, Larsen E, Cheng C, Cao X, Yang W, Ramsey L Leukemia. 2017; 31(6):1325-1332.
PMID: 28096535 PMC: 5462853. DOI: 10.1038/leu.2017.24.
Exploratory Failure Time Analysis in Large Scale Genomics.
Cheng C Comput Stat Data Anal. 2015; 95:192-206.
PMID: 26681817 PMC: 4677332. DOI: 10.1016/j.csda.2015.10.004.