» Articles » PMID: 20084293

Internal Validation Inferences of Significant Genomic Features in Genome-wide Screening

Overview
Date 2010 Jan 20
PMID 20084293
Citations 3
Authors
Affiliations
Soon will be listed here.
Abstract

Although validation of classification and prediction models has been a long-standing topic in Statistics and computer learning, the concept of statistical validation in genome-wide screening studies has been vague. Internal validation generally refers to validation procedures solely based on the study dataset. A popular approach to internal validation of identified genomic features has been the split-dataset validation. Contrast to this approach, internal validation in genome-wide association screening studies is precisely defined through the concepts of association profile and profile significance. A general procedure and two specific profile significance measures are developed and are compared with the split-dataset validation approach by a simulation study. The simulation results clearly demonstrate the strength and limitations of the profile significance approach to internal validation, especially its enormous gain in sensitivity (power) and stability over the split-dataset validation. The proposed methodology is illustrated by an example of genome-wide SNP associaiton analysis in genetic epidemiology.

Citing Articles

Evaluation of a two-step iterative resampling procedure for internal validation of genome-wide association studies.

Guolian Kang , Liu W, Cheng C, Wilson C, Neale G, Yang J J Hum Genet. 2015; 60(12):729-38.

PMID: 26377241 PMC: 4859941. DOI: 10.1038/jhg.2015.110.


A statistical approach to selecting and confirming validation targets in -omics experiments.

Leek J, Taub M, Rasgon J BMC Bioinformatics. 2012; 13:150.

PMID: 22738145 PMC: 3568710. DOI: 10.1186/1471-2105-13-150.


A Phenotype-Driven Dimension Reduction (PhDDR) approach to integrated genomic association analyses.

Gao C, Cheng C Annu Int Conf IEEE Eng Med Biol Soc. 2012; 2011:6837-40.

PMID: 22255909 PMC: 3652376. DOI: 10.1109/IEMBS.2011.6091686.

References
1.
Simon R . Development and validation of therapeutically relevant multi-gene biomarker classifiers. J Natl Cancer Inst. 2005; 97(12):866-7. DOI: 10.1093/jnci/dji168. View

2.
Burton P, Clayton D, Cardon L, Craddock N, Deloukas P, Duncanson A . Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants. Nat Genet. 2007; 39(11):1329-37. PMC: 2680141. DOI: 10.1038/ng.2007.17. View

3.
Cheng C, Pounds S . False discovery rate paradigms for statistical analyses of microarray gene expression data. Bioinformation. 2007; 1(10):436-46. PMC: 1896060. DOI: 10.6026/97320630001436. View

4.
Pounds S, Cheng C . Robust estimation of the false discovery rate. Bioinformatics. 2006; 22(16):1979-87. DOI: 10.1093/bioinformatics/btl328. View

5.
Bovelstad H, Nygard S, Storvold H, Aldrin M, Borgan O, Frigessi A . Predicting survival from microarray data--a comparative study. Bioinformatics. 2007; 23(16):2080-7. DOI: 10.1093/bioinformatics/btm305. View