Application of Two Machine Learning Algorithms to Genetic Association Studies in the Presence of Covariates
Overview
Molecular Biology
Authors
Affiliations
Background: Population-based investigations aimed at uncovering genotype-trait associations often involve high-dimensional genetic polymorphism data as well as information on multiple environmental and clinical parameters. Machine learning (ML) algorithms offer a straightforward analytic approach for selecting subsets of these inputs that are most predictive of a pre-defined trait. The performance of these algorithms, however, in the presence of covariates is not well characterized.
Methods And Results: In this manuscript, we investigate two approaches: Random Forests (RFs) and Multivariate Adaptive Regression Splines (MARS). Through multiple simulation studies, the performance under several underlying models is evaluated. An application to a cohort of HIV-1 infected individuals receiving anti-retroviral therapies is also provided.
Conclusion: Consistent with more traditional regression modeling theory, our findings highlight the importance of considering the nature of underlying gene-covariate-trait relationships before applying ML algorithms, particularly when there is potential confounding or effect mediation.
Covariate adjusted classification trees.
Asafu-Adjei J, Sampson A Biostatistics. 2017; 19(1):42-53.
PMID: 28520903 PMC: 6075597. DOI: 10.1093/biostatistics/kxx015.
Salehe B, Jones C, Di Fatta G, McGuffin L PLoS One. 2017; 12(4):e0175957.
PMID: 28441463 PMC: 5404774. DOI: 10.1371/journal.pone.0175957.
Voisin S, Cieszczyk P, Pushkarev V, Dyatlov D, Vashlyayev B, Shumaylov V BMC Genomics. 2014; 15:382.
PMID: 24884370 PMC: 4035083. DOI: 10.1186/1471-2164-15-382.
Integrative systems biology approaches in asthma pharmacogenomics.
Dahlin A, Tantisira K Pharmacogenomics. 2012; 13(12):1387-404.
PMID: 22966888 PMC: 3553555. DOI: 10.2217/pgs.12.126.
Walters R, Laurin C, Lubke G Bioinformatics. 2012; 28(20):2615-23.
PMID: 22847933 PMC: 3467741. DOI: 10.1093/bioinformatics/bts483.