» Articles » PMID: 29589561

Feature Selection with Interactions in Logistic Regression Models Using Multivariate Synergies for a GWAS Application

Overview
Journal BMC Genomics
Publisher Biomed Central
Specialty Genetics
Date 2018 Mar 29
PMID 29589561
Citations 4
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Genotype-phenotype association has been one of the long-standing problems in bioinformatics. Identifying both the marginal and epistatic effects among genetic markers, such as Single Nucleotide Polymorphisms (SNPs), has been extensively integrated in Genome-Wide Association Studies (GWAS) to help derive "causal" genetic risk factors and their interactions, which play critical roles in life and disease systems. Identifying "synergistic" interactions with respect to the outcome of interest can help accurate phenotypic prediction and understand the underlying mechanism of system behavior. Many statistical measures for estimating synergistic interactions have been proposed in the literature for such a purpose. However, except for empirical performance, there is still no theoretical analysis on the power and limitation of these synergistic interaction measures.

Results: In this paper, it is shown that the existing information-theoretic multivariate synergy depends on a small subset of the interaction parameters in the model, sometimes on only one interaction parameter. In addition, an adjusted version of multivariate synergy is proposed as a new measure to estimate the interactive effects, with experiments conducted over both simulated data sets and a real-world GWAS data set to show the effectiveness.

Conclusions: We provide rigorous theoretical analysis and empirical evidence on why the information-theoretic multivariate synergy helps with identifying genetic risk factors via synergistic interactions. We further establish the rigorous sample complexity analysis on detecting interactive effects, confirmed by both simulated and real-world data sets.

Citing Articles

Artificial Intelligence for Antimicrobial Resistance Prediction: Challenges and Opportunities towards Practical Implementation.

Ali T, Ahmed S, Aslam M Antibiotics (Basel). 2023; 12(3).

PMID: 36978390 PMC: 10044311. DOI: 10.3390/antibiotics12030523.


Machine Learning for Antimicrobial Resistance Prediction: Current Practice, Limitations, and Clinical Perspective.

Kim J, Maguire F, Tsang K, Gouliouris T, Peacock S, McAllister T Clin Microbiol Rev. 2022; 35(3):e0017921.

PMID: 35612324 PMC: 9491192. DOI: 10.1128/cmr.00179-21.


Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework.

Zhang Y, Xie R, Wang J, Leier A, Marquez-Lago T, Akutsu T Brief Bioinform. 2018; 20(6):2185-2199.

PMID: 30351377 PMC: 6954445. DOI: 10.1093/bib/bby079.


Selected research articles from the 2017 International Workshop on Computational Network Biology: Modeling, Analysis, and Control (CNB-MAC).

Yoon B, Qian X, Kahveci T, Pal R BMC Bioinformatics. 2018; 19(Suppl 3):69.

PMID: 29589557 PMC: 5872518. DOI: 10.1186/s12859-018-2058-9.

References
1.
Guo X, Zhang J, Cai Z, Du D, Pan Y . Searching Genome-Wide Multi-Locus Associations for Multiple Diseases Based on Bayesian Inference. IEEE/ACM Trans Comput Biol Bioinform. 2016; 14(3):600-610. DOI: 10.1109/TCBB.2016.2527648. View

2.
Chung Y, Lee S, Elston R, Park T . Odds ratio based multifactor-dimensionality reduction method for detecting gene-gene interactions. Bioinformatics. 2006; 23(1):71-6. DOI: 10.1093/bioinformatics/btl557. View

3.
Wheeler D, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A . The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008; 452(7189):872-6. DOI: 10.1038/nature06884. View

4.
Reijonen H, Novak E, Kochik S, Heninger A, Liu A, Kwok W . Detection of GAD65-specific T-cells by major histocompatibility complex class II tetramers in type 1 diabetic patients and at-risk subjects. Diabetes. 2002; 51(5):1375-82. DOI: 10.2337/diabetes.51.5.1375. View

5.
Lee K, Wucherpfennig K, Wiley D . Structure of a human insulin peptide-HLA-DQ8 complex and susceptibility to type 1 diabetes. Nat Immunol. 2001; 2(6):501-7. DOI: 10.1038/88694. View