» Articles » PMID: 26346579

Clique-Based Clustering of Correlated SNPs in a Gene Can Improve Performance of Gene-Based Multi-Bin Linear Combination Test

Overview
Journal Biomed Res Int
Publisher Wiley
Date 2015 Sep 9
PMID 26346579
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Gene-based analysis of multiple single nucleotide polymorphisms (SNPs) in a gene region is an alternative to single SNP analysis. The multi-bin linear combination test (MLC) proposed in previous studies utilizes the correlation among SNPs within a gene to construct a gene-based global test. SNPs are partitioned into clusters of highly correlated SNPs, and the MLC test statistic quadratically combines linear combination statistics constructed for each cluster. The test has degrees of freedom equal to the number of clusters and can be more powerful than a fully quadratic or fully linear test statistic. In this study, we develop a new SNP clustering algorithm designed to find cliques, which are complete subnetworks of SNPs with all pairwise correlations above a threshold. We evaluate the performance of the MLC test using the clique-based CLQ algorithm versus using the tag-SNP-based LDSelect algorithm. In our numerical power calculations we observed that the two clustering algorithms produce identical clusters about 40~60% of the time, yielding similar power on average. However, because the CLQ algorithm tends to produce smaller clusters with stronger positive correlation, the MLC test is less likely to be affected by the occurrence of opposing signs in the individual SNP effect coefficients.

Citing Articles

A new haplotype block detection method for dense genome sequencing data based on interval graph modeling of clusters of highly correlated SNPs.

Kim S, Cho C, Kim S, Bull S, Yoo Y Bioinformatics. 2017; 34(3):388-397.

PMID: 29028986 PMC: 5860363. DOI: 10.1093/bioinformatics/btx609.


Multiple linear combination (MLC) regression tests for common variants adapted to linkage disequilibrium structure.

Yoo Y, Sun L, Poirier J, Paterson A, Bull S Genet Epidemiol. 2016; 41(2):108-121.

PMID: 27885705 PMC: 5245123. DOI: 10.1002/gepi.22024.

References
1.
Neale B, Sham P . The future of association studies: gene-based analysis and replication. Am J Hum Genet. 2004; 75(3):353-62. PMC: 1182015. DOI: 10.1086/423901. View

2.
Willer C, Schmidt E, Sengupta S, Peloso G, Gustafsson S, Kanoni S . Discovery and refinement of loci associated with lipid levels. Nat Genet. 2013; 45(11):1274-1283. PMC: 3838666. DOI: 10.1038/ng.2797. View

3.
Curtis D . A rapid method for combined analysis of common and rare variants at the level of a region, gene, or pathway. Adv Appl Bioinform Chem. 2012; 5:1-9. PMC: 3413013. DOI: 10.2147/AABC.S33049. View

4.
Asimit J, Yoo Y, Waggott D, Sun L, Bull S . Region-based analysis in genome-wide association study of Framingham Heart Study blood lipid phenotypes. BMC Proc. 2009; 3 Suppl 7:S127. PMC: 2795900. DOI: 10.1186/1753-6561-3-s7-s127. View

5.
Madsen B, Browning S . A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 2009; 5(2):e1000384. PMC: 2633048. DOI: 10.1371/journal.pgen.1000384. View