» Articles » PMID: 21769931

Bayesian Hierarchical Mixture Modeling to Assign Copy Number from a Targeted CNV Array

Overview
Journal Genet Epidemiol
Specialties Genetics
Public Health
Date 2011 Jul 20
PMID 21769931
Citations 10
Authors
Affiliations
Soon will be listed here.
Abstract

Accurate assignment of copy number at known copy number variant (CNV) loci is important for both increasing understanding of the structural evolution of genomes as well as for carrying out association studies of copy number with disease. As with calling SNP genotypes, the task can be framed as a clustering problem but for a number of reasons assigning copy number is much more challenging. CNV assays have lower signal-to-noise ratios than SNP assays, often display heavy tailed and asymmetric intensity distributions, contain outlying observations and may exhibit systematic technical differences among different cohorts. In addition, the number of copy-number classes at a CNV in the population may be unknown a priori. Due to these complications, automatic and robust assignment of copy number from array data remains a challenging problem. We have developed a copy number assignment algorithm, CNVCALL, for a targeted CNV array, such as that used by the Wellcome Trust Case Control Consortium's recent CNV association study. We use a Bayesian hierarchical mixture model that robustly identifies both the number of different copy number classes at a specific locus as well as relative copy number for each individual in the sample. This approach is fully automated which is a critical requirement when analyzing large numbers of CNVs. We illustrate the methods performance using real data from the Wellcome Trust Case Control Consortium's CNV association study and using simulated data.

Citing Articles

Bayesian copy number detection and association in large-scale studies.

Cristiano S, McKean D, Carey J, Bracci P, Brennan P, Chou M BMC Cancer. 2020; 20(1):856.

PMID: 32894098 PMC: 7487704. DOI: 10.1186/s12885-020-07304-3.


Genome-Wide Association of Copy Number Polymorphisms and Kidney Function.

Li M, Carey J, Cristiano S, Susztak K, Coresh J, Boerwinkle E PLoS One. 2017; 12(1):e0170815.

PMID: 28135296 PMC: 5279752. DOI: 10.1371/journal.pone.0170815.


Whole exome association of rare deletions in multiplex oral cleft families.

Fu J, Beaty T, Scott A, Hetmanski J, Parker M, Wilson J Genet Epidemiol. 2016; 41(1):61-69.

PMID: 27910131 PMC: 5154821. DOI: 10.1002/gepi.22010.


Genome-wide association study of copy number variation with lung function identifies a novel signal of association near BANP for forced vital capacity.

Shrine N, Tobin M, Schurmann C, Artigas M, Hui J, Lehtimaki T BMC Genet. 2016; 17(1):116.

PMID: 27514831 PMC: 4981989. DOI: 10.1186/s12863-016-0423-0.


Association analysis of copy numbers of FC-gamma receptor genes for rheumatoid arthritis and other immune-mediated phenotypes.

Franke L, El Bannoudi H, Jansen D, Kok K, Trynka G, Diogo D Eur J Hum Genet. 2015; 24(2):263-70.

PMID: 25966632 PMC: 4717214. DOI: 10.1038/ejhg.2015.95.


References
1.
Colella S, Yau C, Taylor J, Mirza G, Butler H, Clouston P . QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 2007; 35(6):2013-25. PMC: 1874617. DOI: 10.1093/nar/gkm076. View

2.
Tuzun E, Sharp A, Bailey J, Kaul R, Morrison V, Pertz L . Fine-scale structural variation of the human genome. Nat Genet. 2005; 37(7):727-32. DOI: 10.1038/ng1562. View

3.
Shah S, Xuan X, DeLeeuw R, Khojasteh M, Lam W, Ng R . Integrating copy number polymorphisms into array CGH analysis using a robust HMM. Bioinformatics. 2006; 22(14):e431-9. DOI: 10.1093/bioinformatics/btl238. View

4.
Sherry S, Ward M, Kholodov M, Baker J, Phan L, Smigielski E . dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2000; 29(1):308-11. PMC: 29783. DOI: 10.1093/nar/29.1.308. View

5.
Barnes C, Plagnol V, Fitzgerald T, Redon R, Marchini J, Clayton D . A robust statistical method for case-control association testing with copy number variation. Nat Genet. 2008; 40(10):1245-52. PMC: 2784596. DOI: 10.1038/ng.206. View