Correcting Coalescent Analyses for Panel-based SNP Ascertainment
Overview
Affiliations
Single-nucleotide polymorphism (SNP) data are routinely obtained by sequencing a region of interest in a small panel, constructing a chip with probes specific to sites found to vary in the panel, and using the chip to assay subsequent samples. The size of the chip is often reduced by removing low-frequency alleles from the set of SNPs. Using coalescent estimation of the scaled population size parameter, Θ, as a test case, we demonstrate the loss of information inherent in this procedure and develop corrections for coalescent analysis of SNPs obtained via a panel. We show that more accurate Θ-estimates can be recovered if the panel size is known, but at considerable computational cost as the panel individuals must be explicitly modeled in the analysis. We extend this technique to apply to the case where rare alleles have been omitted from the SNP panel. We find that when appropriate corrections for panel ascertainment and rare-allele omission are used, the biases introduced by ascertainment are largely correctable, but recovered estimates are less accurate than would be obtained with fully sequenced data. This method is then applied to recombinant multiple population data to investigate the effects of recombination and migration on the estimate of Θ.
Effects of single nucleotide polymorphism ascertainment on population structure inferences.
Dokan K, Kawamura S, Teshima K G3 (Bethesda). 2021; 11(9).
PMID: 33871576 PMC: 8496283. DOI: 10.1093/g3journal/jkab128.
Bauchet G, Grenier S, Samson N, Bonnet J, Grivet L, Causse M Theor Appl Genet. 2017; 130(5):875-889.
PMID: 28188333 DOI: 10.1007/s00122-017-2857-9.
Leache A, Banbury B, Felsenstein J, Nieto-Montes de Oca A, Stamatakis A Syst Biol. 2015; 64(6):1032-47.
PMID: 26227865 PMC: 4604835. DOI: 10.1093/sysbio/syv053.
McTavish E, Hillis D BMC Genomics. 2015; 16:266.
PMID: 25887858 PMC: 4428227. DOI: 10.1186/s12864-015-1469-5.
Correcting for sequencing error in maximum likelihood phylogeny inference.
Kuhner M, McGill J G3 (Bethesda). 2014; 4(12):2545-52.
PMID: 25378476 PMC: 4267948. DOI: 10.1534/g3.114.014365.