» Articles » PMID: 34587903

Detecting Selection in Low-coverage High-throughput Sequencing Data Using Principal Component Analysis

Overview
Publisher Biomed Central
Specialty Biology
Date 2021 Sep 30
PMID 34587903
Citations 6
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Identification of selection signatures between populations is often an important part of a population genetic study. Leveraging high-throughput DNA sequencing larger sample sizes of populations with similar ancestries has become increasingly common. This has led to the need of methods capable of identifying signals of selection in populations with a continuous cline of genetic differentiation. Individuals from continuous populations are inherently challenging to group into meaningful units which is why existing methods rely on principal components analysis for inference of the selection signals. These existing methods require called genotypes as input which is problematic for studies based on low-coverage sequencing data.

Materials And Methods: We have extended two principal component analysis based selection statistics to genotype likelihood data and applied them to low-coverage sequencing data from the 1000 Genomes Project for populations with European and East Asian ancestry to detect signals of selection in samples with continuous population structure.

Results: Here, we present two selections statistics which we have implemented in the PCAngsd framework. These methods account for genotype uncertainty, opening for the opportunity to conduct selection scans in continuous populations from low and/or variable coverage sequencing data. To illustrate their use, we applied the methods to low-coverage sequencing data from human populations of East Asian and European ancestries and show that the implemented selection statistics can control the false positive rate and that they identify the same signatures of selection from low-coverage sequencing data as state-of-the-art software using high quality called genotypes.

Conclusion: We show that selection scans of low-coverage sequencing data of populations with similar ancestry perform on par with that obtained from high quality genotype data. Moreover, we demonstrate that PCAngsd outperform selection statistics obtained from called genotypes from low-coverage sequencing data without the need for ad-hoc filtering.

Citing Articles

Genetic and morphological shifts associated with climate change in a migratory bird.

Adams N, Dias T, Skeen H, Pegan T, Willard D, Winger B BMC Biol. 2025; 23(1):3.

PMID: 39773181 PMC: 11705884. DOI: 10.1186/s12915-024-02107-5.


Impact of putatively beneficial genomic loci on gene expression in little brown bats (, Le Conte, 1831) affected by white-nose syndrome.

Kwait R, Pinsky M, Gignoux-Wolfsohn S, Eskew E, Kerwin K, Maslo B Evol Appl. 2024; 17(9):e13748.

PMID: 39310794 PMC: 11413065. DOI: 10.1111/eva.13748.


Unravelling reference bias in ancient DNA datasets.

Dolenz S, van der Valk T, Jin C, Oppenheimer J, Sharif M, Orlando L Bioinformatics. 2024; 40(7).

PMID: 38960861 PMC: 11254355. DOI: 10.1093/bioinformatics/btae436.


Variable parallelism in the genomic basis of age at maturity across spatial scales in Atlantic Salmon.

Kess T, Lehnert S, Bentzen P, Duffy S, Messmer A, Dempson J Ecol Evol. 2024; 14(4):e11068.

PMID: 38584771 PMC: 10995719. DOI: 10.1002/ece3.11068.


Panmixia in the American eel extends to its tropical range of distribution: Biological implications and policymaking challenges.

Ulmo-Diaz G, Engman A, McLarney W, Lasso Alcala C, Hendrickson D, Bezault E Evol Appl. 2023; 16(12):1872-1888.

PMID: 38143897 PMC: 10739100. DOI: 10.1111/eva.13599.


References
1.
Bersaglieri T, Sabeti P, Patterson N, Vanderploeg T, Schaffner S, Drake J . Genetic signatures of strong recent positive selection at the lactase gene. Am J Hum Genet. 2004; 74(6):1111-20. PMC: 1182075. DOI: 10.1086/421051. View

2.
Murray K, Janes J, Jones A, Bothwell H, Andrew R, Borevitz J . Landscape drivers of genomic diversity and divergence in woodland Eucalyptus. Mol Ecol. 2019; 28(24):5232-5247. PMC: 7065176. DOI: 10.1111/mec.15287. View

3.
Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo Z, Pool J . Sequencing of 50 human exomes reveals adaptation to high altitude. Science. 2010; 329(5987):75-8. PMC: 3711608. DOI: 10.1126/science.1190371. View

4.
Voight B, Kudaravalli S, Wen X, Pritchard J . A map of recent positive selection in the human genome. PLoS Biol. 2006; 4(3):e72. PMC: 1382018. DOI: 10.1371/journal.pbio.0040072. View

5.
Cheng J, Stern A, Racimo F, Nielsen R . Detecting Selection in Multiple Populations by Modeling Ancestral Admixture Components. Mol Biol Evol. 2021; 39(1). PMC: 8763095. DOI: 10.1093/molbev/msab294. View