» Articles » PMID: 32853200

Learning the Properties of Adaptive Regions with Functional Data Analysis

Overview
Journal PLoS Genet
Specialty Genetics
Date 2020 Aug 28
PMID 32853200
Citations 12
Authors
Affiliations
Soon will be listed here.
Abstract

Identifying regions of positive selection in genomic data remains a challenge in population genetics. Most current approaches rely on comparing values of summary statistics calculated in windows. We present an approach termed SURFDAWave, which translates measures of genetic diversity calculated in genomic windows to functional data. By transforming our discrete data points to be outputs of continuous functions defined over genomic space, we are able to learn the features of these functions that signify selection. This enables us to confidently identify complex modes of natural selection, including adaptive introgression. We are also able to predict important selection parameters that are responsible for shaping the inferred selection events. By applying our model to human population-genomic data, we recapitulate previously identified regions of selective sweeps, such as OCA2 in Europeans, and predict that its beneficial mutation reached a frequency of 0.02 before it swept 1,802 generations ago, a time when humans were relatively new to Europe. In addition, we identify BNC2 in Europeans as a target of adaptive introgression, and predict that it harbors a beneficial mutation that arose in an archaic human population that split from modern humans within the hypothesized modern human-Neanderthal divergence range.

Citing Articles

Digital Image Processing to Detect Adaptive Evolution.

Amin M, Hasan M, DeGiorgio M Mol Biol Evol. 2024; 41(12).

PMID: 39565932 PMC: 11631197. DOI: 10.1093/molbev/msae242.


Tree Sequences as a General-Purpose Tool for Population Genetic Inference.

Whitehouse L, Ray D, Schrider D Mol Biol Evol. 2024; 41(11).

PMID: 39460991 PMC: 11600592. DOI: 10.1093/molbev/msae223.


Tree sequences as a general-purpose tool for population genetic inference.

Whitehouse L, Ray D, Schrider D bioRxiv. 2024; .

PMID: 39185244 PMC: 11343121. DOI: 10.1101/2024.02.20.581288.


Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data.

Amin M, Hasan M, Arnab S, DeGiorgio M Mol Biol Evol. 2023; 40(10).

PMID: 37772983 PMC: 10581699. DOI: 10.1093/molbev/msad216.


Assessing the Presence of Recent Adaptation in the Human Genome With Mixture Density Regression.

Salazar-Tortosa D, Huang Y, Enard D Genome Biol Evol. 2023; 15(10).

PMID: 37713622 PMC: 10563788. DOI: 10.1093/gbe/evad170.


References
1.
Fagny M, Patin E, Enard D, Barreiro L, Quintana-Murci L, Laval G . Exploring the occurrence of classic selective sweeps in humans using whole-genome sequencing data sets. Mol Biol Evol. 2014; 31(7):1850-68. DOI: 10.1093/molbev/msu118. View

2.
Sabeti P, Fry B, Lohmueller J, Hostetter E, Cotsapas C, Xie X . Genome-wide detection and characterization of positive selection in human populations. Nature. 2007; 449(7164):913-8. PMC: 2687721. DOI: 10.1038/nature06250. View

3.
Harrow J, Frankish A, Gonzalez J, Tapanari E, Diekhans M, Kokocinski F . GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 2012; 22(9):1760-74. PMC: 3431492. DOI: 10.1101/gr.135350.111. View

4.
Huerta-Sanchez E, Jin X, Asan , Bianba Z, Peter B, Vinckenbosch N . Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. Nature. 2014; 512(7513):194-7. PMC: 4134395. DOI: 10.1038/nature13408. View

5.
Schrider D, Ayroles J, Matute D, Kern A . Supervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechellia. PLoS Genet. 2018; 14(4):e1007341. PMC: 5933812. DOI: 10.1371/journal.pgen.1007341. View