» Articles » PMID: 37433019

Uncovering Footprints of Natural Selection Through Spectral Analysis of Genomic Summary Statistics

Overview
Journal Mol Biol Evol
Specialty Biology
Date 2023 Jul 11
PMID 37433019
Authors
Affiliations
Soon will be listed here.
Abstract

Natural selection leaves a spatial pattern along the genome, with a haplotype distribution distortion near the selected locus that fades with distance. Evaluating the spatial signal of a population-genetic summary statistic across the genome allows for patterns of natural selection to be distinguished from neutrality. Considering the genomic spatial distribution of multiple summary statistics is expected to aid in uncovering subtle signatures of selection. In recent years, numerous methods have been devised that consider genomic spatial distributions across summary statistics, utilizing both classical machine learning and deep learning architectures. However, better predictions may be attainable by improving the way in which features are extracted from these summary statistics. We apply wavelet transform, multitaper spectral analysis, and S-transform to summary statistic arrays to achieve this goal. Each analysis method converts one-dimensional summary statistic arrays to two-dimensional images of spectral analysis, allowing simultaneous temporal and spectral assessment. We feed these images into convolutional neural networks and consider combining models using ensemble stacking. Our modeling framework achieves high accuracy and power across a diverse set of evolutionary settings, including population size changes and test sets of varying sweep strength, softness, and timing. A scan of central European whole-genome sequences recapitulated well-established sweep candidates and predicted novel cancer-associated genes as sweeps with high support. Given that this modeling framework is also robust to missing genomic segments, we believe that it will represent a welcome addition to the population-genomic toolkit for learning about adaptive processes from genomic data.

Citing Articles

Sweeps in space: leveraging geographic data to identify beneficial alleles in .

Rehmann C, Small S, Ralph P, Kern A bioRxiv. 2025; .

PMID: 39975147 PMC: 11839090. DOI: 10.1101/2025.02.07.637123.


iHDSel software: The price equation and the population stability index to detect genomic patterns compatible with selective sweeps. An example with SARS-CoV-2.

Carvajal-Rodriguez A Biol Methods Protoc. 2024; 9(1):bpae089.

PMID: 39679303 PMC: 11646571. DOI: 10.1093/biomethods/bpae089.


Digital Image Processing to Detect Adaptive Evolution.

Amin M, Hasan M, DeGiorgio M Mol Biol Evol. 2024; 41(12).

PMID: 39565932 PMC: 11631197. DOI: 10.1093/molbev/msae242.


Tree Sequences as a General-Purpose Tool for Population Genetic Inference.

Whitehouse L, Ray D, Schrider D Mol Biol Evol. 2024; 41(11).

PMID: 39460991 PMC: 11600592. DOI: 10.1093/molbev/msae223.


Tree sequences as a general-purpose tool for population genetic inference.

Whitehouse L, Ray D, Schrider D bioRxiv. 2024; .

PMID: 39185244 PMC: 11343121. DOI: 10.1101/2024.02.20.581288.


References
1.
Scally A, Durbin R . Revising the human mutation rate: implications for understanding human evolution. Nat Rev Genet. 2012; 13(10):745-53. DOI: 10.1038/nrg3295. View

2.
Glinka S, Ometto L, Mousset S, Stephan W, de Lorenzo D . Demography and natural selection have shaped genetic variation in Drosophila melanogaster: a multi-locus approach. Genetics. 2003; 165(3):1269-78. PMC: 1462856. DOI: 10.1093/genetics/165.3.1269. View

3.
Przeworski M . The signature of positive selection at randomly chosen loci. Genetics. 2002; 160(3):1179-89. PMC: 1462030. DOI: 10.1093/genetics/160.3.1179. View

4.
Annilo T, Shulenin S, Chen Z, Arnould I, Prades C, Lemoine C . Identification and characterization of a novel ABCA subfamily member, ABCA12, located in the lamellar ichthyosis region on 2q34. Cytogenet Genome Res. 2003; 98(2-3):169-76. DOI: 10.1159/000069811. View

5.
Lin K, Li H, Schlotterer C, Futschik A . Distinguishing positive selection from neutral evolution: boosting the performance of summary statistics. Genetics. 2010; 187(1):229-44. PMC: 3018323. DOI: 10.1534/genetics.110.122614. View