» Articles » PMID: 20185453

A Population Genetic Hidden Markov Model for Detecting Genomic Regions Under Selection

Overview
Journal Mol Biol Evol
Specialty Biology
Date 2010 Feb 27
PMID 20185453
Citations 9
Authors
Affiliations
Soon will be listed here.
Abstract

Recently, hidden Markov models have been applied to numerous problems in genomics. Here, we introduce an explicit population genetics hidden Markov model (popGenHMM) that uses single nucleotide polymorphism (SNP) frequency data to identify genomic regions that have experienced recent selection. Our popGenHMM assumes that SNP frequencies are emitted independently following diffusion approximation expectations but that neighboring SNP frequencies are partially correlated by selective state. We give results from the training and application of our popGenHMM to a set of early release data from the Drosophila Population Genomics Project (dpgp.org) that consists of approximately 7.8 Mb of resequencing from 32 North American Drosophila melanogaster lines. These results demonstrate the potential utility of our model, making predictions based on the site frequency spectrum (SFS) for regions of the genome that represent selected elements.

Citing Articles

Identification of natural selection in genomic data with deep convolutional neural network.

Fadja A, Riguzzi F, Bertorelle G, Trucchi E BioData Min. 2021; 14(1):51.

PMID: 34863217 PMC: 8642854. DOI: 10.1186/s13040-021-00280-9.


Detecting Selection from Linked Sites Using an -Model.

Galimberti M, Leuenberger C, Wolf B, Szilagyi S, Foll M, Wegmann D Genetics. 2020; 216(4):1205-1215.

PMID: 33067324 PMC: 7768260. DOI: 10.1534/genetics.120.303780.


The Unreasonable Effectiveness of Convolutional Neural Networks in Population Genetic Inference.

Flagel L, Brandvain Y, Schrider D Mol Biol Evol. 2018; 36(2):220-238.

PMID: 30517664 PMC: 6367976. DOI: 10.1093/molbev/msy224.


Supervised Machine Learning for Population Genetics: A New Paradigm.

Schrider D, Kern A Trends Genet. 2018; 34(4):301-312.

PMID: 29331490 PMC: 5905713. DOI: 10.1016/j.tig.2017.12.005.


zipHMMlib: a highly optimised HMM library exploiting repetitions in the input to speed up the forward algorithm.

Sand A, Kristiansen M, Pedersen C, Mailund T BMC Bioinformatics. 2013; 14:339.

PMID: 24266924 PMC: 4222747. DOI: 10.1186/1471-2105-14-339.


References
1.
Hobolth A, Christensen O, Mailund T, Schierup M . Genomic relationships and speciation times of human, chimpanzee, and gorilla inferred from a coalescent hidden Markov model. PLoS Genet. 2007; 3(2):e7. PMC: 1802818. DOI: 10.1371/journal.pgen.0030007. View

2.
Boitard S, Schlotterer C, Futschik A . Detecting selective sweeps: a new approach based on hidden markov models. Genetics. 2009; 181(4):1567-78. PMC: 2666521. DOI: 10.1534/genetics.108.100032. View

3.
Gillespie J . SUBSTITUTION PROCESSES IN MOLECULAR EVOLUTION. II. EXCHANGEABLE MODELS FROM POPULATION GENETICS. Evolution. 2017; 48(4):1101-1113. DOI: 10.1111/j.1558-5646.1994.tb05297.x. View

4.
Kern A, Jones C, Begun D . Molecular population genetics of male accessory gland proteins in the Drosophila simulans complex. Genetics. 2004; 167(2):725-35. PMC: 1470896. DOI: 10.1534/genetics.103.020883. View

5.
Yang Z . A space-time process model for the evolution of DNA sequences. Genetics. 1995; 139(2):993-1005. PMC: 1206396. DOI: 10.1093/genetics/139.2.993. View