» Articles » PMID: 35263345

Bayesian Inference of Ancestral Recombination Graphs

Overview
Specialty Biology
Date 2022 Mar 9
PMID 35263345
Authors
Affiliations
Soon will be listed here.
Abstract

We present a novel algorithm, implemented in the software ARGinfer, for probabilistic inference of the Ancestral Recombination Graph under the Coalescent with Recombination. Our Markov Chain Monte Carlo algorithm takes advantage of the Succinct Tree Sequence data structure that has allowed great advances in simulation and point estimation, but not yet probabilistic inference. Unlike previous methods, which employ the Sequentially Markov Coalescent approximation, ARGinfer uses the Coalescent with Recombination, allowing more accurate inference of key evolutionary parameters. We show using simulations that ARGinfer can accurately estimate many properties of the evolutionary history of the sample, including the topology and branch lengths of the genealogical tree at each sequence site, and the times and locations of mutation and recombination events. ARGinfer approximates posterior probability distributions for these and other quantities, providing interpretable assessments of uncertainty that we show to be well calibrated. ARGinfer is currently limited to tens of DNA sequences of several hundreds of kilobases, but has scope for further computational improvements to increase its applicability.

Citing Articles

Estimating evolutionary and demographic parameters via ARG-derived IBD.

Huang Z, Kelleher J, Chan Y, Balding D PLoS Genet. 2025; 21(1):e1011537.

PMID: 39778081 PMC: 11750106. DOI: 10.1371/journal.pgen.1011537.


Tree Sequences as a General-Purpose Tool for Population Genetic Inference.

Whitehouse L, Ray D, Schrider D Mol Biol Evol. 2024; 41(11).

PMID: 39460991 PMC: 11600592. DOI: 10.1093/molbev/msae223.


Inference and applications of ancestral recombination graphs.

Nielsen R, Vaughn A, Deng Y Nat Rev Genet. 2024; 26(1):47-58.

PMID: 39349760 DOI: 10.1038/s41576-024-00772-4.


Improved inference of population histories by integrating genomic and epigenomic data.

Sellinger T, Johannes F, Tellier A Elife. 2024; 12.

PMID: 39264367 PMC: 11392530. DOI: 10.7554/eLife.89470.


Tree sequences as a general-purpose tool for population genetic inference.

Whitehouse L, Ray D, Schrider D bioRxiv. 2024; .

PMID: 39185244 PMC: 11343121. DOI: 10.1101/2024.02.20.581288.


References
1.
Kelleher J, Thornton K, Ashander J, Ralph P . Efficient pedigree recording for fast population genetics simulation. PLoS Comput Biol. 2018; 14(11):e1006581. PMC: 6233923. DOI: 10.1371/journal.pcbi.1006581. View

2.
Harris K . From a database of genomes to a forest of evolutionary trees. Nat Genet. 2019; 51(9):1306-1307. PMC: 8195310. DOI: 10.1038/s41588-019-0492-x. View

3.
Kimura M . The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics. 1969; 61(4):893-903. PMC: 1212250. DOI: 10.1093/genetics/61.4.893. View

4.
Kuhner M, Yamato J, Felsenstein J . Maximum likelihood estimation of recombination rates from population data. Genetics. 2000; 156(3):1393-401. PMC: 1461317. DOI: 10.1093/genetics/156.3.1393. View

5.
Kelleher J, Etheridge A, McVean G . Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes. PLoS Comput Biol. 2016; 12(5):e1004842. PMC: 4856371. DOI: 10.1371/journal.pcbi.1004842. View