Biobank-scale Inference of Multi-individual Identity by Descent and Gene Conversion

Overview

Journal Am J Hum Genet

Publisher Cell Press

Specialty Genetics

Date 2024 Mar 21

PMID 38513668

Authors

Sharon R Browning

Brian L Browning

Affiliations

Soon will be listed here.

Abstract

We present a method for efficiently identifying clusters of identical-by-descent haplotypes in biobank-scale sequence data. Our multi-individual approach enables much more computationally efficient inference of identity by descent (IBD) than approaches that infer pairwise IBD segments and provides locus-specific IBD clusters rather than IBD segments. Our method's computation time, memory requirements, and output size scale linearly with the number of individuals in the dataset. We also present a method for using multi-individual IBD to detect alleles changed by gene conversion. Application of our methods to the autosomal sequence data for 125,361 White British individuals in the UK Biobank detects more than 9 million converted alleles. This is 2,900 times more alleles changed by gene conversion than were detected in a previous analysis of familial data. We estimate that more than 250,000 sequenced probands and a much larger number of additional genomes from multi-generational family members would be required to find a similar number of alleles changed by gene conversion using a family-based approach. Our IBD clustering method is implemented in the open-source ibd-cluster software package.

Citing Articles

Mean gene conversion tract length in humans estimated to be 459 bp from UK Biobank sequence data.

Masaki N, Browning S bioRxiv. 2025; .

PMID: 39868294 PMC: 11761487. DOI: 10.1101/2024.12.30.630818.

Complete human recombination maps.

Palsson G, Hardarson M, Jonsson H, Steinthorsdottir V, Stefansson O, Eggertsson H Nature. 2025; .

PMID: 39843742 DOI: 10.1038/s41586-024-08450-5.

Fast simulation of identity-by-descent segments.

Temple S, Browning S, Thompson E bioRxiv. 2025; .

PMID: 39829821 PMC: 11741331. DOI: 10.1101/2024.12.13.628449.

Identity-by-descent segments in large samples.

Temple S, Thompson E bioRxiv. 2024; .

PMID: 38895476 PMC: 11185678. DOI: 10.1101/2024.06.05.597656.

References

Browning B, Browning S . Statistical phasing of 150,119 sequenced genomes in the UK Biobank. Am J Hum Genet. 2022; 110(1):161-165. PMC: 9892698. DOI: 10.1016/j.ajhg.2022.11.008. View

Browning S, Browning B . High-resolution detection of identity by descent in unrelated individuals. Am J Hum Genet. 2010; 86(4):526-39. PMC: 2850444. DOI: 10.1016/j.ajhg.2010.02.021. View

Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D . PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007; 81(3):559-75. PMC: 1950838. DOI: 10.1086/519795. View

Zhou Y, Browning B, Browning S . Population-Specific Recombination Maps from Segments of Identity by Descent. Am J Hum Genet. 2020; 107(1):137-148. PMC: 7332656. DOI: 10.1016/j.ajhg.2020.05.016. View

Seidman D, Shenoy S, Kim M, Babu R, Woods I, Dyer T . Rapid, Phase-free Detection of Long Identity-by-Descent Segments Enables Effective Relationship Classification. Am J Hum Genet. 2020; 106(4):453-466. PMC: 7118564. DOI: 10.1016/j.ajhg.2020.02.012. View

Dimitromanolakis A, Paterson A, Sun L . Fast and Accurate Shared Segment Detection and Relatedness Estimation in Un-phased Genetic Data via TRUFFLE. Am J Hum Genet. 2019; 105(1):78-88. PMC: 6612710. DOI: 10.1016/j.ajhg.2019.05.007. View

Williams A, Genovese G, Dyer T, Altemose N, Truax K, Jun G . Non-crossover gene conversions show strong GC bias and unexpected clustering in humans. Elife. 2015; 4. PMC: 4404656. DOI: 10.7554/eLife.04637. View

Taliun D, Harris D, Kessler M, Carlson J, Szpiech Z, Torres R . Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature. 2021; 590(7845):290-299. PMC: 7875770. DOI: 10.1038/s41586-021-03205-y. View

Naseri A, Liu X, Tang K, Zhang S, Zhi D . RaPID: ultra-fast, powerful, and accurate detection of segments identical by descent (IBD) in biobank-scale cohorts. Genome Biol. 2019; 20(1):143. PMC: 6659282. DOI: 10.1186/s13059-019-1754-8. View

10.

Qiao Y, Sannerud J, Basu-Roy S, Hayward C, Williams A . Distinguishing pedigree relationships via multi-way identity by descent sharing and sex-specific genetic maps. Am J Hum Genet. 2021; 108(1):68-83. PMC: 7820736. DOI: 10.1016/j.ajhg.2020.12.004. View

11.

Zhou Y, Browning S, Browning B . IBDkin: fast estimation of kinship coefficients from identity by descent segments. Bioinformatics. 2020; 36(16):4519-4520. PMC: 7750976. DOI: 10.1093/bioinformatics/btaa569. View

12.

Cai R, Browning B, Browning S . Identity-by-descent-based estimation of the X chromosome effective population size with application to sex-specific demographic history. G3 (Bethesda). 2023; 13(10). PMC: 10542559. DOI: 10.1093/g3journal/jkad165. View

13.

Zhou Y, Browning S, Browning B . A Fast and Simple Method for Detecting Identity-by-Descent Segments in Large-Scale Data. Am J Hum Genet. 2020; 106(4):426-437. PMC: 7118582. DOI: 10.1016/j.ajhg.2020.02.010. View

14.

Browning S, Browning B . Accurate Non-parametric Estimation of Recent Effective Population Size from Segments of Identity by Descent. Am J Hum Genet. 2015; 97(3):404-18. PMC: 4564943. DOI: 10.1016/j.ajhg.2015.07.012. View

15.

Gusev A, Kenny E, Lowe J, Salit J, Saxena R, Kathiresan S . DASH: a method for identical-by-descent haplotype mapping uncovers association with recent variation. Am J Hum Genet. 2011; 88(6):706-717. PMC: 3113343. DOI: 10.1016/j.ajhg.2011.04.023. View

16.

Halldorsson B, Hardarson M, Kehr B, Styrkarsdottir U, Gylfason A, Thorleifsson G . The rate of meiotic gene conversion varies by sex and age. Nat Genet. 2016; 48(11):1377-1384. PMC: 5083143. DOI: 10.1038/ng.3669. View

17.

Palamara P, Francioli L, Wilton P, Genovese G, Gusev A, Finucane H . Leveraging Distant Relatedness to Quantify Human Mutation and Gene-Conversion Rates. Am J Hum Genet. 2015; 97(6):775-89. PMC: 4678427. DOI: 10.1016/j.ajhg.2015.10.006. View

18.

Kong A, Masson G, Frigge M, Gylfason A, Zusmanovich P, Thorleifsson G . Detection of sharing by descent, long-range phasing and haplotype imputation. Nat Genet. 2009; 40(9):1068-75. PMC: 4540081. DOI: 10.1038/ng.216. View

19.

Browning B, Zhou Y, Browning S . A One-Penny Imputed Genome from Next-Generation Reference Panels. Am J Hum Genet. 2018; 103(3):338-348. PMC: 6128308. DOI: 10.1016/j.ajhg.2018.07.015. View

20.

Naseri A, Yue W, Zhang S, Zhi D . Fast inference of genetic recombination rates in biobank scale data. Genome Res. 2023; 33(7):1015-1022. PMC: 10538484. DOI: 10.1101/gr.277676.123. View