Whole-genome Haplotyping Approaches and Genomic Medicine
Overview
Authors
Affiliations
Genomic information reported as haplotypes rather than genotypes will be increasingly important for personalized medicine. Current technologies generate diploid sequence data that is rarely resolved into its constituent haplotypes. Furthermore, paradigms for thinking about genomic information are based on interpreting genotypes rather than haplotypes. Nevertheless, haplotypes have historically been useful in contexts ranging from population genetics to disease-gene mapping efforts. The main approaches for phasing genomic sequence data are molecular haplotyping, genetic haplotyping, and population-based inference. Long-read sequencing technologies are enabling longer molecular haplotypes, and decreases in the cost of whole-genome sequencing are enabling the sequencing of whole-chromosome genetic haplotypes. Hybrid approaches combining high-throughput short-read assembly with strategic approaches that enable physical or virtual binning of reads into haplotypes are enabling multi-gene haplotypes to be generated from single individuals. These techniques can be further combined with genetic and population approaches. Here, we review advances in whole-genome haplotyping approaches and discuss the importance of haplotypes for genomic medicine. Clinical applications include diagnosis by recognition of compound heterozygosity and by phasing regulatory variation to coding variation. Haplotypes, which are more specific than less complex variants such as single nucleotide variants, also have applications in prognostics and diagnostics, in the analysis of tumors, and in typing tissue for transplantation. Future advances will include technological innovations, the application of standard metrics for evaluating haplotype quality, and the development of databases that link haplotypes to disease.
Graphasing: phasing diploid genome assembly graphs with single-cell strand sequencing.
Henglin M, Ghareghani M, Harvey W, Porubsky D, Koren S, Eichler E Genome Biol. 2024; 25(1):265.
PMID: 39390579 PMC: 11466045. DOI: 10.1186/s13059-024-03409-1.
Bai X, Chen Z, Chen K, Wu Z, Wang R, Liu J Cell Discov. 2024; 10(1):74.
PMID: 38977679 PMC: 11231365. DOI: 10.1038/s41421-024-00694-9.
Phasing Diploid Genome Assembly Graphs with Single-Cell Strand Sequencing.
Henglin M, Ghareghani M, Harvey W, Porubsky D, Koren S, Eichler E bioRxiv. 2024; .
PMID: 38529499 PMC: 10962706. DOI: 10.1101/2024.02.15.580432.
XHap: haplotype assembly using long-distance read correlations learned by transformers.
Consul S, Ke Z, Vikalo H Bioinform Adv. 2023; 3(1):vbad169.
PMID: 38089113 PMC: 10713121. DOI: 10.1093/bioadv/vbad169.
Allele detection using -mer-based sequencing error profiles.
Ashraf H, Ebler J, Marschall T Bioinform Adv. 2023; 3(1):vbad149.
PMID: 37928341 PMC: 10625474. DOI: 10.1093/bioadv/vbad149.