» Articles » PMID: 30894395

Resolving the Full Spectrum of Human Genome Variation Using Linked-Reads

Abstract

Large-scale population analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short-read whole-genome sequencing. However, these short-read approaches fail to give a complete picture of a genome. They struggle to identify structural events, cannot access repetitive regions, and fail to resolve the human genome into haplotypes. Here, we describe an approach that retains long range information while maintaining the advantages of short reads. Starting from ∼1 ng of high molecular weight DNA, we produce barcoded short-read libraries. Novel informatic approaches allow for the barcoded short reads to be associated with their original long molecules producing a novel data type known as "Linked-Reads". This approach allows for simultaneous detection of small and large variants from a single library. In this manuscript, we show the advantages of Linked-Reads over standard short-read approaches for reference-based analysis. Linked-Reads allow mapping to 38 Mb of sequence not accessible to short reads, adding sequence in 423 difficult-to-sequence genes including disease-relevant genes , , and Both Linked-Read whole-genome and whole-exome sequencing identify complex structural variations, including balanced events and single exon deletions and duplications. Further, Linked-Reads extend the region of high-confidence calls by 68.9 Mb. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.

Citing Articles

Advancing the Indian cattle pangenome: characterizing non-reference sequences in Bos indicus.

Azam S, Sahu A, Pandey N, Neupane M, Van Tassell C, Rosen B J Anim Sci Biotechnol. 2025; 16(1):21.

PMID: 39915889 PMC: 11804092. DOI: 10.1186/s40104-024-01133-1.


Assembly of the salt-secreting mangrove Avicennia rumphiana.

Shearman J, Naktang C, Sonthirod C, Kongkachana W, U-Thoomporn S, Jomchai N PLoS One. 2025; 20(2):e0318091.

PMID: 39908310 PMC: 11798505. DOI: 10.1371/journal.pone.0318091.


The known structural variations in hearing loss and their diagnostic approaches: a comprehensive review.

Naghinejad M, Parvizpour S, Shekari Khaniani M, Mehri M, Mansoori Derakhshan S, Amirfiroozy A Mol Biol Rep. 2025; 52(1):131.

PMID: 39821465 DOI: 10.1007/s11033-025-10231-w.


Long-read structural and epigenetic profiling of a kidney tumor-matched sample with nanopore sequencing and optical genome mapping.

Margalit S, Tulpova Z, Zur T, Michaeli Y, Deek J, Nifker G NAR Genom Bioinform. 2025; 7(1):lqae190.

PMID: 39781516 PMC: 11704781. DOI: 10.1093/nargab/lqae190.


A personalized multi-platform assessment of somatic mosaicism in the human frontal cortex.

Zhou W, Mumm C, Gan Y, Switzenberg J, Wang J, De Oliveira P bioRxiv. 2025; .

PMID: 39763954 PMC: 11702624. DOI: 10.1101/2024.12.18.629274.


References
1.
Carneiro M, Russ C, Ross M, Gabriel S, Nusbaum C, DePristo M . Pacific biosciences sequencing technology for genotyping and variation discovery in human data. BMC Genomics. 2012; 13:375. PMC: 3443046. DOI: 10.1186/1471-2164-13-375. View

2.
Huddleston J, Chaisson M, Steinberg K, Warren W, Hoekzema K, Gordon D . Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Res. 2016; 27(5):677-685. PMC: 5411763. DOI: 10.1101/gr.214007.116. View

3.
Nakano K, Shiroma A, Shimoji M, Tamotsu H, Ashimine N, Ohki S . Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area. Hum Cell. 2017; 30(3):149-161. PMC: 5486853. DOI: 10.1007/s13577-017-0168-8. View

4.
Schneider V, Graves-Lindsay T, Howe K, Bouk N, Chen H, Kitts P . Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 2017; 27(5):849-864. PMC: 5411779. DOI: 10.1101/gr.213611.116. View

5.
Quinlan A, Hall I . BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010; 26(6):841-2. PMC: 2832824. DOI: 10.1093/bioinformatics/btq033. View