» Articles » PMID: 30018084

Rapid Low-Cost Assembly of the Reference Genome Using Low-Coverage, Long-Read Sequencing

Overview
Journal G3 (Bethesda)
Date 2018 Jul 19
PMID 30018084
Citations 52
Authors
Affiliations
Soon will be listed here.
Abstract

Accurate and comprehensive characterization of genetic variation is essential for deciphering the genetic basis of diseases and other phenotypes. A vast amount of genetic variation stems from large-scale sequence changes arising from the duplication, deletion, inversion, and translocation of sequences. In the past 10 years, high-throughput short reads have greatly expanded our ability to assay sequence variation due to single nucleotide polymorphisms. However, a recent assembly of a second reference genome has revealed that short read genotyping methods miss hundreds of structural variants, including those affecting phenotypes. While genomes assembled using high-coverage long reads can achieve high levels of contiguity and completeness, concerns about cost, errors, and low yield have limited widespread adoption of such sequencing approaches. Here we resequenced the reference strain of (ISO1) on a single Oxford Nanopore MinION flow cell run for 24 hr. Using only reads longer than 1 kb or with at least 30x coverage, we assembled a highly contiguous genome. The addition of inexpensive paired reads and subsequent scaffolding using an optical map technology achieved an assembly with completeness and contiguity comparable to the reference assembly. Comparison of our assembly to the reference assembly of ISO1 uncovered a number of structural variants (SVs), including novel LTR transposable element insertions and duplications affecting genes with developmental, behavioral, and metabolic functions. Collectively, these SVs provide a snapshot of the dynamics of genome evolution. Furthermore, our assembly and comparison to the reference genome demonstrates that high-quality assembly of reference genomes and comprehensive variant discovery using such assemblies are now possible by a single lab for under $1,000 (USD).

Citing Articles

The Impact of Oxford Nanopore Technologies Based Methodologies on the Genome Sequencing and Assembly of Romanian Strains of .

Ratiu A, Ionascu A, Constantin N Insects. 2025; 16(1).

PMID: 39859583 PMC: 11766098. DOI: 10.3390/insects16010002.


An updated reference genome of Barbatula barbatula (Linnaeus, 1758).

Laczko L, Nagy N, Nagy A, Maroda A, Saly P Sci Data. 2025; 12(1):137.

PMID: 39843539 PMC: 11754907. DOI: 10.1038/s41597-025-04469-z.


Chromosome-level genome assembly of Tritrichomonas foetus, the causative agent of Bovine Trichomonosis.

Abdel-Glil M, Solle J, Wibberg D, Neubauer H, Sprague L Sci Data. 2024; 11(1):1030.

PMID: 39304666 PMC: 11415386. DOI: 10.1038/s41597-024-03818-8.


Range-wide population genomic structure of the Karner blue butterfly, () .

Zhang J, Aunins A, King T, Cong Q, Shen J, Song L Ecol Evol. 2024; 14(9):e70044.

PMID: 39279793 PMC: 11392825. DOI: 10.1002/ece3.70044.


Reclassification of Botryococcus braunii chemical races into separate species based on a comparative genomics analysis.

Boland D, Cornejo-Corona I, Browne D, Murphy R, Mullet J, Okada S PLoS One. 2024; 19(7):e0304144.

PMID: 39074348 PMC: 11286282. DOI: 10.1371/journal.pone.0304144.


References
1.
Gordon D, Huddleston J, Chaisson M, Hill C, Kronenberg Z, Munson K . Long-read sequence assembly of the gorilla genome. Science. 2016; 352(6281):aae0344. PMC: 4920363. DOI: 10.1126/science.aae0344. View

2.
Paszkiewicz K, Studholme D . De novo assembly of short sequence reads. Brief Bioinform. 2010; 11(5):457-72. DOI: 10.1093/bib/bbq020. View

3.
Narzisi G, Schatz M . The challenge of small-scale repeats for indel discovery. Front Bioeng Biotechnol. 2015; 3:8. PMC: 4306302. DOI: 10.3389/fbioe.2015.00008. View

4.
Lam K, Khalak A, Tse D . Near-optimal assembly for shotgun sequencing with noisy reads. BMC Bioinformatics. 2014; 15 Suppl 9:S4. PMC: 4168708. DOI: 10.1186/1471-2105-15-S9-S4. View

5.
Chakraborty M, Baldwin-Brown J, Long A, Emerson J . Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage. Nucleic Acids Res. 2016; 44(19):e147. PMC: 5100563. DOI: 10.1093/nar/gkw654. View