» Articles » PMID: 24930142

Evaluation and Validation of De Novo and Hybrid Assembly Techniques to Derive High-quality Genome Sequences

Overview
Journal Bioinformatics
Specialty Biology
Date 2014 Jun 16
PMID 24930142
Citations 60
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: To assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences.

Results: Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as an additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies.

Availability And Implementation: All assembly tools except CLC Genomics Workbench are freely available under GNU General Public License.

Contact: brownsd@ornl.gov

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

NanoCore: core-genome-based bacterial genomic surveillance and outbreak detection in healthcare facilities from Nanopore and Illumina data.

Fuchs S, Hulse L, Tamayo T, Kolbe-Busch S, Pfeffer K, Dilthey A mSystems. 2024; 9(11):e0108024.

PMID: 39373471 PMC: 11575142. DOI: 10.1128/msystems.01080-24.


Genome sequencing of Elaeocarpus spp. stem blight pathogen Pseudocryphonectria elaeocarpicola reveals potential adaptations to colonize woody bark.

Yang Y, Xiong D, Zhao D, Huang H, Tian C BMC Genomics. 2024; 25(1):714.

PMID: 39048950 PMC: 11267912. DOI: 10.1186/s12864-024-10615-5.


Detection of spp. in farmed deer (Artiodactyla: Cervidae) using multiplex assays in the Qinghai-Tibet Plateau, China.

Miao Y, Guo W, Zhang W, Chen Z, Mian D, Li R Microbiol Spectr. 2024; 12(7):e0412023.

PMID: 38785439 PMC: 11218516. DOI: 10.1128/spectrum.04120-23.


Phylogenomics and genetic analysis of solvent-producing Clostridium species.

Jensen R, Schulz F, Roux S, Klingeman D, Mitchell W, Udwary D Sci Data. 2024; 11(1):432.

PMID: 38693191 PMC: 11063209. DOI: 10.1038/s41597-024-03210-6.


De novo genome assembly and comparative genomics for the colonial ascidian Botrylloides violaceus.

Sumner J, Andrasz C, Johnson C, Wax S, Anderson P, Keeling E G3 (Bethesda). 2023; 13(10).

PMID: 37555394 PMC: 10542563. DOI: 10.1093/g3journal/jkad181.


References
1.
Nagarajan N, Cook C, Di Bonaventura M, Ge H, Richards A, Bishop-Lilly K . Finishing genomes with limited resources: lessons from an ensemble of microbial genomes. BMC Genomics. 2010; 11:242. PMC: 2864248. DOI: 10.1186/1471-2164-11-242. View

2.
Fraser C, Eisen J, Nelson K, Paulsen I, Salzberg S . The value of complete microbial genome sequencing (you get what you pay for). J Bacteriol. 2002; 184(23):6403-5; discusion 6405. PMC: 135419. DOI: 10.1128/JB.184.23.6403-6405.2002. View

3.
Bashir A, Klammer A, Robins W, Chin C, Webster D, Paxinos E . A hybrid approach for the automated finishing of bacterial genomes. Nat Biotechnol. 2012; 30(7):701-707. PMC: 3731737. DOI: 10.1038/nbt.2288. View

4.
Koren S, Treangen T, Hill C, Pop M, Phillippy A . Automated ensemble assembly and validation of microbial genomes. BMC Bioinformatics. 2014; 15:126. PMC: 4030574. DOI: 10.1186/1471-2105-15-126. View

5.
Zerbino D, Birney E . Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008; 18(5):821-9. PMC: 2336801. DOI: 10.1101/gr.074492.107. View