» Articles » PMID: 28824579

PacBio But Not Illumina Technology Can Achieve Fast, Accurate and Complete Closure of the High GC, Complex Two-Chromosome Genome

Overview
Journal Front Microbiol
Specialty Microbiology
Date 2017 Aug 22
PMID 28824579
Citations 21
Authors
Affiliations
Soon will be listed here.
Abstract

Although PacBio third-generation sequencers have improved the read lengths of genome sequencing which facilitates the assembly of complete genomes, no study has reported success in using PacBio data alone to completely sequence a two-chromosome bacterial genome from a single library in a single run. Previous studies using earlier versions of sequencing chemistries have at most been able to finish bacterial genomes containing only one chromosome with assembly. In this study, we compared the robustness of PacBio RS II, using one SMRT cell and the latest P6-C4 chemistry, with Illumina HiSeq 1500 in sequencing the genome of , a bacterium which contains two large circular chromosomes, very high G+C content of 68-69%, highly repetitive regions and substantial genomic diversity, and represents one of the largest and most complex bacterial genomes sequenced, using a reference genome generated by hybrid assembly using PacBio and Illumina datasets with subsequent manual validation. Results showed that PacBio data with assembly, but not Illumina, was able to completely sequence the genome without any gaps or mis-assemblies. The two large contigs of the PacBio assembly aligned unambiguously to the reference genome, sharing >99.9% nucleotide identities. Conversely, Illumina data assembled using three different assemblers resulted in fragmented assemblies (201-366 contigs), sharing only 92.2-100% and 92.0-100% nucleotide identities to chromosomes I and II reference sequences, respectively, with no indication that the genome consisted of two chromosomes with four copies of ribosomal operons. Among all assemblies, the PacBio assembly recovered the highest number of core and virulence proteins, and housekeeping genes based on whole-genome multilocus sequence typing (wgMLST). Most notably, assembly solely based on PacBio outperformed even hybrid assembly using both PacBio and Illumina datasets. Hybrid approach generated only 74 contigs, while the PacBio data alone with assembly achieved complete closure of the two-chromosome genome without additional costly bench work and further sequencing. PacBio RS II using P6-C4 chemistry is highly robust and cost-effective and should be the platform of choice in sequencing bacterial genomes, particularly for those that are well-known to be difficult-to-sequence.

Citing Articles

Full-Length Sequencing of Circular DNA Viruses Using CIDER-Seq.

Zaidi S, Golyaev V, Mehta D, Vanderschuren H Methods Mol Biol. 2025; 2912:191-204.

PMID: 40064783 DOI: 10.1007/978-1-0716-4454-6_17.


Isolation, molecular identification, and genomic analysis of strain ASIOC01 from activated sludge harboring the bioremediation prowess of glycerol and organic pollutants in high-salinity.

Chin H, Ravi Varadharajulu N, Lin Z, Chen W, Zhang Z, Arumugam S Front Microbiol. 2024; 15:1415723.

PMID: 38983623 PMC: 11231211. DOI: 10.3389/fmicb.2024.1415723.


Comparative genome analysis reveals high-level drug resistance markers in a clinical isolate of subsp MF GZ001.

Alam M, Guan P, Zhu Y, Zeng S, Fang X, Wang S Front Cell Infect Microbiol. 2023; 12:1056007.

PMID: 36683685 PMC: 9846761. DOI: 10.3389/fcimb.2022.1056007.


The Development of Technology to Prevent, Diagnose, and Manage Antimicrobial Resistance in Healthcare-Associated Infections.

Elbehiry A, Marzouk E, Abalkhail A, El-Garawany Y, Anagreyyah S, Alnafea Y Vaccines (Basel). 2022; 10(12).

PMID: 36560510 PMC: 9780923. DOI: 10.3390/vaccines10122100.


TSA32-1 as a Promising Agent for Biocontrol of Plant Pathogenic Fungi.

Kim J, Song J, Kim P, Kim D, Kim Y J Fungi (Basel). 2022; 8(10).

PMID: 36294618 PMC: 9604864. DOI: 10.3390/jof8101053.


References
1.
Gomez-Valero L, Rusniok C, Rolando M, Neou M, Dervins-Ravault D, Demirtas J . Comparative analyses of Legionella species identifies genetic features of strains causing Legionnaires' disease. Genome Biol. 2014; 15(11):505. PMC: 4256840. DOI: 10.1186/PREACCEPT-1086350395137407. View

2.
Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S . Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012; 28(12):1647-9. PMC: 3371832. DOI: 10.1093/bioinformatics/bts199. View

3.
Elschner M, Thomas P, El-Adawy H, Mertens K, Melzer F, Hnizdo J . Complete Genome Sequence of a Strain Isolated from a Pet Green Iguana in Prague, Czech Republic. Genome Announc. 2017; 5(10). PMC: 5347253. DOI: 10.1128/genomeA.01761-16. View

4.
Overbeek R, Olson R, Pusch G, Olsen G, Davis J, Disz T . The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 2013; 42(Database issue):D206-14. PMC: 3965101. DOI: 10.1093/nar/gkt1226. View

5.
Goodwin S, McPherson J, McCombie W . Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016; 17(6):333-51. PMC: 10373632. DOI: 10.1038/nrg.2016.49. View