PacBio But Not Illumina Technology Can Achieve Fast, Accurate and Complete Closure of the High GC, Complex Two-Chromosome Genome
Overview
Authors
Affiliations
Although PacBio third-generation sequencers have improved the read lengths of genome sequencing which facilitates the assembly of complete genomes, no study has reported success in using PacBio data alone to completely sequence a two-chromosome bacterial genome from a single library in a single run. Previous studies using earlier versions of sequencing chemistries have at most been able to finish bacterial genomes containing only one chromosome with assembly. In this study, we compared the robustness of PacBio RS II, using one SMRT cell and the latest P6-C4 chemistry, with Illumina HiSeq 1500 in sequencing the genome of , a bacterium which contains two large circular chromosomes, very high G+C content of 68-69%, highly repetitive regions and substantial genomic diversity, and represents one of the largest and most complex bacterial genomes sequenced, using a reference genome generated by hybrid assembly using PacBio and Illumina datasets with subsequent manual validation. Results showed that PacBio data with assembly, but not Illumina, was able to completely sequence the genome without any gaps or mis-assemblies. The two large contigs of the PacBio assembly aligned unambiguously to the reference genome, sharing >99.9% nucleotide identities. Conversely, Illumina data assembled using three different assemblers resulted in fragmented assemblies (201-366 contigs), sharing only 92.2-100% and 92.0-100% nucleotide identities to chromosomes I and II reference sequences, respectively, with no indication that the genome consisted of two chromosomes with four copies of ribosomal operons. Among all assemblies, the PacBio assembly recovered the highest number of core and virulence proteins, and housekeeping genes based on whole-genome multilocus sequence typing (wgMLST). Most notably, assembly solely based on PacBio outperformed even hybrid assembly using both PacBio and Illumina datasets. Hybrid approach generated only 74 contigs, while the PacBio data alone with assembly achieved complete closure of the two-chromosome genome without additional costly bench work and further sequencing. PacBio RS II using P6-C4 chemistry is highly robust and cost-effective and should be the platform of choice in sequencing bacterial genomes, particularly for those that are well-known to be difficult-to-sequence.
Full-Length Sequencing of Circular DNA Viruses Using CIDER-Seq.
Zaidi S, Golyaev V, Mehta D, Vanderschuren H Methods Mol Biol. 2025; 2912:191-204.
PMID: 40064783 DOI: 10.1007/978-1-0716-4454-6_17.
Chin H, Ravi Varadharajulu N, Lin Z, Chen W, Zhang Z, Arumugam S Front Microbiol. 2024; 15:1415723.
PMID: 38983623 PMC: 11231211. DOI: 10.3389/fmicb.2024.1415723.
Alam M, Guan P, Zhu Y, Zeng S, Fang X, Wang S Front Cell Infect Microbiol. 2023; 12:1056007.
PMID: 36683685 PMC: 9846761. DOI: 10.3389/fcimb.2022.1056007.
Elbehiry A, Marzouk E, Abalkhail A, El-Garawany Y, Anagreyyah S, Alnafea Y Vaccines (Basel). 2022; 10(12).
PMID: 36560510 PMC: 9780923. DOI: 10.3390/vaccines10122100.
TSA32-1 as a Promising Agent for Biocontrol of Plant Pathogenic Fungi.
Kim J, Song J, Kim P, Kim D, Kim Y J Fungi (Basel). 2022; 8(10).
PMID: 36294618 PMC: 9604864. DOI: 10.3390/jof8101053.