» Articles » PMID: 29048594

Neptune: a Bioinformatics Tool for Rapid Discovery of Genomic Variation in Bacterial Populations

Abstract

The ready availability of vast amounts of genomic sequence data has created the need to rethink comparative genomics algorithms using 'big data' approaches. Neptune is an efficient system for rapidly locating differentially abundant genomic content in bacterial populations using an exact k-mer matching strategy, while accommodating k-mer mismatches. Neptune's loci discovery process identifies sequences that are sufficiently common to a group of target sequences and sufficiently absent from non-targets using probabilistic models. Neptune uses parallel computing to efficiently identify and extract these loci from draft genome assemblies without requiring multiple sequence alignments or other computationally expensive comparative sequence analyses. Tests on simulated and real datasets showed that Neptune rapidly identifies regions that are both sensitive and specific. We demonstrate that this system can identify trait-specific loci from different bacterial lineages. Neptune is broadly applicable for comparative bacterial analyses, yet will particularly benefit pathogenomic applications, owing to efficient and sensitive discovery of differentially abundant genomic loci. The software is available for download at: http://github.com/phac-nml/neptune.

Citing Articles

AutoPVPrimer: A comprehensive AI-Enhanced pipeline for efficient plant virus primer design and assessment.

Ghorbani A, Rostami M, Ashrafi-Dehkordi E, Guzzi P PLoS One. 2025; 20(1):e0317918.

PMID: 39883659 PMC: 11781739. DOI: 10.1371/journal.pone.0317918.


Virulence factor discovery identifies associations between the Fic gene family and Fap2 fusobacteria in colorectal cancer microbiomes.

Nakatsu G, Ko D, Michaud M, Franzosa E, Morgan X, Huttenhower C mBio. 2025; 16(2):e0373224.

PMID: 39807864 PMC: 11796403. DOI: 10.1128/mbio.03732-24.


Identification of conserved genomic signatures specific to species colonising the human gut.

Arjun O, Prakash T 3 Biotech. 2023; 13(3):97.

PMID: 36852175 PMC: 9958220. DOI: 10.1007/s13205-023-03492-4.


Development of reverse-transcriptase, real-time PCR assays to distinguish the Southern African Territories (SAT) serotypes 1 and 3 and topotype VII of SAT2 of Foot-and-Mouth Disease Virus.

Chestley T, Sroga P, Nebroski M, Hole K, Ularamu H, Lung O Front Vet Sci. 2022; 9:977761.

PMID: 36204292 PMC: 9530708. DOI: 10.3389/fvets.2022.977761.


Genomic Characterization of From Beef Cattle Feedlots and Associated Environmental Continuum.

Zaidi S, Zaheer R, Barbieri R, Cook S, Hannon S, Booker C Front Microbiol. 2022; 13:859990.

PMID: 35832805 PMC: 9271880. DOI: 10.3389/fmicb.2022.859990.


References
1.
Vijaya Satya R, Zavaljevski N, Kumar K, Reifman J . A high-throughput pipeline for designing microarray-based pathogen diagnostic assays. BMC Bioinformatics. 2008; 9:185. PMC: 2375140. DOI: 10.1186/1471-2105-9-185. View

2.
Gilmour M, Graham M, Van Domselaar G, Tyler S, Kent H, Trout-Yakel K . High-throughput genome sequencing of two Listeria monocytogenes clinical isolates during a large foodborne outbreak. BMC Genomics. 2010; 11:120. PMC: 2834635. DOI: 10.1186/1471-2164-11-120. View

3.
Cotter P, Draper L, Lawton E, Daly K, Groeger D, Casey P . Listeriolysin S, a novel peptide haemolysin associated with a subset of lineage I Listeria monocytogenes. PLoS Pathog. 2008; 4(9):e1000144. PMC: 2522273. DOI: 10.1371/journal.ppat.1000144. View

4.
Dhillon B, Laird M, Shay J, Winsor G, Lo R, Nizam F . IslandViewer 3: more flexible, interactive genomic island discovery, visualization and analysis. Nucleic Acids Res. 2015; 43(W1):W104-8. PMC: 4489224. DOI: 10.1093/nar/gkv401. View

5.
Orsi R, Den Bakker H, Wiedmann M . Listeria monocytogenes lineages: Genomics, evolution, ecology, and phenotypic characteristics. Int J Med Microbiol. 2010; 301(2):79-96. DOI: 10.1016/j.ijmm.2010.05.002. View