» Articles » PMID: 33588753

Cognac: Rapid Generation of Concatenated Gene Alignments for Phylogenetic Inference from Large, Bacterial Whole Genome Sequencing Datasets

Overview
Publisher Biomed Central
Specialty Biology
Date 2021 Feb 16
PMID 33588753
Citations 6
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The quantity of genomic data is expanding at an increasing rate. Tools for phylogenetic analysis which scale to the quantity of available data are required. To address this need, we present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic analysis.

Results: We illustrate that cognac is able to rapidly identify phylogenetic marker genes using a data driven approach and efficiently generate concatenated gene alignments for very large genomic datasets. To benchmark our tool, we generated core gene alignments for eight unique genera of bacteria, including a dataset of over 11,000 genomes from the genus Escherichia producing an alignment with 1353 genes, which was constructed in less than 17 h.

Conclusions: We demonstrate that cognac presents an efficient method for generating concatenated gene alignments for phylogenetic analysis. We have released cognac as an R package ( https://github.com/rdcrawford/cognac ) with customizable parameters for adaptation to diverse applications.

Citing Articles

Longitudinal genomic surveillance of carriage and transmission of Clostridioides difficile in an intensive care unit.

Miles-Jay A, Snitkin E, Lin M, Shimasaki T, Schoeny M, Fukuda C Nat Med. 2023; 29(10):2526-2534.

PMID: 37723252 PMC: 10579090. DOI: 10.1038/s41591-023-02549-4.


Distinct Origins and Transmission Pathways of across Three U.S. States.

Lapp Z, Octaria R, OMalley S, Nguyen T, Wolford H, Crawford R J Clin Microbiol. 2023; 61(8):e0025923.

PMID: 37439675 PMC: 10446861. DOI: 10.1128/jcm.00259-23.


The survivor strain: isolation and characterization of AB48, a filamentous phototactic cyanobacterium with biotechnological potential.

Koch M, Noonan A, Qiu Y, Dofher K, Kieft B, Mottahedeh S Front Bioeng Biotechnol. 2022; 10:932695.

PMID: 36046667 PMC: 9420970. DOI: 10.3389/fbioe.2022.932695.


Phenotypic and Genomic Diversification in Complex Carbohydrate-Degrading Human Gut Bacteria.

Pudlo N, Urs K, Crawford R, Pirani A, Atherly T, Jimenez R mSystems. 2022; 7(1):e0094721.

PMID: 35166563 PMC: 8845570. DOI: 10.1128/msystems.00947-21.


Genomic Update of Phenotypic Prediction Rule for Methicillin-Resistant Staphylococcus aureus (MRSA) USA300 Discloses Jail Transmission Networks with Increased Resistance.

Sansom S, Benedict E, Thiede S, Hota B, Aroutcheva A, Payne D Microbiol Spectr. 2021; 9(1):e0037621.

PMID: 34287060 PMC: 8552710. DOI: 10.1128/Spectrum.00376-21.


References
1.
Fu L, Niu B, Zhu Z, Wu S, Li W . CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012; 28(23):3150-2. PMC: 3516142. DOI: 10.1093/bioinformatics/bts565. View

2.
Maiden M, Bygraves J, Feil E, Morelli G, Russell J, Urwin R . Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A. 1998; 95(6):3140-5. PMC: 19708. DOI: 10.1073/pnas.95.6.3140. View

3.
Wang L, Jiang T . On the complexity of multiple sequence alignment. J Comput Biol. 1994; 1(4):337-48. DOI: 10.1089/cmb.1994.1.337. View

4.
Tonini J, Moore A, Stern D, Shcheglovitova M, Orti G . Concatenation and Species Tree Methods Exhibit Statistically Indistinguishable Accuracy under a Range of Simulated Conditions. PLoS Curr. 2015; 7. PMC: 4391732. DOI: 10.1371/currents.tol.34260cc27551a527b124ec5f6334b6be. View

5.
Kreitman M . Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature. 1983; 304(5925):412-7. DOI: 10.1038/304412a0. View