» Articles » PMID: 21931822

Toward an Efficient Method of Identifying Core Genes for Evolutionary and Functional Microbial Phylogenies

Overview
Journal PLoS One
Date 2011 Sep 21
PMID 21931822
Citations 45
Authors
Affiliations
Soon will be listed here.
Abstract

Microbial community metagenomes and individual microbial genomes are becoming increasingly accessible by means of high-throughput sequencing. Assessing organismal membership within a community is typically performed using one or a few taxonomic marker genes such as the 16S rDNA, and these same genes are also employed to reconstruct molecular phylogenies. There is thus a growing need to bioinformatically catalog strongly conserved core genes that can serve as effective taxonomic markers, to assess the agreement among phylogenies generated from different core gene, and to characterize the biological functions enriched within core genes and thus conserved throughout large microbial clades. We present a method to recursively identify core genes (i.e. genes ubiquitous within a microbial clade) in high-throughput from a large number of complete input genomes. We analyzed over 1,100 genomes to produce core gene sets spanning 2,861 bacterial and archaeal clades, ranging in size from one to >2,000 genes in inverse correlation with the α-diversity (total phylogenetic branch length) spanned by each clade. These cores are enriched as expected for housekeeping functions including translation, transcription, and replication, in addition to significant representations of regulatory, chaperone, and conserved uncharacterized proteins. In agreement with previous manually curated core gene sets, phylogenies constructed from one or more of these core genes agree with those built using 16S rDNA sequence similarity, suggesting that systematic core gene selection can be used to optimize both comparative genomics and determination of microbial community structure. Finally, we examine functional phylogenies constructed by clustering genomes by the presence or absence of orthologous gene families and show that they provide an informative complement to standard sequence-based molecular phylogenies.

Citing Articles

Antimicrobial peptide AP2 ameliorates Salmonella Typhimurium infection by modulating gut microbiota.

Li L, Mo Q, Wan Y, Zhou Y, Li W, Li W BMC Microbiol. 2025; 25(1):64.

PMID: 39910418 PMC: 11796240. DOI: 10.1186/s12866-025-03776-0.


The members of zinc finger-homeodomain (ZF-HD) transcription factors are associated with abiotic stresses in soybean: insights from genomics and expression analysis.

Rizwan H, He J, Nawaz M, Lu K, Wang M BMC Plant Biol. 2025; 25(1):56.

PMID: 39810081 PMC: 11730174. DOI: 10.1186/s12870-024-06028-x.


Linkage-based ortholog refinement in bacterial pangenomes with CLARC.

Gonzalez Ojeda I, Palace S, Martinez P, Azarian T, Grant L, Hammitt L bioRxiv. 2025; .

PMID: 39763808 PMC: 11702680. DOI: 10.1101/2024.12.18.629228.


An Integrated Neuromuscular Training Intervention Applied in Primary School Induces Epigenetic Modifications in Disease-Related Genes: A Genome-Wide DNA Methylation Study.

Vasileva F, Font-Llado R, Lopez-Ros V, Barretina J, Noguera-Castells A, Esteller M Scand J Med Sci Sports. 2025; 35(1):e70012.

PMID: 39757698 PMC: 11701344. DOI: 10.1111/sms.70012.


Characterization of meningitis-causing bacteria, with focus on genomic and pangenomic study of multi-drug resistant Streptococcus pneumoniae from cerebrospinal fluid.

Ali R, Ali K, Aurongzeb M, Al-Regaiey K, Kori J, Irfan M Antonie Van Leeuwenhoek. 2024; 118(1):16.

PMID: 39382798 DOI: 10.1007/s10482-024-02016-1.


References
1.
Donati C, Hiller N, Tettelin H, Muzzi A, Croucher N, Angiuoli S . Structure and dynamics of the pan-genome of Streptococcus pneumoniae and closely related species. Genome Biol. 2010; 11(10):R107. PMC: 3218663. DOI: 10.1186/gb-2010-11-10-r107. View

2.
Gil R, Silva F, Pereto J, Moya A . Determination of the core of a minimal bacterial gene set. Microbiol Mol Biol Rev. 2004; 68(3):518-37, table of contents. PMC: 515251. DOI: 10.1128/MMBR.68.3.518-537.2004. View

3.
Rasko D, Rosovitz M, Myers G, Mongodin E, Fricke W, Gajer P . The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates. J Bacteriol. 2008; 190(20):6881-93. PMC: 2566221. DOI: 10.1128/JB.00619-08. View

4.
Sims G, Kim S . Whole-genome phylogeny of Escherichia coli/Shigella group by feature frequency profiles (FFPs). Proc Natl Acad Sci U S A. 2011; 108(20):8329-34. PMC: 3100984. DOI: 10.1073/pnas.1105168108. View

5.
Wu J, Kasif S, DeLisi C . Identification of functional links between genes using phylogenetic profiles. Bioinformatics. 2003; 19(12):1524-30. DOI: 10.1093/bioinformatics/btg187. View