» Articles » PMID: 20868492

Predicting Phenotypic Traits of Prokaryotes from Protein Domain Frequencies

Overview
Publisher Biomed Central
Specialty Biology
Date 2010 Sep 28
PMID 20868492
Citations 6
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Establishing the relationship between an organism's genome sequence and its phenotype is a fundamental challenge that remains largely unsolved. Accurately predicting microbial phenotypes solely based on genomic features will allow us to infer relevant phenotypic characteristics when the availability of a genome sequence precedes experimental characterization, a scenario that is favored by the advent of novel high-throughput and single cell sequencing techniques.

Results: We present a novel approach to predict the phenotype of prokaryotes directly from their protein domain frequencies. Our discriminative machine learning approach provides high prediction accuracy of relevant phenotypes such as motility, oxygen requirement or spore formation. Moreover, the set of discriminative domains provides biological insight into the underlying phenotype-genotype relationship and enables deriving hypotheses on the possible functions of uncharacterized domains.

Conclusions: Fast and accurate prediction of microbial phenotypes based on genomic protein domain content is feasible and has the potential to provide novel biological insights. First results of a systematic check for annotation errors indicate that our approach may also be applied to semi-automatic correction and completion of the existing phenotype annotation.

Citing Articles

From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry.

Karlsen S, Rau M, Sanchez B, Jensen K, Zeidan A FEMS Microbiol Rev. 2023; 47(4).

PMID: 37286882 PMC: 10337747. DOI: 10.1093/femsre/fuad030.


From Genomes to Phenotypes: Traitar, the Microbial Trait Analyzer.

Weimann A, Mooren K, Frank J, Pope P, Bremges A, McHardy A mSystems. 2017; 1(6).

PMID: 28066816 PMC: 5192078. DOI: 10.1128/mSystems.00101-16.


Bayesian prediction of microbial oxygen requirement.

Jensen D, Ussery D F1000Res. 2016; 2:184.

PMID: 26913185 PMC: 4743139. DOI: 10.12688/f1000research.2-184.v1.


Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders.

Konietzny S, Pope P, Weimann A, McHardy A Biotechnol Biofuels. 2014; 7(1):124.

PMID: 25342967 PMC: 4189754. DOI: 10.1186/s13068-014-0124-8.


A domain sequence approach to pangenomics: applications to Escherichia coli.

Snipen L, Ussery D F1000Res. 2014; 1:19.

PMID: 24555018 PMC: 3901455. DOI: 10.12688/f1000research.1-19.v2.


References
1.
Hoffmaster A, Hill K, Gee J, Marston C, De B, Popovic T . Characterization of Bacillus cereus isolates associated with fatal pneumonias: strains are closely related to Bacillus anthracis and harbor B. anthracis virulence genes. J Clin Microbiol. 2006; 44(9):3352-60. PMC: 1594744. DOI: 10.1128/JCM.00561-06. View

2.
Slonim N, Elemento O, Tavazoie S . Ab initio genotype-phenotype association reveals intrinsic modularity in genetic networks. Mol Syst Biol. 2006; 2:2006.0005. PMC: 1681479. DOI: 10.1038/msb4100047. View

3.
Lasken R . Single-cell genomic sequencing using Multiple Displacement Amplification. Curr Opin Microbiol. 2007; 10(5):510-6. DOI: 10.1016/j.mib.2007.08.005. View

4.
Zhao K, Liu M, Burgess R . Promoter and regulon analysis of nitrogen assimilation factor, sigma54, reveal alternative strategy for E. coli MG1655 flagellar biosynthesis. Nucleic Acids Res. 2009; 38(4):1273-83. PMC: 2831329. DOI: 10.1093/nar/gkp1123. View

5.
Liu Y, Li J, Sam L, Goh C, Gerstein M, Lussier Y . An integrative genomic approach to uncover molecular mechanisms of prokaryotic traits. PLoS Comput Biol. 2006; 2(11):e159. PMC: 1636675. DOI: 10.1371/journal.pcbi.0020159. View