» Articles » PMID: 16217548

Protein Molecular Function Prediction by Bayesian Phylogenomics

Overview
Specialty Biology
Date 2005 Oct 12
PMID 16217548
Citations 81
Authors
Affiliations
Soon will be listed here.
Abstract

We present a statistical graphical model to infer specific molecular function for unannotated protein sequences using homology. Based on phylogenomic principles, SIFTER (Statistical Inference of Function Through Evolutionary Relationships) accurately predicts molecular function for members of a protein family given a reconciled phylogeny and available function annotations, even when the data are sparse or noisy. Our method produced specific and consistent molecular function predictions across 100 Pfam families in comparison to the Gene Ontology annotation database, BLAST, GOtcha, and Orthostrapper. We performed a more detailed exploration of functional predictions on the adenosine-5'-monophosphate/adenosine deaminase family and the lactate/malate dehydrogenase family, in the former case comparing the predictions against a gold standard set of published functional characterizations. Given function annotations for 3% of the proteins in the deaminase family, SIFTER achieves 96% accuracy in predicting molecular function for experimentally characterized proteins as reported in the literature. The accuracy of SIFTER on this dataset is a significant improvement over other currently available methods such as BLAST (75%), GeneQuiz (64%), GOtcha (89%), and Orthostrapper (11%). We also experimentally characterized the adenosine deaminase from Plasmodium falciparum, confirming SIFTER's prediction. The results illustrate the predictive power of exploiting a statistical model of function evolution in phylogenomic problems. A software implementation of SIFTER is available from the authors.

Citing Articles

Improving enzyme functional annotation by integrating in vitro and in silico approaches: The example of histidinol phosphate phosphatases.

Kinateder T, Mayer C, Nazet J, Sterner R Protein Sci. 2024; 33(2):e4899.

PMID: 38284491 PMC: 10804674. DOI: 10.1002/pro.4899.


Phylogenetic inference of the emergence of sequence modules and protein-protein interactions in the ADAMTS-TSL family.

Dennler O, Coste F, Blanquart S, Belleannee C, Theret N PLoS Comput Biol. 2023; 19(8):e1011404.

PMID: 37651409 PMC: 10499240. DOI: 10.1371/journal.pcbi.1011404.


Bioinformatics approaches for classification and investigation of the evolution of the Na/K-ATPase alpha-subunit.

Shahnazari M, Zakipour Z, Razi H, Moghadam A, Alemzadeh A BMC Ecol Evol. 2022; 22(1):122.

PMID: 36289471 PMC: 9609216. DOI: 10.1186/s12862-022-02071-0.


Multiple Profile Models Extract Features from Protein Sequence Data and Resolve Functional Diversity of Very Different Protein Families.

Vicedomini R, Bouly J, Laine E, Falciatore A, Carbone A Mol Biol Evol. 2022; 39(4).

PMID: 35353898 PMC: 9016551. DOI: 10.1093/molbev/msac070.


Schistosomiasis Drug Discovery in the Era of Automation and Artificial Intelligence.

Moreira-Filho J, Silva A, Dantas R, Gomes B, Souza Neto L, Brandao-Neto J Front Immunol. 2021; 12:642383.

PMID: 34135888 PMC: 8203334. DOI: 10.3389/fimmu.2021.642383.


References
1.
Maier S, Podemski L, Graham S, McDermid H, Locke J . Characterization of the adenosine deaminase-related growth factor (ADGF) gene family in Drosophila. Gene. 2001; 280(1-2):27-36. DOI: 10.1016/s0378-1119(01)00762-4. View

2.
Troyanskaya O, Dolinski K, Owen A, Altman R, Botstein D . A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). Proc Natl Acad Sci U S A. 2003; 100(14):8348-53. PMC: 166232. DOI: 10.1073/pnas.0832373100. View

3.
Raychaudhuri S, Chang J, Sutphin P, Altman R . Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. Genome Res. 2002; 12(1):203-14. PMC: 155261. DOI: 10.1101/gr.199701. View

4.
Gerlt J, Babbitt P . Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies. Annu Rev Biochem. 2001; 70:209-46. DOI: 10.1146/annurev.biochem.70.1.209. View

5.
Storm C, Sonnhammer E . Automated ortholog inference from phylogenetic trees and calculation of orthology reliability. Bioinformatics. 2002; 18(1):92-9. DOI: 10.1093/bioinformatics/18.1.92. View