» Articles » PMID: 21554720

Inferring Functional Modules of Protein Families with Probabilistic Topic Models

Overview
Publisher Biomed Central
Specialty Biology
Date 2011 May 11
PMID 21554720
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Genome and metagenome studies have identified thousands of protein families whose functions are poorly understood and for which techniques for functional characterization provide only partial information. For such proteins, the genome context can give further information about their functional context.

Results: We describe a Bayesian method, based on a probabilistic topic model, which directly identifies functional modules of protein families. The method explores the co-occurrence patterns of protein families across a collection of sequence samples to infer a probabilistic model of arbitrarily-sized functional modules.

Conclusions: We show that our method identifies protein modules - some of which correspond to well-known biological processes - that are tightly interconnected with known functional interactions and are different from the interactions identified by pairwise co-occurrence. The modules are not specific to any given organism and may combine different realizations of a protein complex or pathway within different taxa.

Citing Articles

An overview of topic modeling and its current applications in bioinformatics.

Liu L, Tang L, Dong W, Yao S, Zhou W Springerplus. 2016; 5(1):1608.

PMID: 27652181 PMC: 5028368. DOI: 10.1186/s40064-016-3252-8.


Understanding Genotype-Phenotype Effects in Cancer via Network Approaches.

Kim Y, Cho D, Przytycka T PLoS Comput Biol. 2016; 12(3):e1004747.

PMID: 26963104 PMC: 4786343. DOI: 10.1371/journal.pcbi.1004747.


Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders.

Konietzny S, Pope P, Weimann A, McHardy A Biotechnol Biofuels. 2014; 7(1):124.

PMID: 25342967 PMC: 4189754. DOI: 10.1186/s13068-014-0124-8.


A Module Analysis Approach to Investigate Molecular Mechanism of TCM Formula: A Trial on Shu-feng-jie-du Formula.

Song J, Zhang F, Tang S, Liu X, Gao Y, Lu P Evid Based Complement Alternat Med. 2013; 2013:731370.

PMID: 24376467 PMC: 3860149. DOI: 10.1155/2013/731370.


Metagenomic annotation networks: construction and applications.

Vey G, Moreno-Hagelsieb G PLoS One. 2012; 7(8):e41283.

PMID: 22879885 PMC: 3413691. DOI: 10.1371/journal.pone.0041283.

References
1.
van Noort V, Snel B, Huynen M . Predicting gene function by conserved co-expression. Trends Genet. 2003; 19(5):238-42. DOI: 10.1016/S0168-9525(03)00056-8. View

2.
Liu Y, Harrison P, Kunin V, Gerstein M . Comprehensive analysis of pseudogenes in prokaryotes: widespread gene decay and failure of putative horizontally transferred genes. Genome Biol. 2004; 5(9):R64. PMC: 522871. DOI: 10.1186/gb-2004-5-9-r64. View

3.
Osterman A, Overbeek R . Missing genes in metabolic pathways: a comparative genomics approach. Curr Opin Chem Biol. 2003; 7(2):238-51. DOI: 10.1016/s1367-5931(03)00027-9. View

4.
von Mering C, Zdobnov E, Tsoka S, Ciccarelli F, Pereira-Leal J, Ouzounis C . Genome evolution reveals biochemical networks and functional modules. Proc Natl Acad Sci U S A. 2003; 100(26):15428-33. PMC: 307584. DOI: 10.1073/pnas.2136809100. View

5.
Aso T, Eguchi K . Predicting protein-protein relationships from literature using latent topics. Genome Inform. 2010; 23(1):3-12. View