» Articles » PMID: 19081747

A Two-Step Approach for Clustering Proteins Based on Protein Interaction Profile

Overview
Authors
Affiliations
Soon will be listed here.
Abstract

High-throughput methods for detecting protein-protein interactions (PPI) have given researchers an initial global picture of protein interactions on a genomic scale. The huge data sets generated by such experiments pose new challenges in data analysis. Though clustering methods have been successfully applied in many areas in bioinformatics, many clustering algorithms cannot be readily applied on protein interaction data sets. One main problem is that the similarity between two proteins cannot be easily defined. This paper proposes a probabilistic model to define the similarity based on conditional probabilities. We then propose a two-step method for estimating the similarity between two proteins based on protein interaction profile. In the first step, the model is trained with proteins with known annotation. Based on this model, similarities are calculated in the second step. Experiments show that our method improves performance.

References
1.
Spirin V, Mirny L . Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci U S A. 2003; 100(21):12123-8. PMC: 218723. DOI: 10.1073/pnas.2032324100. View

2.
Li S, Armstrong C, Bertin N, Ge H, Milstein S, Boxem M . A map of the interactome network of the metazoan C. elegans. Science. 2004; 303(5657):540-3. PMC: 1698949. DOI: 10.1126/science.1091403. View

3.
Drees B, SUNDIN B, Brazeau E, Caviston J, Chen G, Guo W . A protein interaction map for cell polarity development. J Cell Biol. 2001; 154(3):549-71. PMC: 2196425. DOI: 10.1083/jcb.200104057. View

4.
Ben-Dor A, Shamir R, Yakhini Z . Clustering gene expression patterns. J Comput Biol. 1999; 6(3-4):281-97. DOI: 10.1089/106652799318274. View

5.
Deng M, Mehta S, Sun F, Chen T . Inferring domain-domain interactions from protein-protein interactions. Genome Res. 2002; 12(10):1540-8. PMC: 187530. DOI: 10.1101/gr.153002. View