Predicting Protein Functions with Message Passing Algorithms
Overview
Authors
Affiliations
Motivation: In the last few years, a growing interest in biology has been shifting toward the problem of optimal information extraction from the huge amount of data generated via large-scale and high-throughput techniques. One of the most relevant issues has recently emerged that of correctly and reliably predicting the functions of a given protein with that of functions exploiting information coming from the whole network of proteins physically interacting with the functionally undetermined one. In the present work, we will refer to an 'observed' protein as the one present in the protein-protein interaction networks published in the literature.
Methods: The method proposed in this paper is based on a message passing algorithm known as Belief Propagation, which accepts the network of protein's physical interactions and a catalog of known protein's functions as input, and returns the probabilities for each unclassified protein of having one chosen function. The implementation of the algorithm allows for fast online analysis, and can easily be generalized into more complex graph topologies taking into account hypergraphs, i.e. complexes of more than two interacting proteins.
Results: Benchmarks of our method are the two Saccharomyces cerevisiae protein-protein interaction networks and the Database of Interacting Proteins. The validity of our approach is successfully tested against other available techniques.
Contact: leone@isiosf.isi.it
Supplementary Information: http://isiosf.isi.it/~pagnani
Wei Y, Wei H, Tian C, Wu Q, Li D, Huang C Comb Chem High Throughput Screen. 2024; 27(7):1056-1070.
PMID: 38305398 DOI: 10.2174/0113862073261891231115072310.
Supervised learning is an accurate method for network-based gene classification.
Liu R, Mancuso C, Yannakopoulos A, Johnson K, Krishnan A Bioinformatics. 2020; 36(11):3457-3465.
PMID: 32129827 PMC: 7267831. DOI: 10.1093/bioinformatics/btaa150.
Youngs N, Penfold-Brown D, Drew K, Shasha D, Bonneau R Bioinformatics. 2013; 29(9):1190-8.
PMID: 23511543 PMC: 3634187. DOI: 10.1093/bioinformatics/btt110.
Assessing the relevance of node features for network structure.
Bianconi G, Pin P, Marsili M Proc Natl Acad Sci U S A. 2009; 106(28):11433-8.
PMID: 19571013 PMC: 2704854. DOI: 10.1073/pnas.0811511106.
A critical assessment of Mus musculus gene function prediction using integrated genomic evidence.
Pena-Castillo L, Tasan M, Myers C, Lee H, Joshi T, Zhang C Genome Biol. 2008; 9 Suppl 1:S2.
PMID: 18613946 PMC: 2447536. DOI: 10.1186/gb-2008-9-s1-s2.