» Articles » PMID: 21364756

The Impact of Multifunctional Genes on "guilt by Association" Analysis

Overview
Journal PLoS One
Date 2011 Mar 3
PMID 21364756
Citations 111
Authors
Affiliations
Soon will be listed here.
Abstract

Many previous studies have shown that by using variants of "guilt-by-association", gene function predictions can be made with very high statistical confidence. In these studies, it is assumed that the "associations" in the data (e.g., protein interaction partners) of a gene are necessary in establishing "guilt". In this paper we show that multifunctionality, rather than association, is a primary driver of gene function prediction. We first show that knowledge of the degree of multifunctionality alone can produce astonishingly strong performance when used as a predictor of gene function. We then demonstrate how multifunctionality is encoded in gene interaction data (such as protein interactions and coexpression networks) and how this can feed forward into gene function prediction algorithms. We find that high-quality gene function predictions can be made using data that possesses no information on which gene interacts with which. By examining a wide range of networks from mouse, human and yeast, as well as multiple prediction methods and evaluation metrics, we provide evidence that this problem is pervasive and does not reflect the failings of any particular algorithm or data type. We propose computational controls that can be used to provide more meaningful control when estimating gene function prediction performance. We suggest that this source of bias due to multifunctionality is important to control for, with widespread implications for the interpretation of genomics studies.

Citing Articles

Single-cell transcriptomes reveal cell-type-specific and sample-specific gene function in human cancer.

Yuan H, Liang X, Zhang X, Cao Y Heliyon. 2025; 11(3):e42218.

PMID: 39959484 PMC: 11830296. DOI: 10.1016/j.heliyon.2025.e42218.


Gene and transcript expression patterns, coupled with isoform switching and long non-coding RNA dynamics in adipose tissue, underlie the longevity of Ames dwarf mice.

Cano-Besquet S, Park M, Berkley N, Wong M, Ashiqueali S, Noureddine S Geroscience. 2024; .

PMID: 39405012 DOI: 10.1007/s11357-024-01383-x.


Node-degree aware edge sampling mitigates inflated classification performance in biomedical random walk-based graph representation learning.

Cappelletti L, Rekerle L, Fontana T, Hansen P, Casiraghi E, Ravanmehr V Bioinform Adv. 2024; 4(1):vbae036.

PMID: 38577542 PMC: 10994718. DOI: 10.1093/bioadv/vbae036.


scDrugPrio: a framework for the analysis of single-cell transcriptomics to address multiple problems in precision medicine in immune-mediated inflammatory diseases.

Schafer S, Smelik M, Sysoev O, Zhao Y, Eklund D, Lilja S Genome Med. 2024; 16(1):42.

PMID: 38509600 PMC: 10956347. DOI: 10.1186/s13073-024-01314-7.


CryptoCEN: A Co-Expression Network for Cryptococcus neoformans reveals novel proteins involved in DNA damage repair.

OMeara M, Rapala J, Nichols C, Alexandre A, Billmyre R, Steenwyk J PLoS Genet. 2024; 20(2):e1011158.

PMID: 38359090 PMC: 10901339. DOI: 10.1371/journal.pgen.1011158.


References
1.
von Mering C, Krause R, Snel B, Cornell M, Oliver S, Fields S . Comparative assessment of large-scale data sets of protein-protein interactions. Nature. 2002; 417(6887):399-403. DOI: 10.1038/nature750. View

2.
Costanzo M, Baryshnikova A, Bellay J, Kim Y, Spear E, Sevier C . The genetic landscape of a cell. Science. 2010; 327(5964):425-31. PMC: 5600254. DOI: 10.1126/science.1180823. View

3.
Maslov S, Sneppen K . Specificity and stability in topology of protein networks. Science. 2002; 296(5569):910-3. DOI: 10.1126/science.1065103. View

4.
Wagner G, Kenney-Hunt J, Pavlicev M, Peck J, Waxman D, Cheverud J . Pleiotropic scaling of gene effects and the 'cost of complexity'. Nature. 2008; 452(7186):470-2. DOI: 10.1038/nature06756. View

5.
Basu S, Kollu R, Banerjee-Basu S . AutDB: a gene reference resource for autism research. Nucleic Acids Res. 2008; 37(Database issue):D832-6. PMC: 2686502. DOI: 10.1093/nar/gkn835. View