» Articles » PMID: 35929781

Hierarchical Deep Learning for Predicting GO Annotations by Integrating Protein Knowledge

Overview
Journal Bioinformatics
Specialty Biology
Date 2022 Aug 5
PMID 35929781
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they are expensive, time-consuming and cannot cope with the exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill the gap with automatic function prediction. The results of the last Critical Assessment of Function Annotation challenge revealed that GO-terms prediction remains a very challenging task. Recent developments on deep learning are significantly breaking out the frontiers leading to new knowledge in protein research thanks to the integration of data from multiple sources. However, deep models hitherto developed for functional prediction are mainly focused on sequence data and have not achieved breakthrough performances yet.

Results: We propose DeeProtGO, a novel deep-learning model for predicting GO annotations by integrating protein knowledge. DeeProtGO was trained for solving 18 different prediction problems, defined by the three GO sub-ontologies, the type of proteins, and the taxonomic kingdom. Our experiments reported higher prediction quality when more protein knowledge is integrated. We also benchmarked DeeProtGO against state-of-the-art methods on public datasets, and showed it can effectively improve the prediction of GO annotations.

Availability And Implementation: DeeProtGO and a case of use are available at https://github.com/gamerino/DeeProtGO.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

The CABANA model 2017-2022: research and training synergy to facilitate bioinformatics applications in Latin America.

Campos-Sanchez R, Willis I, Gopalasingam P, Lopez-Juarez D, Cristancho M, Brooksbank C Front Educ (Lausanne). 2024; 9.

PMID: 39686965 PMC: 7617245. DOI: 10.3389/feduc.2024.1358620.


Optimizing Scorpion Toxin Processing through Artificial Intelligence.

Psenicnik A, Ojanguren-Affilastro A, Graham M, Hassan M, Abdel-Rahman M, Sharma P Toxins (Basel). 2024; 16(10).

PMID: 39453213 PMC: 11511117. DOI: 10.3390/toxins16100437.


Osmoprotectants play a major role in the resistance to high levels of salinity stress-insights from a metabolomics and proteomics integrated approach.

Rodrigues Neto J, Salgado F, Braga I, Carvalho da Silva T, Belo Silva V, Leao A Front Plant Sci. 2023; 14:1187803.

PMID: 37384354 PMC: 10296175. DOI: 10.3389/fpls.2023.1187803.


PFresGO: an attention mechanism-based deep-learning approach for protein annotation by integrating gene ontology inter-relationships.

Pan T, Li C, Bi Y, Wang Z, Gasser R, Purcell A Bioinformatics. 2023; 39(3).

PMID: 36794913 PMC: 9978587. DOI: 10.1093/bioinformatics/btad094.


Elucidating the functional roles of prokaryotic proteins using big data and artificial intelligence.

Ardern Z, Chakraborty S, Lenk F, Kaster A FEMS Microbiol Rev. 2023; 47(1).

PMID: 36725215 PMC: 9960493. DOI: 10.1093/femsre/fuad003.

References
1.
Littmann M, Heinzinger M, Dallago C, Olenyi T, Rost B . Embeddings from deep learning transfer GO annotations beyond homology. Sci Rep. 2021; 11(1):1160. PMC: 7806674. DOI: 10.1038/s41598-020-80786-0. View

2.
Rost B, Liu J, Nair R, Wrzeszczynski K, Ofran Y . Automatic prediction of protein function. Cell Mol Life Sci. 2003; 60(12):2637-50. PMC: 11138487. DOI: 10.1007/s00018-003-3114-8. View

3.
Raad J, Stegmayer G, Milone D . Complexity measures of the mature miRNA for improving pre-miRNAs prediction. Bioinformatics. 2019; 36(8):2319-2327. DOI: 10.1093/bioinformatics/btz940. View

4.
Bonetta R, Valentino G . Machine learning techniques for protein function prediction. Proteins. 2019; 88(3):397-413. DOI: 10.1002/prot.25832. View

5.
Zhou N, Jiang Y, Bergquist T, Lee A, Kacsoh B, Crocker A . The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biol. 2019; 20(1):244. PMC: 6864930. DOI: 10.1186/s13059-019-1835-8. View