» Articles » PMID: 27627880

Reduction Strategies for Hierarchical Multi-label Classification in Protein Function Prediction

Overview
Publisher Biomed Central
Specialty Biology
Date 2016 Sep 16
PMID 27627880
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Hierarchical Multi-Label Classification is a classification task where the classes to be predicted are hierarchically organized. Each instance can be assigned to classes belonging to more than one path in the hierarchy. This scenario is typically found in protein function prediction, considering that each protein may perform many functions, which can be further specialized into sub-functions. We present a new hierarchical multi-label classification method based on multiple neural networks for the task of protein function prediction. A set of neural networks are incrementally training, each being responsible for the prediction of the classes belonging to a given level.

Results: The method proposed here is an extension of our previous work. Here we use the neural network output of a level to complement the feature vectors used as input to train the neural network in the next level. We experimentally compare this novel method with several other reduction strategies, showing that it obtains the best predictive performance. Empirical results also show that the proposed method achieves better or comparable predictive performance when compared with state-of-the-art methods for hierarchical multi-label classification in the context of protein function prediction.

Conclusions: The experiments showed that using the output in one level as input to the next level contributed to better classification results. We believe the method was able to learn the relationships between the protein functions during training, and this information was useful for classification. We also identified in which functional classes our method performed better.

Citing Articles

A Novel Piano Arrangement Timbre Intelligent Recognition System Using Multilabel Classification Technology and KNN Algorithm.

Lu Y, Chu C Comput Intell Neurosci. 2022; 2022:2205936.

PMID: 35855792 PMC: 9288348. DOI: 10.1155/2022/2205936.


Evaluating hierarchical machine learning approaches to classify biological databases.

Rezende P, Xavier J, Ascher D, Fernandes G, Pires D Brief Bioinform. 2022; 23(4).

PMID: 35724625 PMC: 9310517. DOI: 10.1093/bib/bbac216.


Survey of Image Processing Techniques for Brain Pathology Diagnosis: Challenges and Opportunities.

Cenek M, Hu M, York G, Dahl S Front Robot AI. 2021; 5:120.

PMID: 33500999 PMC: 7805910. DOI: 10.3389/frobt.2018.00120.


Learning important features from multi-view data to predict drug side effects.

Liang X, Zhang P, Li J, Fu Y, Qu L, Chen Y J Cheminform. 2021; 11(1):79.

PMID: 33430979 PMC: 6916463. DOI: 10.1186/s13321-019-0402-3.


PSIONplus Server for Accurate Multi-Label Prediction of Ion Channels and Their Types.

Gao J, Wei H, Cano A, Kurgan L Biomolecules. 2020; 10(6).

PMID: 32517331 PMC: 7355608. DOI: 10.3390/biom10060876.


References
1.
Yu G, Zhu H, Domeniconi C . Predicting protein functions using incomplete hierarchical labels. BMC Bioinformatics. 2015; 16:1. PMC: 4384381. DOI: 10.1186/s12859-014-0430-y. View

2.
Ruepp A, Zollner A, Maier D, Albermann K, Hani J, Mokrejs M . The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res. 2004; 32(18):5539-45. PMC: 524302. DOI: 10.1093/nar/gkh894. View

3.
Stark C, Breitkreutz B, Reguly T, Boucher L, Breitkreutz A, Tyers M . BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 2005; 34(Database issue):D535-9. PMC: 1347471. DOI: 10.1093/nar/gkj109. View

4.
Zhou H, Huang G, Lin Z, Wang H, Soh Y . Stacked Extreme Learning Machines. IEEE Trans Cybern. 2014; 45(9):2013-25. DOI: 10.1109/TCYB.2014.2363492. View

5.
Konc J, Janezic D . Binding site comparison for function prediction and pharmaceutical discovery. Curr Opin Struct Biol. 2014; 25:34-9. DOI: 10.1016/j.sbi.2013.11.012. View