Protein Fold Recognition Based on Multi-view Modeling

Overview

Journal Bioinformatics

Publisher Oxford University Press

Specialty Biology

Date 2019 Jan 23

PMID 30668845

Citations 19

Authors

Ke Yan

Xiaozhao Fang

Yong Xu

Bin Liu

Affiliations

Soon will be listed here.

Abstract

Motivation: Protein fold recognition has attracted increasing attention because it is critical for studies of the 3D structures of proteins and drug design. Researchers have been extensively studying this important task, and several features with high discriminative power have been proposed. However, the development of methods that efficiently combine these features to improve the predictive performance remains a challenging problem.

Results: In this study, we proposed two algorithms: MV-fold and MT-fold. MV-fold is a new computational predictor based on the multi-view learning model for fold recognition. Different features of proteins were treated as different views of proteins, including the evolutionary information, secondary structure information and physicochemical properties. These different views constituted the latent space. The ε-dragging technique was employed to enlarge the margins between different protein folds, improving the predictive performance of MV-fold. Then, MV-fold was combined with two template-based methods: HHblits and HMMER. The ensemble method is called MT-fold incorporating the advantages of both discriminative methods and template-based methods. Experimental results on five widely used benchmark datasets (DD, RDD, EDD, TG and LE) showed that the proposed methods outperformed some state-of-the-art methods in this field, indicating that MV-fold and MT-fold are useful computational tools for protein fold recognition and protein homology detection and would be efficient tools for protein sequence analysis. Finally, we constructed an update and rigorous benchmark dataset based on SCOPe (version 2.07) to fairly evaluate the performance of the proposed method, and our method achieved stable performance on this new dataset. This new benchmark dataset will become a widely used benchmark dataset to fairly evaluate the performance of different methods for fold recognition.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

ACP-CapsPred: an explainable computational framework for identification and functional prediction of anticancer peptides based on capsule network.

Yao L, Xie P, Guan J, Chung C, Zhang W, Deng J Brief Bioinform. 2024; 25(5).

PMID: 39293807 PMC: 11410379. DOI: 10.1093/bib/bbae460.

The effect of low-frequency high-intensity ultrasound combined with aspirin on tooth movement in rats.

Xin J, Zhan X, Zheng F, Li H, Wang Y, Li C BMC Oral Health. 2023; 23(1):642.

PMID: 37670292 PMC: 10478369. DOI: 10.1186/s12903-023-03359-3.

Multi-view clustering by CPS-merge analysis with application to multimodal single-cell data.

Zhang L, Lin L, Li J PLoS Comput Biol. 2023; 19(4):e1011044.

PMID: 37068097 PMC: 10138214. DOI: 10.1371/journal.pcbi.1011044.

sAMPpred-GAT: prediction of antimicrobial peptide by graph attention network and predicted peptide structure.

Yan K, Lv H, Guo Y, Peng W, Liu B Bioinformatics. 2022; 39(1).

PMID: 36342186 PMC: 9805557. DOI: 10.1093/bioinformatics/btac715.

BioS2Net: Holistic Structural and Sequential Analysis of Biomolecules Using a Deep Neural Network.

Roethel A, Bilinski P, Ishikawa T Int J Mol Sci. 2022; 23(6).

PMID: 35328384 PMC: 8954277. DOI: 10.3390/ijms23062966.