» Articles » PMID: 34291486

Improving Protein Tertiary Structure Prediction by Deep Learning and Distance Prediction in CASP14

Overview
Journal Proteins
Date 2021 Jul 22
PMID 34291486
Citations 16
Authors
Affiliations
Soon will be listed here.
Abstract

Substantial progresses in protein structure prediction have been made by utilizing deep-learning and residue-residue distance prediction since CASP13. Inspired by the advances, we improve our CASP14 MULTICOM protein structure prediction system by incorporating three new components: (a) a new deep learning-based protein inter-residue distance predictor to improve template-free (ab initio) tertiary structure prediction, (b) an enhanced template-based tertiary structure prediction method, and (c) distance-based model quality assessment methods empowered by deep learning. In the 2020 CASP14 experiment, MULTICOM predictor was ranked seventh out of 146 predictors in tertiary structure prediction and ranked third out of 136 predictors in inter-domain structure prediction. The results demonstrate that the template-free modeling based on deep learning and residue-residue distance prediction can predict the correct topology for almost all template-based modeling targets and a majority of hard targets (template-free targets or targets whose templates cannot be recognized), which is a significant improvement over the CASP13 MULTICOM predictor. Moreover, the template-free modeling performs better than the template-based modeling on not only hard targets but also the targets that have homologous templates. The performance of the template-free modeling largely depends on the accuracy of distance prediction closely related to the quality of multiple sequence alignments. The structural model quality assessment works well on targets for which enough good models can be predicted, but it may perform poorly when only a few good models are predicted for a hard target and the distribution of model quality scores is highly skewed. MULTICOM is available at https://github.com/jianlin-cheng/MULTICOM_Human_CASP14/tree/CASP14_DeepRank3 and https://github.com/multicom-toolbox/multicom/tree/multicom_v2.0.

Citing Articles

Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models.

Yue T, Wang Y, Zhang L, Gu C, Xue H, Wang W Int J Mol Sci. 2023; 24(21).

PMID: 37958843 PMC: 10649223. DOI: 10.3390/ijms242115858.


Precision Oncology Comes of Age: Designing Best-in-Class Small Molecules by Integrating Two Decades of Advances in Chemistry, Target Biology, and Data Science.

Stuart D, Guzman-Perez A, Brooijmans N, Jackson E, Kryukov G, Friedman A Cancer Discov. 2023; 13(10):2131-2149.

PMID: 37712571 PMC: 10551669. DOI: 10.1158/2159-8290.CD-23-0280.


Improving AlphaFold2-based protein tertiary structure prediction with MULTICOM in CASP15.

Liu J, Guo Z, Wu T, Roy R, Chen C, Cheng J Commun Chem. 2023; 6(1):188.

PMID: 37679431 PMC: 10484931. DOI: 10.1038/s42004-023-00991-6.


Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15.

Roy R, Liu J, Giri N, Guo Z, Cheng J Proteins. 2023; 91(12):1889-1902.

PMID: 37357816 PMC: 10749984. DOI: 10.1002/prot.26542.


Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15.

Roy R, Liu J, Giri N, Guo Z, Cheng J bioRxiv. 2023; .

PMID: 36945536 PMC: 10028888. DOI: 10.1101/2023.03.08.531814.


References
1.
Madera M . Profile Comparer: a program for scoring and aligning profile hidden Markov models. Bioinformatics. 2008; 24(22):2630-1. PMC: 2579712. DOI: 10.1093/bioinformatics/btn504. View

2.
Rotkiewicz P, Skolnick J . Fast procedure for reconstruction of full-atom protein models from reduced representations. J Comput Chem. 2008; 29(9):1460-5. PMC: 2692024. DOI: 10.1002/jcc.20906. View

3.
Zheng W, Li Y, Zhang C, Pearce R, Mortuza S, Zhang Y . Deep-learning contact-map guided protein structure prediction in CASP13. Proteins. 2019; 87(12):1149-1164. PMC: 6851476. DOI: 10.1002/prot.25792. View

4.
Karasikov M, Pages G, Grudinin S . Smooth orientation-dependent scoring function for coarse-grained protein quality assessment. Bioinformatics. 2018; 35(16):2801-2808. DOI: 10.1093/bioinformatics/bty1037. View

5.
Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y . The I-TASSER Suite: protein structure and function prediction. Nat Methods. 2014; 12(1):7-8. PMC: 4428668. DOI: 10.1038/nmeth.3213. View