» Articles » PMID: 21252072

Critical Assessment of High-throughput Standalone Methods for Secondary Structure Prediction

Overview
Journal Brief Bioinform
Specialty Biology
Date 2011 Jan 22
PMID 21252072
Citations 22
Authors
Affiliations
Soon will be listed here.
Abstract

Sequence-based prediction of protein secondary structure (SS) enjoys wide-spread and increasing use for the analysis and prediction of numerous structural and functional characteristics of proteins. The lack of a recent comprehensive and large-scale comparison of the numerous prediction methods results in an often arbitrary selection of a SS predictor. To address this void, we compare and analyze 12 popular, standalone and high-throughput predictors on a large set of 1975 proteins to provide in-depth, novel and practical insights. We show that there is no universally best predictor and thus detailed comparative studies are needed to support informed selection of SS predictors for a given application. Our study shows that the three-state accuracy (Q3) and segment overlap (SOV3) of the SS prediction currently reach 82% and 81%, respectively. We demonstrate that carefully designed consensus-based predictors improve the Q3 by additional 2% and that homology modeling-based methods are significantly better by 1.5% Q3 than ab initio approaches. Our empirical analysis reveals that solvent exposed and flexible coils are predicted with a higher quality than the buried and rigid coils, while inverse is true for the strands and helices. We also show that longer helices are easier to predict, which is in contrast to longer strands that are harder to find. The current methods confuse 1-6% of strand residues with helical residues and vice versa and they perform poorly for residues in the β- bridge and 3(10)-helix conformations. Finally, we compare predictions of the standalone implementations of four well-performing methods with their corresponding web servers.

Citing Articles

DescribePROT Database of Residue-Level Protein Structure and Function Annotations.

Zhao B, Basu S, Kurgan L Methods Mol Biol. 2024; 2867:169-184.

PMID: 39576581 DOI: 10.1007/978-1-0716-4196-5_10.


Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences.

Zhang J, Qian J, Zou Q, Zhou F, Kurgan L Methods Mol Biol. 2024; 2870:1-19.

PMID: 39543027 DOI: 10.1007/978-1-0716-4213-9_1.


Taxonomy-specific assessment of intrinsic disorder predictions at residue and region levels in higher eukaryotes, protists, archaea, bacteria and viruses.

Basu S, Kurgan L Comput Struct Biotechnol J. 2024; 23:1968-1977.

PMID: 38765610 PMC: 11098722. DOI: 10.1016/j.csbj.2024.04.059.


Availability of web servers significantly boosts citations rates of bioinformatics methods for protein function and disorder prediction.

Song J, Kurgan L Bioinform Adv. 2023; 3(1):vbad184.

PMID: 38146538 PMC: 10749743. DOI: 10.1093/bioadv/vbad184.


The N-terminal intrinsically disordered region of Ncb5or docks with the cytochrome b core to form a helical motif that is of ancient origin.

Benson D, Deng B, Kashipathy M, Lovell S, Battaile K, Cooper A Proteins. 2023; 92(4):554-566.

PMID: 38041394 PMC: 10932899. DOI: 10.1002/prot.26647.