SPICKER: a Clustering Approach to Identify Near-native Protein Folds
Overview
Authors
Affiliations
We have developed SPICKER, a simple and efficient strategy to identify near-native folds by clustering protein structures generated during computer simulations. In general, the most populated clusters tend to be closer to the native conformation than the lowest energy structures. To assess the generality of the approach, we applied SPICKER to 1489 representative benchmark proteins </=200 residues that cover the PDB at the level of 35% sequence identity; each contains up to 280,000 structure decoys generated using the recently developed TASSER (Threading ASSembly Refinement) algorithm. The best of the top five identified folds has a root-mean-square deviation from native (RMSD) in the top 1.4% of all decoys. For 78% of the proteins, the difference in RMSD from native to the identified models and RMSD from native to the absolutely best individual decoy is below 1 A; the majority belong to the targets with converged conformational distributions. Although native fold identification from divergent decoy structures remains a challenge, our overall results show significant improvement over our previous clustering algorithms.
lociPARSE: A Locality-aware Invariant Point Attention Model for Scoring RNA 3D Structures.
Tarafder S, Bhattacharya D J Chem Inf Model. 2024; 64(22):8655-8664.
PMID: 39523843 PMC: 11600500. DOI: 10.1021/acs.jcim.4c01621.
Identification and In-Silico study of non-synonymous functional SNPs in the human SCN9A gene.
Waheed S, Ramzan K, Ahmad S, Khan M, Wajid M, Ullah H PLoS One. 2024; 19(2):e0297367.
PMID: 38394191 PMC: 10889873. DOI: 10.1371/journal.pone.0297367.
lociPARSE: a locality-aware invariant point attention model for scoring RNA 3D structures.
Tarafder S, Bhattacharya D bioRxiv. 2023; .
PMID: 37961488 PMC: 10635153. DOI: 10.1101/2023.11.04.565599.
Pathfinder: Protein folding pathway prediction based on conformational sampling.
Huang Z, Cui X, Xia Y, Zhao K, Zhang G PLoS Comput Biol. 2023; 19(9):e1011438.
PMID: 37695768 PMC: 10513300. DOI: 10.1371/journal.pcbi.1011438.
Zheng W, Wuyun Q, Freddolino L, Freddolino P, Zhang Y Proteins. 2023; 91(12):1684-1703.
PMID: 37650367 PMC: 10840719. DOI: 10.1002/prot.26585.