Prediction of Protein Domain Boundaries from Inverse Covariances
Overview
Authors
Affiliations
It has been known even since relatively few structures had been solved that longer protein chains often contain multiple domains, which may fold separately and play the role of reusable functional modules found in many contexts. In many structural biology tasks, in particular structure prediction, it is of great use to be able to identify domains within the structure and analyze these regions separately. However, when using sequence data alone this task has proven exceptionally difficult, with relatively little improvement over the naive method of choosing boundaries based on size distributions of observed domains. The recent significant improvement in contact prediction provides a new source of information for domain prediction. We test several methods for using this information including a kernel smoothing-based approach and methods based on building alpha-carbon models and compare performance with a length-based predictor, a homology search method and four published sequence-based predictors: DOMCUT, DomPRO, DLP-SVM, and SCOOBY-DOmain. We show that the kernel-smoothing method is significantly better than the other ab initio predictors when both single-domain and multidomain targets are considered and is not significantly different to the homology-based method. Considering only multidomain targets the kernel-smoothing method outperforms all of the published methods except DLP-SVM. The kernel smoothing method therefore represents a potentially useful improvement to ab initio domain prediction.
Sanchez Rodriguez F, Mesdaghi S, Simpkin A, Burgos-Marmol J, Murphy D, Uski V Bioinformatics. 2021; 37(17):2763-2765.
PMID: 34499718 PMC: 8428603. DOI: 10.1093/bioinformatics/btab049.
Mesdaghi S, Murphy D, Sanchez Rodriguez F, Burgos-Marmol J, Rigden D F1000Res. 2021; 9:1395.
PMID: 33520197 PMC: 7818093. DOI: 10.12688/f1000research.27676.2.
Co-evolution techniques are reshaping the way we do structural bioinformatics.
De Oliveira S, Deane C F1000Res. 2017; 6:1224.
PMID: 28781768 PMC: 5531156. DOI: 10.12688/f1000research.11543.1.
Applications of contact predictions to structural biology.
Simkovic F, Ovchinnikov S, Baker D, Rigden D IUCrJ. 2017; 4(Pt 3):291-300.
PMID: 28512576 PMC: 5414403. DOI: 10.1107/S2052252517005115.
Simkovic F, Thomas J, Keegan R, Winn M, Mayans O, Rigden D IUCrJ. 2016; 3(Pt 4):259-70.
PMID: 27437113 PMC: 4937781. DOI: 10.1107/S2052252516008113.