Identification and Analysis of Domains in Proteins
Overview
Biotechnology
Authors
Affiliations
An automatic algorithm based on inter-residue contacts is presented to identify domains in proteins. The results of the algorithm are compared to an assignment performed by inspection that was guided by the authors' description in the literature. The authors' and the algorithm's assignments for a chain were considered to agree if the same number of domains were identified and if the assignments were the same for at least 95% of the residues. With this criterion, the algorithm agreed with the authors' assignment for 78% of the 284 non-redundant chains considered. When some of the authors' assignments were re-evaluated based on the results of the algorithm, an agreement of 84% was obtained. The algorithm is therefore a useful tool for data validation in domain assignment. The authors assignments of domains were analysed for structural principles of domains. The number of chains forming one, two, three, four and five domains are 197, 67, 13, 6 and 1 respectively. Most domains in multidomain proteins are formed from continuous segments and adopt the same structural class. Distributions of the number of residues and the ellipticity of domains and chains are presented. The relationship between accessible surface area and molecular weight for domains and chains is examined.
Chemical Synthesis of Human Proteoforms and Application in Biomedicine.
Ai H, Pan M, Liu L ACS Cent Sci. 2024; 10(8):1442-1459.
PMID: 39220697 PMC: 11363345. DOI: 10.1021/acscentsci.4c00642.
A unified approach to protein domain parsing with inter-residue distance matrix.
Zhu K, Su H, Peng Z, Yang J Bioinformatics. 2023; 39(2).
PMID: 36734597 PMC: 9919455. DOI: 10.1093/bioinformatics/btad070.
Assignment of structural domains in proteins using diffusion kernels on graphs.
Taheri-Ledari M, Zandieh A, Shariatpanahi S, Eslahchi C BMC Bioinformatics. 2022; 23(1):369.
PMID: 36076174 PMC: 9461149. DOI: 10.1186/s12859-022-04902-9.
An ambiguity principle for assigning protein structural domains.
Postic G, Ghouzam Y, Chebrek R, Gelly J Sci Adv. 2017; 3(1):e1600552.
PMID: 28097215 PMC: 5235333. DOI: 10.1126/sciadv.1600552.
Hoffmann J, Wrabl J, Hilser V Proteins. 2016; 84(4):435-47.
PMID: 26800099 PMC: 4811355. DOI: 10.1002/prot.24989.