» Articles » PMID: 35275927

Characterization of Intrinsically Disordered Regions in Proteins Informed by Human Genetic Diversity

Overview
Specialty Biology
Date 2022 Mar 11
PMID 35275927
Authors
Affiliations
Soon will be listed here.
Abstract

All proteomes contain both proteins and polypeptide segments that don't form a defined three-dimensional structure yet are biologically active-called intrinsically disordered proteins and regions (IDPs and IDRs). Most of these IDPs/IDRs lack useful functional annotation limiting our understanding of their importance for organism fitness. Here we characterized IDRs using protein sequence annotations of functional sites and regions available in the UniProt knowledgebase ("UniProt features": active site, ligand-binding pocket, regions mediating protein-protein interactions, etc.). By measuring the statistical enrichment of twenty-five UniProt features in 981 IDRs of 561 human proteins, we identified eight features that are commonly located in IDRs. We then collected the genetic variant data from the general population and patient-based databases and evaluated the prevalence of population and pathogenic variations in IDPs/IDRs. We observed that some IDRs tolerate 2 to 12-times more single amino acid-substituting missense mutations than synonymous changes in the general population. However, we also found that 37% of all germline pathogenic mutations are located in disordered regions of 96 proteins. Based on the observed-to-expected frequency of mutations, we categorized 34 IDRs in 20 proteins (DDX3X, KIT, RB1, etc.) as intolerant to mutation. Finally, using statistical analysis and a machine learning approach, we demonstrate that mutation-intolerant IDRs carry a distinct signature of functional features. Our study presents a novel approach to assign functional importance to IDRs by leveraging the wealth of available genetic data, which will aid in a deeper understating of the role of IDRs in biological processes and disease mechanisms.

Citing Articles

Discordance between a deep learning model and clinical-grade variant pathogenicity classification in a rare disease cohort.

Kong S, Lee I, Collen L, Field M, Manrai A, Snapper S NPJ Genom Med. 2025; 10(1):17.

PMID: 40021654 PMC: 11871343. DOI: 10.1038/s41525-025-00480-w.


Conservation of OFD1 Protein Motifs: Implications for Discovery of Novel Interactors and the OFD1 Function.

Jagodzik P, Zietkiewicz E, Bukowy-Bieryllo Z Int J Mol Sci. 2025; 26(3).

PMID: 39940934 PMC: 11818881. DOI: 10.3390/ijms26031167.


Prediction and assessment of deleterious and disease causing nonsynonymous single nucleotide polymorphisms (nsSNPs) in human gene: An study.

Kamal M, Teeya S, Rahman M, Talukder M, Sarmin S, Wani T Heliyon. 2024; 10(12):e32791.

PMID: 38994097 PMC: 11237951. DOI: 10.1016/j.heliyon.2024.e32791.


Molecular Docking of Intrinsically Disordered Proteins: Challenges and Strategies.

Patel K, Chavda D, Manna M Methods Mol Biol. 2024; 2780:165-201.

PMID: 38987470 DOI: 10.1007/978-1-0716-3985-6_11.


Discordance between a deep learning model and clinical-grade variant pathogenicity classification in a rare disease cohort.

Kong S, Lee I, Collen L, Manrai A, Snapper S, Mandl K medRxiv. 2024; .

PMID: 38826236 PMC: 11142383. DOI: 10.1101/2024.05.22.24307756.


References
1.
Schad E, Ficho E, Pancsa R, Simon I, Dosztanyi Z, Meszaros B . DIBS: a repository of disordered binding sites mediating interactions with ordered proteins. Bioinformatics. 2018; 34(3):535-537. PMC: 5860366. DOI: 10.1093/bioinformatics/btx640. View

2.
Zhou J, Oldfield C, Yan W, Shen B, Dunker A . Intrinsically disordered domains: Sequence ➔ disorder ➔ function relationships. Protein Sci. 2019; 28(9):1652-1663. PMC: 6699093. DOI: 10.1002/pro.3680. View

3.
Iqbal S, Hoque M . DisPredict: A Predictor of Disordered Protein Using Optimized RBF Kernel. PLoS One. 2015; 10(10):e0141551. PMC: 4627842. DOI: 10.1371/journal.pone.0141551. View

4.
Lohia R, Salari R, Brannigan G . Sequence specificity despite intrinsic disorder: How a disease-associated Val/Met polymorphism rearranges tertiary interactions in a long disordered protein. PLoS Comput Biol. 2019; 15(10):e1007390. PMC: 6821141. DOI: 10.1371/journal.pcbi.1007390. View

5.
Yu J, Cao Z, Yang Y, Wang C, Su Z, Zhao Y . Natural protein sequences are more intrinsically disordered than random sequences. Cell Mol Life Sci. 2016; 73(15):2949-57. PMC: 4937073. DOI: 10.1007/s00018-016-2138-9. View