The Information Content of Glutamine-Rich Sequences Define Protein Functional Characteristics
Overview
Affiliations
The presence of abnormally expanded glutamine (Q) repeats within specific proteins ( huntingtin) are the well-established cause of several neurogenerative diseases, including Huntington disease and spinocerebellar ataxias. However, the impact of "expanded Q" stretches on the protein function is not well-understood, mostly due to lack of knowledge about the physiological role of Q repeats and the mechanism by which these repeats achieve functional-specificity. Indeed, is intriguing that regions with such low complexity (low information content) can display exquisite functional specificity. Prompting the question: where is this information stored? Applying biochemical/structural constraints and statistical analysis of protein composition we identified Q-rich (Q) regions present in coiled coils of yeast transcription factors and endocytic proteins. Our analysis indicated the existence of non-Q amino-acids differentially enriched or excluded from Q regions in one protein group versus the other. Importantly, when the non-Q amino-acids from an endocytic protein were exchanged by the ones enriched in Q from transcription factors, the resulting protein was unable to localize to the plasma membrane and was instead found in the nucleus. These results indicate that while Q repeats can efficiently engage in binding, the non-Q amino-acids provide essential specificity information. We speculate that coupling low complexity regions with information-intensive determinants might be a strategy used in many protein systems involved in different biological processes.
Rollenhagen C, Agyeman H, Eszterhas S, Lee S mSphere. 2021; 6(5):e0070721.
PMID: 34585966 PMC: 8550084. DOI: 10.1128/mSphere.00707-21.
Gupta R, Jiao S, Zhao S, Meeley R, Williams R, Taramino G Plant Direct. 2021; 4(12):e00295.
PMID: 33392436 PMC: 7771657. DOI: 10.1002/pld3.295.