Inferring Repeat-protein Energetics from Evolutionary Information
Overview
Affiliations
Natural protein sequences contain a record of their history. A common constraint in a given protein family is the ability to fold to specific structures, and it has been shown possible to infer the main native ensemble by analyzing covariations in extant sequences. Still, many natural proteins that fold into the same structural topology show different stabilization energies, and these are often related to their physiological behavior. We propose a description for the energetic variation given by sequence modifications in repeat proteins, systems for which the overall problem is simplified by their inherent symmetry. We explicitly account for single amino acid and pair-wise interactions and treat higher order correlations with a single term. We show that the resulting evolutionary field can be interpreted with structural detail. We trace the variations in the energetic scores of natural proteins and relate them to their experimental characterization. The resulting energetic evolutionary field allows the prediction of the folding free energy change for several mutants, and can be used to generate synthetic sequences that are statistically indistinguishable from the natural counterparts.
Conserved and divergent signals in 5' splice site sequences across fungi, metazoa and plants.
Beckel M, Kaufman B, Yanovsky M, Chernomoretz A PLoS Comput Biol. 2023; 19(10):e1011540.
PMID: 37831726 PMC: 10599564. DOI: 10.1371/journal.pcbi.1011540.
The Effect of Mutations in the TPR and Ankyrin Families of Alpha Solenoid Repeat Proteins.
Izert M, Szybowska P, Gorna M, Merski M Front Bioinform. 2022; 1:696368.
PMID: 36303725 PMC: 9581033. DOI: 10.3389/fbinf.2021.696368.
Evolution and folding of repeat proteins.
Galpern E, Marchi J, Mora T, Walczak A, Ferreiro D Proc Natl Acad Sci U S A. 2022; 119(31):e2204131119.
PMID: 35905321 PMC: 9351489. DOI: 10.1073/pnas.2204131119.
Large Ankyrin repeat proteins are formed with similar and energetically favorable units.
Galpern E, Freiberger M, Ferreiro D PLoS One. 2020; 15(6):e0233865.
PMID: 32579546 PMC: 7314423. DOI: 10.1371/journal.pone.0233865.
Size and structure of the sequence space of repeat proteins.
Marchi J, Galpern E, Espada R, Ferreiro D, Walczak A, Mora T PLoS Comput Biol. 2019; 15(8):e1007282.
PMID: 31415557 PMC: 6733475. DOI: 10.1371/journal.pcbi.1007282.