Modeling Sequence-Space Exploration and Emergence of Epistatic Signals in Protein Evolution
Overview
Affiliations
During their evolution, proteins explore sequence space via an interplay between random mutations and phenotypic selection. Here, we build upon recent progress in reconstructing data-driven fitness landscapes for families of homologous proteins, to propose stochastic models of experimental protein evolution. These models predict quantitatively important features of experimentally evolved sequence libraries, like fitness distributions and position-specific mutational spectra. They also allow us to efficiently simulate sequence libraries for a vast array of combinations of experimental parameters like sequence divergence, selection strength, and library size. We showcase the potential of the approach in reanalyzing two recent experiments to determine protein structure from signals of epistasis emerging in experimental sequence libraries. To be detectable, these signals require sufficiently large and sufficiently diverged libraries. Our modeling framework offers a quantitative explanation for different outcomes of recently published experiments. Furthermore, we can forecast the outcome of time- and resource-intensive evolution experiments, opening thereby a way to computationally optimize experimental protocols.
Entrenchment and contingency in neutral protein evolution with epistasis.
Schmelkin L, Carnevale V, Haldane A, Townsend J, Chung S, Levy R bioRxiv. 2025; .
PMID: 39868204 PMC: 11761135. DOI: 10.1101/2025.01.09.632266.
Mazzocato Y, Frasson N, Sample M, Fregonese C, Pavan A, Caregnato A ACS Cent Sci. 2024; 10(12):2242-2252.
PMID: 39735311 PMC: 11672547. DOI: 10.1021/acscentsci.4c01428.
Chen J, Bisardi M, Lee D, Cotogno S, Zamponi F, Weigt M Nat Commun. 2024; 15(1):8441.
PMID: 39349467 PMC: 11442494. DOI: 10.1038/s41467-024-52614-w.
Emergent time scales of epistasis in protein evolution.
Di Bari L, Bisardi M, Cotogno S, Weigt M, Zamponi F Proc Natl Acad Sci U S A. 2024; 121(40):e2406807121.
PMID: 39325427 PMC: 11459137. DOI: 10.1073/pnas.2406807121.
Machine learning in biological physics: From biomolecular prediction to design.
Martin J, Lequerica Mateos M, Onuchic J, Coluzza I, Morcos F Proc Natl Acad Sci U S A. 2024; 121(27):e2311807121.
PMID: 38913893 PMC: 11228481. DOI: 10.1073/pnas.2311807121.