» Articles » PMID: 33230664

Site-Specific Amino Acid Distributions Follow a Universal Shape

Overview
Journal J Mol Evol
Specialty Biochemistry
Date 2020 Nov 24
PMID 33230664
Citations 3
Authors
Affiliations
Soon will be listed here.
Abstract

In many applications of evolutionary inference, a model of protein evolution needs to be fitted to the amino acid variation at individual sites in a multiple sequence alignment. Most existing models fall into one of two extremes: Either they provide a coarse-grained description that lacks biophysical realism (e.g., dN/dS models), or they require a large number of parameters to be fitted (e.g., mutation-selection models). Here, we ask whether a middle ground is possible: Can we obtain a realistic description of site-specific amino acid frequencies while severely restricting the number of free parameters in the model? We show that a distribution with a single free parameter can accurately capture the variation in amino acid frequency at most sites in an alignment, as long as we are willing to restrict our analysis to predicting amino acid frequencies by rank rather than by amino acid identity. This result holds equally well both in alignments of empirical protein sequences and of sequences evolved under a biophysically realistic all-atom force field. Our analysis reveals a near universal shape of the frequency distributions of amino acids. This insight has the potential to lead to new models of evolution that have both increased realism and a limited number of free parameters.

Citing Articles

A fitness distribution law for amino-acid replacements.

Sun M, Stoltzfus A, McCandlish D bioRxiv. 2024; .

PMID: 39464166 PMC: 11507765. DOI: 10.1101/2024.10.11.617952.


Substitution Models of Protein Evolution with Selection on Enzymatic Activity.

Ferreiro D, Khalil R, Sousa S, Arenas M Mol Biol Evol. 2024; 41(2).

PMID: 38314876 PMC: 10873502. DOI: 10.1093/molbev/msae026.


Multiple mechanisms explain loss of anthocyanins from betalain-pigmented Caryophyllales, including repeated wholesale loss of a key anthocyanidin synthesis enzyme.

Pucker B, Walker-Hale N, Dzurlic J, Yim W, Cushman J, Crum A New Phytol. 2023; 241(1):471-489.

PMID: 37897060 PMC: 10952170. DOI: 10.1111/nph.19341.

References
1.
Porto M, Roman H, Vendruscolo M, Bastolla U . Prediction of site-specific amino acid distributions and limits of divergent evolutionary changes in protein sequences. Mol Biol Evol. 2004; 22(3):630-8. DOI: 10.1093/molbev/msi048. View

2.
Yang Z, Nielsen R . Mutation-selection models of codon substitution and their use to estimate selective strengths on codon usage. Mol Biol Evol. 2008; 25(3):568-79. DOI: 10.1093/molbev/msm284. View

3.
Spielman S, Kosakovsky Pond S . Relative evolutionary rate inference in HyPhy with LEISR. PeerJ. 2018; 6:e4339. PMC: 5804317. DOI: 10.7717/peerj.4339. View

4.
Bastolla U, Arenas M . The Influence of Protein Stability on Sequence Evolution: Applications to Phylogenetic Inference. Methods Mol Biol. 2018; 1851:215-231. DOI: 10.1007/978-1-4939-8736-8_11. View

5.
Kosakovsky Pond S, Frost S . Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol Biol Evol. 2005; 22(5):1208-22. DOI: 10.1093/molbev/msi105. View