» Articles » PMID: 30521036

Population Genetics Based Phylogenetics Under Stabilizing Selection for an Optimal Amino Acid Sequence: A Nested Modeling Approach

Overview
Journal Mol Biol Evol
Specialty Biology
Date 2018 Dec 7
PMID 30521036
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

We present a new phylogenetic approach, selection on amino acids and codons (SelAC), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models that assume a single substitution matrix for all sites, our model more realistically represents the evolution of protein-coding DNA under the assumption of consistent, stabilizing selection using a cost-benefit approach. This cost-benefit approach allows us to generate a set of 20 optimal amino acid-specific matrix families using just a handful of parameters and naturally links the strength of stabilizing selection to protein synthesis levels, which we can estimate. Using a yeast data set of 100 orthologs for 6 taxa, we find SelAC fits the data much better than popular models by 104-105 Akike information criterion units adjusted for small sample bias. Our results also indicated that nested, mechanistic models better predict observed data patterns highlighting the improvement in biological realism in amino acid sequence evolution that our model provides. Additional parameters estimated by SelAC indicate that a large amount of nonphylogenetic, but biologically meaningful, information can be inferred from existing data. For example, SelAC prediction of gene-specific protein synthesis rates correlates well with both empirical (r=0.33-0.48) and other theoretical predictions (r=0.45-0.64) for multiple yeast species. SelAC also provides estimates of the optimal amino acid at each site. Finally, because SelAC is a nested approach based on clearly stated biological assumptions, future modifications, such as including shifts in the optimal amino acid sequence within or across lineages, are possible.

Citing Articles

Next-generation development and application of codon model in evolution.

Gupta M, Vadde R Front Genet. 2023; 14:1091575.

PMID: 36777719 PMC: 9911445. DOI: 10.3389/fgene.2023.1091575.


A Spatially Explicit Model of Stabilizing Selection for Improving Phylogenetic Inference.

Beaulieu J, OMeara B, Gilchrist M Mol Biol Evol. 2020; 38(4):1641-1652.

PMID: 33306127 PMC: 8042768. DOI: 10.1093/molbev/msaa318.

References
1.
Thiele I, Fleming R, Que R, Bordbar A, Diep D, Palsson B . Multiscale modeling of metabolism and macromolecular synthesis in E. coli and its application to the evolution of codon usage. PLoS One. 2012; 7(9):e45635. PMC: 3461016. DOI: 10.1371/journal.pone.0045635. View

2.
Gilchrist M, Wagner A . A model of protein translation including codon bias, nonsense errors, and ribosome recycling. J Theor Biol. 2005; 239(4):417-34. DOI: 10.1016/j.jtbi.2005.08.007. View

3.
Goldman N, Thorne J, Jones D . Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics. 1998; 149(1):445-58. PMC: 1460119. DOI: 10.1093/genetics/149.1.445. View

4.
Xia X, Li W . What amino acid properties affect protein evolution?. J Mol Evol. 1998; 47(5):557-64. DOI: 10.1007/pl00006412. View

5.
Koshi J, Goldstein R . Mutation matrices and physical-chemical properties: correlations and implications. Proteins. 1997; 27(3):336-44. DOI: 10.1002/(sici)1097-0134(199703)27:3<336::aid-prot2>3.0.co;2-b. View