» Articles » PMID: 36826345

Early Selection of the Amino Acid Alphabet Was Adaptively Shaped by Biophysical Constraints of Foldability

Abstract

Whereas modern proteins rely on a quasi-universal repertoire of 20 canonical amino acids (AAs), numerous lines of evidence suggest that ancient proteins relied on a limited alphabet of 10 "early" AAs and that the 10 "late" AAs were products of biosynthetic pathways. However, many nonproteinogenic AAs were also prebiotically available, which begs two fundamental questions: Why do we have the current modern amino acid alphabet and would proteins be able to fold into globular structures as well if different amino acids comprised the genetic code? Here, we experimentally evaluate the solubility and secondary structure propensities of several prebiotically relevant amino acids in the context of synthetic combinatorial 25-mer peptide libraries. The most prebiotically abundant linear aliphatic and basic residues were incorporated along with or in place of other early amino acids to explore these alternative sequence spaces. The results show that foldability was likely a critical factor in the selection of the canonical alphabet. Unbranched aliphatic amino acids were purged from the proteinogenic alphabet despite their high prebiotic abundance because they generate polypeptides that are oversolubilized and have low packing efficiency. Surprisingly, we find that the inclusion of a short-chain basic amino acid also decreases polypeptides' secondary structure potential, for which we suggest a biophysical model. Our results support the view that, despite lacking basic residues, the early canonical alphabet was remarkably adaptive at supporting protein folding and explain why basic residues were only incorporated at a later stage of protein evolution.

Citing Articles

Designing single-polymer-chain nanoparticles to mimic biomolecular hydration frustration.

Jin T, Coley C, Alexander-Katz A Nat Chem. 2025; .

PMID: 40074826 DOI: 10.1038/s41557-025-01760-9.


Long-Range Electrostatics in Serine Proteases: Machine Learning-Driven Reaction Sampling Yields Insights for Enzyme Design.

Zlobin A, Maslova V, Beliaeva J, Meiler J, Golovin A J Chem Inf Model. 2025; 65(4):2003-2013.

PMID: 39928564 PMC: 11863386. DOI: 10.1021/acs.jcim.4c01827.


Similarity Analysis of Computer-Generated and Commercial Libraries for Targeted Biocompatible Coded Amino Acid Replacement.

Meringer M, Casanola-Martin G, Rasulev B, James Cleaves 2nd H Int J Mol Sci. 2024; 25(22).

PMID: 39596409 PMC: 11595000. DOI: 10.3390/ijms252212343.


The interplay between peptides and RNA is critical for protoribosome compartmentalization and stability.

Codispoti S, Yamaguchi T, Makarov M, Giacobelli V, Masek M, Kolar M Nucleic Acids Res. 2024; 52(20):12689-12700.

PMID: 39340303 PMC: 11551759. DOI: 10.1093/nar/gkae823.


Alkylation of Complex Glycine Precursor (CGP) as a Prebiotic Route to 20 Proteinogenic Amino Acids Synthesis.

Kuroda C, Kobayashi K Molecules. 2024; 29(18).

PMID: 39339398 PMC: 11434435. DOI: 10.3390/molecules29184403.


References
1.
Alvarez-Carreno C, Becerra A, Lazcano A . Norvaline and norleucine may have been more abundant protein components during early stages of cell evolution. Orig Life Evol Biosph. 2013; 43(4-5):363-75. DOI: 10.1007/s11084-013-9344-3. View

2.
Dill K . Dominant forces in protein folding. Biochemistry. 1990; 29(31):7133-55. DOI: 10.1021/bi00483a001. View

3.
Singh S, Singh H, Tuknait A, Chaudhary K, Singh B, Kumaran S . PEPstrMOD: structure prediction of peptides containing natural, non-natural and modified residues. Biol Direct. 2015; 10:73. PMC: 4687368. DOI: 10.1186/s13062-015-0103-4. View

4.
Fried S, Fujishima K, Makarov M, Cherepashuk I, Hlouchova K . Peptides before and during the nucleotide world: an origins story emphasizing cooperation between proteins and nucleic acids. J R Soc Interface. 2022; 19(187):20210641. PMC: 8833103. DOI: 10.1098/rsif.2021.0641. View

5.
Longo L, Despotovic D, Weil-Ktorza O, Walker M, Jablonska J, Fridmann-Sirkis Y . Primordial emergence of a nucleic acid-binding protein via phase separation and statistical ornithine-to-arginine conversion. Proc Natl Acad Sci U S A. 2020; 117(27):15731-15739. PMC: 7355028. DOI: 10.1073/pnas.2001989117. View