» Articles » PMID: 30796087

Unraveling the Hidden Universe of Small Proteins in Bacterial Genomes

Overview
Journal Mol Syst Biol
Specialty Molecular Biology
Date 2019 Feb 24
PMID 30796087
Citations 62
Authors
Affiliations
Soon will be listed here.
Abstract

Identification of small open reading frames (smORFs) encoding small proteins (≤ 100 amino acids; SEPs) is a challenge in the fields of genome annotation and protein discovery. Here, by combining a novel bioinformatics tool (RanSEPs) with "-omics" approaches, we were able to describe 109 bacterial small ORFomes. Predictions were first validated by performing an exhaustive search of SEPs present in proteome via mass spectrometry, which illustrated the limitations of shotgun approaches. Then, RanSEPs predictions were validated and compared with other tools using proteomic datasets from different bacterial species and SEPs from the literature. We found that up to 16 ± 9% of proteins in an organism could be classified as SEPs. Integration of RanSEPs predictions with transcriptomics data showed that some annotated non-coding RNAs could in fact encode for SEPs. A functional study of SEPs highlighted an enrichment in the membrane, translation, metabolism, and nucleotide-binding categories. Additionally, 9.7% of the SEPs included a N-terminus predicted signal peptide. We envision RanSEPs as a tool to unmask the hidden universe of small bacterial proteins.

Citing Articles

Insights into the diverse roles of the terminal oxidases in Burkholderia cenocepacia H111.

Paszti S, Biner O, Liu Y, Bolli K, Jeggli S, Pessi G Sci Rep. 2025; 15(1):2390.

PMID: 39827173 PMC: 11742914. DOI: 10.1038/s41598-025-86211-8.


Functional characterization of the DUF1127-containing small protein YjiS of Typhimurium.

Venturini E, Maass S, Bischler T, Becher D, Vogel J, Westermann A Microlife. 2025; 6:uqae026.

PMID: 39790481 PMC: 11707872. DOI: 10.1093/femsml/uqae026.


LncSL: A Novel Stacked Ensemble Computing Tool for Subcellular Localization of lncRNA by Amino Acid-Enhanced Features and Two-Stage Automated Selection Strategy.

Zhu L, Chen H, Yang S Int J Mol Sci. 2025; 25(24.

PMID: 39769496 PMC: 11678684. DOI: 10.3390/ijms252413734.


Uncovering the small proteome of Methanosarcina mazei using Ribo-seq and peptidomics under different nitrogen conditions.

Tufail M, Jordan B, Hadjeras L, Gelhausen R, Cassidy L, Habenicht T Nat Commun. 2024; 15(1):8659.

PMID: 39370430 PMC: 11456600. DOI: 10.1038/s41467-024-53008-8.


Transposon mutagenesis screen in identifies genetic determinants required for growth in human urine and serum.

Gray J, Torres V, Goodall E, McKeand S, Scales D, Collins C Elife. 2024; 12.

PMID: 39189918 PMC: 11349299. DOI: 10.7554/eLife.88971.


References
1.
Mount D . Using the Basic Local Alignment Search Tool (BLAST). CSH Protoc. 2011; 2007:pdb.top17. DOI: 10.1101/pdb.top17. View

2.
Crappe J, Van Criekinge W, Trooskens G, Hayakawa E, Luyten W, Baggerman G . Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs. BMC Genomics. 2013; 14:648. PMC: 3852105. DOI: 10.1186/1471-2164-14-648. View

3.
Burkholder W, Kurtser I, Grossman A . Replication initiation proteins regulate a developmental checkpoint in Bacillus subtilis. Cell. 2001; 104(2):269-79. DOI: 10.1016/s0092-8674(01)00211-2. View

4.
Kong L, Zhang Y, Ye Z, Liu X, Zhao S, Wei L . CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 2007; 35(Web Server issue):W345-9. PMC: 1933232. DOI: 10.1093/nar/gkm391. View

5.
Ina Y . New methods for estimating the numbers of synonymous and nonsynonymous substitutions. J Mol Evol. 1995; 40(2):190-226. DOI: 10.1007/BF00167113. View