The Use of Proteotypic Peptide Libraries for Protein Identification
Overview
Affiliations
This paper describes an algorithm to apply proteotypic peptide sequence libraries to protein identifications performed using tandem mass spectrometry (MS/MS). Proteotypic peptides are those peptides in a protein sequence that are most likely to be confidently observed by current MS-based proteomics methods. Libraries of proteotypic peptide sequences were compiled from the Global Proteome Machine Database for Homo sapiens and Saccharomyces cerevisiae model species proteomes. These libraries were used to scan through collections of tandem mass spectra to discover which proteins were represented by the data sets, followed by detailed analysis of the spectra with the full protein sequences corresponding to the discovered proteotypic peptides. This algorithm (Proteotypic Peptide Profiling, or P3) resulted in sequence-to-spectrum matches comparable to those obtained by conventional protein identification algorithms using only full protein sequences, with a 20-fold reduction in the time required to perform the identification calculations. The proteotypic peptide libraries, the open source code for the implementation of the search algorithm and a website for using the software have been made freely available. Approximately 4% of the residues in the H. sapiens proteome were required in the proteotypic peptide library to successfully identify proteins.
Reply to: Identification of old coding regions disproves the hominoid de novo status of genes.
Xiao C, Mo F, Lu Y, Xiao Q, Yao C, Li T Nat Ecol Evol. 2024; 8(10):1831-1834.
PMID: 39187608 DOI: 10.1038/s41559-024-02515-4.
Interpretation of Tandem Mass Spectrometry (MS-MS) Spectra for Peptide Analysis.
Hjerno K, Hojrup P Methods Mol Biol. 2024; 2821:91-110.
PMID: 38997483 DOI: 10.1007/978-1-0716-3914-6_8.
Karpov O, Stotland A, Raedschelders K, Chazarin B, Ai L, Murray C Physiol Rev. 2024; 104(3):931-982.
PMID: 38300522 PMC: 11381016. DOI: 10.1152/physrev.00026.2023.
SpliceProt 2.0: A Sequence Repository of Human, Mouse, and Rat Proteoforms.
Santos L, Parreira V, da Silva E, Dias Mariano Santos M, Fernandes A, Neves-Ferreira A Int J Mol Sci. 2024; 25(2).
PMID: 38256255 PMC: 10816255. DOI: 10.3390/ijms25021183.
Ai L, Binek A, Kreimer S, Ayres M, Stotland A, Van Eyk J J Proteome Res. 2023; 22(6):2124-2130.
PMID: 37040897 PMC: 10243111. DOI: 10.1021/acs.jproteome.3c00027.