» Articles » PMID: 20829449

The Generating Function of CID, ETD, and CID/ETD Pairs of Tandem Mass Spectra: Applications to Database Search

Overview
Date 2010 Sep 11
PMID 20829449
Citations 109
Authors
Affiliations
Soon will be listed here.
Abstract

Recent emergence of new mass spectrometry techniques (e.g. electron transfer dissociation, ETD) and improved availability of additional proteases (e.g. Lys-N) for protein digestion in high-throughput experiments raised the challenge of designing new algorithms for interpreting the resulting new types of tandem mass (MS/MS) spectra. Traditional MS/MS database search algorithms such as SEQUEST and Mascot were originally designed for collision induced dissociation (CID) of tryptic peptides and are largely based on expert knowledge about fragmentation of tryptic peptides (rather than machine learning techniques) to design CID-specific scoring functions. As a result, the performance of these algorithms is suboptimal for new mass spectrometry technologies or nontryptic peptides. We recently proposed the generating function approach (MS-GF) for CID spectra of tryptic peptides. In this study, we extend MS-GF to automatically derive scoring parameters from a set of annotated MS/MS spectra of any type (e.g. CID, ETD, etc.), and present a new database search tool MS-GFDB based on MS-GF. We show that MS-GFDB outperforms Mascot for ETD spectra or peptides digested with Lys-N. For example, in the case of ETD spectra, the number of tryptic and Lys-N peptides identified by MS-GFDB increased by a factor of 2.7 and 2.6 as compared with Mascot. Moreover, even following a decade of Mascot developments for analyzing CID spectra of tryptic peptides, MS-GFDB (that is not particularly tailored for CID spectra or tryptic peptides) resulted in 28% increase over Mascot in the number of peptide identifications. Finally, we propose a statistical framework for analyzing multiple spectra from the same precursor (e.g. CID/ETD spectral pairs) and assigning p values to peptide-spectrum-spectrum matches.

Citing Articles

A Review of Protein Inference.

Uszkoreit J, Marcus K, Eisenacher M Methods Mol Biol. 2024; 2859:53-64.

PMID: 39436596 DOI: 10.1007/978-1-0716-4152-1_4.


Characterization of peptide-protein relationships in protein ambiguity groups via bipartite graphs.

Schork K, Turewicz M, Uszkoreit J, Rahnenfuhrer J, Eisenacher M PLoS One. 2022; 17(10):e0276401.

PMID: 36269744 PMC: 9586388. DOI: 10.1371/journal.pone.0276401.


Systematic exploration of dynamic splicing networks reveals conserved multistage regulators of neurogenesis.

Han H, Best A, Braunschweig U, Mikolajewicz N, Li J, Roth J Mol Cell. 2022; 82(16):2982-2999.e14.

PMID: 35914530 PMC: 10686216. DOI: 10.1016/j.molcel.2022.06.036.


Dataset containing physiological amounts of spike-in proteins into murine C2C12 background as a ground truth quantitative LC-MS/MS reference.

Uszkoreit J, Barkovits K, Pacharra S, Pfeiffer K, Steinbach S, Marcus K Data Brief. 2022; 43:108435.

PMID: 35845101 PMC: 9283871. DOI: 10.1016/j.dib.2022.108435.


Endofin is required for HD-PTP and ESCRT-0 interdependent endosomal sorting of ubiquitinated transmembrane cargoes.

Kazan J, Desrochers G, Martin C, Jeong H, Kharitidi D, Apaja P iScience. 2021; 24(11):103274.

PMID: 34761192 PMC: 8567383. DOI: 10.1016/j.isci.2021.103274.


References
1.
Coon J . Collisions or electrons? Protein sequence analysis in the 21st century. Anal Chem. 2009; 81(9):3208-15. PMC: 2714553. DOI: 10.1021/ac802330b. View

2.
Keller A, Eng J, Zhang N, Li X, Aebersold R . A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Mol Syst Biol. 2006; 1:2005.0017. PMC: 1681455. DOI: 10.1038/msb4100024. View

3.
Altelaar A, Mohammed S, Brans M, Adan R, Heck A . Improved identification of endogenous peptides from murine nervous tissue by multiplexed peptide extraction methods and multiplexed mass spectrometric analysis. J Proteome Res. 2009; 8(2):870-6. DOI: 10.1021/pr800449n. View

4.
Khidekel N, Ficarro S, Clark P, Bryan M, Swaney D, Rexach J . Probing the dynamics of O-GlcNAc glycosylation in the brain using quantitative proteomics. Nat Chem Biol. 2007; 3(6):339-48. DOI: 10.1038/nchembio881. View

5.
Cox J, Mann M . MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008; 26(12):1367-72. DOI: 10.1038/nbt.1511. View