» Articles » PMID: 19106086

A Simulated MS/MS Library for Spectrum-to-spectrum Searching in Large Scale Identification of Proteins

Overview
Date 2008 Dec 25
PMID 19106086
Citations 17
Authors
Affiliations
Soon will be listed here.
Abstract

Identifying peptides from mass spectrometric fragmentation data (MS/MS spectra) using search strategies that map protein sequences to spectra is computationally expensive. An alternative strategy uses direct spectrum-to-spectrum matching against a reference library of previously observed MS/MS that has the advantage of evaluating matches using fragment ion intensities and other ion types than the simple set normally used. However, this approach is limited by the small sizes of the available peptide MS/MS libraries and the inability to evaluate the rate of false assignments. In this study, we observed good performance of simulated spectra generated by the kinetic model implemented in MassAnalyzer (Zhang, Z. (2004) Prediction of low-energy collision-induced dissociation spectra of peptides. Anal. Chem. 76, 3908-3922; Zhang, Z. (2005) Prediction of low-energy collision-induced dissociation spectra of peptides with three or more charges. Anal. Chem. 77, 6364-6373) as a substitute for the reference libraries used by the spectrum-to-spectrum search programs X!Hunter and BiblioSpec and similar results in comparison with the spectrum-to-sequence program Mascot. We also demonstrate the use of simulated spectra for searching against decoy sequences to estimate false discovery rates. Although we found lower score discrimination with spectrum-to-spectrum searches than with Mascot, particularly for higher charge forms, comparable peptide assignments with low false discovery rate were achieved by examining consensus between X!Hunter and Mascot, filtering results by mass accuracy, and ignoring score thresholds. Protein identification results are comparable to those achieved when evaluating consensus between Sequest and Mascot. Run times with large scale data sets using X!Hunter with the simulated spectral library are 7 times faster than Mascot and 80 times faster than Sequest with the human International Protein Index (IPI) database. We conclude that simulated spectral libraries greatly expand the search space available for spectrum-to-spectrum searching while enabling principled analyses and that the approach can be used in consensus strategies for large scale studies while reducing search times.

Citing Articles

MSBooster: improving peptide identification rates using deep learning-based features.

Yang K, Yu F, Teo G, Li K, Demichev V, Ralser M Nat Commun. 2023; 14(1):4539.

PMID: 37500632 PMC: 10374903. DOI: 10.1038/s41467-023-40129-9.


CIDer: A Statistical Framework for Interpreting Differences in CID and HCD Fragmentation.

Wilburn D, Richards A, Swaney D, Searle B J Proteome Res. 2021; 20(4):1951-1965.

PMID: 33729787 PMC: 8256874. DOI: 10.1021/acs.jproteome.0c00964.


Molecular Surgery: Proteomics of a Rare Genetic Disease Gives Insight into Common Causes of Blindness.

Velez G, Mahajan V iScience. 2020; 23(11):101667.

PMID: 33134897 PMC: 7586135. DOI: 10.1016/j.isci.2020.101667.


ProSave: an application for restoring quantitative data to manipulated subsets of protein lists.

Machlab D, Velez G, Bassuk A, Mahajan V Source Code Biol Med. 2018; 13:3.

PMID: 30459825 PMC: 6233572. DOI: 10.1186/s13029-018-0070-0.


Personalized Proteomics for Precision Health: Identifying Biomarkers of Vitreoretinal Disease.

Velez G, Tang P, Cabral T, Cho G, Machlab D, Tsang S Transl Vis Sci Technol. 2018; 7(5):12.

PMID: 30271679 PMC: 6159735. DOI: 10.1167/tvst.7.5.12.


References
1.
Searle B, Turner M, Nesvizhskii A . Improving sensitivity by probabilistically combining results from multiple MS/MS search methodologies. J Proteome Res. 2008; 7(1):245-53. DOI: 10.1021/pr070540w. View

2.
Keller A, Nesvizhskii A, Kolker E, Aebersold R . Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem. 2002; 74(20):5383-92. DOI: 10.1021/ac025747h. View

3.
Aebersold R, Mann M . Mass spectrometry-based proteomics. Nature. 2003; 422(6928):198-207. DOI: 10.1038/nature01511. View

4.
Sadygov R, Cociorva D, Yates 3rd J . Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nat Methods. 2005; 1(3):195-202. DOI: 10.1038/nmeth725. View

5.
Stein S, Scott D . Optimization and testing of mass spectral library search algorithms for compound identification. J Am Soc Mass Spectrom. 2013; 5(9):859-66. DOI: 10.1016/1044-0305(94)87009-8. View