» Articles » PMID: 20674389

An Efficient Data Format for Mass Spectrometry-based Proteomics

Overview
Specialty Chemistry
Date 2010 Aug 3
PMID 20674389
Citations 11
Authors
Affiliations
Soon will be listed here.
Abstract

The diverse range of mass spectrometry (MS) instrumentation along with corresponding proprietary and nonproprietary data formats has generated a proteomics community driven call for a standardized format to facilitate management, processing, storing, visualization, and exchange of both experimental and processed data. To date, significant efforts have been extended towards standardizing XML-based formats for mass spectrometry data representation, despite the recognized inefficiencies associated with storing large numeric datasets in XML. The proteomics community has periodically entertained alternate strategies for data exchange, e.g., using a common application programming interface or a database-derived format. However, these efforts have yet to gain significant attention, mostly because they have not demonstrated significant performance benefits over existing standards, but also due to issues such as extensibility to multidimensional separation systems, robustness of operation, and incomplete or mismatched vocabulary. Here, we describe a format based on standard database principles that offers multiple benefits over existing formats in terms of storage size, ease of processing, data retrieval times, and extensibility to accommodate multidimensional separation systems.

Citing Articles

Proteomics Standards Initiative at Twenty Years: Current Activities and Future Work.

Deutsch E, Vizcaino J, Jones A, Binz P, Lam H, Klein J J Proteome Res. 2023; 22(2):287-301.

PMID: 36626722 PMC: 9903322. DOI: 10.1021/acs.jproteome.2c00637.


Optical Microscopy-Guided Laser Ablation Electrospray Ionization Ion Mobility Mass Spectrometry: Ambient Single Cell Metabolomics with Increased Confidence in Molecular Identification.

Taylor M, Mattson S, Liyu A, Stopka S, Ibrahim Y, Vertes A Metabolites. 2021; 11(4).

PMID: 33801673 PMC: 8065410. DOI: 10.3390/metabo11040200.


PIXiE: an algorithm for automated ion mobility arrival time extraction and collision cross section calculation using global data association.

Ma J, Casey C, Zheng X, Ibrahim Y, Wilkins C, Renslow R Bioinformatics. 2017; 33(17):2715-2722.

PMID: 28505286 PMC: 5860068. DOI: 10.1093/bioinformatics/btx305.


Development of an Ion Mobility Spectrometry-Orbitrap Mass Spectrometer Platform.

Ibrahim Y, Garimella S, Prost S, Wojcik R, Norheim R, Baker E Anal Chem. 2017; 88(24):12152-12160.

PMID: 28193022 PMC: 6211177. DOI: 10.1021/acs.analchem.6b03027.


Development of a New Ion Mobility (Quadrupole) Time-of-Flight Mass Spectrometer.

Ibrahim Y, Baker E, Danielson 3rd W, Norheim R, Prior D, Anderson G Int J Mass Spectrom. 2015; 377:655-662.

PMID: 26185483 PMC: 4501404. DOI: 10.1016/j.ijms.2014.07.034.


References
1.
Falkner J, Falkner J, Andrews P . ProteomeCommons.org IO Framework: reading and writing multiple proteomics data formats. Bioinformatics. 2006; 23(2):262-3. DOI: 10.1093/bioinformatics/btl573. View

2.
Lin S, Zhu L, Winter A, Sasinowski M, Kibbe W . What is mzXML good for?. Expert Rev Proteomics. 2005; 2(6):839-45. DOI: 10.1586/14789450.2.6.839. View

3.
Deutsch E . mzML: a single, unifying data format for mass spectrometer output. Proteomics. 2008; 8(14):2776-7. DOI: 10.1002/pmic.200890049. View

4.
Katajamaa M, Miettinen J, Oresic M . MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics. 2006; 22(5):634-6. DOI: 10.1093/bioinformatics/btk039. View

5.
Kessner D, Chambers M, Burke R, Agus D, Mallick P . ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics. 2008; 24(21):2534-6. PMC: 2732273. DOI: 10.1093/bioinformatics/btn323. View