» Articles » PMID: 35166338

Democratizing Data-independent Acquisition Proteomics Analysis on Public Cloud Infrastructures Via the Galaxy Framework

Overview
Journal Gigascience
Specialties Biology
Genetics
Date 2022 Feb 15
PMID 35166338
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Data-independent acquisition (DIA) has become an important approach in global, mass spectrometric proteomic studies because it provides in-depth insights into the molecular variety of biological systems. However, DIA data analysis remains challenging owing to the high complexity and large data and sample size, which require specialized software and vast computing infrastructures. Most available open-source DIA software necessitates basic programming skills and covers only a fraction of a complete DIA data analysis. In consequence, DIA data analysis often requires usage of multiple software tools and compatibility thereof, severely limiting the usability and reproducibility.

Findings: To overcome this hurdle, we have integrated a suite of open-source DIA tools in the Galaxy framework for reproducible and version-controlled data processing. The DIA suite includes OpenSwath, PyProphet, diapysef, and swath2stats. We have compiled functional Galaxy pipelines for DIA processing, which provide a web-based graphical user interface to these pre-installed and pre-configured tools for their use on freely accessible, powerful computational resources of the Galaxy framework. This approach also enables seamless sharing workflows with full configuration in addition to sharing raw data and results. We demonstrate the usability of an all-in-one DIA pipeline in Galaxy by the analysis of a spike-in case study dataset. Additionally, extensive training material is provided to further increase access for the proteomics community.

Conclusion: The integration of an open-source DIA analysis suite in the web-based and user-friendly Galaxy framework in combination with extensive training material empowers a broad community of researches to perform reproducible and transparent DIA data analysis.

Citing Articles

Data-Independent Acquisition: A Milestone and Prospect in Clinical Mass Spectrometry-Based Proteomics.

Frohlich K, Fahrner M, Brombacher E, Seredynska A, Maldacker M, Kreutz C Mol Cell Proteomics. 2024; 23(8):100800.

PMID: 38880244 PMC: 11380018. DOI: 10.1016/j.mcpro.2024.100800.


Galaxy Training: A powerful framework for teaching!.

Hiltemann S, Rasche H, Gladman S, Hotz H, Lariviere D, Blankenberg D PLoS Comput Biol. 2023; 19(1):e1010752.

PMID: 36622853 PMC: 9829167. DOI: 10.1371/journal.pcbi.1010752.


Proteomic repository data submission, dissemination, and reuse: key messages.

Perez-Riverol Y Expert Rev Proteomics. 2022; 19(7-12):297-310.

PMID: 36529941 PMC: 7614296. DOI: 10.1080/14789450.2022.2160324.


Characterization of serum protein expression profiles in the early sarcopenia older adults with low grip strength: a cross-sectional study.

Wu J, Cao L, Wang J, Wang Y, Hao H, Huang L BMC Musculoskelet Disord. 2022; 23(1):894.

PMID: 36192674 PMC: 9528053. DOI: 10.1186/s12891-022-05844-2.


Implementing the reuse of public DIA proteomics datasets: from the PRIDE database to Expression Atlas.

Walzer M, Garcia-Seisdedos D, Prakash A, Brack P, Crowther P, Graham R Sci Data. 2022; 9(1):335.

PMID: 35701420 PMC: 9197839. DOI: 10.1038/s41597-022-01380-9.


References
1.
Rost H, Rosenberger G, Navarro P, Gillet L, Miladinovic S, Schubert O . OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nat Biotechnol. 2014; 32(3):219-23. DOI: 10.1038/nbt.2841. View

2.
Deutsch E . Mass spectrometer output file format mzML. Methods Mol Biol. 2009; 604:319-31. PMC: 3073315. DOI: 10.1007/978-1-60761-444-9_22. View

3.
Muntel J, Kirkpatrick J, Bruderer R, Huang T, Vitek O, Ori A . Comparison of Protein Quantification in a Complex Background by DIA and TMT Workflows with Fixed Instrument Time. J Proteome Res. 2019; 18(3):1340-1351. DOI: 10.1021/acs.jproteome.8b00898. View

4.
Teleman J, Rost H, Rosenberger G, Schmitt U, Malmstrom L, Malmstrom J . DIANA--algorithmic improvements for analysis of data-independent acquisition MS data. Bioinformatics. 2014; 31(4):555-62. DOI: 10.1093/bioinformatics/btu686. View

5.
Schubert O, Gillet L, Collins B, Navarro P, Rosenberger G, Wolski W . Building high-quality assay libraries for targeted analysis of SWATH MS data. Nat Protoc. 2015; 10(3):426-41. DOI: 10.1038/nprot.2015.015. View