» Articles » PMID: 39632911

Dual Gene Set Enrichment Analysis (dualGSEA); an R Function That Enables More Robust Biological Discovery and Pre-clinical Model Alignment from Transcriptomics Data

Overview
Journal Sci Rep
Specialty Science
Date 2024 Dec 5
PMID 39632911
Authors
Affiliations
Soon will be listed here.
Abstract

Gene set enrichment analysis (GSEA) tools can identify biological insights within gene expression-based studies. Although their statistical performance has been compared, the downstream biological implications that arise when choosing between the range of pairwise or single sample forms of GSEA methods remain understudied. We compare the statistical and biological results obtained from various pre-ranking methods/options for pairwise GSEA, followed by a stand-alone comparison of GSEA, single sample GSEA (ssGSEA) and gene set variation analysis (GSVA). Pairwise GSEA and fGSEA provide similar results when deployed using a range of gene pre-ranking methods. However, pairwise GSEA can overgeneralise biological enrichment, as when the most statistically significant signatures were assessed using single sample approaches, there was a complete absence of biological distinction between these groups. To avoid these issues, we developed a new dualGSEA tool, which provides users with multiple statistics and visuals to aid interpretation of results. This new tool removes the possibility of users inadvertently interpreting statistical findings as equating to biological distinction between samples within groups-of-interest. dualGSEA provides a more robust basis for discovery research, one which allows user to compare both statistical significance alongside biological distinctions in their data.

References
1.
Tarca A, Bhatti G, Romero R . A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity. PLoS One. 2013; 8(11):e79217. PMC: 3829842. DOI: 10.1371/journal.pone.0079217. View

2.
Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov J . GenePattern 2.0. Nat Genet. 2006; 38(5):500-1. DOI: 10.1038/ng0506-500. View

3.
Chang L, Lin H, Sibille E, Tseng G . Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline. BMC Bioinformatics. 2013; 14:368. PMC: 3898528. DOI: 10.1186/1471-2105-14-368. View

4.
Prebensen C, Lefol Y, Myhre P, Luders T, Jonassen C, Blomfeldt A . Longitudinal whole blood transcriptomic analysis characterizes neutrophil activation and interferon signaling in moderate and severe COVID-19. Sci Rep. 2023; 13(1):10368. PMC: 10293211. DOI: 10.1038/s41598-023-37606-y. View

5.
Malla S, Byrne R, Lafarge M, Corry S, Fisher N, Tsantoulis P . Pathway level subtyping identifies a slow-cycling biological phenotype associated with poor clinical outcomes in colorectal cancer. Nat Genet. 2024; 56(3):458-472. PMC: 10937375. DOI: 10.1038/s41588-024-01654-5. View