» Articles » PMID: 31165139

Gene Set Enrichment for Reproducible Science: Comparison of CERNO and Eight Other Algorithms

Overview
Journal Bioinformatics
Specialty Biology
Date 2019 Jun 6
PMID 31165139
Citations 58
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Analysis of gene set (GS) enrichment is an essential part of functional omics studies. Here, we complement the established evaluation metrics of GS enrichment algorithms with a novel approach to assess the practical reproducibility of scientific results obtained from GS enrichment tests when applied to related data from different studies.

Results: We evaluated eight established and one novel algorithm for reproducibility, sensitivity, prioritization, false positive rate and computational time. In addition to eight established algorithms, we also included Coincident Extreme Ranks in Numerical Observations (CERNO), a flexible and fast algorithm based on modified Fisher P-value integration. Using real-world datasets, we demonstrate that CERNO is robust to ranking metrics, as well as sample and GS size. CERNO had the highest reproducibility while remaining sensitive, specific and fast. In the overall ranking Pathway Analysis with Down-weighting of Overlapping Genes, CERNO and over-representation analysis performed best, while CERNO and GeneSetTest scored high in terms of reproducibility.

Availability And Implementation: tmod package implementing the CERNO algorithm is available from CRAN (cran.r-project.org/web/packages/tmod/index.html) and an online implementation can be found at http://tmod.online/. The datasets analyzed in this study are widely available in the KEGGdzPathwaysGEO, KEGGandMetacoreDzPathwaysGEO R package and GEO repository.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Citing Articles

Gene Set Enrichment Analysis in Zebrafish Embryos Is Susceptible to False-Positive Results in the Absence of Differentially Expressed Genes.

Stead J, Lee H, Williams A, Ramirez S, Atlas E, Mennigen J Bioinform Biol Insights. 2025; 19:11779322251321071.

PMID: 40040651 PMC: 11877468. DOI: 10.1177/11779322251321071.


Divergent endothelial mechanisms drive arteriovenous malformations in Alk1 and SMAD4 loss-of-function.

Oppenheim O, Giese W, Park H, Baumann E, Ivanov A, Beule D bioRxiv. 2025; .

PMID: 39829872 PMC: 11741317. DOI: 10.1101/2025.01.03.631070.


Integrative multiomics reveals common endotypes across PSEN1, PSEN2, and APP mutations in familial Alzheimer's disease.

Valdes P, Caldwell A, Liu Q, Fitzgerald M, Ramachandran S, Karch C Alzheimers Res Ther. 2025; 17(1):5.

PMID: 39754192 PMC: 11699654. DOI: 10.1186/s13195-024-01659-6.


Molecular profiles, sources and lineage restrictions of stem cells in an annelid regeneration model.

Stockinger A, Adelmann L, Fahrenberger M, Ruta C, Ozpolat B, Milivojev N Nat Commun. 2024; 15(1):9882.

PMID: 39557833 PMC: 11574210. DOI: 10.1038/s41467-024-54041-3.


Insights into phylogenetic relationships and gene rearrangements: complete mitogenomes of two sympatric species in the genus (Anura, Ranidae).

Li J, Xie M, Zhang F, Shu J, Zhang J, Zhang Z Zookeys. 2024; 1216:63-82.

PMID: 39474245 PMC: 11519660. DOI: 10.3897/zookeys.1216.131847.


References
1.
Weiner 3rd J, Maertzdorf J, Sutherland J, Duffy F, Thompson E, Suliman S . Metabolite changes in blood predict the onset of tuberculosis. Nat Commun. 2018; 9(1):5208. PMC: 6283869. DOI: 10.1038/s41467-018-07635-7. View

2.
Tarca A, Draghici S, Khatri P, Hassan S, Mittal P, Kim J . A novel signaling pathway impact analysis. Bioinformatics. 2008; 25(1):75-82. PMC: 2732297. DOI: 10.1093/bioinformatics/btn577. View

3.
Edgar R, Domrachev M, Lash A . Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2001; 30(1):207-10. PMC: 99122. DOI: 10.1093/nar/30.1.207. View

4.
Tamayo P, Steinhardt G, Liberzon A, Mesirov J . The limitations of simple gene set enrichment analysis assuming gene independence. Stat Methods Med Res. 2012; 25(1):472-87. PMC: 3758419. DOI: 10.1177/0962280212460441. View

5.
Powers R, Goodspeed A, Pielke-Lombardo H, Tan A, Costello J . GSEA-InContext: identifying novel and common patterns in expression experiments. Bioinformatics. 2018; 34(13):i555-i564. PMC: 6022535. DOI: 10.1093/bioinformatics/bty271. View