» Articles » PMID: 38436561

Benchmarking Enrichment Analysis Methods with the Disease Pathway Network

Overview
Journal Brief Bioinform
Specialty Biology
Date 2024 Mar 4
PMID 38436561
Authors
Affiliations
Soon will be listed here.
Abstract

Enrichment analysis (EA) is a common approach to gain functional insights from genome-scale experiments. As a consequence, a large number of EA methods have been developed, yet it is unclear from previous studies which method is the best for a given dataset. The main issues with previous benchmarks include the complexity of correctly assigning true pathways to a test dataset, and lack of generality of the evaluation metrics, for which the rank of a single target pathway is commonly used. We here provide a generalized EA benchmark and apply it to the most widely used EA methods, representing all four categories of current approaches. The benchmark employs a new set of 82 curated gene expression datasets from DNA microarray and RNA-Seq experiments for 26 diseases, of which only 13 are cancers. In order to address the shortcomings of the single target pathway approach and to enhance the sensitivity evaluation, we present the Disease Pathway Network, in which related Kyoto Encyclopedia of Genes and Genomes pathways are linked. We introduce a novel approach to evaluate pathway EA by combining sensitivity and specificity to provide a balanced evaluation of EA methods. This approach identifies Network Enrichment Analysis methods as the overall top performers compared with overlap-based methods. By using randomized gene expression datasets, we explore the null hypothesis bias of each method, revealing that most of them produce skewed P-values.

Citing Articles

Mining single-cell data for cell type-disease associations.

Chen K, Farley K, Lassmann T NAR Genom Bioinform. 2024; 6(4):lqae180.

PMID: 39703426 PMC: 11655289. DOI: 10.1093/nargab/lqae180.

References
1.
Wu T, Hu E, Xu S, Chen M, Guo P, Dai Z . clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation (Camb). 2021; 2(3):100141. PMC: 8454663. DOI: 10.1016/j.xinn.2021.100141. View

2.
Castresana-Aguirre M, Sonnhammer E . Pathway-specific model estimation for improved pathway annotation by network crosstalk. Sci Rep. 2020; 10(1):13585. PMC: 7423893. DOI: 10.1038/s41598-020-70239-z. View

3.
Huang D, Sherman B, Lempicki R . Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009; 4(1):44-57. DOI: 10.1038/nprot.2008.211. View

4.
Geistlinger L, Csaba G, Zimmer R . Bioconductor's EnrichmentBrowser: seamless navigation through combined results of set- & network-based enrichment analysis. BMC Bioinformatics. 2016; 17:45. PMC: 4721010. DOI: 10.1186/s12859-016-0884-1. View

5.
Notkins A, Lernmark A . Autoimmune type 1 diabetes: resolved and unresolved issues. J Clin Invest. 2001; 108(9):1247-52. PMC: 209446. DOI: 10.1172/JCI14257. View