A Comparison of Marker Gene Selection Methods for Single-cell RNA Sequencing Data
Overview
Authors
Affiliations
Background: The development of single-cell RNA sequencing (scRNA-seq) has enabled scientists to catalog and probe the transcriptional heterogeneity of individual cells in unprecedented detail. A common step in the analysis of scRNA-seq data is the selection of so-called marker genes, most commonly to enable annotation of the biological cell types present in the sample. In this paper, we benchmark 59 computational methods for selecting marker genes in scRNA-seq data.
Results: We compare the performance of the methods using 14 real scRNA-seq datasets and over 170 additional simulated datasets. Methods are compared on their ability to recover simulated and expert-annotated marker genes, the predictive performance and characteristics of the gene sets they select, their memory usage and speed, and their implementation quality. In addition, various case studies are used to scrutinize the most commonly used methods, highlighting issues and inconsistencies.
Conclusions: Overall, we present a comprehensive evaluation of methods for selecting marker genes in scRNA-seq data. Our results highlight the efficacy of simple methods, especially the Wilcoxon rank-sum test, Student's t-test, and logistic regression.
Sugiki S, Horie T, Kunii K, Sakamoto T, Nakamura Y, Chikazawa I Cancer Inform. 2025; 24:11769351251319872.
PMID: 40008390 PMC: 11851766. DOI: 10.1177/11769351251319872.
Molla Desta G, Birhanu A Acta Biochim Pol. 2025; 72:13922.
PMID: 39980637 PMC: 11835515. DOI: 10.3389/abp.2025.13922.
tagtango: an application to compare single-cell annotations.
Mora B, Lindsay H, Thiebaut A, Stuart K, Gottardo R Bioinformatics. 2025; 41(2).
PMID: 39798134 PMC: 11814489. DOI: 10.1093/bioinformatics/btaf012.
CORTADO: Hill Climbing Optimization for Cell-Type Specific Marker Gene Discovery.
Lodi M, Clark L, Roy S, Ghosh P bioRxiv. 2025; .
PMID: 39763976 PMC: 11703242. DOI: 10.1101/2024.12.23.630040.
Considerations for building and using integrated single-cell atlases.
Hrovatin K, Sikkema L, Shitov V, Heimberg G, Shulman M, Oliver A Nat Methods. 2024; 22(1):41-57.
PMID: 39672979 DOI: 10.1038/s41592-024-02532-y.