» Articles » PMID: 38409056

A Comparison of Marker Gene Selection Methods for Single-cell RNA Sequencing Data

Overview
Journal Genome Biol
Specialties Biology
Genetics
Date 2024 Feb 26
PMID 38409056
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The development of single-cell RNA sequencing (scRNA-seq) has enabled scientists to catalog and probe the transcriptional heterogeneity of individual cells in unprecedented detail. A common step in the analysis of scRNA-seq data is the selection of so-called marker genes, most commonly to enable annotation of the biological cell types present in the sample. In this paper, we benchmark 59 computational methods for selecting marker genes in scRNA-seq data.

Results: We compare the performance of the methods using 14 real scRNA-seq datasets and over 170 additional simulated datasets. Methods are compared on their ability to recover simulated and expert-annotated marker genes, the predictive performance and characteristics of the gene sets they select, their memory usage and speed, and their implementation quality. In addition, various case studies are used to scrutinize the most commonly used methods, highlighting issues and inconsistencies.

Conclusions: Overall, we present a comprehensive evaluation of methods for selecting marker genes in scRNA-seq data. Our results highlight the efficacy of simple methods, especially the Wilcoxon rank-sum test, Student's t-test, and logistic regression.

Citing Articles

Integrated Bioinformatic Analyses Reveal Thioredoxin as a Putative Marker of Cancer Stem Cells and Prognosis in Prostate Cancer.

Sugiki S, Horie T, Kunii K, Sakamoto T, Nakamura Y, Chikazawa I Cancer Inform. 2025; 24:11769351251319872.

PMID: 40008390 PMC: 11851766. DOI: 10.1177/11769351251319872.


Advancements in single-cell RNA sequencing and spatial transcriptomics: transforming biomedical research.

Molla Desta G, Birhanu A Acta Biochim Pol. 2025; 72:13922.

PMID: 39980637 PMC: 11835515. DOI: 10.3389/abp.2025.13922.


tagtango: an application to compare single-cell annotations.

Mora B, Lindsay H, Thiebaut A, Stuart K, Gottardo R Bioinformatics. 2025; 41(2).

PMID: 39798134 PMC: 11814489. DOI: 10.1093/bioinformatics/btaf012.


CORTADO: Hill Climbing Optimization for Cell-Type Specific Marker Gene Discovery.

Lodi M, Clark L, Roy S, Ghosh P bioRxiv. 2025; .

PMID: 39763976 PMC: 11703242. DOI: 10.1101/2024.12.23.630040.


Considerations for building and using integrated single-cell atlases.

Hrovatin K, Sikkema L, Shitov V, Heimberg G, Shulman M, Oliver A Nat Methods. 2024; 22(1):41-57.

PMID: 39672979 DOI: 10.1038/s41592-024-02532-y.


References
1.
Saelens W, Cannoodt R, Todorov H, Saeys Y . A comparison of single-cell trajectory inference methods. Nat Biotechnol. 2019; 37(5):547-554. DOI: 10.1038/s41587-019-0071-9. View

2.
Aibar S, Gonzalez-Blas C, Moerman T, Huynh-Thu V, Imrichova H, Hulselmans G . SCENIC: single-cell regulatory network inference and clustering. Nat Methods. 2017; 14(11):1083-1086. PMC: 5937676. DOI: 10.1038/nmeth.4463. View

3.
Kleshchevnikov V, Shmatko A, Dann E, Aivazidis A, King H, Li T . Cell2location maps fine-grained cell types in spatial transcriptomics. Nat Biotechnol. 2022; 40(5):661-671. DOI: 10.1038/s41587-021-01139-4. View

4.
Soneson C, Robinson M . Bias, robustness and scalability in single-cell differential expression analysis. Nat Methods. 2018; 15(4):255-261. DOI: 10.1038/nmeth.4612. View

5.
Law C, Chen Y, Shi W, Smyth G . voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 2014; 15(2):R29. PMC: 4053721. DOI: 10.1186/gb-2014-15-2-r29. View