A Benchmark of Batch-effect Correction Methods for Single-cell RNA Sequencing Data
Overview
Authors
Affiliations
Background: Large-scale single-cell transcriptomic datasets generated using different technologies contain batch-specific systematic variations that present a challenge to batch-effect removal and data integration. With continued growth expected in scRNA-seq data, achieving effective batch integration with available computational resources is crucial. Here, we perform an in-depth benchmark study on available batch correction methods to determine the most suitable method for batch-effect removal.
Results: We compare 14 methods in terms of computational runtime, the ability to handle large datasets, and batch-effect correction efficacy while preserving cell type purity. Five scenarios are designed for the study: identical cell types with different technologies, non-identical cell types, multiple batches, big data, and simulated data. Performance is evaluated using four benchmarking metrics including kBET, LISI, ASW, and ARI. We also investigate the use of batch-corrected data to study differential gene expression.
Conclusion: Based on our results, Harmony, LIGER, and Seurat 3 are the recommended methods for batch integration. Due to its significantly shorter runtime, Harmony is recommended as the first method to try, with the other methods as viable alternatives.
Pita-Juarez Y, Karagkouni D, Kalavros N, Melms J, Niezen S, Delorey T Genome Biol. 2025; 26(1):56.
PMID: 40087773 DOI: 10.1186/s13059-025-03499-5.
Feature selection methods affect the performance of scRNA-seq data integration and querying.
Zappia L, Richter S, Ramirez-Suastegui C, Kfuri-Rubens R, Vornholz L, Wang W Nat Methods. 2025; .
PMID: 40082610 DOI: 10.1038/s41592-025-02624-3.
Chen L, Tong X, Wu Y, Liu C, Tang C, Qi X BMC Genom Data. 2025; 26(1):16.
PMID: 40075302 PMC: 11899051. DOI: 10.1186/s12863-025-01308-3.
Composite quantile regression approach to batch effect correction in microbiome data.
Park J, Park T Front Microbiol. 2025; 16:1484183.
PMID: 40071205 PMC: 11893821. DOI: 10.3389/fmicb.2025.1484183.
Wang Y, Du P Front Genet. 2025; 16:1553352.
PMID: 40034748 PMC: 11872911. DOI: 10.3389/fgene.2025.1553352.