» Articles » PMID: 33767393

Robust Integration of Multiple Single-cell RNA Sequencing Datasets Using a Single Reference Space

Overview
Journal Nat Biotechnol
Specialty Biotechnology
Date 2021 Mar 26
PMID 33767393
Citations 23
Authors
Affiliations
Soon will be listed here.
Abstract

In many biological applications of single-cell RNA sequencing (scRNA-seq), an integrated analysis of data from multiple batches or studies is necessary. Current methods typically achieve integration using shared cell types or covariance correlation between datasets, which can distort biological signals. Here we introduce an algorithm that uses the gene eigenvectors from a reference dataset to establish a global frame for integration. Using simulated and real datasets, we demonstrate that this approach, called Reference Principal Component Integration (RPCI), consistently outperforms other methods by multiple metrics, with clear advantages in preserving genuine cross-sample gene expression differences in matching cell types, such as those present in cells at distinct developmental stages or in perturbated versus control studies. Moreover, RPCI maintains this robust performance when multiple datasets are integrated. Finally, we applied RPCI to scRNA-seq data for mouse gut endoderm development and revealed temporal emergence of genetic programs helping establish the anterior-posterior axis in visceral endoderm.

Citing Articles

Atlas of multilineage stem cell differentiation reveals TMEM88 as a developmental regulator of blood pressure.

Shen S, Werner T, Lukowski S, Andersen S, Sun Y, Shim W Nat Commun. 2025; 16(1):1356.

PMID: 39904980 PMC: 11794859. DOI: 10.1038/s41467-025-56533-2.


A Single-Cell RNA Sequencing Atlas of the Chronic Obstructive Pulmonary Disease Distal Lung to Predict Cell-Cell Communication.

Blackburn J, Tufenkjian T, Liu Y, Nichols D, Blackwell T, Richmond B Am J Respir Cell Mol Biol. 2024; 72(3):332-335.

PMID: 39356793 PMC: 11890073. DOI: 10.1165/rcmb.2024-0232LE.


scDAPP: a comprehensive single-cell transcriptomics analysis pipeline optimized for cross-group comparison.

Ferrena A, Zheng X, Jackson K, Hoang B, Morrow B, Zheng D NAR Genom Bioinform. 2024; 6(4):lqae134.

PMID: 39345754 PMC: 11437360. DOI: 10.1093/nargab/lqae134.


Molecular and network disruptions in neurodevelopment uncovered by single cell transcriptomics analysis of heterozygous cerebral organoids.

Astorkia M, Liu Y, Pedrosa E, Lachman H, Zheng D Heliyon. 2024; 10(14):e34862.

PMID: 39149047 PMC: 11325375. DOI: 10.1016/j.heliyon.2024.e34862.


Macrophages in the infarcted heart acquire a fibrogenic phenotype, expressing matricellular proteins, but do not undergo fibroblast conversion.

Li R, Hanna A, Huang S, Hernandez S, Tuleta I, Kubota A J Mol Cell Cardiol. 2024; 196:152-167.

PMID: 39089570 PMC: 11534516. DOI: 10.1016/j.yjmcc.2024.07.010.


References
1.
Islam S, Zeisel A, Joost S, La Manno G, Zajac P, Kasper M . Quantitative single-cell RNA-seq with unique molecular identifiers. Nat Methods. 2013; 11(2):163-6. DOI: 10.1038/nmeth.2772. View

2.
Nawy T . Single-cell sequencing. Nat Methods. 2014; 11(1):18. DOI: 10.1038/nmeth.2771. View

3.
Wang Y, Navin N . Advances and applications of single-cell sequencing technologies. Mol Cell. 2015; 58(4):598-609. PMC: 4441954. DOI: 10.1016/j.molcel.2015.05.005. View

4.
Zheng G, Terry J, Belgrader P, Ryvkin P, Bent Z, Wilson R . Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017; 8:14049. PMC: 5241818. DOI: 10.1038/ncomms14049. View

5.
Azizi E, Carr A, Plitas G, Cornish A, Konopacki C, Prabhakaran S . Single-Cell Map of Diverse Immune Phenotypes in the Breast Tumor Microenvironment. Cell. 2018; 174(5):1293-1308.e36. PMC: 6348010. DOI: 10.1016/j.cell.2018.05.060. View