» Articles » PMID: 37460600

Atlas-scale Single-cell Multi-sample Multi-condition Data Integration Using ScMerge2

Overview
Journal Nat Commun
Specialty Biology
Date 2023 Jul 17
PMID 37460600
Authors
Affiliations
Soon will be listed here.
Abstract

The recent emergence of multi-sample multi-condition single-cell multi-cohort studies allows researchers to investigate different cell states. The effective integration of multiple large-cohort studies promises biological insights into cells under different conditions that individual studies cannot provide. Here, we present scMerge2, a scalable algorithm that allows data integration of atlas-scale multi-sample multi-condition single-cell studies. We have generalized scMerge2 to enable the merging of millions of cells from single-cell studies generated by various single-cell technologies. Using a large COVID-19 data collection with over five million cells from 1000+ individuals, we demonstrate that scMerge2 enables multi-sample multi-condition scRNA-seq data integration from multiple cohorts and reveals signatures derived from cell-type expression that are more accurate in discriminating disease progression. Further, we demonstrate that scMerge2 can remove dataset variability in CyTOF, imaging mass cytometry and CITE-seq experiments, demonstrating its applicability to a broad spectrum of single-cell profiling technologies.

Citing Articles

A Current Perspective of Medical Informatics Developments for a Clinical Translation of (Non-coding)RNAs and Single-Cell Technologies.

Baumann A, Ahmadi N, Wolfien M Methods Mol Biol. 2024; 2883:31-51.

PMID: 39702703 DOI: 10.1007/978-1-0716-4290-0_2.


Capture of Totipotency in Mouse Embryonic Stem Cells in the Absence of Pdzk1.

Zhang W, Zhao Y, Yang Z, Yan J, Wang H, Nie S Adv Sci (Weinh). 2024; 12(6):e2408852.

PMID: 39630006 PMC: 11809344. DOI: 10.1002/advs.202408852.


scCTS: identifying the cell type-specific marker genes from population-level single-cell RNA-seq.

Chen L, Guo Z, Deng T, Wu H Genome Biol. 2024; 25(1):269.

PMID: 39402623 PMC: 11472465. DOI: 10.1186/s13059-024-03410-8.


C-ziptf: stable tensor factorization for zero-inflated multi-dimensional genomics data.

Chafamo D, Shanmugam V, Tokcan N BMC Bioinformatics. 2024; 25(1):323.

PMID: 39369208 PMC: 11456250. DOI: 10.1186/s12859-024-05886-4.


Spatiotemporal metabolomic approaches to the cancer-immunity panorama: a methodological perspective.

Xiao Y, Li Y, Zhao H Mol Cancer. 2024; 23(1):202.

PMID: 39294747 PMC: 11409752. DOI: 10.1186/s12943-024-02113-9.


References
1.
Luecken M, Buttner M, Chaichoompu K, Danese A, Interlandi M, Mueller M . Benchmarking atlas-level data integration in single-cell genomics. Nat Methods. 2021; 19(1):41-50. PMC: 8748196. DOI: 10.1038/s41592-021-01336-8. View

2.
Salim A, Molania R, Wang J, De Livera A, Thijssen R, Speed T . RUV-III-NB: normalization of single cell RNA-seq data. Nucleic Acids Res. 2022; 50(16):e96. PMC: 9458465. DOI: 10.1093/nar/gkac486. View

3.
Liu C, Martins A, Lau W, Rachmaninoff N, Chen J, Imberti L . Time-resolved systems immunology reveals a late juncture linked to fatal COVID-19. Cell. 2021; 184(7):1836-1857.e22. PMC: 7874909. DOI: 10.1016/j.cell.2021.02.018. View

4.
Ritchie M, Phipson B, Wu D, Hu Y, Law C, Shi W . limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015; 43(7):e47. PMC: 4402510. DOI: 10.1093/nar/gkv007. View

5.
Meizlish M, Pine A, Bishai J, Goshua G, Nadelmann E, Simonov M . A neutrophil activation signature predicts critical illness and mortality in COVID-19. Blood Adv. 2021; 5(5):1164-1177. PMC: 7908851. DOI: 10.1182/bloodadvances.2020003568. View