» Articles » PMID: 39868099

SwarmMAP: Swarm Learning for Decentralized Cell Type Annotation in Single Cell Sequencing Data

Overview
Journal bioRxiv
Date 2025 Jan 27
PMID 39868099
Authors
Affiliations
Soon will be listed here.
Abstract

Rapid technological advancements have made it possible to generate single-cell data at a large scale. Several laboratories around the world can now generate single-cell transcriptomic data from different tissues. Unsupervised clustering, followed by annotation of the cell type of the identified clusters, is a crucial step in single-cell analyses. However, there is no consensus on the marker genes to use for annotation, and cell-type annotation is currently mostly done by manual inspection of marker genes, which is irreproducible, and poorly scalable. Additionally, patient-privacy is also a critical issue with human datasets. There is a critical need to standardize and automate cell-type annotation across datasets in a privacy-preserving manner. Here, we developed SwarmMAP that uses Swarm Learning to train machine learning models for cell-type classification based on single-cell sequencing data in a decentralized way. SwarmMAP does not require any exchange of raw data between data centers. SwarmMAP has a F1-score of 0.93, 0.98, and 0.88 for cell type classification in human heart, lung, and breast datasets, respectively. Swarm Learning-based models yield an average performance of which is on par with the performance achieved by models trained on centralized data (-val=, Mann-Whitney Test). We also find that increasing the number of datasets increases cell-type prediction accuracy and enables handling higher cell-type diversity. Together, these findings demonstrate that Swarm Learning is a viable approach to automate cell-type annotation. SwarmMAP is available at https://github.com/hayatlab/SwarmMAP.

References
1.
Travaglini K, Nabhan A, Penland L, Sinha R, Gillich A, Sit R . A molecular cell atlas of the human lung from single-cell RNA sequencing. Nature. 2020; 587(7835):619-625. PMC: 7704697. DOI: 10.1038/s41586-020-2922-4. View

2.
Chaffin M, Papangeli I, Simonson B, Akkad A, Hill M, Arduini A . Single-nucleus profiling of human dilated and hypertrophic cardiomyopathy. Nature. 2022; 608(7921):174-180. DOI: 10.1038/s41586-022-04817-8. View

3.
Heryanto Y, Zhang Y, Imoto S . Predicting cell types with supervised contrastive learning on cells and their types. Sci Rep. 2024; 14(1):430. PMC: 10764802. DOI: 10.1038/s41598-023-50185-2. View

4.
Stephenson E, Reynolds G, Botting R, Calero-Nieto F, Morgan M, Tuong Z . Single-cell multi-omics analysis of the immune response in COVID-19. Nat Med. 2021; 27(5):904-916. PMC: 8121667. DOI: 10.1038/s41591-021-01329-2. View

5.
Sikkema L, Ramirez-Suastegui C, Strobl D, Gillett T, Zappia L, Madissoon E . An integrated cell atlas of the lung in health and disease. Nat Med. 2023; 29(6):1563-1577. PMC: 10287567. DOI: 10.1038/s41591-023-02327-2. View