» Articles » PMID: 39566551

A Cell Atlas Foundation Model for Scalable Search of Similar Human Cells

Abstract

Single-cell RNA sequencing has profiled hundreds of millions of human cells across organs, diseases, development and perturbations to date. Mining these growing atlases could reveal cell-disease associations, identify cell states in unexpected tissue contexts and relate in vivo biology to in vitro models. These require a common measure of cell similarity across the body and an efficient way to search. Here we develop SCimilarity, a metric-learning framework to learn a unified and interpretable representation that enables rapid queries of tens of millions of cell profiles from diverse studies for cells that are transcriptionally similar to an input cell profile or state. We use SCimilarity to query a 23.4-million-cell atlas of 412 single-cell RNA-sequencing studies for macrophage and fibroblast profiles from interstitial lung disease and reveal similar cell profiles across other fibrotic diseases and tissues. The top scoring in vitro hit for the macrophage query was a 3D hydrogel system, which we experimentally demonstrated reproduces this cell state. SCimilarity serves as a foundation model for single-cell profiles that enables researchers to query for similar cellular states across the human body, providing a powerful tool for generating biological insights from the Human Cell Atlas.

Citing Articles

Editorial: The Human Cell Atlas. What Is It and Where Could It Take Us?.

Parums D Med Sci Monit. 2025; 30:e947707.

PMID: 39741433 PMC: 11702441. DOI: 10.12659/MSM.947707.


A cell atlas foundation model for scalable search of similar human cells.

Heimberg G, Kuo T, DePianto D, Salem O, Heigl T, Diamant N Nature. 2024; 638(8052):1085-1094.

PMID: 39566551 PMC: 11864978. DOI: 10.1038/s41586-024-08411-y.

References
1.
Adams T, Schupp J, Poli S, Ayaub E, Neumark N, Ahangari F . Single-cell RNA-seq reveals ectopic and aberrant lung-resident cell populations in idiopathic pulmonary fibrosis. Sci Adv. 2020; 6(28):eaba1983. PMC: 7439502. DOI: 10.1126/sciadv.aba1983. View

2.
Rood J, Maartens A, Hupalowska A, Teichmann S, Regev A . Impact of the Human Cell Atlas on medicine. Nat Med. 2022; 28(12):2486-2496. DOI: 10.1038/s41591-022-02104-7. View

3.
Eraslan G, Drokhlyansky E, Anand S, Fiskin E, Subramanian A, Slyper M . Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function. Science. 2022; 376(6594):eabl4290. PMC: 9383269. DOI: 10.1126/science.abl4290. View

4.
Heimberg G, Bhatnagar R, El-Samad H, Thomson M . Low Dimensionality in Gene Expression Data Enables the Accurate Extraction of Transcriptional Programs from Shallow Sequencing. Cell Syst. 2016; 2(4):239-250. PMC: 4856162. DOI: 10.1016/j.cels.2016.04.001. View

5.
Kuppe C, Ramirez Flores R, Li Z, Hayat S, Levinson R, Liao X . Spatial multi-omic map of human myocardial infarction. Nature. 2022; 608(7924):766-777. PMC: 9364862. DOI: 10.1038/s41586-022-05060-x. View