CZ CELLxGENE Discover: a Single-cell Data Platform for Scalable Exploration, Analysis and Modeling of Aggregated Data
Overview
Authors
Affiliations
Hundreds of millions of single cells have been analyzed using high-throughput transcriptomic methods. The cumulative knowledge within these datasets provides an exciting opportunity for unlocking insights into health and disease at the level of single cells. Meta-analyses that span diverse datasets building on recent advances in large language models and other machine-learning approaches pose exciting new directions to model and extract insight from single-cell data. Despite the promise of these and emerging analytical tools for analyzing large amounts of data, the sheer number of datasets, data models and accessibility remains a challenge. Here, we present CZ CELLxGENE Discover (cellxgene.cziscience.com), a data platform that provides curated and interoperable single-cell data. Available via a free-to-use online data portal, CZ CELLxGENE hosts a growing corpus of community-contributed data of over 93 million unique cells. Curated, standardized and associated with consistent cell-level metadata, this collection of single-cell transcriptomic data is the largest of its kind and growing rapidly via community contributions. A suite of tools and features enables accessibility and reusability of the data via both computational and visual interfaces to allow researchers to explore individual datasets, perform cross-corpus analysis, and run meta-analyses of tens of millions of cells across studies and tissues at the resolution of single cells.
Optimizing Xenium In Situ data utility by quality assessment and best-practice analysis workflows.
Marco Salas S, Kuemmerle L, Mattsson-Langseth C, Tismeyer S, Avenel C, Hu T Nat Methods. 2025; .
PMID: 40082609 DOI: 10.1038/s41592-025-02617-2.
Portable-CELLxGENE: standalone executables of CELLxGENE for easy installation.
Hall G GigaByte. 2025; 2025:gigabyte151.
PMID: 40070474 PMC: 11894539. DOI: 10.46471/gigabyte.151.
Liu H, Zhou Y, Wang Z, Liu D, Li Y, Lai H MedComm (2020). 2025; 6(3):e70129.
PMID: 40066224 PMC: 11891570. DOI: 10.1002/mco2.70129.
Consequences of training data composition for deep learning models in single-cell biology.
Nadig A, Thoutam A, Hughes M, Gupta A, Navia A, Fusi N bioRxiv. 2025; .
PMID: 40060416 PMC: 11888162. DOI: 10.1101/2025.02.19.639127.
Li S, Mingoia S, Montegut L, Lambertucci F, Chen H, Dong Y Cell Death Dis. 2025; 16(1):134.
PMID: 40011442 PMC: 11865319. DOI: 10.1038/s41419-025-07447-w.