GEMLI: Gene Expression Memory-Based Lineage Inference from Single-Cell RNA-Sequencing Datasets
Overview
Authors
Affiliations
Gene expression memory-based lineage inference (GEMLI) is a computational tool allowing to predict cell lineages solely from single-cell RNA-sequencing (scRNA-seq) datasets and is publicly available as an R package on GitHub. GEMLI is based on the occurrence of gene expression memory, i.e., the gene-specific maintenance of expression levels through cell divisions. This represents a shift away from experimental lineage tracing techniques based on genetic marks or physical cell lineage separation and greatly eases and expands lineage annotation. GEMLI allows to study cell lineages during differentiation in development, homeostasis, and regeneration, as well as disease onset and progression in various physiological and pathological contexts. This makes it possible to dissect cell type-specific gene expression memory, to discriminate symmetric and asymmetric cell fate decisions, and to reconstruct individual multicellular structures from pooled scRNA-seq datasets. GEMLI is particularly promising for its ability to identify small lineages in human samples, a context in which no other lineage tracing methods are applicable. In this chapter, we provide a detailed protocol of the GEMLI R package usage on gene expression matrices derived from standard scRNA-seq on various platforms. We cover the use of the main function to predict cell lineages and how to adjust its parameters to different tasks. We also show how lineage information is extracted, visualized, and fine-tuned. Finally, we describe the use of the package's functions for the detailed analysis of the predicted cell lineages. This includes the analysis of gene expression memory, cell type composition of individual large lineages, and identification of lineages at the transition point between two cell types.