» Articles » PMID: 38166601

Inferring Single-cell Copy Number Profiles Through Cross-cell Segmentation of Read Counts

Overview
Journal BMC Genomics
Publisher Biomed Central
Specialty Genetics
Date 2024 Jan 3
PMID 38166601
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Copy number alteration (CNA) is one of the major genomic variations that frequently occur in cancers, and accurate inference of CNAs is essential for unmasking intra-tumor heterogeneity (ITH) and tumor evolutionary history. Single-cell DNA sequencing (scDNA-seq) makes it convenient to profile CNAs at single-cell resolution, and thus aids in better characterization of ITH. Despite that several computational methods have been proposed to decipher single-cell CNAs, their performance is limited in either breakpoint detection or copy number estimation due to the high dimensionality and noisy nature of read counts data.

Results: By treating breakpoint detection as a process to segment high dimensional read count sequence, we develop a novel method called DeepCNA for cross-cell segmentation of read count sequence and per-cell inference of CNAs. To cope with the difficulty of segmentation, an autoencoder (AE) network is employed in DeepCNA to project the original data into a low-dimensional space, where the breakpoints can be efficiently detected along each latent dimension and further merged to obtain the final breakpoints. Unlike the existing methods that manually calculate certain statistics of read counts to find breakpoints, the AE model makes it convenient to automatically learn the representations. Based on the inferred breakpoints, we employ a mixture model to predict copy numbers of segments for each cell, and leverage expectation-maximization algorithm to efficiently estimate cell ploidy by exploring the most abundant copy number state. Benchmarking results on simulated and real data demonstrate our method is able to accurately infer breakpoints as well as absolute copy numbers and surpasses the existing methods under different test conditions. DeepCNA can be accessed at: https://github.com/zhyu-lab/deepcna .

Conclusions: Profiling single-cell CNAs based on deep learning is becoming a new paradigm of scDNA-seq data analysis, and DeepCNA is an enhancement to the current arsenal of computational methods for investigating cancer genomics.

Citing Articles

Improved allele-specific single-cell copy number estimation in low-coverage DNA-sequencing.

Weiner S, Li B, Nabavi S Bioinformatics. 2024; 40(8).

PMID: 39133157 PMC: 11346770. DOI: 10.1093/bioinformatics/btae506.

References
1.
Li H, Durbin R . Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009; 25(14):1754-60. PMC: 2705234. DOI: 10.1093/bioinformatics/btp324. View

2.
Yu Z, Liu H, Du F, Tang X . GRMT: Generative Reconstruction of Mutation Tree From Scratch Using Single-Cell Sequencing Data. Front Genet. 2021; 12:692964. PMC: 8212059. DOI: 10.3389/fgene.2021.692964. View

3.
Gawad C, Koh W, Quake S . Single-cell genome sequencing: current state of the science. Nat Rev Genet. 2016; 17(3):175-88. DOI: 10.1038/nrg.2015.16. View

4.
Wang X, Chen H, Zhang N . DNA copy number profiling using single-cell sequencing. Brief Bioinform. 2017; 19(5):731-736. PMC: 6171490. DOI: 10.1093/bib/bbx004. View

5.
Garvin T, Aboukhalil R, Kendall J, Baslan T, Atwal G, Hicks J . Interactive analysis and assessment of single-cell copy-number variations. Nat Methods. 2015; 12(11):1058-60. PMC: 4775251. DOI: 10.1038/nmeth.3578. View