» Articles » PMID: 16873504

Integrating Copy Number Polymorphisms into Array CGH Analysis Using a Robust HMM

Overview
Journal Bioinformatics
Specialty Biology
Date 2006 Jul 29
PMID 16873504
Citations 67
Authors
Affiliations
Soon will be listed here.
Abstract

Motivation: Array comparative genomic hybridization (aCGH) is a pervasive technique used to identify chromosomal aberrations in human diseases, including cancer. Aberrations are defined as regions of increased or decreased DNA copy number, relative to a normal sample. Accurately identifying the locations of these aberrations has many important medical applications. Unfortunately, the observed copy number changes are often corrupted by various sources of noise, making the boundaries hard to detect. One popular current technique uses hidden Markov models (HMMs) to divide the signal into regions of constant copy number called segments; a subsequent classification phase labels each segment as a gain, a loss or neutral. Unfortunately, standard HMMs are sensitive to outliers, causing over-segmentation, where segments erroneously span very short regions.

Results: We propose a simple modification that makes the HMM robust to such outliers. More importantly, this modification allows us to exploit prior knowledge about the likely location of "outliers", which are often due to copy number polymorphisms (CNPs). By "explaining away" these outliers with prior knowledge about the locations of CNPs, we can focus attention on the more clinically relevant aberrated regions. We show significant improvements over the current state of the art technique (DNAcopy with MergeLevels) on previously published data from mantle cell lymphoma cell lines, and on published benchmark synthetic data augmented with outliers.

Availability: Source code written in Matlab is available from http://www.cs.ubc.ca/~sshah/acgh.

Citing Articles

Scalable co-sequencing of RNA and DNA from individual nuclei.

Olsen T, Talla P, Sagatelian R, Furnari J, Bruce J, Canoll P Nat Methods. 2025; 22(3):477-487.

PMID: 39939719 DOI: 10.1038/s41592-024-02579-x.


Multi-omic and single-cell profiling of chromothriptic medulloblastoma reveals genomic and transcriptomic consequences of genome instability.

Smirnov P, Przybilla M, Simovic-Lorenz M, Parra R, Susak H, Ratnaparkhe M Nat Commun. 2024; 15(1):10183.

PMID: 39580568 PMC: 11585558. DOI: 10.1038/s41467-024-54547-w.


A clinically feasible algorithm for the parallel detection of glioma-associated copy number variation markers based on shallow whole genome sequencing.

Wu S, Ma C, Cai J, Yang C, Liu X, Luo C J Pathol Clin Res. 2024; 10(6):e70005.

PMID: 39375998 PMC: 11458885. DOI: 10.1002/2056-4538.70005.


Improved allele-specific single-cell copy number estimation in low-coverage DNA-sequencing.

Weiner S, Li B, Nabavi S Bioinformatics. 2024; 40(8).

PMID: 39133157 PMC: 11346770. DOI: 10.1093/bioinformatics/btae506.


SCCNAInfer: a robust and accurate tool to infer the absolute copy number on scDNA-seq data.

Zhang L, Zhou X, Mallory X Bioinformatics. 2024; .

PMID: 39067018 PMC: 11286278. DOI: 10.1093/bioinformatics/btae454.