» Articles » PMID: 27809781

PSE-HMM: Genome-wide CNV Detection from NGS Data Using an HMM with Position-Specific Emission Probabilities

Overview
Publisher Biomed Central
Specialty Biology
Date 2016 Nov 5
PMID 27809781
Citations 1
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Copy Number Variation (CNV) is envisaged to be a major source of large structural variations in the human genome. In recent years, many studies apply Next Generation Sequencing (NGS) data for the CNV detection. However, still there is a necessity to invent more accurate computational tools.

Results: In this study, mate pair NGS data are used for the CNV detection in a Hidden Markov Model (HMM). The proposed HMM has position specific emission probabilities, i.e. a Gaussian mixture distribution. Each component in the Gaussian mixture distribution captures a different type of aberration that is observed in the mate pairs, after being mapped to the reference genome. These aberrations may include any increase (decrease) in the insertion size or change in the direction of mate pairs that are mapped to the reference genome. This HMM with Position-Specific Emission probabilities (PSE-HMM) is utilized for the genome-wide detection of deletions and tandem duplications. The performance of PSE-HMM is evaluated on a simulated dataset and also on a real data of a Yoruban HapMap individual, NA18507.

Conclusions: PSE-HMM is effective in taking observation dependencies into account and reaches a high accuracy in detecting genome-wide CNVs. MATLAB programs are available at http://bs.ipm.ir/softwares/PSE-HMM/ .

Citing Articles

CIRCNV: Detection of CNVs Based on a Circular Profile of Read Depth from Sequencing Data.

Zhao H, Li Q, Tian Y, Chen Y, Alvi H, Yuan X Biology (Basel). 2021; 10(7).

PMID: 34202028 PMC: 8301091. DOI: 10.3390/biology10070584.

References
1.
Li H, Ruan J, Durbin R . Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008; 18(11):1851-8. PMC: 2577856. DOI: 10.1101/gr.078212.108. View

2.
Medvedev P, Fiume M, Dzamba M, Smith T, Brudno M . Detecting copy number variation with mated short reads. Genome Res. 2010; 20(11):1613-22. PMC: 2963824. DOI: 10.1101/gr.106344.110. View

3.
Szatkiewicz J, ODushlaine C, Chen G, Chambert K, Moran J, Neale B . Copy number variation in schizophrenia in Sweden. Mol Psychiatry. 2014; 19(7):762-73. PMC: 4271733. DOI: 10.1038/mp.2014.40. View

4.
Ding J, Shah S . A robust hidden semi-Markov model with application to aCGH data processing. Int J Data Min Bioinform. 2014; 8(4):427-42. DOI: 10.1504/ijdmb.2013.056616. View

5.
Rueda O, Diaz-Uriarte R . Flexible and accurate detection of genomic copy-number changes from aCGH. PLoS Comput Biol. 2007; 3(6):e122. PMC: 1894821. DOI: 10.1371/journal.pcbi.0030122. View