» Articles » PMID: 30779024

SM-RCNV: a Statistical Method to Detect Recurrent Copy Number Variations in Sequenced Samples

Overview
Journal Genes Genomics
Specialty Genetics
Date 2019 Feb 20
PMID 30779024
Citations 2
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Copy number variation (CNV) is an important form of genomic structural variation and is linked to dozens of human diseases. Using next-generation sequencing (NGS) data and developing computational methods to characterize such structural variants is significant for understanding the mechanisms of diseases.

Objective: The objective of this study is to develop a new statistical method of detection recurrent CNVs across multiple samples from genomic sequences.

Methods: A statistical method is carried out to detect recurrent CNVs, referred to as SM-RCNV. This method uses a statistic associated with each location by combining the frequency of variation at one location across whole samples and the correlation among consecutive locations. The weights of the frequency and correlation are trained using real datasets with known CNVs. P-value is assessed for each location on the genome by permutation testing.

Results: Compared with six peer methods, SM-RCNV outperforms the peer methods under receiver operating characteristic curves. SM-RCNV successfully identifies many consistent recurrent CNVs, most of which are known to be of biological significance and associated with diseased genes. The validation rate of SM-RCNV in the CEU call set and YRI call set with Database of Genomic Variants are 258/328 (79%) and (157/309) 51%, respectively.

Conclusion: SM-RCNV is a well-grounded statistical framework for detecting recurrent CNVs from multiple genomic sequences, providing valuable information to study genomes in human diseases. The source code is freely available at https://sourceforge.net/projects/sm-rcnv/ .

Citing Articles

A copy number variation detection method based on OCSVM algorithm using multi strategies integration.

Zhou M, Dong J, Jiang H, Zhao Z, Yuan T Sci Rep. 2025; 15(1):3526.

PMID: 39875521 PMC: 11775105. DOI: 10.1038/s41598-025-88143-9.


CNV-MEANN: A Neural Network and Mind Evolutionary Algorithm-Based Detection of Copy Number Variations From Next-Generation Sequencing Data.

Huang T, Li J, Jia B, Sang H Front Genet. 2021; 12:700874.

PMID: 34484298 PMC: 8415314. DOI: 10.3389/fgene.2021.700874.


A Cluster-Based Approach for the Discovery of Copy Number Variations From Next-Generation Sequencing Data.

Liu G, Zhang J Front Genet. 2021; 12:699510.

PMID: 34262604 PMC: 8273656. DOI: 10.3389/fgene.2021.699510.

References
1.
Freeman J, Perry G, Feuk L, Redon R, McCarroll S, Altshuler D . Copy number variation: new insights in genome diversity. Genome Res. 2006; 16(8):949-61. DOI: 10.1101/gr.3677206. View

2.
Xie C, Tammi M . CNV-seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinformatics. 2009; 10:80. PMC: 2667514. DOI: 10.1186/1471-2105-10-80. View

3.
Xi R, Hadjipanayis A, Luquette L, Kim T, Lee E, Zhang J . Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion. Proc Natl Acad Sci U S A. 2011; 108(46):E1128-36. PMC: 3219132. DOI: 10.1073/pnas.1110574108. View

4.
Cock P, Fields C, Goto N, Heuer M, Rice P . The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2009; 38(6):1767-71. PMC: 2847217. DOI: 10.1093/nar/gkp1137. View

5.
Yoon S, Xuan Z, Makarov V, Ye K, Sebat J . Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 2009; 19(9):1586-92. PMC: 2752127. DOI: 10.1101/gr.092981.109. View